How to update records only storing changes?

Issue

My backend using Python and Flask splits JSON data into various endpoints, to retrieve in my client Swift app rather than having to download the full data client-side.

JSON file :

{
  "pilots": [
    {
      "cid": 1234567,
      "name": "John Smith",
      "callsign": "TIA1",
      "server": "USA-WEST",
      "pilot_rating": 3,
      "latitude": 18.42663,
      "longitude": 116.15007,
      "altitude": 41038,
      "groundspeed": 435,
      "transponder": "2200",
      "heading": 154,
      "qnh_i_hg": 29.96,
      "qnh_mb": 1015,
      "flight_plan": {
        "flight_rules": "I",
        "aircraft": "B737/M-VGDW/C",
        "aircraft_faa": "B737/L",
        "aircraft_short": "B737",
        "departure": "LTBA",
        "arrival": "WAMM",
        "alternate": "",
        "cruise_tas": "437",
        "altitude": "33000",
        "deptime": "1230",
        "enroute_time": "1415",
        "fuel_time": "1542",
        "remarks": "PBN/B1D1O1S1 DOF/221107 REG/VPCTY EET/LTAA0020 UDDD0137 UBBA0151 UTAK0222 UTAA0247 UTAV0309 UTSD0322 UTTR0345 UAII0352 UTTR0354 UCFO0412 UCFM0434 ZWUQ0451 ZLHW0606 ZPKM0741 ZGZU0856 VHHK0946 RPHI1020 WAAF1251 SEL/EJKS CODE/ADF5D2 OPR/TEXAS AIR LLC ORGN/KCHIUALE PER/C  RMK/CALLSIGN \"TEXAS\"  /V/",
        "route": "ASMAP UL333 SIV UA4 ERZ UB374 INDUR N449 DUKAN A480 KRS B701 TUGTA A909 BABUM A477 POGON L143 TISIB L141 KAMUD W186 SADAN Y1 OMBON B330 AVPAM A599 POU B330 CH A583 ZAM A461 BONDA",
        "revision_id": 4,
        "assigned_transponder": "0363"
      },
      "logon_time": "2022-11-06T07:07:42.1130925Z",
      "last_updated": "2022-11-07T22:36:19.1087668Z"
    }
}

I import JSON into various SQLite tables. JSON data is updated every 60 seconds, so I need to update my copy accordingly. My current solution is to delete data in database then reinsert, but that’s most likely not the right way. I’m not sure how I should go about diffing records in database against latest JSON, I could retrieve all records then compare old and new line by line, but this could be even less efficient. What’s a robust way of doing this in Python?

Code to insert pilots and associated flight plans :

def _store_pilots(pilots):
    """Removes all records from the db, then stores pilots and associated flight plans, checking for duplicate CIDs."""
    pilots_list = []
    cid_list = []
    fp_list = []
    for pilot in pilots:
        cid = int(pilot['cid'])
        if cid in cid_list:
            continue
        cid_list.append(cid)
        pilot_tuple = (
            pilot['cid'], pilot['name'],
            pilot['callsign'], pilot['server'],
            pilot['pilot_rating'],
            pilot['latitude'], pilot['longitude'],
            pilot['altitude'], pilot['groundspeed'],
            pilot['transponder'], pilot['heading'],
            pilot['qnh_i_hg'], pilot['qnh_mb'],
            pilot['logon_time'], pilot['last_updated']
        )
        pilots_list.append(pilot_tuple)
        if pilot['flight_plan']:
            fp = pilot['flight_plan']
            fp_tuple = (
                pilot['cid'], fp['flight_rules'],
                fp['aircraft'], fp['aircraft_faa'], fp['aircraft_short'],
                fp['departure'], fp['arrival'], fp['alternate'],
                fp['cruise_tas'], fp['altitude'],
                fp['deptime'], fp['enroute_time'], fp['fuel_time'],
                fp['remarks'], fp['route'], fp['revision_id']
            )
            fp_list.append(fp_tuple)

    with sqlite3.connect(DATABASE_PATH) as connection:
        connection.execute("PRAGMA foreign_keys = 1")
        c = connection.cursor()
        c.execute("DELETE FROM pilots")
        c.executemany("""
            INSERT INTO pilots(cid, name, callsign, server, pilot_rating, latitude, longitude, altitude, groundspeed, transponder, heading, qnh_i_hg, qnh_mb, logon_time, last_updated)
            values(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
        """, pilots_list)

        c.executemany("""
        INSERT INTO flight_plans VALUES (?, ?, ?, ?, ?, ?, ?, ?,
        ?, ?, ?, ?, ?, ?, ?, ?
        )
        """, fp_list)

Solution

You can create a table with columns cid, raw_json_value, json_hash and check hash value before update / insert. Here is an example:

data = [{'cid': 1, ...}, {'cid': 2, ...}, {'cid': 3, ...}]
for item in data:  # type: dict
    cid = item['cid']
    json_hash = hash(json.dumps(data))
    # Record - let's say a model from db
    record = get_record_by_cid(cid)
    if not record:
        save_record(cid, item, json_hash)
        continue
    
    if record.json_hash != json_hash:
        update_record(cid, item, json_hash) 

Answered By – Danila Ganchar

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published