Handling large SQL Query results

Recently I encountered an odd problem while working on an app for the current book I’m writing. The app worked as-is, but I wanted to improve the startup performance by using a SQLite database rather than using an XHR to load a large JSON file. I added the cordova-sqlite-ext plugin because it had built-in support for pre-populated databases. I also have a promise-based wrapper that I like to use.

I should mention that both the JSON and resulting SQLite database are large – the JSON file is 27MiB and the SQLite database is 45MiB. As such, an unlimited result set from a naive query could return a lot of rows – nearly 150,000.

The naive code, by the way, looked something like this:

db.query({sql: "select lemma from lemmas order by lemma"})
  .then((r) => { this.sortedIndex = r.rows.map(r) => r.lemma })
  ...

This looks like it should work, and it did work in Safari, in Chrome, and in the iOS simulator. However, this failed miserably on a physical device (yet another example of why it is always important to test on physical devices).

Unfortunately, there was no error. There was nothing interesting on the console. The app just didn’t work right. I investigated further, and it turned out that the above promise wasn’t resolving correctly. What blew my mind was that even unrelated promises failed to resolve once this particular piece of code was executed. It took an alert(1); to get any promises to work again.

Today I took a closer look and stepped through the native code, and it suddenly occurred to me: the plugin has to transfer a lot of data over the Cordova bridge. Although querying the SQLite database is a lot faster than iterating over a JSON object, the latter stays entirely within the JavaScript context. The plugin, on the other hand, was having to transfer data between the native and hybrid contexts. Furthermore, my wrapper converts the rows in the result set into a normal Array. While this makes for easier processing later (I can use map and the like), this also takes time and memory.

I tried a shorter query that would return one row. BOOM, it worked. It was now obvious that the problem was related to the amount of data being selected. So I tried some more numbers. I was able to successfully load up to one-hundred thousand records at once, but the more records I loaded, the more I recognized a problem: the app would become increasingly unresponsive. As far as I can tell, this is due in part to the data transfer and also due the conversion of the rows into an array by my wrapper, and obviously the more data one has to transfer and convert, the longer it takes.

At some point after one-hundred thousand records, the process breaks down completely, and the data is never returned to the promise. As such, the promise is just sitting there, twiddling its thumbs, waiting for data that will never arrive.

So how can we fix this? Turns out, rather easily. SQLite understands the LIMIT keyword, which allows us to break up a query into smaller segments. Typically this might be used for paging a result set, but it works for our purposes as well. This is what the final result looks like:

db.query({sql: "select count(*) as numLemmas from lemmas"}).then((r) => {
    const ITEMS_PER = 10000;
    let numLemmas = r.rows[0].numLemmas;
    let p = Promise.resolve();
    for (let i = 0; i < numLemmas; i += ITEMS_PER) {
        p = p.then(() => db.query({sql: "select lemma from lemmas order by lemma limit ?, ?", binds:[i, ITEMS_PER]}))
             .then((r) => this.sortedIndex.push(...r.rows.map((r) => r.lemma)));
    }
    return p;
})...

This is what the above code does:

  • Selects the number of records that the query will return

  • Creates a series of promises that process only a portion of the total query, in 10,000 record blocks.

    • The key here is LIMIT ?, ?. The first parameter is the starting position in the result set, and the second is the number of records that are returned.
  • Pushes the results into the target array.

Of course, for a query that returns nearly 150,000 records, this has to generate 15 queries. Interestingly enough, however, the return is nearly instantaneous, and doesn’t ever block the app itself, so everything remains responsive, and should the query ever return even more rows, the app will still continue to function correctly.

There is, of course, an upper limit to how much we can store in an array, of course. That said, JavaScript doesn’t do terribly badly here – A sorted 150,000 array of words doesn’t take up much space at all. Right now the app is consuming around 20MiB of RAM using Xcode’s profiler, and that’s with a full DOM and a few views on screen. While this might be tight on devices with a ridiculously low amount of RAM, it’s more than workable on any device within the last three or so years.

Kerri Shotts

Kerri Shotts is a mobile hybrid app developer, an author of several tech books on Cordova, and a photographer.

Southern Illinois