Offline *Only* Viewing

It made the rounds a while back that Chris Bolin built a page of his personal website that could only be viewed while you are offline.

This page itself is an experiment in that vein: What if certain content required us to disconnect? What if readers had access to that glorious focus that makes devouring a novel for hours at a time so satisfying? What if creators could pair that with the power of modern devices? Our phones and laptops are amazing platforms for inventive content—if only we could harness our own attention.

Now Bolin has a whole magazine around this same concept called The Disconnect!

The Disconnect is an offline-only, digital magazine of commentary, fiction, and poetry. Each issue forces you to disconnect from the internet, giving you a break from constant distractions and relentless advertisements.

I believe it’s some Service Worker trickery to serve different files depending on the state of the network. Usually, Service Workers are meant to serve cached files when the network is off or slow such as to make the website continue to work. This flips that logic on its head, preventing files from being served until the network is off.


Offline *Only* Viewing is a post from CSS-Tricks

Making your web app work offline, Part 2: The Implementation

This two-part series is a gentle, high-level introduction to offline web development. In Part 1 we got a basic service worker running, which caches our application resources. Now let’s extend it to support offline.

Article Series:

  1. The Setup
  2. The Implementation (you are here!)

Making an `offline.htm` file

Next, lets add some code to detect when the application is offline, and if so, redirect our users to a (cached) `offline.htm`.

But wait, if the service worker file is generated automatically, how do we go about adding in our own code, manually? Well, we can add an entry for importScripts, which tells our service worker to import the scripts we specify. It does this through the service worker’s native importScripts function, which is well-named. And we’ll also add our `offline.htm` file to our statically cached list of files. The new files are highlighted below:

new SWPrecacheWebpackPlugin({ mergeStaticsConfig: true, filename: "service-worker.js", importScripts: ["../sw-manual.js"], staticFileGlobs: [ //... "offline.htm" ], // the rest of the config is unchanged })

Now, let’s go in our `sw-manual.js` file, and add code to load the cached `offline.htm` file when the user is offline.

toolbox.router.get(/books$/, handleMain);
toolbox.router.get(/subjects$/, handleMain);
toolbox.router.get(/localhost:3000\/$/, handleMain);
toolbox.router.get(/mylibrary.io$/, handleMain); function handleMain(request) { return fetch(request).catch(() => { return caches.match("react-redux/offline.htm", { ignoreSearch: true }); });
}

We’ll use the toolbox.router object we saw before to catch all our top-level routes, and if the main page doesn’t load from the network, send back the (hopefully cached) `offline.htm` file.

This is one of the few times in this post you’ll see promises being used directly, instead of with the async syntax, mainly because in this case it’s actually easier to just tack on a .catch(), rather than set up a try{} catch{} block.

The `offline.htm` file will be pretty basic, just some HTML that reads cached books from IndexedDB, and displays them in a rudimentary table. But before showing that, let’s walk through how to actually use IndexedDB (if you want to just see it now, it’s here)

Hello World, IndexedDB

IndexedDB is an in-browser database. It’s ideal for enabling offline functionality since it can be accessed without network connectivity, but it’s by no means limited to that.

The API pre-dates Promises, so it’s callback based. We’ll go through everything with the native API, but in practice, you’ll likely want to wrap and simplify it, either with your own helper methods which wrap the functionality with Promises, or with a third-party utility.

Let me repeat: the API for IndexedDB is awful. Here’s Jake Archibald saying he wouldn’t even teach it directly

We’ll still go over it because I really want you to see everything as it is, but please don’t let it scare you away. There’s plenty of simplifying abstractions out there, for example dexie and idb.

Setting up our database

Let’s add code to sw-manual that subscribes to the service worker’s activate event, and checks to see if we already have an IndexedDB setup; if not, we’ll create, and then fill it with data.

First, the creating bit.

self.addEventListener("activate", () => { //1 is the version of IDB we're opening let open = indexedDB.open("books", 1); //should only be called the first time, when version 1 does not exist open.onupgradeneeded = evt => { let db = open.result; //this callback should only ever be called upon creation of our IDB, when an upgrade is needed //for version 1, but to be doubly safe, and also to demonstrade this, we'll check to see //if the stores exist if (!db.objectStoreNames.contains("books") || !db.objectStoreNames.contains("syncInfo")) { if (!db.objectStoreNames.contains("books")) { let bookStore = db.createObjectStore("books", { keyPath: "_id" }); bookStore.createIndex("imgSync", "imgSync", { unique: false }); } if (!db.objectStoreNames.contains("syncInfo")) { db.createObjectStore("syncInfo", { keyPath: "id" }); evt.target.transaction .objectStore("syncInfo") .add({ id: 1, lastImgSync: null, lastImgSyncStarted: null, lastLoadStarted: +new Date(), lastLoad: null }); } evt.target.transaction.oncomplete = fullSync; } };
});

The code’s messy and manual; as I said, you’ll likely want to add some abstractions in practice. Some of the key points: we check for the objectStores (tables) we’ll be using, and create them as needed. Note that we can even create indexes, which we can see on the books store, with the imgSync index. We also create a syncInfo store (table) which we’ll use to store information on when we last synced our data, so we don’t pester our servers too frequently, asking for updates.

When the transaction has completed, at the very bottom, we call the fullSync method, which loads all our data. Let’s see what that looks like.

Performing an initial sync

Below is the relevant portion of the syncing code, which makes repeated calls to our endpoint to load our books, page by page, adding each result to IDB along the way. Again, this is using zero abstractions, so expect a lot of bloat.

See this GitHub gist for the full code, which includes some additional error handling, and code which runs when the last page is finished.

function fullSyncPage(db, page) { let pageSize = 50; doFetch("/book/offlineSync", { page, pageSize }) .then(resp => resp.json()) .then(resp => { if (!resp.books) return; let books = resp.books; let i = 0; putNext(); function putNext() { //callback for an insertion, with indicators it hasn't had images cached yet if (i < pageSize) { let book = books[i++]; let transaction = db.transaction("books", "readwrite"); let booksStore = transaction.objectStore("books"); //extend the book with the imgSync indicated, add it, and on success, do this for the next book booksStore.add(Object.assign(book, { imgSync: 0 })).onsuccess = putNext; } else { //either load the next page, or call loadDone() } } });
}

The putNext() function is where the real work is done. This serves as the callback for each successful insertion’s success. In real life we’d hopefully have a nice method that adds each book, wrapped in a promise, so we could do a simple for of loop, and await each insertion. But this is the “vanilla” solution or at least one of them.

We modify each book before inserting it, to set the imgSync property to 0, to indicate that this book has not had its image cached, yet.

And after we’ve exhausted the last page, and there are no more results, we call loadDone(), to set some metadata indicating the last time we did a full data sync.

In real life, this would be a good time to sync all those images, but let’s instead do it on-demand by the web app itself, in order to demonstrate another feature of service workers.

Communicating between the web app, and service worker

Let’s just pretend it would be a good idea to have the books’ covers load the next time the user visits our page when the service worker is running. Let’s have our web app send a message to the service worker, and we’ll have the service worker receive it, and then sync the book covers.

From our app code, we attempt to send a message to a running service worker, instructing it to sync images.

In the web app:

if ("serviceWorker" in navigator) { try { navigator.serviceWorker.controller.postMessage({ command: "sync-images" }); } catch (er) {}
}

In `sw-manual.js`:

self.addEventListener("message", evt => { if (evt.data && evt.data.command == "sync-images") { let open = indexedDB.open("books", 1); open.onsuccess = evt => { let db = open.result; if (db.objectStoreNames.contains("books")) { syncImages(db); } }; }
});

In sw-manual we have code to catch that message, and call the syncImages() method. Let’s look at that, next.

function syncImages(db) { let tran = db.transaction("books"); let booksStore = tran.objectStore("books"); let idx = booksStore.index("imgSync"); let booksCursor = idx.openCursor(0); let booksToUpdate = []; //a cursor's onsuccess callback will fire for EACH item that's read from it booksCursor.onsuccess = evt => { let cursor = evt.target.result; //if (!cursor) means the cursor has been exhausted; there are no more results if (!cursor) return runIt(); let book = cursor.value; booksToUpdate.push({ _id: book._id, smallImage: book.smallImage }); //read the next item from the cursor cursor.continue(); }; async function runIt() { if (!booksToUpdate.length) return; for (let book of booksToUpdate) { try { //fetch, and cache the book's image await preCacheBookImage(book); let tran = db.transaction("books", "readwrite"); let booksStore = tran.objectStore("books"); //now save the updated book - we'll wrap the IDB callback-based opertion in //a manual promise, so we can await it await new Promise(res => { let req = booksStore.get(book._id); req.onsuccess = ({ target: { result: bookToUpdate } }) => { bookToUpdate.imgSync = 1; booksStore.put(bookToUpdate); res(); }; req.onerror = () => res(); }); } catch (er) { console.log("ERROR", er); } } }
}

We’re cracking open the imageSync index from before, and reading all books that have a zero, which means they haven’t had their images sync’d yet. The booksCursor.onsuccess will be called over and over again, until there are no books left; I’m using this to put them all into an array, at which point I call the runIt() method, which runs through them, calling preCacheBookImage() for each. This method will cache the image, and if there are no unforeseen errors, update the book in IDB to indicate that imgSync is now 1.

If you’re wondering why in the world I’m going through the trouble to save all the books from the cursor into an array, before calling runIt(), rather than just walking through the results of the cursor, and caching and updating as I go, well — it turns out transactions in IndexedDB are a bit weird. They complete when you yield to the event loop unless you yield to the event loop in a method provided by the transaction. So if we leave the event loop to go do other things, like make a network request to pull down an image, then the cursor’s transaction will complete, and we’ll get an error if we try to continue reading from it later.

Manually updating the cache.

Let’s wrap this up, and look at the preCacheBookImage method which actually pulls down a cover image, and adds it to the relevant cache, (but only if it’s not there already.)

async function preCacheBookImage(book) { let smallImage = book.smallImage; if (!smallImage) return; let cachedImage = await caches.match(smallImage); if (cachedImage) return; if (/https:\/\/s3.amazonaws.com\/my-library-cover-uploads/.test(smallImage)) { let cache = await caches.open("local-images1"); let img = await fetch(smallImage, { mode: "no-cors" }); await cache.put(smallImage, img); }
}

If the book has no image, we’re done. Next, we check if it’s cached already — if so, we’re done. Lastly, we inspect the URL, and figure out which cache it belongs in.

The local-images1 cache name is the same from before, which we set up in our dynamic cache. If the image in question isn’t already there, we fetch it, and add it to cache. Each cache operation returns a promise, so the async/await syntax simplifies things nicely.

Testing it out

The way it’s set up, if we clear our service worker either in dev tools, below, or by just opening a fresh incognito window…

…then the first time we view our app, all our books will get saved to IndexedDB.

When we refresh, the image sync will happen. So if we start on a page that’s already pulling down these images, we’ll see our normal service worker saving them to cache (ahem, assuming we delay the ajax call to give our Service Worker a chance to install), which is what these events are in our network tab.

Then, if we navigate elsewhere and refresh, we won’t see any network requests for those image, since our sync method is already finding everything in cache.

If we clear our service workers again, and start on this same page, which is not otherwise pulling these images down, then refresh, we’ll see the network requests to pull down, and sync these images to cache.

Then if we navigate back to the page that uses these images, we won’t see the calls to cache these images, since they’re already cached; moreover, we’ll see these images being retrieved from cache by the service worker.

Both our runtimeCaching provided by sw-toolbox, and our own manual code are working together, off of the same cache.

It works!

As promised, here’s the `offline.htm` page

<div style="padding: 15px"> <h1>Offline</h1> <table class="table table-condescend table-striped"> <thead> <tr> <th></th> <th>Title</th> <th>Author</th> </tr> </thead> <tbody id="booksTarget"> <!--insertion will happen here--> </tbody> </table>
</div>
let open = indexedDB.open("books");
open.onsuccess = evt => { let db = open.result; let transaction = db.transaction("books", "readonly"); let booksStore = transaction.objectStore("books"); var request = booksStore.openCursor(); let rows = ``; request.onsuccess = function(event) { var cursor = event.target.result; if(cursor) { let book = cursor.value; rows += ` <tr> <td><img src="${book.smallImage}" /></td> <td>${book.title}</td> <td>${Array.isArray(book.authors) ? book.authors.join("<br/>") : book.authors}</td> </tr>`; cursor.continue(); } else { document.getElementById("booksTarget").innerHTML = rows; } };
}

Now let’s tell Chrome to pretend to be offline, and test it out:

Cool!

Where to, from here?

We’re barely scratching the surface. Your users can update these data from multiple devices, and each one will need to keep in sync somehow. You could either periodically wipe your IDB tables and re-sync; have the user manually trigger a re-sync when they want; or you could get really ambitious and try to log all your mutations on your server, and have each service worker on each device request all changes that happened since the last time it ran, in order to sync up.

The most interesting solution here is PouchDB, which does this syncing for you; the catch is it’s designed to work with CouchDB, which you may or may not be using.

Syncing local changes

For one last piece of code, let’s consider an easier problem to solve: syncing your IndexedDB with changes that are made right this minute, by your user who’s using your web app. We can already intercept fetch requests in the service worker, so it should be easy to listen for the right mutation endpoint, run it, then then peak at the results and update IndexedDB accordingly. Let’s take a look.

toolbox.router.post(/graphql/, request => { //just run the request as is return fetch(request).then(response => { //clone it by necessity let respClone = response.clone(); //do this later - get the response back to our user NOW setTimeout(() => { respClone.json().then(resp => { //this graphQL endpoint is for lots of things - inspect the data response to see //which operation we just ran if (resp && resp.data && resp.data.updateBook && resp.data.updateBook.Book) { syncBook(resp.data.updateBook.Book); } }, 5); }); //return the response to our user NOW, before the IDB syncing return response; });
}); function syncBook(book) { let open = indexedDB.open("books", 1); open.onsuccess = evt => { let db = open.result; if (db.objectStoreNames.contains("books")) { let tran = db.transaction("books", "readwrite"); let booksStore = tran.objectStore("books"); booksStore.get(book._id).onsuccess = ({ target: { result: bookToUpdate } }) => { //update the book with the new values ["title", "authors", "isbn"].forEach(prop => (bookToUpdate[prop] = book[prop])); //and save it booksStore.put(bookToUpdate); }; } };
}

This may seem a bit more involved than you were hoping. We can only read the fetch response once, and our application thread will also need to read it, so we’ll first clone the response. Then, we’ll run a setTimeout() so we can return the original response to the web application/user as quickly as possible, and do what we need thereafter. Don’t just rely on the promise in respClone.json() to do this, since promises use microtasks. I’ll let Jake Archibald explain what exactly that means, but the short of it is that they can starve the main event loop. I’m not quite smart enough to be certain whether that applies here, so I just went with the safe approach of setTimeout.

Since I’m using GraphQL, the responses are in a predictable format, and it’s easy to see if I just performed the operation I’m interested in, and if so I can re-sync the affected data.

Further reading

Literally everything here is explained in wonderful depth in this book by Tal Ater. If you’re interested in learning more, you can’t beat that as a learning resource.

For some more immediate, quick resources, here’s an MDN article on IndexedDB, and a service workers introduction, and offline cookbook, both from Google.

Parting thoughts

Giving your user useful things to do with your web app when they don’t even have network connectivity is an amazing new ability web developers have. As you’ve seen though, it’s no easy task. Hopefully this post has given you a realistic idea of what to expect, and a decent introduction to the things you’ll need to do to accomplish this.

Article Series:

  1. The Setup
  2. The Implementation (you are here!)

Making your web app work offline, Part 2: The Implementation is a post from CSS-Tricks

Making your web app work offline, Part 1: The Setup

This two-part series is a gentle introduction to offline web development. Getting a web application to do something while offline is surprisingly tricky, requiring a lot of things to be in place and functioning correctly. We’re going to cover all of these pieces from a high level, with working examples. This post is an overview, but there are plenty of more-detailed resources listed throughout.

Article Series:

  1. The Setup (you are here!)
  2. The Implementation

Basic approach

I’ll be making heavy use of JavaScript’s async/await syntax. It’s supported in all major browsers and Node, and greatly simplifies Promise-based code. The link above explains async well, but in a nutshell they allow you to resolve a promise, and access its value directly in code with await, rather than calling .then and accessing the value in the callback, which often leads to the dreaded “rightward drift.”

What are we building?

We’ll be extending an existing booklist project to sync the current user’s books to IndexedDB, and create a simplified offline page that’ll show even when the user has no network connectivity.

Starting with a service worker

The one non-negotiable thing you need for offline development is a service worker. A service worker is a background process that can, among other things, intercept network requests; redirect them; short circuit them by returning cached responses; or execute them as normal and do custom things with the response, like caching.

Basic caching

Probably the first, most basic, yet high impact thing you’ll do with a service worker is have it cache your application’s resources. Service worker and the cache it uses are extremely low-level primitives; everything is manual. In order to properly cache your resources you’ll need to fetch and add them to a cache, but then you’ll also need to track changes to these resources. You’ll track when they change, remove the prior version, and fetch and update the new one.

In practice, this means your service worker code will need to be generated as part of a build step, which hashes your files, and generates a file that’s smart enough to record these changes between versions, and update caches as needed.

Abstractions to the rescue

This is extremely tedious and error-prone code that you’d likely never want to write yourself. Luckily some smart people have written abstractions to help, namely sw-precache, and sw-toolbox by the great people at Google. Note, Google has since deprecated these tools in favor of the newer Workbox. I’ve yet to move my code over since sw-* works so well, but in any event the ideas are the same, and I’m told the conversion is easy. And it’s worth mentioning that sw-precache currently has about 30,000 downloads per day, so it’s still widely used.

Hello World, sw-precache

Let’s jump right in. We’re using webpack, and as webpack goes, there’s a plugin, so let’s check that out first.

// inside your webpack config
new SWPrecacheWebpackPlugin({ mergeStaticsConfig: true, filename: "service-worker.js", staticFileGlobs: [ //static resources to cache "static/bootstrap/css/bootstrap-booklist-build.css", ... ], ignoreUrlParametersMatching: /./, stripPrefixMulti: { //any paths that need adjusting "static/": "react-redux/static/", ... }, ...
})

By default ALL of the bundles webpack makes will be precached. We’re also manually providing some paths to static resources I want cached in the staticFileGlobs property, and I’m adjusting some paths in stripPrefixMulti.

// inside your webpack config
const getCache = ({ name, pattern, expires, maxEntries }) => ({ urlPattern: pattern, handler: "cacheFirst", options: { cache: { maxEntries: maxEntries || 500, name: name, maxAgeSeconds: expires || 60 * 60 * 24 * 365 * 2 //2 years }, successResponses: /0|[123].*/ }
}); new SWPrecacheWebpackPlugin({ ... runtimeCaching: [ //pulls in sw-toolbox and caches dynamically based on a pattern getCache({ pattern: /^https:\/\/images-na.ssl-images-amazon.com/, name: "amazon-images1" }), getCache({ pattern: /book\/searchBooks/, name: "book-search", expires: 60 * 7 }), //7 minutes ... ]
})

Adding the runtimeCaching section to our SWPrecacheWebpackPlugin pulls in sw-toolbox and lets us cache urls matching a certain pattern, dynamically, as needed—with getCache helping keep the boilerplate to a minimum.

Hello World, sw-toolbox

The entire service worker file that’s generated is pretty big, but let’s just look at a small piece, namely one of the dynamic caches from above:

toolbox.router.get(/^https:\/\/images-na.ssl-images-amazon.com/, toolbox.cacheFirst, { cache: { maxEntries: 500, name: "amazon-images1", maxAgeSeconds: 63072000 }, successResponses: /0|[123].*/
});

sw-toolbox has provided us with a nice, high-level router object we can use to hook into various URL requests, MVC-style. We’ll use this to setup offline shortly.

Don’t forget to register the service worker

And, of course, the existence of the service worker file that’s generated above is of no use by itself; it needs to be registered. The code looks like this, but be sure to either have it inside an onload listener, or some other place that’ll be guaranteed to run after the page has loaded.

if ("serviceWorker" in navigator) { navigator.serviceWorker.register("https://cdn.css-tricks.com/service-worker.js");
}

There we have it! We got a basic service worker running, which caches our application resources. Tune in tomorrow when we extend it to support offline.

Article Series:

  1. The Setup (you are here!)
  2. The Implementation

Making your web app work offline, Part 1: The Setup is a post from CSS-Tricks