Sends Mongo Change Streams Over SocketIO

Create a NodeJS app that streams Mongo data changes over web sockets.


Introduction

This is a full tutorial on creating a NodeJS app that listens for changes in a Mongo database and streams those changes to clients who connect to the app through a SocketIO web socket. The tutorial will also include the setup on the client side and the Mongo configuraiton.

Using this setup, the Mongo database will be the "Source of Truth" and users can get live updates on any changes.

For the full working example, reference this repo: MongoChangeStreamSockets

What This Tutorial Covers

What This Tutorial Covers
  1. Installing & Configuring Mongo For Change Streams
  2. Creating A NodeJS App With SocketIO & Mongo Driver
  3. Configuring A Back End For Basic Mongo Updates
  4. Configuring Client Side SocketIO

What You Need For This Tutorial

What You Need For This Tutorial

Docker


Mongo Set Up

Mongo requires either a replica set or a sharded cluster in order to enable change streams. We'll be using a replica set. The following file is the Docker Compose configuration for a mongo replica set along with the builds and deployments for the back and front ends.

Btw, on Kubernetes you would deploy a Mongo replica set using a headless service and a stateful set. Unfortunately, Docker doesn't have those concepts, so we'll have to define each Mongo replica as a separate Docker service.

Some important things to note:

Also notice that we added a script to the 1st Mongo instance. That script configures our replica set for us. It's important to note that Mongo will only run this script the first time it is launched. If you're Mongo instances already exist, you'll have to connect to your Mongo shell and run the script from there.

SocketIO Web Sockets

We'll set up our sockets server here. First we need to connect to the Mongo replica set and set up the SocketIO server. This is done in the database() and sockets() functions. Once they resolve, we set up our change streams in watchChangeStreams(). In that function, we connect to whatever databases we want to listen to and then listen to changes in that database with the watch() function.

The real key to all of this is that we store a UUID on every document. Then we create socket rooms for those UUIDs. That way, when a client joins that UUID room, it will only get updates for documents that match that UUID. While the room is the UUID, the event being ommitted is the collection name. You'll see how this is important when we discuss the front end.

Below are general events we use for joining rooms, leaving rooms, and just a check event for testing.

Mongo Back End

To test all this out, we need a back end for updating data in Mongo. Below is a very simple ExpressJS app that has one route for updating documents in a collection called "test". A key thing to notice is that we're adding Mongo Binary UUIDs to all documents. You could really use anything, but binary UUIDs are very efficient. Although not used in this tutorial, it's highly recommended that you create an index on the UUID.

The Front End

On the front end, we have two simple functions. The first, connect(), connects to the SocketIO server and joins a room based on the UUID provided. The second, update(), updates or inserts a document with the provided UUID and arbitrary data. If you do not provide a UUID, a random one will be generated on the back end.

If you are connected to the UUID room of the document you just updated, you should see the updates printed below on the HTML page. The first time you create the document, you'll see the _id key. On updates, you won't because that key was not updated, hence, you only see updated fields.

Dunners!

Overall I think it's a fairly easy to grasp though there are a lot of moving pieces and gotchas when figuring this out. One final important note is that if you want to use this kind of architecture in production, you'll need to add authentication/authorization for who can listen to which UUIDs. That's too much extra stuff to go over in this tutorial, but when I implemented it at my company, the general gist was to store an httpOnly Oauth token with the client and then use that to ping an Oauth back end service for auth information on the user. Good luck!