Optimize an Express Server in Node.js

We can optimize some aspects of our Express API server to increase the performance of a Node.js application. Two major strategies are caching requests and compressing responses. Compression reduces the size of responses that are sent back to clients, saving bandwidth both for the server and for the client. Caching allows you to reuse retrieved data to speed up serving requests to your API.

By the end of this tutorial, you should be able to:

Understand caching and compression
Explain the benefits of compressing responses
Identify when we should and shouldn’t use caching
Learn about different options for a caching layer

This tutorial is part 5 of 7 tutorials that walk through using Express.js to create an API proxy server.

Goal

Understand how compression and caching can improve performance of an API server.

Prerequisites

None

Watch: Optimize an Express Server

Compression

Data compression is the process of encoding information into a smaller size than the original. When sending a response back to a client, we can use compression to decrease the size of the payload, whether that’s a file of CSS, HTML, JavaScript, or a JSON response to an API call.

When we talk about compression on the web we are primarily referring to gzip compression. Gzip compression is well supported by essentially all modern browsers and web servers.

The main benefits of compression are:

Decreased bandwidth usage for your application
Faster download speeds for clients

When using compression, your application sends less data over the network which in turn reduces the bandwidth requirements of the application. Compressing resources sent to the client can reduce load times for responses and assets, as it takes less time to send less data.

The gains in speed from compression on an individual request typically are small; the client may save just a few hundred milliseconds, for example. With that said, every optimization has a cumulative effect as traffic scales up.

You can add gzip compression to an Express application with the compression middleware. It’s as direct as enabling the compression middleware in a single line, and then your application’s responses will be gzip compressed.

The client should be able to handle unzipping the responses without needing to change anything, so long as they are sending an Accept-Encoding header that includes gzip. (That’s typically enabled by default for most browsers and HTTP request libraries.)

Caching

Caching temporarily stores data, such as the result of an operation, somewhere it can be accessed quickly with a minimal amount of effort. After an initial time-consuming process completes, the result can be stored temporarily and then retrieved without having to perform the time-consuming process again.

In the context of an Express API, we can cache the result of some requests to the API so the application doesn’t have to repeat the operation to serve the request. You can cache any expensive operation in your application, such as the result of database calls, but right now we are focusing on caching entire responses to requests.

The main benefits of caching are:

Decreased latency. Cached requests are served in a fraction of the time of an uncached request.
Reduced application load. The server does much less work to serve cached results.

The basic structure of a cache is a set of key/value pairs, where the key is a unique identifier associated with a cached value. Data is stored to a cache by associating it with a key in the cache, and is retrieved by looking up the key.

When caching requests, we use the full path of the request (including any query parameters) as a key to check if we have a response cached. If there is a cached response associated with the key, we can return that response to the user. That’s called a “cache hit”. If there isn’t a response cached for the key, that’s called a “cache miss”. The request will be handled as normal, and the generated response will be added to the cache with the path used as the key.

Think carefully about what you decide to cache. Typically, you should only cache GET requests when you are retrieving data.

Don’t cache requests that change the state of your application or alter data, like POST, DELETE, and PUT requests. Doing so prevents the intended operation from actually occurring.

It’s also best practice to cache data that doesn’t change often. You run the risk of sending back stale data if you are caching resources that change frequently, or caching them longer than is appropriate. Caching is best implemented with data that is tolerable to be a little stale, or that doesn’t change frequently.

Caching can also introduce some potential complexity when data records relate to each other. If you update or alter a record that other records depend on, your cached data can become invalid. In this situation, you’d want to clear any cached records that relate to the record you have updated. Implementing this type of cache clearing can become complicated, and is a separate topic from letting data expire out of the cache after a set period of time. Think about how related your data is before deciding what to cache to try and reduce the complexity of managing your cached data.

At its most basic, a cache can be implemented internally within the local memory of an application. With an internal cache, the data is available only to the single instance of the application and is not shared with other servers or instances. If the application goes down or restarts, the contents of the cache will be lost. Internal caches are limited to the amount of available memory on the server, so they can’t be used for large object storage. Depending on your needs, this can be an acceptable trade-off as an internal cache is incredibly easy to set up and use. The node-cache module is a nice example of an internal cache.

A more scalable approach would be to use a data structure store like Redis or a cache like Memcached running on separate server(s) that all instances of your application can interact with. This is more complex to implement and manage, but avoids the limitations of an internal cache.

Where to add caching and compression?

Both caching and compression functionality can take place at the application level (in the code you write), or outside of the application as part of your infrastructure. If you are running multiple instances of your application, you may already be using a reverse proxy like Nginx to load balance between the instances, at which point you can let the reverse proxy handle caching and compression for you.

If you aren’t already using a reverse proxy or are running a single instance of an application, implementing caching and compression at the application level is appropriate.

Where to implement caching and compression is an infrastructure choice that depends on factors outside the scope of this tutorial. Regardless of where you use them, caching and compression are useful tools that can help increase the performance of most applications.

Recap

In this tutorial we learned about caching and compression as optimization strategies for an Express API. Compression reduces the size of payloads sent over the network to clients, improving latency and reducing bandwidth consumption. Caching is a way to prevent your application from repeating work to regenerate the same response. Caching can reduce latency and the load on your application. Both of these strategies can be implemented in the code of your application, or handled by parts of your infrastructure.

Keep going with the next tutorial in this set: Add Compression to Express in Node.js.

Further your understanding

What are the differences between Redis and Memcached?
Can you think of operations within an application you’re working on where it would be beneficial to cache?
When would you want to explicitly clear a cache rather than just let it expire?

Additional resources

Express Performance and Reliability Best Practices (expressjs.com)