Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (2024)

Clusteringis a technique used to horizontally scale a Node.js server on a single machine by spawning child processes (workers) that run concurrently and share a single port. It is a common tactic to reduce downtime, slowdowns, and outages by distributing the incoming connections across all the available worker processes so that available CPU cores are utilized to their full potential. Since a Node.js instance runs on a single thread, it cannot take advantage of multi-core systems properly - hence the need for clustering.

In this article, we're going to talk about how to optimize the performance of our node applications. Let's start with a quick recap of our NodeJS fundamentals.

🏛️Node.js Server Architecture

Our NodeJS servers normally take in requests and process them on the events loop, sending back the response to the browser. This all happens on one thread, your node server, all of the JavaScript code you're running, and the event loop are on a single thread, which means that they can only ever run one line of code at a time, processing one request at any point in time.

That event loop is generally pretty good at juggling multiple requests as they come in and passing off any hard work. So your server doesn't freeze or, as we call it, it doesn't block.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (1)

Let's go back to our more detailed view of the node's internals and the event loop can juggle these multiple requests coming in by taking advantage of the many threads that your operating system already has.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (2)

And by taking advantage of the thread pool. Node makes use of them by passing off input and output tasks that take a long time to complete. It tries to pass off these long-running tasks to your thread pool and to the operating system so that your JavaScript code and the event loop don't block.

For example, if you have a node server that handles an incoming request by reading and writing to a file on your machine or on another server, in that case, NodeJS passes off the work to the OS to read that file and your JavaScript code doesn't block, even though that JavaScript code is itself running on a single thread.

As we've seen so far in the vast majority of node applications, this setup works great. Node.js is ideal when working with non-blocking operations, for example, making requests to servers hosted somewhere on the internet that do things for us and that we can put together into a meaningful application.

However, sometimes we have some code that requires a lot of processing power. Code that blocks our events loop from continuing code that doesn't involve our basic file and network operations that Node.js knows how to handle efficiently. What happens when we need to do a lot of the heavy lifting in JavaScript JavaScript that only runs on a single thread? Let's investigate this and use what we learn to really optimize the performance of our node web servers, especially on machines with multiple cores.

🛠️Building A Simple Blocking Server

Before we demonstrate how we can improve performance, we first need to show how we might run into performance problems. Let's start by building out a very simple example to show the effects of blocking the event loop in a node server. We’re going to create a basic Node.JS server with Express, which is gonna look something like this:

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (3)

Inside, the server.js file, we’re going to have two GET routes:

Root route: It just sends the text “Performance example”.
/timer route: It’s a little bit interesting. It also takes in a request and sends a response. Now it's going to delay the response for 9 seconds (i.e. 9000 milliseconds) before responding back to the clients.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (4)

Where the delay function looks something like this:

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (5)

Well, we're not going to do anything special. We're just going to constantly check whether the current time - the start time, has been less than the duration of our delay. Our JavaScript will loop over this condition until we've been waiting longer than the duration. And while JavaScript is checking this condition and looping over it, well, our event loop will be blocked.

JavaScript code like our delay function is processed on the event loop. It's not a file or a network operation, so it's not going to be sent to the thread pool or passed off to the operating system. Like many of our built-in node functions, instead, while we're in our loop, the event loop is completely blocked. This means that if we call our delay function in our /timer route, then for 9000 milliseconds or 9 seconds, our server can't do anything else while our delay function is executing. It can't process other requests or pass any files or make queries to our database. No other JavaScript code will be processed.

Let's find out what effect this has on our server, so we still need to call up the app.listen() function on any port like 3000. Then start the server. When that started, let's go to our browser and try to make some requests against our server. Let's open up our JavaScript console here, go to the Network tab, and try to make some requests to our server at localhost on Port 3000.

Navigate to the ‘/’ route i.e. root route and take a look at the Time section inside the network tab for every request.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (6)

The request took about 14 milliseconds. This number may vary on different machines depending on your system configuration.

Now in another tab, navigate to the ‘/timer’ routes, we can go to slash timer and wait around until our server responds. After just over 9 seconds, our delay function is simulating some work that takes 9 seconds to complete.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (7)

Just like we expect now, here's the catch. In order to understand the catch, open two new tabs (tab 1 & tab 2) along with the network tab in each tab. Clear network tabs & make “Disable cache” on in both tabs. Finally, in tab 1, open the ‘/timer’ route & then quickly in tab 2, also open the ‘/timer’ route.

In Tab 1: We get back the response after 9 seconds as expected.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (8)

In Tab 2: But in the second tab, it's still pending. It was only after 16 seconds that our response came back.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (9)

That's more than twice as long as it should have taken. This is crazy because the first request blocked the CallStack. The second request wasn't able to start. The Express server wasn't able to respond to it until the first response was completed, and that timer expired. Our CPU and the event loop are really, really busy here. Spinning on this condition as fast as our CPU can go. Our event loop won't continue until our delay function returns, which means our server won't be able to respond to any other requests.

If we go back to the browser and in one tab, I go to the ‘/timer’ endpoint at first, and then in the second I go to the ‘/’ route i.e. root route, you’ll notice that the ‘/timer’ endpoint takes 9 seconds to respond as usual.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (10)

But this time, the ‘/’ endpoint takes 5 seconds = 5000 milliseconds to respond, which was taking previously about 14 milliseconds.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (11)

Hence, our blocking code is slowing down the entire server. There needs to be a better way of handling this so that our server doesn't grind to a complete halt. Let's keep going and investigate some different approaches we can take to improve this situation.

🏃 Running Multiple Node Processes

Let's look at the main approach we can take to improve the performance of our node servers in general when solving difficult problems. The best way to approach them is to divide your large, difficult problem into smaller pieces and solve those smaller, more achievable pieces.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (12)

Similarly, when dealing with servers that are overloaded with too much work, we want to divide that work up and spread the load. How well do we know that the Node.js process runs on a single thread node and JavaScript doesn't follow the multi-threading approach that other languages like Java and C++ follow? What we do in Node is run multiple node processes side by side, allowing them to share the work amongst themselves like a team working together towards a common objective.

With servers, the work that we take in is broken down into these requests that come into our server.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (13)

And so rather than taking in each request and handling it in one Node.js server with one node process, we can instead spread our requests out between multiple Node.js processes that each respond to that request in the same way.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (14)

They each contain a copy of your server code that's running side by side in parallel. And so now these second requests can be handled by the second node process, and the third request can be handled by the third node process.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (15)

And if we have more requests, then servers. Well, each of these processes can still handle multiple requests at a time. The important part is that they're now sharing the load equally between them. Using this technique allows your single-threaded node applications to make full use of all the CPU's cores on your machine.

Remember, most computers today have multiple CPU cores that can each run code side by side without affecting the performance of code running on any of the other CPU cores in the next video. Let's take a look at one approach we can take to have multiple node processes working side by side to respond to multiple incoming server requests.

🕸️The Node Cluster Module

Our first approach to improving node performance is using the built-in node cluster module. The cluster module allows you to create copies of your node process that each run your server code side by side in parallel.

The way this works is when you type: node server.js to start your node application, the main node process is created. Node calls this the master process inside of the cluster module.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (16)

We have access to this function called fork. Any time we run the fork function in our server.js file, node takes this master process and creates a copy of it that it calls the worker process.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (17)

We can call this fork function however many times we like to create many worker processes that are attached to a single master.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (18)

It's these worker processes which do all of the hard work of taking in HTTP requests, processing them, and responding to them. Each worker contains all of the code required to respond to any request in our server, while the master is only responsible for coordinating the creation of these worker processes using the fork function.

In this example, where the fork function has been called twice, we now have three node processes running. The master, which we created ourselves when we told Node to execute the server.js file, and then the two worker processes that were each created by the fork function, and it will be these worker processes that accept incoming requests and share them using what's called the round robin approach, where it sends the first request to the first worker and the next request to the second worker. And because we only have two processes, the third request goes back into the first worker.

The workers take turns responding to requests as they come in. And while not all requests are equal, some might take 15 seconds to complete, while others might be just a few milliseconds. In the grand scheme of things, this approach is simplest to implement, and you might be surprised that this round-robin approach ends up being one of the most fairways of distributing work between the two workers, even compared to fancy approaches that try to predict priorities and assign different importance or weight to each request.

🚨 Attention: One little caveat is that on Windows, because of how Windows manages processes, NodeJS makes no guarantees about using the round-robin approach. Instead to maximize performance, it leaves how to divide the tasks up between worker processes up to the Windows operating system, it could use round robin, or it could use a slightly different approach.

🎬Clustering In Action

We've talked about the theory, but how do we use the cluster module? How can we apply clustering to our node servers to improve performance? Let's go back to our performance example code and create two forks of our server, which can each handle requests concurrently side by side.

First things first, to use the cluster module we need to require are built-in cluster module, assigning it to a constant that will call cluster.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (19)

Now we make our cluster of processes. Remember our diagram,

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (20)

When we type node server.js, guess what we call the master process is started, and it's from this master process that we can call our fork function to create these two worker processes. The worker processes that we have forked run the same code that we have in server.js.

The only way we differentiate our master process from the worker processes is by using the isMaster Boolean flag from the cluster module. So down there, right above our app.listen() call.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (21)

We can write if cluster.ismaster is master and any code inside of this block will only get executed the first time that our server.js file is executed as the master. Let's log that to our console so that we can see the master has been started And anything that we write in the else condition there is when this isMaster is false and our code is running as a worker process. So we'll see the console log worker process started.

What we're going to do is fork two workers, which will run the express server. With the two routes that we've defined here. So from the top, we run node server.js were the master and now we want to call cluster.fork().

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (22)

Each time we call this function from the cluster module, we create a worker. We can do this, however many times we like. But let's just start with two for now. Just like that. Perfect.

🚀 Maximizing Cluster Performance 📈

We saw the huge improvement to our servers response time that clustering can give us. But our cluster has its limits. Let's set up another little experiment continuing from our example.

If I open up 4 new tabs along with network console, each of them loading the ‘/timer’ endpoint nearly at the same time. Making sure that disable cache has been checked off, and let's see -

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (33)

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (34)

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (35)

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (36)

I'm waiting around and the request came back after -

9 seconds in the tab 1
9 seconds in the tab 2
14 seconds nearly in the tab 3, nearly double as long as it should have.
14 seconds nearly in the tab 4, nearly double as long as it should have.

This means that only two of our requests were made side by side in parallel. The other two had to wait because those processes were already blocked. We can confirm this by looking at the process IDs that we logged.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (37)

For the second request, we have a process with the ID ending with 92.

In Tab 1: We have sent the first request, there we have a process with the ID ending with 60.
In Tab 2: We have sent the second request, there we have a process with the ID ending with 92.
In Tab 1: Using our round-robin approach to dividing work, we can see that the third request reused the first process because it has the same process ID ending with 60.
In Tab 1: and the fourth request reused the second process because it has the same process ID ending with 92.

Using clustering is not what we call a silver bullet, something that solves all our problems just by always taking the same approach. We're limited by the number of processes that can execute in parallel. Right now in our cluster, we've talked to processes and we could only handle two requests simultaneously at the same time.

That means we should keep adding those fork function calls to create more worker processes, right?

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (38)

Well, there is a limit to how far this approach can take us. Let me show you what I mean. Let's be a little bit more clever and automate the creation of our forks. We're going to use the OS module, which is another node built-in.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (39)

And we're going to use the operating system to give us some information that will let us create the correct number of worker processes you see to run efficiently. Each process needs to use a separate processor in your computer, in your CPU. In general, we want to limit the number of worker processes that we create to the amount of logical or physical cores in your CPU. Physical cores are just for that.

💡What are physical cores and logical cores?

Physical cores are separate processors in your computer that each handle works in parallel, whereas logical cores are a little bit more complicated. They use some fancy logic to let you run multiple threads in parallel on one physical core, but only in certain cases and not as efficiently as a physical core could run your code.

To maximize the performance of our server in general, we want to take the number of logical cores which we can do by using our built-in operating system module os.cpus(), which gives us an array of objects representing each logical core and hence, os.cpus().length gives the number of logical cores. Therefore, for each of these workers, we'll create a new fork.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (40)

Now we're only creating the amount of work our processes, which will maximize the usage of all of the cores in our server machine. That's what we want in the real world. So on my machine, I'll restart the server.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (41)

And I have 4 physical cores, but 8 logical cores, which means I now have 8 workers. That means in my browser, I can make up to 8 requests at the same time and get increased performance. The 4 physical cores should allow our response to return for our timer in about 9 seconds and the non-physical cores, so the 4 leftovers might see some performance somewhere in between.

Let's see. I'll make 6 requests in total. That means. I’ll open 6 new tabs to load the same ‘/timer’ route, making the browser’s cache disabled.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (42)

I have collected data from all 6 of those tabs, here it’s above. As you can see, each of those 6 requests is handled by a new process which has a new process ID and we get the responses in almost 9 seconds from every tab. That's a pleasant surprise. We're now maximizing the performance of our server based on the number of CPU cores in our machine. Very good stuff.

⚖️Load Balancing

Remember how we mentioned the round-robin approach? Round robin is just one strategy that is used for what's called load balancing.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (43)

Load balancing is a fairly advanced topic, and you could make an entire article about it. But it's such an important topic when it comes to building back-ends that you'll gain a lot just from being introduced to this concept. load balancing is basically a way of distributing or dividing a set of tasks to a set of resources, for example, dividing which requests will be handled by which processes.

If you have a cluster of worker processes for your server, when we talk about load balancing, we use what's called a load balancer to determine how, for example, our requests should be divided among the different processes handling them.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (44)

The load balancer is what takes requests from users and distributes them so that the responsibility of handling those requests is shared by many different processes or potentially many different applications or servers.

For example, you could have 2 servers, i.e. 2 different machines, each running a set of processes that can handle requests. So you're balancing the load of requests, both due to possible different servers and too many different processes inside of those servers, load balancing applies when you're running multiple servers or processes in parallel, each handling the same kind of request.

Just like we saw in our example, using the same set of possible routes, you might hear load balancing talked about in the context of what's called horizontal scaling.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (45)

Where horizontal scaling is a fancy term for what we've been doing in our cluster, rather than doing what's called vertical scaling, where we add more speed to our one-node process, for example, replacing our CPU with a faster one. That's vertical scaling.

For horizontal scaling, we instead take our server and it doesn't have to be super fast or big, but we grow or scale our application to handle more requests more quickly by adding more servers or in the case of our cluster, more node processes. That's horizontal scaling and load balancing, in a nutshell, the two most common strategies for distributing requests that are used by load balancers.

In the case that we have no prior knowledge about how long a request will take to complete, which is usually the case when we have servers with many different types of requests that can sometimes take longer and sometimes take much shorter. Well, when we don't have that rigid predictability, the two main approaches for load balancing are-

Round-robin scheduling, which we now have quite some experience with.
Randomized static, which is basically just randomized distribution of requests where each new request is assigned to one of the processes at random.

Believe it or not, these ridiculously simple algorithms for deciding which process should handle a request are actually the most effective when you don't have really good knowledge about exactly how long a request will take to complete.

To summarize, in Node, we can use the cluster module to do load balancing of requests as they come into our node HTTP servers. And the cluster module uses the round-robin approach to determine which process will handle those incoming requests, the requests, which are the load being balanced across your server.

🛠️The PM2 Tool

The cluster module is a great tool to improve the performance of our servers. However, there are many other features that are commonly asked for when running clustered servers in production. The good news is we have a really superb tool built on the functionality in the cluster module that includes any functionality you could possibly need. This is the PM2 tool.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (49)

PM2 uses the cluster module internally, which we can see if we go to the documentation.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (50)

Under the hood, it's using the cluster module that we're already familiar with. So what we've learned applies, but PM2 provides many added capabilities to help you manage that cluster, like -

You may want to restart your cluster processes if there's a code change, just like we do with the tool nodemon when running just one process for that we can see. PM2 has a watch and restart mode, or you may want to restart processes automatically if there's a failure in one of your processes. So we can look at restart strategies or graceful shutdowns.
We might want to monitor the status of our processes or manage where the logs for each process go.

PM2 is really rich when it comes to features, and it's widely used even in projects that don't use NodeJS. We can find PM2 in our NPM registry and use it as a tool.

🏗️Using PM2 To Create Clusters

Let's get PM2 installed on our machines and see how it can help us to manage a cluster of worker processes. By the way, PM2 stands for Process Manager 2. PM2 is on NPM, so to install it, in our terminal

npm i pm2

Now I suggest you include PM2 in your dependencies by saving it to your packaged.json file, just like we do with all of our other dependencies. So I'm going to install PM2 as a local package in our performance example folder. It's not a development dependency because we'll be running our node server with PM2 in production. However, for convenience, many people like to install PM2 as a global module, which you can do by using

npm i pm2 -g

I'll do that here as well to make it easier to demonstrate some of the features inside of our command line. Once PM2 is installed as a global module, to start our server process, we can run

pm2 start server.js

You’ll see that we have one instance of our server and it's currently online using zero percent of our CPU and about 38.6 MB of memory.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (51)

Just like before, we can make requests in our browser to our server at Port 8000 & navigate to different routes - ‘/’ or ‘timer’.

You’ll also notice that PM2 is running in the background. I can type other commands into the terminal and the server is still running. For example,

To list the processes that are being managed by PM2, we can run the following command:

// Any one of these three:pm2 list// Orpm2 ls// orpm2 status

If we type one of these commands, we get the current status of our server. We have one process. This is our master process that's being managed by PM2. We'll see how to manage the worker processes in just a second.

To stop the server managed by PM2, we can run the following command:

// Using the ID of the process: 0pm2 stop 0// Using name of the process: serverpm2 stop server

To terminate and remove the process from the list of processes managed by PM2, we would type:

// Using the ID of the process: 0pm2 delete 0// Using name of the process: serverpm2 delete server

All right. This is all fine and all, but why do we really want to use PM2? Well, PM2 comes with clustering built in, which means that we can simplify our code to remove any usage of the cluster module. We don't need to fork inside JavaScript because instead, PM2 will fork our process.

Our code in the server.js here will be run as the worker process, so we don't need to check the isMaster flag and hence we can get rid of our entire master block here. And leave just the worker process aspect where we listen on Port 8000. Finally, it looks like this:

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (52)

and also remove the OS module import,

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (53)

To start a cluster in PM2, what we'll do is run the following command:

pm2 start server.js -i max

where the flags

-i = instance, it measures the number of worker processes that will be created in our cluster.
max = to tell we want to start the maximum amount of workers to take full advantage of all of the CPU cores in our machine.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (54)

You can see that on my machine, 8 processes have been started and are being managed in our PM2 cluster. Just like when we were using our cluster module, we can make two requests side by side, say to the ‘/timer’ endpoint, and they're executed in different processes. One request doesn't block the other.

In fact, remember the PM2 tool takes advantage of our cluster module under the hood. We can always write PM2 list to show the current status of our server. And you might be asking if PM two is managing all of these processes. Where do my logs go? Why am I not seeing any logs in the terminal?

To get a real-time view of what's being logged in our server right now, you can run:

pm2 logs

To restart our cluster, we can run:

pm2 restart server

You’ll see something like this:

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (55)

Then you’ll see that our list now shows that each server has been restarted once each process is using a little bit of our CPU because it's just starting up and in our logs.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (56)

PM2 also includes a lot of advanced logging functionality, including sending logs to a file and even doing log rotation. So that the server doesn't get overwhelmed by one giant log file. we'll demonstrate more features later. Next, we'll take a look at some of the really useful things that PM2 allows us to do when managing live clusters of node processes.

📋Managing Live Clusters With PM2

When we're in development, we don't use PM2 so much beyond configuring it and testing it periodically to make sure that our cluster still works after any changes. But PM2 really shines when we're in production and we have a live cluster. We're going to take a look at some of those things that PM2 allows us to do that would be quite difficult to duplicate with the built-in node cluster functionality.

Let's start with a fresh cluster. I can delete any of my running processes from the list of processes that PM2 tracks by running

pm2 delete server

There we go. We have no running cluster. Now all clear the terminal and we can start our server using

pm2 start server.js -l logs.txt -i max

Where

server.js = The name of our main JavaScript file as the entry point,
the flag -l = To specify the name of a file to send our logs to,
logs.txt = The name of the log file that we want to be created.
-i max = To match the number of instances to the number of logical cores in our machine

Once you hit enter to run the command, We’ll have 8 instances and we have some basic information about the overall status of each process.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (57)

You’ll also notice that a new log file named logs.txt is created as well.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (58)

We could get more detailed information about each of these processes by running

// Using the ID of the process: 2pm2 show 2

There you’ll have fancy code metrics showing -

Used Heap Size: the amount of memory that we're using
Event Loop Latency p95: how long each pass through the event loop is taking
Event Loop Latency: what's the latency or wait time when running a command on the event loop?

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (59)

Scrolling up, we can do a lot more advanced things and get information about the process that's currently running -

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (60)

like -

uptime: How long the process has been running?
entire log path: The path to the logs that we just specified
script path: the path of the script, just like we can get information using the command: PM2 show.

We can also manage each of our processes from our PM2 list. So maybe we've detected a problem in the process with ID 4 and we want to temporarily bring it down to see what effect it has on the rest of the cluster. So we can do by running:

// Using ID of the process we want to stop: 4pm2 stop 4

You’ll see something like this, where we can see that status of process ID 4 is stopped.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (61)

And that individual process can be managed directly by PM2. When we're ready, we can start each process up individually as well by running:

// Using ID of the process we want to start: 4pm2 start 4

And all our processes are back online.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (62)

All right. To wrap up our overview of PM2, we can use this PM2 to get a fancy dashboard right in our terminal for monitoring, by running:

pm2 monit

It's not going to look so great in the small terminal of VSCode, so open it up in a separate terminal outside of vs code.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (63)

Here we get this live dashboard of the status of each of our server processes as they accept incoming requests. We can see what effect that has on the memory being used by that process or the CPU usage. We should see the CPU usage go way up when we make a request to our server, like the one with the ‘/timer’ endpoint, which uses the CPU to really quickly spin doing nothing.

🔄Zero Downtime Restart

Say we have our cluster running, just like we can see with the command: pm2 monitor. that's on the screen right now, but we've been tasked to make a change in our server code. Maybe our timer now, rather than making a sound “Ding sound!”, needs to make a beep sound like “Beep beep!”.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (64)

Our server code has changed and we now need to restart our server to apply our changes, but we have live users, there's people using our application, and we want to make sure that our application is still available to them while we make these changes. No one likes browsing to a page on the internet and seeing any notification about scheduled downtime or worse off unscheduled downtime.

These situations can very often be avoided by doing zero downtime restarts just like what we're going to do here, and maybe as part of our change, we want to decrease the delay to be only, say, 4 seconds = 4000 milliseconds. We'll save our server.js file. But if we go to our browser and refresh our ‘/timer’ endpoint, well, it's still running the old version of the code.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (65)

It's still going to take 9 seconds for this response to come back. Check that out.

What we could do in our terminal, we could restart our server using:

pm2 reload server

But there would be a point where our server is unavailable to all of our users because all of the processes are shut down. And then maybe they take a little while to restart.

💡What is Zero Downtime Restart?

With zero downtime reload or restart, instead of terminating all the processes and restarting them all at once, we can use the command:

pm2 reload server

Here notice it's ‘reload’ and not ‘restart’ to restart processes one by one, keeping at least one process running at all times. This is the best way to update servers that are already live and serving users, particularly with applications that are time-sensitive.

We want our users to be able to access our application 24x7 any day of the year. So let's see what happens if we run that command. We'll see our updated code being applied and our process is restarted. And if we had this monitor dashboard open at the same time, we would see that each server is being brought offline one by one.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (66)

We can now see that the uptime for each server is in the seconds, here it’s is near about 75s because it’s just get restarted. Now, when I make a request to the ‘/timer’ endpoint.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (67)

We get “Beep beep!” instead of “Ding sound!” and now It’s taking 4 seconds instead of 9 seconds to respond. Zero downtime reloads are all about making sure your applications are available to your users at all times, even when changes need to be made to your code.

👷Worker Threads

Before we move on, there's an exciting feature in Node that I'd like to introduce you to. We're talking about worker threads. This is a built-in module that enables the use of threads to execute JavaScript in parallel, worker threads are useful for performing CPU-intensive JavaScript operations. Those operations would otherwise block our code.

Whoa, whoa, whoa, whoa. I know what you're thinking, doesn't our JavaScript code run on a single thread? Was everything that we learned earlier wrong? Don't worry. Everything that we've learned so far very much applies. Worker threads don't change how NodeJS works at the core. But they do add something new. Worker threads take Node, just one step closer to making JavaScript a multi-threaded language.

JavaScript, the language doesn't have multithreading features, and that won't change any time soon. So what's the difference between traditional multithreading and worker threads in Node.js? What exactly do worker threads do to understand this? We need to look at the bigger picture.

Worker threads in Node are based on the web workers that are available in your browser with the Web Worker API. Web workers let you run a piece of JavaScript code from your browser in. That's right.

But in Node.js there is a more recent feature, a feature that's still evolving as we speak. Worker threads were introduced thanks to a shiny new feature of the V8 engine called V8 Isolates.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (69)

V8 isolates are isolated instances of V8 engine. You can think of them as these sandboxes that run JavaScript code independently of each other. Worker threads use these isolates to create new threads that can execute your JavaScript code side by side with each v8 isolates handling the JavaScript code for one thread.

What this means for us in practice is that worker threads, just like clusters, help us to take advantage of the multiple CPU processors in our machine. So worker threads are similar to the cluster module, but they do things very differently. The cluster module uses processes while worker threads use this V8 isolate. Why does this matter? How does this difference affect us as developers? Let's explore these questions by comparing how the two work.

To recap, our cluster module allows us to start a server.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (70)

Which creates a master process that can then use the fork function to create child processes or worker processes. Which can do things like respond to requests in your server and you can call for work however many times you like to create new workers. We're very familiar with this flow.

In comparison with worker threads.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (71)

When we run our JavaScript file, we create what's called the main thread. This thread can use the Worker() constructor to create a new worker thread by calling:

new Worker();

And just like with the fork function, we can create as many workers as we'd like. If this seems very similar to you, that's because it is. If we look at them side by side.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (72)

The flow to create worker threads and processes follows the same structure. Basically, you can think of the cluster module as allowing us to run multiple instances of Node in separate processes, while the worker thread module allows us to run multiple instances of Node in the same process.

By taking advantage of that V8 isolate feature that we mentioned, we'll see this whole flow in action when we write some code to take advantage of worker threads. There are also some very important functional differences between worker threads and clusters.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (73)

If you're paying really close attention, you might have noticed that with the worker thread example, I'm not calling my JavaScript file server.js file, but instead index.js file. This is to highlight the difference that each worker thread here is not designed to share requests coming into a server. The worker threads module doesn't include any built-in functionality to run a server on one port and distribute incoming requests between each thread.

That's specific to the cluster module. So we could run a server using worker threads, but we'd have to implement that distribution of work ourselves. And here's the main difference. Unlike processes made with the cluster module, worker threads can share memory with each other. We'll see this in action very soon.

For now, what we need to know is that the worker thread approach isn't as rock solid as the cluster approach with multiple processes. This cluster approach has been used in node since pretty much the day that it was created. In production, I highly recommend you stick with clustering for your servers. But that's what makes worker threads so exciting. There's still a lot of potential behind how they can be used.

🧪Worker Threads In Action

How do we use worker threads? Let's put this shiny node feature to work. Let’s understand this with an example. So create a project folder & a JavaScript file named threads.js.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (74)

To put our threats to use, we need to require the ‘worker_threads’ module. And I'll take a few different values from that module.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (75)

The most important value will be the Worker constructor. And similarly to what isMaster in the cluster module allowed us to do, we can check if we're currently in the main thread or any worker thread by using this isMainThread value that's also made available in our ‘worker_threads’ module. We can combine these two features to create new worker threads by instancing a new instance of the Worker class: new Worker(). And the worker constructor takes a parameter, a string that points to some file that contains JavaScript code to be executed in that worker thread. So what we can do is pass in the current file path by using the __filename, available in our node modules.

As it stands, this code would create worker threads over and over until our machine can no longer create worker threads, right, because -

We would run our program (i.e. thread.js) ⇒ It would create workers that run our program (i.e. thread.js) again.

So what we want to do is only create new workers if we're in the main thread.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (76)

When we run threads.js for the first time, we can check isMainThread and only if it’s true i.e. we’re in the main thread, then call the Worker() class providing the current file name and to demonstrate how this might be used, let's create just 2 workers.

Both of those 2 workers run the same code - code in the threads.js file. So if we run this, what do you expect to see?

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (77)

We have two worker threads that were executed. Both with the ability to run our worker code in parallel. But let's demonstrate how worker threads are different from processes.

Remember, worker threads are all part of the same process. Unlike clusters, we run Node multiple times in one process. To confirm this, we console log the process ID both from the main thread & worker threads.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (78)

We should expect to see, unlike our cluster module three identical process IDs. Let's see.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (79)

All right, so this is all fine in theory. But right now, our worker thread isn't doing anything useful. Let's give it some actual work to do in parallel with the other worker thread.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (80)

What work can we give it? Say we have an array of numbers. Remember that there is a built-in JavaScript sort operation that we can call on any array that was blocking. See, we wanted to sort a few different arrays. On separate threads to take advantage of the multiple CPU processors on my machine. We can send work from our main thread to our worker threads by using the second parameter to our worker constructor, which accepts an object with a workerData property.

With the help of workerData, We can pass in the array we want to sort in each worker as the value of this property. This data will now be available in the worker thread inside of this workerData value. That's part of our ‘worker_threads’ module. Now, while this sort of operation will block each thread because I have multiple processors on my machine, the two different arrays that I pass in can be sorted in parallel, side by side. Let's see what happens if I run our program here.

Ultimate Guide to NodeJS Performance feat. Clustering, PM2, Worker Threads (81)

You can see what happened is the main thread started and then each of our two worker threads started. Both doing our sort operation more or less side-by-side sorting is a fairly expensive operation when it comes to how much of your CPU it uses. By using worker threads, we can multiply the effectiveness of our CPU by taking advantage of its multiple processors. Processors, which can run each thread in parallel. And because we're using the worker threads module, all of this happens inside one process in the most efficient way possible. Good stuff.