16

Let's take the simple code snippet:

var express = require('express');
var app = express();
var counter = 0;

app.get('/', function (req, res) {
   // LOCK
   counter++;
   // UNLOCK
   res.send('hello world')
})

Let's say that app.get(...) is called a huge number of times, and as you can understand I don't want the line counter++ to be executed concurrently by the two different threads.

Therefore, I want to lock this line that only one thread can have access to this line. My question is how to do it in node.js?

I know there is a lock package: https://www.npmjs.com/package/locks, but I'm wondering whether there is a "native" way of doing it without an external library.

2
  • 4
    Node is still single threaded. Don’t mix up parallel and concurrent. bytearcher.com/articles/parallel-vs-concurrent Commented Jun 22, 2018 at 5:45
  • Why do you want to block thread? If you block thread none of the requests will execute all will be in the queue. It's not good practice to block the thread in Node.js Commented Jun 22, 2018 at 6:05

5 Answers 5

19

I don't want the line counter++ to be executed concurrently by the two different threads

That cannot happen in node.js with just regular Javascript coding.

node.js is single threaded and event-driven, so there's only ever one piece of Javascript code running at a time that can access that variable. You do not have to worry about the typical pre-emptive concurrency issues of multi-threaded systems.

That said, you can still have concurrency issues in node.js if you are using asynchronous code because the node.js asynchronous model returns control back to the system to process the next event and the asynchronous callback gets called on some future event. But, the concurrency issues are non-pre-emptive so you fully control when they can occur.

If you show us your actual code in your app.get() route handler, then we can advise more specifically about whether you do or don't have a concurrency issue there or not. And, if you do, we can advise on how to best deal with that.

Threads in the thread pool are all native code that runs behind the scenes. They only trigger actual Javascript to run by queuing events through the event queue. So, because all Javascript that runs is serialized through the event queue, you only get one piece of Javascript ever running at a time. The basic scheme of the event queue is that the interpreter runs a piece of Javascript until it returns control back to the system. At that point, the interpreter looks in the event queue and if there's an event waiting, it pulls that event out and calls the callback associated with that event. Meanwhile, if there is native code running in the background, when it completes, it adds an event to the event queue. That event is not processed until the current Javascript returns control back to the system and it can then grab the next event out of the event queue. So, it's this event-queue that serializes running only one piece of Javascript at a time.

Edit: Nodejs does now have WorkerThreads which enable separate threads of Javascript, but each thread has its own heap and its own variables so a variable from one thread cannot be directly accessed from another thread. You can configure shared memory that both WorkerThreads can access, but that isn't straight variables, but blocks of memory and if you want to use shared memory, then you do indeed need to code your own synchronization methods to make sure you are atomically accessing the variable. The code you show in your question is not using any of this so the access to the counter variable is already atomic and cannot be simultaneously accessed by any other Javascript, even if you are using WorkerThreads.

Sign up to request clarification or add additional context in comments.

7 Comments

OK. So you're saying that even if I have 100 threads in thread pool, I'll never see two threads that execute in parallel ?
@CrazySynthax - Javascript itself doesn't have threads. So, I don't know what you mean by 100 threads in a thread pool. If you're talking about the thread pool used internally in node.js. that is for native code only. When a native code action (like say a file sytem operation) wants to notify completion back to Javascript, it goes through the event queue as described in my answer. There are never two pieces of Javascript running at the same time. node.js does not have Javascript threads. All native code threads are serialized through the event queue for calling Javascript callbacks.
@CrazySynthax - Yes, that is true. It will execute single threaded until it returns control back to the interpreter at which point the interpreter will pull the next event from the event queue and call the callback associated with it. Note, due to the asynchronous design of Javascript, the first callback may not actually be done with its work. It may have returned and scheduled some other event to occur in the future (timer, I/O request, etc...) that will trigger a new callback in the future where it will finish its work. But, it will run uninterrupted until it returns.
I wonder how would you deal in the situation when counter is saved in DB and you have multi instance server?
@VitaliyMarkitanov - For multi-process access to a database, you have to use atomic database operations. For example, you don't fetch a counter, increment it and write it back because that's subject to multi-process race conditions. Instead, you use the atomic operations your database provides. That might be a lock or that might be an atomic increment operation or something else. Each database has its own set of concurrency/atomic features expressly for these types of problems. This is part of proper database coding.
|
3

If you block thread none of the requests will execute all will be in the queue.

It 's not good practice to block the thread in Node.js

var express = require('express');
var app = express();
var counter = 0;

const getPromise = () => {
    return new Promise((resolve) => {
        setTimeout(() => {
            resolve('Done')
        }, 100);
    });
}

app.get('/', async (req, res) => {
    const localCounter = counter++; 
    // Use local counter for rest of operation so value won't vary

    // LOCK: Use promise/callback 
    await getPromise(); // Not locked but waiting for getPromise to finish

    console.log(localCounter); // Same value before lock

    res.send('hello world')
})

1 Comment

await does not block the event loop, it is just syntactic sugar over promises. More info about blocking the event loop: nodejs.org/en/learn/asynchronous-work/dont-block-the-event-loop
1

Node.js is single-threaded, which means that any single process running your app will not have data races like you anticipate. In fact, a quick inspection of the locks library shows that they use a boolean flag and a system of Array objects to determine whether something is locked or not.

You should only really worry about this if you plan on sharing data with multiple processes. In that case, you could use Alan's lockfile approach from this stackoverflow thread here.

2 Comments

So can we conclude that once a thread (in thread pool) starts to execute, it will never be interrupted until it terminates?
There is only ever 1 thread in the thread pool in Node.js. However, jfriend00 does a good job explaining how Node accomplishes concurrency despite being single-threaded. Put simply, it is safe to assume that as long as a statement is executing, it will not be manipulated in such a way that you have to worry about data races (reads and writes happening at the same time). However, it is completely possible for callbacks to be interleaved in such a way that while one callback is waiting for something to complete, another can be started. Again, jfriend00 explains this with the event queue.
1

You can also solve the this problem with cluster native package of node.js.

const cluster = require('node:cluster');

but it's important to understand the difference between the cluster package and the worker_thread package.

the former is running multiple instances of node.js application as processes, that distributes workloads among their application threads, in this case, you do not have full control of this threads as it's system managed, while the former creates a thread you can fully manage, they can both solve this particular problem.

with the clusters processes can run in isolation, without interference, but can share the same system port and the thread automatically managed by the system in a distributed fashion.

you can fork a worker you can use to accomplish your task like so;

if (cluster.isPrimary) {

// Fork worker.

const worker = cluster.fork();

}

you can use following to execute your code.

worker.on('online', () => {

// Worker is online

});

make sure that you terminate it, with disconnect() or kill() methods on the worker, depending on the intentions.

Comments

1

Node.js is multithreaded, but the main thread, just like any multithreaded programming language, will operate by defaults. To multithread a node.js application and run multiple parallel threads, which can handle different tasks separately and simultaneously, you will have to implement the worker_thread and assign this your above task to a single worker_thread to handle, independent of the main thread. you need to do the following.

  1. Confirm that you have a multiple core processor, you can do that by invoking the available parallism or the cpus methods of the "os" native node package, or using the nproc command on linux or checking the windows task manager performance and look for the logical processors.
const os = require("node:os");

let THREAD_COUNT;

//THREAD_COUNT = os.cpus().length; 

//or this
    
THREAD_COUNT = os.availableParallelism();
  1. implement a worker thread in a separate JS file, you can add this your counter task there, to execute as a separate thread, uninterrupted and non-blocking, by using the workData and parentPort properties of the worker_threads native node.js package.
const { workerData, parentPort } = require("worker_threads");

In node.js the main thread and the worker threads can communicate either one way communication or duplex communication using these properties of the worker_thread package.

The workerData contain the data you can pass to the worker thread from the main thread, while the parentPort contain a method called postMessage you can use to communicated back to the main thread.

  1. Implement a "worker" class, that will run as a an independent JS execution thread, totally different from the main thread. this class is also a component of the worker_thread native package.
const { Worker } = require("worker_threads");

Instantiate a new Worker class and pass the worker thread you have just created earlier, like so.

const worker = new Worker("path to the worker thread", {
  workerData: { data: "data to the worker thread" },
});

Pass the path to the worker thread to the new Worker class and also pass data to the worker_thread if you will.

It's recommended that you wrap this Worker class in a new promise, to insure when the worker threads completes, especially if you intend to communicate between the worker thread and the main thread.

The communication is event driven, which works like node.js native event implementation, but without Emitter class instantiation, because the Worker class has it's own even implementation by default, to enable this duplex communication between the main thread and the worker threads, like so.

worker.on("message", (data) => {

});

worker.on("error", (msg) => {

});

For both message and error event registers, on the Worker class, to pass message or error from the worker threads.

If you have wrapped the Worker class in a new promise, you can use resolve and reject functions of the promise to handle each case respectively.

With is setup, you have successfully implemented a multithread node.js application in a non-blocking fashion and you can send different task to the main thread or the worker threads.

You will know that you have done this properly when you keep the main JS thread busy with a huge CPU task, like say loop a 1_000_000_000 times, which is usually blocking in a normal JS single threaded application, executed by the main thread, then send a request to the worker_thread when the main thread is busy, you will get a response or task execution result, if you have implemented it like I have just done.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.