How to avoid multiple node processes doing repetitive things?

How to avoid multiple node processes doing repetitive things? - javascript

I have a module in Node.js which repeatedly pick a document from MongoDB and process it. One document should be processed only once. I also want to use multiple processes concept. I want to run the same module(process) on different processors, which run independently.
The problem is, there might be a scenario where the same document picked and processed by two different workers. How multiple processes can know that, a particular document is processed by some other worker so I should not touch it. And there is no way that my independent processes can communicate. I cannot use a parent which forks multiple processes and acts as a bridge between them. How to avoid this kind of problems in Node.js?

One way to do it is to assign an unique numeric ID to each of your MongoDB documents, and to assign an unique numeric identifier to each of your node.js workers.
For example, have an env var called NUM_WORKERS, and then in your node.js module:
var NumWorkers = process.env.NUM_WORKERS || 1;
You then need to assign an unique, contiguous instance number id (in the range 0 to NumWorkers-1) to each of your workers (e.g. via a command line parameter read by your node.js process when it initializes). You can store that in a variable called MyWorkerInstanceNum.
When you pick a document from MongoDB, call the following function (passing the document's unique documentId as a parameter):
function isMine(documentId){
//
// Example: documentId=10
// NumWorkers= 4
// (10 % 4) = 2
// If MyWorkerInstanceNum is 2, return true, else return false.
return ((documentId % NumWorkers) === MyWorkerInstanceNum);
}
Only continue to actually process the document if isMine() returns true.
So, multiple workers may "pick" a document, but only one worker will actually process it.

Simply keep a transaction log of the document being processed by its unique ID. In the transaction log table for the processed documents, write the status as one of the following (for example):
requested
initiated
processed
failed
You may also want a column in that table for stderr/stdout in case you want to know why something failed or succeeded, and timestamps - that sort of thing.
When you initialize the processing of the document in your Node app, look up the document by ID and check its status. If it doesn't exist, then you're free to process it.
Pseudo-code (sorry, I'm not a Mongo guy!):
db.collection.list('collectionName', function(err, doc) {
db.collection.find(doc.id, 'transactions', function(err, trx) {
if (trx === undefined || trx.status === 'failed') {
DocProcessor.child.process(doc)
} else {
// don't need to process it, it's already been done
}
})
})
You'll also want to enable concurrency locking on the transactions log collection so that you ensure a row (and subsequent job) can't be duplicated. If this becomes a challenge to ensure docs are being queued properly, consider adding in an AMQP service to handle queuing of the docs. Set up a handler to manage distribution of the child processes and transaction logging. Flow would be something like:
MQ ⇢ Log ⇢ Handler ⇢ Doc processor children

Related

Global memoizing fetch() to prevent multiple of the same request

I have an SPA and for technical reasons I have different elements potentially firing the same fetch() call pretty much at the same time.[1]
Rather than going insane trying to prevent multiple unrelated elements to orchestrate loading of elements, I am thinking about creating a gloabalFetch() call where:
the init argument is serialised (along with the resource parameter) and used as hash
when a request is made, it's queued and its hash is stored
when another request comes, and the hash matches (which means it's in-flight), another request will NOT be made, and it will piggy back from the previous one
async function globalFetch(resource, init) {
const sigObject = { ...init, resource }
const sig = JSON.stringify(sigObject)
// If it's already happening, return that one
if (globalFetch.inFlight[sig]) {
// NOTE: I know I don't yet have sig.timeStamp, this is just to show
// the logic
if (Date.now - sig.timeStamp < 1000 * 5) {
return globalFetch.inFlight[sig]
} else {
delete globalFetch.inFlight[sig]
}
const ret = globalFetch.inFlight[sig] = fetch(resource, init)
return ret
}
globalFetch.inFlight = {}
It's obviously missing a way to have the requests' timestamps. Plus, it's missing a way to delete old requests in batch. Other than that... is this a good way to go about it?
Or, is there something already out there, and I am reinventing the wheel...?
[1] If you are curious, I have several location-aware elements which will reload data independently based on the URL. It's all nice and decoupled, except that it's a little... too decoupled. Nested elements (with partially matching URLs) needing the same data potentially end up making the same request at the same time.

Your concept will generally work just fine.
Some thing missing from your implementation:
Failed responses should either not be cached in the first place or removed from the cache when you see the failure. And failure is not just rejected promises, but also any request that doesn't return an appropriate success status (probably a 2xx status).
JSON.stringify(sigObject) is not a canonical representation of the exact same data because properties might not be stringified in the same order depending upon how the sigObject was built. If you grabbed the properties, sort them and inserted them in sorted order onto a temporary object and then stringified that, it would be more canonical.
I'd recommend using a Map object instead of a regular object for globalFetch.inFlight because it's more efficient when you're adding/removing items regularly and will never have any name collision with property names or methods (though your hash would probably not conflict anyway, but it's still a better practice to use a Map object for this kind of thing).
Items should be aged from the cache (as you apparently know already). You can just use a setInterval() that runs every so often (it doesn't have to run very often - perhaps every 30 minutes) that just iterates through all the items in the cache and removes any that are older than some amount of time. Since you're already checking the time when you find one, you don't have to clean the cache very often - you're just trying to prevent non-stop build-up of stale data that isn't going to be re-requested - so it isn't getting automatically replaced with newer data and isn't being used from the cache.
If you have any case insensitive properties or values in the request parameters or the URL, the current design would see different case as different requests. Not sure if that matters in your situation or not or if it's worth doing anything about it.
When you write the real code, you need Date.now(), not Date.now.
Here's a sample implementation that implements all of the above (except for case sensitivity because that's data-specific):
function makeHash(url, obj) {
// put properties in sorted order to make the hash canonical
// the canonical sort is top level only,
// does not sort properties in nested objects
let items = Object.entries(obj).sort((a, b) => b[0].localeCompare(a[0]));
// add URL on the front
items.unshift(url);
return JSON.stringify(items);
}
async function globalFetch(resource, init = {}) {
const key = makeHash(resource, init);
const now = Date.now();
const expirationDuration = 5 * 1000;
const newExpiration = now + expirationDuration;
const cachedItem = globalFetch.cache.get(key);
// if we found an item and it expires in the future (not expired yet)
if (cachedItem && cachedItem.expires >= now) {
// update expiration time
cachedItem.expires = newExpiration;
return cachedItem.promise;
}
// couldn't use a value from the cache
// make the request
let p = fetch(resource, init);
p.then(response => {
if (!response.ok) {
// if response not OK, remove it from the cache
globalFetch.cache.delete(key);
}
}, err => {
// if promise rejected, remove it from the cache
globalFetch.cache.delete(key);
});
// save this promise (will replace any expired value already in the cache)
globalFetch.cache.set(key, { promise: p, expires: newExpiration });
return p;
}
// initalize cache
globalFetch.cache = new Map();
// clean up interval timer to remove expired entries
// does not need to run that often because .expires is already checked above
// this just cleans out old expired entries to avoid memory increasing
// indefinitely
globalFetch.interval = setInterval(() => {
const now = Date.now()
for (const [key, value] of globalFetch.cache) {
if (value.expires < now) {
globalFetch.cache.delete(key);
}
}
}, 10 * 60 * 1000); // run every 10 minutes
Implementation Notes:
Depending upon your situation, you may want to customize the cleanup interval time. This is set to run a cleanup pass every 10 minutes just to keep it from growing unbounded. If you were making millions of requests, you'd probably run that interval more often or cap the number of items in the cache. If you aren't making that many requests, this can be less frequent. It is just to clean up old expired entries sometime so they don't accumulate forever if never re-requested. The check for the expiration time in the main function already keeps it from using expired entries - that's why this doesn't have to run very often.
This looks as response.ok from the fetch() result and promise rejection to determine a failed request. There could be some situations where you want to customize what is and isn't a failed request with some different criteria than that. For example, it might be useful to cache a 404 to prevent repeating it within the expiration time if you don't think the 404 is likely to be transitory. This really depends upon your specific use of the responses and behavior of the specific host you are targeting. The reason to not cache failed results is for cases where the failure is transitory (either a temporary hiccup or a timing issue and you want a new, clean request to go if the previous one failed).
There is a design question for whether you should or should not update the .expires property in the cache when you get a cache hit. If you do update it (like this code does), then an item could stay in the cache a long time if it keeps getting requested over and over before it expires. But, if you really want it to only be cached for a maximum amount of time and then force a new request, you can just remove the update of the expiration time and let the original result expire. I can see arguments for either design depending upon the specifics of your situation. If this is largely invariant data, then you can just let it stay in the cache as long as it keeps getting requested. If it is data that can change regularly, then you may want it to be cached no more than the expiration time, even if its being requested regularly.

Consider using a ServiceWorker or Workbox to separate caching logic from your application. The Stale-While-Revalidate strategy could apply here.

How to cancel a wasm process from within a webworker

I have a wasm process (compiled from c++) that processes data inside a web application. Let's say the necessary code looks like this:
std::vector<JSONObject> data
for (size_t i = 0; i < data.size(); i++)
{
process_data(data[i]);
if (i % 1000 == 0) {
bool is_cancelled = check_if_cancelled();
if (is_cancelled) {
break;
}
}
}
This code basically "runs/processes a query" similar to a SQL query interface:
However, queries may take several minutes to run/process and at any given time the user may cancel their query. The cancellation process would occur in the normal javascript/web application, outside of the service Worker running the wasm.
My question then is what would be an example of how we could know that the user has clicked the 'cancel' button and communicate it to the wasm process so that knows the process has been cancelled so it can exit? Using the worker.terminate() is not an option, as we need to keep all the loaded data for that worker and cannot just kill that worker (it needs to stay alive with its stored data, so another query can be run...).
What would be an example way to communicate here between the javascript and worker/wasm/c++ application so that we can know when to exit, and how to do it properly?
Additionally, let us suppose a typical query takes 60s to run and processes 500MB of data in-browser using cpp/wasm.
Update: I think there are the following possible solutions here based on some research (and the initial answers/comments below) with some feedback on them:
Use two workers, with one worker storing the data and another worker processing the data. In this way the processing-worker can be terminated, and the data will always remain. Feasible? Not really, as it would take way too much time to copy over ~ 500MB of data to the webworker whenever it starts. This could have been done (previously) using SharedArrayBuffer, but its support is now quite limited/nonexistent due to some security concerns. Too bad, as this seems like by far the best solution if it were supported...
Use a single worker using Emterpreter and using emscripten_sleep_with_yield. Feasible? No, destroys performance when using Emterpreter (mentioned in the docs above), and slows down all queries by about 4-6x.
Always run a second worker and in the UI just display the most recent. Feasible? No, would probably run into quite a few OOM errors if it's not a shared data structure and the data size is 500MB x 2 = 1GB (500MB seems to be a large though acceptable size when running in a modern desktop browser/computer).
Use an API call to a server to store the status and check whether the query is cancelled or not. Feasible? Yes, though it seems quite heavy-handed to long-poll with network requests every second from every running query.
Use an incremental-parsing approach where only a row at a time is parsed. Feasible? Yes, but also would require a tremendous amount of re-writing the parsing functions so that every function supports this (the actual data parsing is handled in several functions -- filter, search, calculate, group by, sort, etc. etc.
Use IndexedDB and store the state in javascript. Allocate a chunk of memory in WASM, then return its pointer to JavaScript. Then read database there and fill the pointer. Then process your data in C++. Feasible? Not sure, though this seems like the best solution if it can be implemented.
[Anything else?]
In the bounty then I was wondering three things:
If the above six analyses seem generally valid?
Are there other (perhaps better) approaches I'm missing?
Would anyone be able to show a very basic example of doing #6 -- seems like that would be the best solution if it's possible and works cross-browser.

For Chrome (only) you may use shared memory (shared buffer as memory). And raise a flag in memory when you want to halt. Not a big fan of this solution (is complex and is supported only in chrome). It also depends on how your query works, and if there are places where the lengthy query can check the flag.
Instead you should probably call the c++ function multiple times (e.g. for each query) and check if you should halt after each call (just send a message to the worker to halt).
What I mean by multiple time is make the query in stages (multiple function cals for a single query). It may not be applicable in your case.
Regardless, AFAIK there is no way to send a signal to a Webassembly execution (e.g. Linux kill). Therefore, you'll have to wait for the operation to finish in order to complete the cancellation.
I'm attaching a code snippet that may explain this idea.
worker.js:
... init webassembly
onmessage = function(q) {
// query received from main thread.
const result = ... call webassembly(q);
postMessage(result);
}
main.js:
const worker = new Worker("worker.js");
const cancel = false;
const processing = false;
worker.onmessage(function(r) {
// when worker has finished processing the query.
// r is the results of the processing.
processing = false;
if (cancel === true) {
// processing is done, but result is not required.
// instead of showing the results, update that the query was canceled.
cancel = false;
... update UI "cancled".
return;
}
... update UI "results r".
}
function onCancel() {
// Occurs when user clicks on the cancel button.
if (cancel) {
// sanity test - prevent this in UI.
throw "already cancelling";
}
cancel = true;
... update UI "canceling".
}
function onQuery(q) {
if (processing === true) {
// sanity test - prevent this in UI.
throw "already processing";
}
processing = true;
// Send the query to the worker.
// When the worker receives the message it will process the query via webassembly.
worker.postMessage(q);
}
An idea from user experience perspective:
You may create ~two workers. This will take twice the memory, but will allow you to "cancel" "immediately" once. (it will just mean that in the backend the 2nd worker will run the next query, and when the 1st finishes the cancellation, cancellation will again become immediate).

Shared Thread
Since the worker and the C++ function that it called share the same thread, the worker will also be blocked until the C++ loop is finished, and won't be able to handle any incoming messages. I think the a solid option would minimize the amount of time that the thread is blocked by instead initializing one iteration at a time from the main application.
It would look something like this.
main.js -> worker.js -> C++ function -> worker.js -> main.js
Breaking up the Loop
Below, C++ has a variable initialized at 0, which will be incremented at each loop iteration and stored in memory.
C++ function then performs one iteration of the loop, increments the variable to keep track of loop position, and immediately breaks.
int x;
x = 0; // initialized counter at 0
std::vector<JSONObject> data
for (size_t i = x; i < data.size(); i++)
{
process_data(data[i]);
x++ // increment counter
break; // stop function until told to iterate again starting at x
}
Then you should be able to post a message to the web worker, which then sends a message to main.js that the thread is no longer blocked.
Canceling the Operation
From this point, main.js knows that the web worker thread is no longer blocked, and can decide whether or not to tell the web worker to execute the C++ function again (with the C++ variable keeping track of the loop increment in memory.)
let continueOperation = true
// here you can set to false at any time since the thread is not blocked here
worker.expensiveThreadBlockingFunction()
// results in one iteration of the loop being iterated until message is received below
worker.onmessage = function(e) {
if (continueOperation) {
worker.expensiveThreadBlockingFunction()
// execute worker function again, ultimately continuing the increment in C++
} {
return false
// or send message to worker to reset C++ counter to prepare for next execution
}
}
Continuing the Operation
Assuming all is well, and the user has not cancelled the operation, the loop should continue until finished. Keep in mind you should also send a distinct message for whether the loop has completed, or needs to continue, so you don't keep blocking the worker thread.

How to synchronously read data from Azure DocumentDb collection inside a server-side trigger?

I am trying to implement a trigger on an Azure DocumentDb collection, which is supposed to auto-increment a version of a document, which is being inserted. The trigger is created as a pre-trigger.
The challenge I am facing is that collection class doesn't seem to provide a synchronous API for querying data. My plan for the trigger was to query existing documents, get the top version, increment, and assign the +1 value to the document, which is being inserted into the collection. But since the result of the query is only available asynchronously, by that time my trigger is completed and the document is inserted unmodified.
How can I await the query result?
Here is how my current trigger looks like:
// TRIGGER Auto increment version
function autoIncrementVersion() {
var collection = getContext().getCollection();
var request = getContext().getRequest();
var docToCreate = request.getBody();
// Reject documents that do not have a name property by throwing an exception.
if (!docToCreate.Version) {
throw new Error('Document must include a "Version" property.');
}
var lastVersion;
var filter = "SELECT TOP 1 d.Version FROM CovenantsDocuments d ORDER BY d.Version DESC";
var result = collection.queryDocuments(collection.getSelfLink(), filter, {},
function (err, documents, responseOptions) {
if (err) throw new Error("Error: " + err.message);
if (documents.length != 1 || !documents[0]) {
lastVersion = 0;
} else {
lastVersion = documents[0];
}
//By the time we reach this line, our trigger has already completed?
docToCreate.Version = lastVersion + 1;
});
if (!result) throw "Unable to read last version of the document";
}
UPDATE: The issue was with the way I was submitting request. Looks like triggers are not fired by default, their names need to be explicitly provided as an argument to the request.
In my case the trigger wasn't firing until I changed the client code to this:
RequestOptions options = new RequestOptions
{
PreTriggerInclude = new[] { "autoIncrementVersion"}
};
client.CreateDocumentAsync(url, document, options);

It will automatically wait until all pending async operations either complete, fail, or time out before returning. What you have is close. The only thing that I can see is missing is that you never call request.setBody(docToCreate) after you alter docToCreate.
That said, I'm not 100% certain that this approach is safe. All operations inside of a trigger, sproc, or UDF are atomic, but I'm not sure that the combination of a pre-trigger plus a write operation is atomic. The risk is that two simultaneous writes will both run and complete the trigger part which would give them a same .Version. You would probably have to ask the DocumentDB Product Managers to confirm this. They hang out here so they may respond here.
If you find that it's not atomic, then you can move everything (read to find latest version and write) into a stored procedure (sproc).
You might also consider creating a single document whose id you hard code to something like 'LAST_VERSION' to hold the last used version. That means that every write will result in a read + two writes (one for the document and one to update this document), but it may be more efficient than your query + one write approach. You could do all of this in one sproc or you could use a pre-trigger (to fetch the 'LAST_VERSION' + write operation + post-trigger (to update the 'LAST_VERSION' document) depending upon what the Product Managers say about atomicity.
One more caution about your current approach... Make sure the precision of the index on the Version field is set to -1 (Maximum precision).

Firebase-javascript API get data-nodes satisfying a condition in one trigger only. Also does the makes new TCP connection for each query?

There is a similar question here. However its related to REST and I want to ask regarding javascript-API . Also my case is bit different. So maybe someone can suggest some other solution.
I want to perform a query similar to this:
"SELECT * FROM db.table WHERE field1 ="val1";"
With firebase we can do following:
var ref = new Firebase("https://db.firebaseio.com/table");
ref.orderByChild("field1").equalTo("val1").on("value", function(record) {
console.log(record.val())
});
So firebase triggers my callback function for each child that satisfies field1="val1". Does it opens new TCP connection for each of these chlid queries? Also is there any way to get all the childs satisfying the condition just in one go(That is one callback is triggered when all of them are downloaded at the client).

So firebase triggers my callback function for each child that satisfies field1="val1"
Not exactly. It triggers the callback function exactly once, passing all the matching nodes in the DataSnapshot parameter. You can loop over them with:
var ref = new Firebase("https://db.firebaseio.com/table");
ref.orderByChild("field1").equalTo("val1").on("value", function(snapshot) {
snapshot.forEach(function(record) {
console.log(record.val())
});
});
The loop is needed, even if there's only one child. You can use snapshot.numChildren() to determine if there are any nodes matching your query.
Does it opens new TCP connection for each of these chlid queries
Nope. The Firebase client establishes a WebSocket connection when you first call new Firebase(...). After that all communication goes over that WebSocket. Only when the environment doesn't support WebSockets, does Firebase fall back to HTTP long-polling. Have a look in the network tab of your browser's debugger to see what's going over the wire. It's quite educational.
Also is there any way to get all the childs satisfying the condition just in one go(That is one callback is triggered when all of them are downloaded at the client).
I think I answered that one already.
Update based on the comments
Are the callback functions passed to forEach called synchronously?
Yes

Node.js fs.writeFile() empties the file

I have an update method which gets called about every 16-40ms, and inside I have this code:
this.fs.writeFile("./data.json", JSON.stringify({
totalPlayersOnline: this.totalPlayersOnline,
previousDay: this.previousDay,
gamesToday: this.gamesToday
}), function (err) {
if (err) {
return console.log(err);
}
});
If the server throws an error, the "data.json" file sometimes becomes empty. How do I prevent that?

Problem
fs.writeFile is not an atomic operation. Here is an example program which I will run strace on:
#!/usr/bin/env node
const { writeFile, } = require('fs');
// nodejs won’t exit until the Promise completes.
new Promise(function (resolve, reject) {
writeFile('file.txt', 'content\n', function (err) {
if (err) {
reject(err);
} else {
resolve();
}
});
});
When I run that under strace -f and tidied up the output to show just the syscalls from the writeFile operation (which spans multiple IO threads, actually), I get:
open("file.txt", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 9
pwrite(9, "content\n", 8, 0) = 8
close(9) = 0
As you can see, writeFile completes in three steps.
The file is open()ed. This is an atomic operation that, with the provided flags, either creates an empty file on disk or, if the file exists, truncates it. Truncating the file is an easy way to make sure that only the content you write ends up in the file. If there is existing data in the file and the file is longer than the data you subsequently write to the file, the extra data will stay. To avoid this you truncate.
The content is written. Because I wrote such a short string, this is done with a single pwrite() call, but for larger amounts of data I assume it is possible nodejs would only write a chunk at a time.
The handle is closed.
My strace had each of these steps occurring on a different node IO thread. This suggests to me that fs.writeFile() might actually be implemented in terms of fs.open(), fs.write(), and fs.close(). Thus, nodejs does not treat this complex operation like it is atomic at any level—because it isn’t. Therefore, if your node process terminates, even gracefully, without waiting for the operation to complete, the operation could be at any of the steps above. In your case, you are seeing your process exit after writeFile() finishes step 1 but before it completes step 2.
Solution
The common pattern for transactionally replacing a file’s contents with a POSIX layer is to use these steps:
Write the data to a differently named file, fsync() the file (See “When should you fsync?” in “Ensuring data reaches disk”), and then close() it.
rename() (or, on Windows, MoveFileEx() with MOVEFILE_REPLACE_EXISTING) the differently-named file over the one you want to replace.
Using this algorithm, the destination file is either updated or not regardless of when your program terminates. And, even better, journalled (modern) filesystems will ensure that, as long as you fsync() the file in step 1 before proceeding to step 2, the two operations will occur in order. I.e., if your program performs step 1 and then step 2 but you pull the plug, when you boot up you will find the filesystem in one of the following states:
None of the two steps are completed. The original file is intact (or if it never existed before, it doesn’t exist). The replacement file is either nonexistent (step 1 of the writeFile() algorithm, open(), effectively never succeeded), existent but empty (step 1 of writeFile() algorithm completed), or existent with some data (step 2 of writeFile() algorithm partially completed).
The first step completed. The original file is intact (or if it didn’t exist before it still doesn’t exist). The replacement file exists with all of the data you want.
Both steps completed. At the path of the original file, you can now access your replacement data—all of it, not a blank file. The path you wrote the replacement data to in the first step no longer exists.
The code to use this pattern might look like the following:
const { writeFile, rename, } = require('fs');
function writeFileTransactional (path, content, cb) {
// The replacement file must be in the same directory as the
// destination because rename() does not work across device
// boundaries.
// This simple choice of replacement filename means that this
// function must never be called concurrently with itself for the
// same path value. Also, properly guarding against other
// processes trying to use the same temporary path would make this
// function more complicated. If that is a concern, a proper
// temporary file strategy should be used. However, this
// implementation ensures that any files left behind during an
// unclean termination will be cleaned up on a future run.
let temporaryPath = `${path}.new`;
writeFile(temporaryPath, content, function (err) {
if (err) {
return cb(err);
}
rename(temporaryPath, path, cb);
});
};
This is basically the same solution you’d use for the same problem in any langage/framework.

if the error is caused due to bad input (the data you want to write) then make sure the data is as they should and then do the writeFile.
if the error is caused due to failure of the writeFile even though the input is Ok, you could check that the function is executed until the file is written. One way is using the async doWhilst function.
async.doWhilst(
writeFile(), //your function here but instead of err when fail callback success to loop again
check_if_file_null, //a function that checks that the file is not null
function (err) {
//here the file is not null
}
);

I didn't run some real tests with this I just noticed with manually reloading my ide that sometime the file was empty.
What I tried first was the rename method and noted the same problem, but recreating a new file was less desirable (considering file watches etc.).
My suggestion or what I'm doing now is in your own readFileSync I check if the file is missing or data returned is empty and sleep for a 100 milliseconds before giving it another try. I suppose a third try with more delay would really push the sigma up a notch but currently not going do it as the added delay is hopefully an unnecessary negative (would consider a promise at that point). There are other recovery option opportunities relative to your own code you can add just in case I hopefully. File not found or empty? is basically a retry another way.
My custom writeFileSync has an added flag to toggle between using the rename method (with write sub-dir '._new' creation) or the normal direct method as your code's need may vary. Possible based on file size is my recommendation.
In this use case the files are small and only updated by one node instance / server at a time. I can see adding the random file name as another option with rename to allow multiple machines to write another option for later if needed. Maybe a retry limit argument as well?
I was also thinking that you could write to a local temp and then copy to share target by some means (maybe also rename on target for speed increase), and then clean up (unlink from local temp) of course. I guess that idea is kind of pushing it to shell commands so not better.
Anyway still the main idea here is to read twice if found empty. I'm sure it's safe from being partially written, via nodejs 8+ on to a shared Ubuntu type NFS mount right?

We Keep Coding

JavaScript is the programming language of the Web.