Express server that creates maximum of 2 child/worker processes

Express server that creates maximum of 2 child/worker processes - javascript

I'm experimenting with node and it's child_process module.
My goal is to create server which will run on maximum of 3 processes (1 main and optionally 2 children).
I'm aware that code below may be incorrect, but it displays interesting results.
const app = require ("express")();
const {fork} = require("child_process")
const maxChildrenRuning = 2
let childrenRunning = 0
app.get("/isprime", (req, res) => {
if(childrenRunning+1 <= maxChildrenRuning) {
childrenRunning+=1;
console.log(childrenRunning)
const childProcess = fork('./isprime.js');
childProcess.send({"number": parseInt(req.query.number)})
childProcess.on("message", message => {
console.log(message)
res.send(message)
childrenRunning-=1;
})
}
})
function isPrime(number) {
...
}
app.listen(8000, ()=>console.log("Listening on 8000") )
I'm launching 3 requests with 5*10^9'ish numbers.
After 30 seconds I receive 2 responses with correct results.
CPU stops doing hard work and goes idle
Surprisingly after next 1 minute 30 seconds 1 thread starts to proceed, still pending, 3rd request and finishes after next 30 seconds with correct answer. Console log displayed below:
> node index.js
Listening on 8000
1
2
{ number: 5000000029, isPrime: true, time: 32471 }
{ number: 5000000039, isPrime: true, time: 32557 }
1
{ number: 5000000063, isPrime: true, time: 32251 }
Either express listens and checks pending requests once for a while or my browser sends actual requests every x time while pending. Can anybody explain what is happening here and why? How can I correctly achieve my goal?

The way your server code is written, if you receive a /isprime request and two child processes are already running, your request handler for /isprime does nothing. It never sends any response. You don't pass that first if test and then nothing happens afterwards. So, that request will just sit there with the client waiting for a response. Depending upon the client, it will probably eventually time out as a dead/inactive request and the client will shut it down.
Some clients (like browsers) may assume that something just got lost in the network and they may retry the request by sending it again. It would be my guess that this is what is happening in your case. The browser eventually times out and then resends the request. By the time it retries, there are less than two child processes running so it gets processed on the retry.
You could verify that the browser is retrying automatically by going to the network tab in the Chrome debugger and watching exactly what the browser sends to your server and watch that third request, see it timeout and see if it is the browser retrying the request.
Note, this code seems to be only partially implemented because you initially start two child processes, but you don't reuse those child processes. Once they finish and you decrement maxChildrenRuning, your code will then start another child process. Probably what you really want to do is to keep track of the two child processes you started and when one finishes, add it to an array of "available child processes" so when a new request comes in, you can just use an existing child process that is already started, but idle.
You also need to either queue incoming requests when all the child processes are full or you need to send some sort of error response to the http request. Never sending an http response to an incoming request is a poor design that just leads to great inefficiencies (connections hanging around much longer than needed that never actually accomplish anything).

Related

Only one message is received from SQS ( nodejs aws sdk)

I created an SQS with default settings. I published two messages to it, and I would like to read them back in the same time. I tried it like this:
const sqsClient = new SQSClient({ region: REGION });
const params = {
AttributeNames: ["SentTimestamp"],
MaxNumberOfMessages: 5,
MessageAttributeNames: ["All"],
QueueUrl: queueURL,
WaitTimeSeconds: 5,
};
const data = await sqsClient.send(new ReceiveMessageCommand(params));
const messages = data.Messages ?? [];
console.log(messages.length);
Unfortunately only one message is returned, no matter what I provide in MaxNumberOfMessages. What can cause this? How is it possible to fix this issue?
I was able to find a similar question, but it has only one answer, refering to a 3rd party library.

A ReceiveMessageCommand does not guarantee that you will get exactly the number of messages specified for MaxNumberOfMessages. In fact the documentation says the following:
Short poll is the default behavior where a weighted random set of machines is sampled on a ReceiveMessage call. Thus, only the messages on the sampled machines are returned. If the number of messages in the queue is small (fewer than 1,000), you most likely get fewer messages than you requested per ReceiveMessage call. If the number of messages in the queue is extremely small, you might not receive any messages in a particular ReceiveMessage response. If this happens, repeat the request.
You must use long-polling to receive multiple messages. This is essentially setting the WaitTimeSeconds to a greater value (5 seconds should be enough).
And you must have a larger number of messages in the queue to be able to fetch multiple messages with one call.
To summarize:
SQS is a distributed system, each call will poll one machine only.
Messages are distributes on those machines, if you have a small number of messages, it might happen that you fetch only one message, or none.
Test your code with a larger set of sent messages and put your receiving call in loop.

Express retriggering API endpoint on request timeouts

Context:
My Express.js web server is currently serving an API which wraps a SOAP service (some legacy service which I can't change). The SOAP service takes a dynamic number of items to process and takes about 1.5 seconds to process each request. The Nginx server has a timeout of 60 seconds.
Problem:
For a request to this API which e.g. lets say takes more than 60 seconds to complete, I am observing that the service is getting re-triggered automatically (I am assuming by Express.js). So if in the original request I was expecting to insert lets say 50 records to a table, now due to the re-triggering of the API I am ending up with 100 records inserted (duplication).
Here is a skeleton/sample of log that kind of shows the issue: (sensitive info stripped)
January 10, 2022 15:35:44 [... ee905] - Starting myAwesomeAPI() <-- Original API trigger
January 10, 2022 15:36:44 [... ff870] - Starting myAwesomeAPI() <-- Re-trigger happens
January 10, 2022 15:36:54 [... ee905] - Completed myAwesomeAPI() <-- Original API ends (inserts 50 records in the table)
January 10, 2022 15:37:54 [... ff870] - Completed myAwesomeAPI() <-- Re-triggered API ends (inserting another 50 records in the table resulting in duplication)
What I have tried:
To reproduce the issue and check if the re-triggering can be independent of nginx. With the Nginx timeout set to 60 seconds, I changed my Express server's timeout to 10 seconds and 15 items to process (to force timeout before processing can be complete) using this:
const express = require("express")
const server = express()
server.setTimeout(10000) <-- sets all requests to have a 10 seconds timeout
// myAwesomeAPI code
Testing showed that after 10 seconds, the timeout "did" re-trigger the API and the 15 items were duplicated (I saw 30 records inserted). So this tells me that the API is getting re-triggered by Express.js.
Question(s):
How to stop the re-trigger from happening, is there an express server configuration to enable/disable the auto re-triggering on timeout?
Solutions & Ideas:
Since the max items = 100 (set by team), increasing the Nginx and Express.js timeout to 300 seconds should be a quick but dirty fix. I understand that tying async API calls to some approximation of time is pure foolishness (tell me about trying to explain this to other engineers in my team ;-p), so I would like to avoid this approach.
Create a composite key with some combination of columns and enforce the insert restrictions on the table. Combine this with checking if the composite key is already inserted/present in the table and decide to skip/insert. This approach seems a bit better .
Another approach can be to respond back to the API call immediately on receipt (which will close the request) and then continue with the request processing. Something like this (inspiration): https://www.bennadel.com/blog/3275-you-can-continue-to-process-an-express-js-request-after-the-client-response-has-been-sent.htm.
This will make me independent of platform's timeout settings but will take away the real-time nature of the response being delivered with statuses for different items and add a bit more complexity of tracking the request statuses via other lookups etc.

If you have the ability to alter the front end you can add a transaction ID to it. Store the transaction routine in an object linked to the transaction ID, then if you get an API request for an ongoing transaction you can refer to the ongoing transaction.
Something like this:
let transactions = {};
router.get('/myapi', async (req,res,next) => {
try {
let {transactionID} = req.params;
delete(req.params.transactionID);
let transaction = transactions[transactionID];
if(!transaction) {
transaction = (async () => {
let ret = await SOAPCall(req.params);
// hold onto the transaction for some period of time
let to = setTimeout(()=>{
delete(transactions[transactionID]);
}, 5000);
to.detach(); // don't hold up process exit
return ret;
})();
transactions[transactionID] = transaction;
}
let ret = await transaction;
res.json(ret);
}
catch(err) { next(err) }
});

Is it possible to pause/suspend (don’t accept(2) on the socket) a Node.js server?

Goal: To have a Node.js server where only one connection is active at a time.
I can temporarily remove the connection event listener on the server, or only set it up once in the first place by calling once instead of on, but then any connection that gets made while there is no connection event listener seems to get lost. From strace, I can see that Node is still accept(2)ing on the socket. Is it possible to get it to not do that, so that the kernel will instead queue up all incoming request until the server is ready to accept them again (or the backlog configured in listen(2) is exceeded)?
Example code that doesn’t work as I want it to:
#!/usr/bin/node
const net = require("net");
const server = net.createServer();
function onConnection(socket) {
socket.on("close", () => server.once("connection", onConnection));
let count = 0;
socket.on("data", (buffer) => {
count += buffer.length;
if (count >= 16) {
socket.end();
}
console.log("read " + count + " bytes total on this connection");
});
}
server.once("connection", onConnection);
server.listen(8080);
Connect to localhost, port 8080, with the agent of your choice (nc, socat, telnet, …).
Send less than 16 bytes, and witness the server logging to the terminal.
Without killing the first agent, connect a second time in another terminal. Try to send any number of bytes – the server will not log anything.
Send more bytes on the first connection, so that the total number of bytes sent there exceeds 16. The server will close this connection (and again log this to the console).
Send yet more bytes on the second connection. Nothing will happen.
I would like the second connection to block until the first one is over, and then to be handled normally. Is this possible?

.. so that the kernel will instead queue up all incoming request until the server is ready to accept them again (or the backlog configured in listen(2) is exceeded)?
...
I would like the second connection to block until the first one is over, and then to be handled normally. Is this possible?
Unfortunately, it is not possible without catching the connection events that are sent and managing the accepted connections in your application rather than with the OS backlog. node calls libuv with an OnConnection callback that will try to accept all connections and make them available in the JS context.

Postgresql. One process insert, second try to select but not found

Strange situation.
I try to start chat application.
I use postgresql 9.3 and tomcat as web server.
What is happens when one browser sending message another:
1 - Broswer A send message to server (tomcat)
2 - Tomcat put msg into database and get his id
INSERT INTO messages VALUES('first message') returning into MSGID id
3 - Tomcat resend message to Browser B (websocket recipient)
4 - Browser B send system answer: MSGID_READED
5 - Tomcat update database message
UPDATE messages SET readtime = now() WHERE id = MSGID
All works, but sometimes at point 5 update can't find message by MSGID...
Very strange, coz at point 2 I getting message record ID, but at 5, not.
May postgresql write slowly and this record not allow (not visible) from parallel db connection?
UPDATE
I found solution for me, just put insert inside begin/exception/end block.
BEGIN
INSERT INTO messages (...)
VALUES (...)
RETURNING id INTO MSGID;
EXCEPTION
WHEN unique_violation THEN
-- nothing
END;
UPDATE 2
In detail tests above changes with BEGIN block has no effects.
Solution in Javascript! I sent websocket messages from other thread and problem solved!
// WebSocket send message function
// Part of code. so is a web socket
send = function(msg) {
if (msg != null && msg != '') {
var f = function() {
var mm = m;
// JCC.log('SENT: [' + mm + ']');
so.send(mm);
};
setTimeout(f, 1);
}
};

Ok, so the problem is that normally writers do not block readers. This means that your first insert happens, and the second insert fires before the first one commits. This introduces a race condition in your application which introduces the problem you see.
Your best issue here is either to switch to serializable snapshot isolation or to do what you have done and do exception handling on the insert. One way or another you end up with additional exception handling that must be handled (if serializable, then a serialization failure exception may sometimes happen and you may have to wait for it).
In your case, despite the performance penalty of exception handling in plpgsql, you are best off to do things the way you are currently doing them because that avoids the locking issues and waiting for the transaction to complete.

why does the browser freeze when executing many ajax requests?

I have a page that is executing around 200 ajax requests using jquery.load but it is behaving in a very un-ajax way because the browser is frozen while the results are fetched.
By freezing I mean losing control of the browser, not able to scroll it up and down even. Then the results all display at once when it has finished all requests, but I know it is actually fetching the results 6 at a time (browser controlled "same host" policy) from watching the access log of the target server.
Though the jquery.load commands are built using a "foreach" loop they are already written to the source of the page when the user loads it (so for all intents and purposes they could all be hand written individually), so its not like the page is waiting for the loop to finish. The last "symptom" is that even if it is only 30 requests instead, the issue is just the same.
So it's odd to me and I am looking for ideas of what could cause this and how it could be worked around. It's definitely confusing to the end user especially as it could take 90-100 seconds until all the responses are back and the user regains control of the browser.
One small update:
I have very similar code running in another webapp that does around 20 requests simultaneously without issue. The difference is that instead of fetching a page, it is ssh'ing to the server and reading/updating a file on the file system via a script. I would have thought that would actually have a little more overhead but it has none of these issues.
And as I have said - even 20 requests causes the same issue with the code in question... so I am tempted to think its perhaps curl related... though its pure speculation.
The Bigger update Now with infinitely more Code!!!
The fuller background to app is this. We run a cluster of some of the highest trafficked WebSphere AppServers in the world, which are running our Commerce applications. The intensity of the traffic means that if we simply let traffic on to an appserver before the JVM is warmed up, they crash! So we hit a few key pages before allowing traffic on, as this precompiles all the major servlets, proportions the JVM, and populates some of the servlet caches. Then the traffic can come onto the server with no issues and they run great.
We had a version of the app written in CGI which worked but was so slow due to being synchronous. We are talking about 10 minutes on some clusters to run. Due to being synchronous requests, only one thread on the appserver and one jdbc connection was being used.
So what the new webapp does is use a template of these key pages, to combine with a bunch of market definitions (country code, language code, catalog id's etc) to produce a list of all those URL's that need to be hit. By hitting them all in an asynchronous way it not only runs faster (now taking only 90 seconds), it also does a better job of proportioning the JVM, uses up to 30 threads and opens the JDBC pool to its full number of connections. Thus it's REALLY in a production-like state by the time we let traffic on. So I am very pleased with results, but this browser freeze is annoying me from a purely cosmetic and puzzle-solving point of view.
So now some code, the user simply selects an appserver, the app decides which cluster it is from, and displays the list of computed URL's it will hit. At this point the page is a table of 'Markets x Urls' with each cell having a unique id that the jquery uses to put the right result in the right cell (as we can't guarantee the order in which the results come back - nor do we want to as that takes us back into synchronous territory again.
So at the point at which the user is ready to click Go, the table is written and the jQuery commands prepared. On clicking go the jquery script is executed and URL's are hit and return a HTTP status code for each so we know they were successful.
The JQ part generated looks like (shortened to just a few markets)
$("a#submit").click(function(event) {
alert(" booya ");
$("#sesv-1").load("psurl.php?server=servera.domain.com&url=/se/sv");
$("#sesv-2").load("psurl.php?server=servera.domain.com&url=/se/sv/catalog/productsaz/");
$("#sesv-3").load("psurl.php?server=servera.domain.com&url=/se/sv/catalog/products/12345678");
$("#sesv-4").load("psurl.php?server=servera.domain.com&url=/webapp/wcs/stores/servlet/StockSearch?storeId=14&productId=103406&StoreNumber=099&langId=-13&ddkey=http:StockSearch");
$("#sesv-5").load("psurl.php?server=servera.domain.com&url=/webapp/wcs/stores/servlet/StockSearch?query=testProd&storeId=14&langId=-11&StoreNumber=011");
$("#atde-1").load("psurl.php?server=servera.domain.com&url=/at/de");
$("#atde-2").load("psurl.php?server=servera.domain.com&url=/at/de/catalog/productsaz/");
$("#atde-3").load("psurl.php?server=servera.domain.com&url=/at/de/catalog/products/12345678");
$("#atde-4").load("psurl.php?server=servera.domain.com&url=/webapp/wcs/stores/servlet/StockSearch?storeId=1&productId=103406&StoreNumber=114&langId=-99&ddkey=http:StockSearch");
$("#atde-5").load("psurl.php?server=servera.domain.com&url=/webapp/wcs/stores/servlet/StockSearch?query=testProd&storeId=1&langId=-21&StoreNumber=273");
$("#benl-1").load("psurl.php?server=servera.domain.com&url=/be/nl");
$("#benl-2").load("psurl.php?server=servera.domain.com&url=/be/nl/catalog/productsaz/");
$("#benl-3").load("psurl.php?server=servera.domain.com&url=/be/nl/catalog/products/12345678");
$("#benl-4").load("psurl.php?server=servera.domain.com&url=/webapp/wcs/stores/servlet/StockSearch?storeId=18&productId=103406&StoreNumber=412&langId=-44&ddkey=http:StockSearch");
$("#benl-5").load("psurl.php?server=servera.domain.com&url=/webapp/wcs/stores/servlet/StockSearch?query=testProd&storeId=18&langId=-23&StoreNumber=482");
$("#befr-1").load("psurl.php?server=servera.domain.com&url=/be/fr");
$("#befr-2").load("psurl.php?server=servera.domain.com&url=/be/fr/catalog/productsaz/");
$("#befr-3").load("psurl.php?server=servera.domain.com&url=/be/fr/catalog/products/12345678");
$("#befr-4").load("psurl.php?server=servera.domain.com&url=/webapp/wcs/stores/servlet/StockSearch?storeId=130&productId=103406&StoreNumber=048&langId=-73&ddkey=http:StockSearch");
$("#befr-5").load("psurl.php?server=servera.domain.com&url=/webapp/wcs/stores/servlet/StockSearch?query=testProd&storeId=130&langId=-24&StoreNumber=482");
$("#caen-1").load("psurl.php?server=servera.domain.com&url=/ca/en");
$("#caen-2").load("psurl.php?server=servera.domain.com&url=/ca/en/catalog/productsaz/");
$("#caen-3").load("psurl.php?server=servera.domain.com&url=/ca/en/catalog/products/12345678");
$("#caen-4").load("psurl.php?server=servera.domain.com&url=/webapp/wcs/stores/servlet/StockSearch?storeId=30&productId=103406&StoreNumber=006&langId=-11&ddkey=http:StockSearch");
$("#caen-5").load("psurl.php?server=servera.domain.com&url=/webapp/wcs/stores/servlet/StockSearch?query=testProd&storeId=30&langId=-15&StoreNumber=216");
$("#cafr-1").load("psurl.php?server=servera.domain.com&url=/ca/fr");
$("#cafr-2").load("psurl.php?server=servera.domain.com&url=/ca/fr/catalog/productsaz/");
$("#cafr-3").load("psurl.php?server=servera.domain.com&url=/ca/fr/catalog/products/12345678");
$("#cafr-4").load("psurl.php?server=servera.domain.com&url=/webapp/wcs/stores/servlet/StockSearch?storeId=33&productId=103406&StoreNumber=124&langId=-09&ddkey=http:StockSearch");
$("#cafr-5").load("psurl.php?server=servera.domain.com&url=/webapp/wcs/stores/servlet/StockSearch?query=testProd&storeId=33&langId=-16&StoreNumber=216")
});
});
The PS URL is simply a curl request function that responds with 404, 200, 500 etc which is then used to populate the relevant cell.
function getPage( $url ) {
$options = array(
CURLOPT_URL => $url,
CURLOPT_RETURNTRANSFER => true, // return web page
CURLOPT_HEADER => true, // return headers
CURLOPT_FOLLOWLOCATION => true, // follow redirects
CURLOPT_ENCODING => "", // handle all encodings
CURLOPT_USERAGENT => "pre-surf", // who am i
CURLOPT_AUTOREFERER => true, // set referer on redirect
CURLOPT_CONNECTTIMEOUT => 120, // timeout on connect
CURLOPT_TIMEOUT => 120, // timeout on response
CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
CURLOPT_POST => 0, // i am not sending post data
CURLOPT_SSL_VERIFYHOST => 0, // don't verify ssl
CURLOPT_SSL_VERIFYPEER => FALSE, //
);
$ch = curl_init();
curl_setopt_array($ch, $options);
$content = curl_exec($ch);
$err = curl_errno($ch);
$errmsg = curl_error($ch) ;
$header = curl_getinfo($ch);
curl_close($ch);
// $header['errno'] = $err;
// $header['errmsg'] = $errmsg;
// $header['content'] = $content;
return $header['http_status_code'];
}

The problem here is not the Ajax requests, the problem is each one of those requests is updating the DOM. The browser redraw is what is causing the browser to lock up.
You need to find a better solution that does not write to the DOM so often.

We Keep Coding

JavaScript is the programming language of the Web.