IBM MQ cmit and rollback with syncpoint - javascript

Infra-Overview:
I have a setup where I am reading a set of messages from IBM MQ and processing those messages in k8 cluster env and sending it to the destination host.
Issue:
I observed that sometimes the flow of the messages is huge and before sending it to the destination host our pod gets failed and restarts, by this we are losing all the messages as we are following a read-and-delete approach from ibmmq example
Expected Solution:
I am looking for a solution where, until these messages are sent to the destination host, we don't lose the track of the messages.
What I tried:
We have a concept of unit of work in IBM MQ but since we can't expect a delay in reading and processing, I can't wait for a single message to get processed and then read the another message as it might have a major performance setback.
Code language:
NodeJs

As the comments suggest there are a number of ways to skin this cat, but you will need to use transactions.
As soon as you create the connection with the transaction option, the transaction scope begins. This gets closed and next transaction begins when you either commit or rollback.
So you should handle the messages in batches, that make sense to your application, and commit when the batch is complete. If your application is killed by k8s then all uncommitted read messages will get rolled back, via back out queue process to stop poison messages.
Section added to show sample code, and explanation of backout queues.
In your normal processing, if an app gets stopped before it has had time to process the message, you will want that message returned to the queue. So that the message is still available to be processed.
To enable this rollback you need to or in the MQC.MQPMO_SYNCPOINT into the get message options
gmo.Options |= MQC.MQGMO_SYNCPOINT
Then if all goes well, you can commit.
mq.Cmit(hConn, function(err) {
if (err) {
debug_warn('Error on commit', err);
} else {
debug_info('Commit was successful');
}
});
or rollback
mq.Back(hConn, function(err) {
if (err) {
debug_warn('Error on rollback', err);
} else {
debug_info('rollback was successful');
}
});
If you rollback, the message goes back to the queue. Which means it is also the next message that your app will read. This can generate a poison message loop. So you should also set up a backout queue with pass all context permissions for your app user and a backout threshold.
Say you set the threshold to 5. The message can be read 5 times, with rollback. Your app needs to check the threshold and decide that it is a poison message and move it off the queue.
To check the backout threshold (and the backout queue name) you can use the following code
// Remember to or in the Inquire option on the Open
openOptions |= MQC.MQOO_INQUIRE;
...
attrs = [ new mq.MQAttr(MQC.MQIA_BACKOUT_THRESHOLD),
new mq.MQAttr(MQC.MQCA_BACKOUT_REQ_Q_NAME) ];
mq.Inq(hObj, attrs, (err, selectors) => {
if (err) {
debug_warn('Error retrieving backout threshold', err);
} else {
debug_info('Attributes have been found');
selectors.forEach((s) => {
switch (s.selector) {
case MQC.MQIA_BACKOUT_THRESHOLD:
debug_info('Threshold is ', s.value);
break;
case MQC.MQCA_BACKOUT_REQ_Q_NAME:
debug_info('Backout queue is ', s.value);
break;
}
});
}
});
When getting the message your app can use mqmd.BackoutCount to check how often the message has been rolled back.
if (mqmd.BackoutCount >= threshold) {
...
}
What I have noticed, that if this is in the same application instance that is repeatedly calling rollback on the same message, then at the threshold a MQRC_HOBJ_ERROR error is thrown. Which your app can check for, and then discard the message.
If its a different app instance then it doesn't get the MQRC_HOBJ_ERROR error, so it can check the backout threshold and can discard the message, remembering to commit the discard action.
See https://github.com/ibm-messaging/mq-dev-patterns/tree/master/transactions/JMS/SE for more information.
As an alternative you could use keda - https://keda.sh - which works with k8s
to monitor your queue depth and scale according to the number of messages waiting to be processed, as opposed to CPU / memory consumption. That way you can scale up when there are lots of messages waiting to be processed, and slowly scale down then the queue becomes manageable. Here is a link to getting started - https://github.com/ibm-messaging/mq-dev-patterns/tree/master/Go-K8s - the example is for a Go app, but equally applies to Node.js

Related

Only one message is received from SQS ( nodejs aws sdk)

I created an SQS with default settings. I published two messages to it, and I would like to read them back in the same time. I tried it like this:
const sqsClient = new SQSClient({ region: REGION });
const params = {
AttributeNames: ["SentTimestamp"],
MaxNumberOfMessages: 5,
MessageAttributeNames: ["All"],
QueueUrl: queueURL,
WaitTimeSeconds: 5,
};
const data = await sqsClient.send(new ReceiveMessageCommand(params));
const messages = data.Messages ?? [];
console.log(messages.length);
Unfortunately only one message is returned, no matter what I provide in MaxNumberOfMessages. What can cause this? How is it possible to fix this issue?
I was able to find a similar question, but it has only one answer, refering to a 3rd party library.
A ReceiveMessageCommand does not guarantee that you will get exactly the number of messages specified for MaxNumberOfMessages. In fact the documentation says the following:
Short poll is the default behavior where a weighted random set of machines is sampled on a ReceiveMessage call. Thus, only the messages on the sampled machines are returned. If the number of messages in the queue is small (fewer than 1,000), you most likely get fewer messages than you requested per ReceiveMessage call. If the number of messages in the queue is extremely small, you might not receive any messages in a particular ReceiveMessage response. If this happens, repeat the request.
You must use long-polling to receive multiple messages. This is essentially setting the WaitTimeSeconds to a greater value (5 seconds should be enough).
And you must have a larger number of messages in the queue to be able to fetch multiple messages with one call.
To summarize:
SQS is a distributed system, each call will poll one machine only.
Messages are distributes on those machines, if you have a small number of messages, it might happen that you fetch only one message, or none.
Test your code with a larger set of sent messages and put your receiving call in loop.

Express server that creates maximum of 2 child/worker processes

I'm experimenting with node and it's child_process module.
My goal is to create server which will run on maximum of 3 processes (1 main and optionally 2 children).
I'm aware that code below may be incorrect, but it displays interesting results.
const app = require ("express")();
const {fork} = require("child_process")
const maxChildrenRuning = 2
let childrenRunning = 0
app.get("/isprime", (req, res) => {
if(childrenRunning+1 <= maxChildrenRuning) {
childrenRunning+=1;
console.log(childrenRunning)
const childProcess = fork('./isprime.js');
childProcess.send({"number": parseInt(req.query.number)})
childProcess.on("message", message => {
console.log(message)
res.send(message)
childrenRunning-=1;
})
}
})
function isPrime(number) {
...
}
app.listen(8000, ()=>console.log("Listening on 8000") )
I'm launching 3 requests with 5*10^9'ish numbers.
After 30 seconds I receive 2 responses with correct results.
CPU stops doing hard work and goes idle
Surprisingly after next 1 minute 30 seconds 1 thread starts to proceed, still pending, 3rd request and finishes after next 30 seconds with correct answer. Console log displayed below:
> node index.js
Listening on 8000
1
2
{ number: 5000000029, isPrime: true, time: 32471 }
{ number: 5000000039, isPrime: true, time: 32557 }
1
{ number: 5000000063, isPrime: true, time: 32251 }
Either express listens and checks pending requests once for a while or my browser sends actual requests every x time while pending. Can anybody explain what is happening here and why? How can I correctly achieve my goal?
The way your server code is written, if you receive a /isprime request and two child processes are already running, your request handler for /isprime does nothing. It never sends any response. You don't pass that first if test and then nothing happens afterwards. So, that request will just sit there with the client waiting for a response. Depending upon the client, it will probably eventually time out as a dead/inactive request and the client will shut it down.
Some clients (like browsers) may assume that something just got lost in the network and they may retry the request by sending it again. It would be my guess that this is what is happening in your case. The browser eventually times out and then resends the request. By the time it retries, there are less than two child processes running so it gets processed on the retry.
You could verify that the browser is retrying automatically by going to the network tab in the Chrome debugger and watching exactly what the browser sends to your server and watch that third request, see it timeout and see if it is the browser retrying the request.
Note, this code seems to be only partially implemented because you initially start two child processes, but you don't reuse those child processes. Once they finish and you decrement maxChildrenRuning, your code will then start another child process. Probably what you really want to do is to keep track of the two child processes you started and when one finishes, add it to an array of "available child processes" so when a new request comes in, you can just use an existing child process that is already started, but idle.
You also need to either queue incoming requests when all the child processes are full or you need to send some sort of error response to the http request. Never sending an http response to an incoming request is a poor design that just leads to great inefficiencies (connections hanging around much longer than needed that never actually accomplish anything).

How to run a Parse cloud job in batches of users to avoid time limit?

I'm currently running the following Parse cloud code job which iterates through every user in the database, and I'm starting to hit the 15 minute time limit. How can I set this to run in batches of 500 users at a time instead of all users at once? I would need it to run through the users in order, so user 1-500 for the first batch, then 500-1000 for the 2nd batch, and so on, that way it doesn't repeat anyone.
Parse.Cloud.job("MCBackground", function(request, status) {
// ... other code to setup usersQuery ...
Parse.Cloud.useMasterKey();
var usersQuery = new Parse.Query(Parse.User);
return usersQuery.each(function(user) {
return processUser(user)
.then(function(eBayResults) {
return mcComparison(user, eBayResults);
});
})
.then(function() {
// Set the job's success status
status.success("MCBackground completed successfully.");
}, function(error) {
// Set the job's error status
status.error("Got an error " + JSON.stringify(error));
});
});
Well, for starters, from your comment I see you have a constraint you can add: only query for user objects that has a matchCenterItem:
query.exist("matchCenterItem");
I don't know how many percent of your users would not have this field, but maybe you already now will have reduced the number of Users to fetch.
I have no idea how often this job is run or what you are comparing with, but how likely is it that this data will change? Do you always need to run it on ALL users, or can you set a flag so that you don't run it on the same user again? At all, or until some time has passed?

Postgresql. One process insert, second try to select but not found

Strange situation.
I try to start chat application.
I use postgresql 9.3 and tomcat as web server.
What is happens when one browser sending message another:
1 - Broswer A send message to server (tomcat)
2 - Tomcat put msg into database and get his id
INSERT INTO messages VALUES('first message') returning into MSGID id
3 - Tomcat resend message to Browser B (websocket recipient)
4 - Browser B send system answer: MSGID_READED
5 - Tomcat update database message
UPDATE messages SET readtime = now() WHERE id = MSGID
All works, but sometimes at point 5 update can't find message by MSGID...
Very strange, coz at point 2 I getting message record ID, but at 5, not.
May postgresql write slowly and this record not allow (not visible) from parallel db connection?
UPDATE
I found solution for me, just put insert inside begin/exception/end block.
BEGIN
INSERT INTO messages (...)
VALUES (...)
RETURNING id INTO MSGID;
EXCEPTION
WHEN unique_violation THEN
-- nothing
END;
UPDATE 2
In detail tests above changes with BEGIN block has no effects.
Solution in Javascript! I sent websocket messages from other thread and problem solved!
// WebSocket send message function
// Part of code. so is a web socket
send = function(msg) {
if (msg != null && msg != '') {
var f = function() {
var mm = m;
// JCC.log('SENT: [' + mm + ']');
so.send(mm);
};
setTimeout(f, 1);
}
};
Ok, so the problem is that normally writers do not block readers. This means that your first insert happens, and the second insert fires before the first one commits. This introduces a race condition in your application which introduces the problem you see.
Your best issue here is either to switch to serializable snapshot isolation or to do what you have done and do exception handling on the insert. One way or another you end up with additional exception handling that must be handled (if serializable, then a serialization failure exception may sometimes happen and you may have to wait for it).
In your case, despite the performance penalty of exception handling in plpgsql, you are best off to do things the way you are currently doing them because that avoids the locking issues and waiting for the transaction to complete.

JavaScript: Using a queue for network communication

I'm working on a project in which a client must be able to communicate with a server via WebSockets. Since the application which we develop as to be highly responsive on user input we have decided to let a WebWorker do all the communication over the network so that a slow connection cannot interrupt the GUI. That works fine so far.
Now we thought about optimizations which will be necessary if the network is slow and the amount of messages to send is high. Since most of these messages will be only to synchronize some other clients user interface we can drop some of them if necessary. But to do so we need the possibility to detect the situations when there is congestion.
We came up with the idea of a queue: Each message to be sent is pushed in a queue and the WebWorker does nothing else than permanently iterating over the queue and sending all the messages it finds there. With that we can later let the Worker act differently if there is a certain number of elements in the queue (i.e. messages are sent too slowly). The idea is simple and straightforward, however, the implementation doesn't seem to be.
var ws;
var queue = new Array();
function processMessageQueue() {
while(true)
if(queue.length > 0) ws.send(queue.shift());
}
self.addEventListener('message', function(e) {
var msg = e.data;
switch(msg.type) {
case 'SEND':
queue.push(JSON.stringify(msg));
break;
case 'CREATE_WS':
ws = new WebSocket(msg.wsurl);
ws.onmessage = function(e) {
self.postMessage(JSON.parse(e.data));
}
processMessageQueue();
}
}, false);
As you can see, the worker creates the WebSocket as soon as it received the message to do so. It then executes the function processMessageQueue() which is the loop which permanently empties the queue by sending their data via the WebSocket. Now, my problem is that there seems to be no possibility to push messages into the queue. That should happen when a message of type 'SEND' arrives but it cannot for the Worker is too busy to handle any events. So it can either loop over the queue or push messages, not both.
What I need is a way to somehow push data on this queue. If that is not easily possible I would like to know if anyone can think of another way to find out when messages arrive faster then they are sent. Any hints?
Thanks in advance!
You have to give the worker time to handle the onmessage event. Doing while(true) will take all the processing time and not allow any events to be executed.
You could for instance change it to
function processMessageQueue() {
if(queue.length > 0) ws.send(queue.shift());
setTimeout(processMessageQueue, 50);
}

Categories