RxJS share memory across node.js instances using Redis

RxJS share memory across node.js instances using Redis - javascript

We're working on a project where we are creating an event processor using RxJS. We have a few 'rules', so to speak, where input is provided from a few different source and output has to be generated based on the number of times an input is above a set value (simple rule).
Now, all this works without any problems, but we want to move the project from beta to production. This means running multiple instances of Node.JS with RxJS on top of it.
We're wondering if it's possible for RxJS to share its memory using Redis for example. This way when one of the instances dies for whatever reason, another one can pick up where the dead one stopped. Ensuring that the amount of times the value was above the set value is retained.
This would also allow us to spread the load over multiple instances if the 'rules' get more complex and the amount of data increases.
Is something like this possible with RxJS, or should we build our own administration around it?

You can't share memory between node.js processes, as far as I know. Doing so would be super-unsafe, since then you're dealing with concurrency problems that can't be mitigated with javascript (what happens when one process interrupts another?)
That said, you can pass messages back and forth with redis. Generally, what I do is establish a work queue as a redis queue. Some servers push into the work queue. Some workers will pull from the queue and process data and deal with the results as needed.
A concrete example is generating outbound emails in response to some REST event (new post or whatever). The webapp does a LPUSH to a known queue. The worker process can use BRPOPLPUSH to atomically pull an entry from the work queue and push it into an "in process" queue. Once the mail is sent, it can be removed from the in process queue. If there's a server crash or long timeout, entries in the in process queue can be pushed back into the work queue and re-tried.
But you're not going to get a fancy shared-memory solution here, I don't think at least.

Related

Meteor server restarts itself upon slow requests

I have a Meteor app that is performing some calls that are currently hanging. I'm processing a lot of items in a loop that is then upserting to server-side Mongo. (I think this is done asynchronously) I understand the upserting in a loop is not good .
This whole functionality seems to make the app hang for a while. I'm even noticing sock.js and websocket error out in the console. I think this is all due to DDP, async Mongo upserts, and the slow requests.
Here's some pseduocode to what I'm talking about
for (1..A Lot of records) {
//Is this async?
Collection.upsert(record)
}
Eventually this function will complete. However, I'll notice that Meteor "Restarts" (I think this is true because I see Accounts.onLogin being called again. It's almost like the client refreshes after the slow request has actually finished. This results in something that appears like an infinite loop.
My question is why the app is "restarting". Is this due to something in the framework and how it handles slow requests? I.e. does it queue up all bad requests and then eventually retry them automatically?

I am not sure about what exactly is going on here, but it sounds like the client isn't able to reach the server while it is "busy", and then the client connection over DDP times out, and ends up with a client refresh. The server process probably doesn't restart.
One technique for improving this is to implement a queue in your database. One piece of code detects there are a bunch of database upserts to do, so it records the information in a table which is used as a queue.
You set up a cron job (using eg npm module node-cron) that looks for things in the queue on a regular basis - when it finds an unprocessed record, it does the upsert work needed, and then either updates a status value in the queue record to 'done', or simply deletes it from the queue. You can decide how many records to process at a time to minimise interruptions.
Another approach is to do the processing in another node process on your server, basically like a worker process. If this process is busy, it is not going to impact your front end. The same queueing technique can be used to make sure this doesn't get bogged down either.
You lose a little reactivity this way, but given it's some kind of bulk process, that shouldn't matter.

Bull Queue Concurrency Questions

I need help understanding how Bull Queue (bull.js) processes concurrent jobs.
Suppose I have 10 Node.js instances that each instantiate a Bull Queue connected to the same Redis instance:
const bullQueue = require('bull');
const queue = new bullQueue('taskqueue', {...})
const concurrency = 5;
queue.process('jobTypeA', concurrency, job => {...do something...});
Does this mean that globally across all 10 node instances there will be a maximum of 5 (concurrency) concurrently running jobs of type jobTypeA? Or am I misunderstanding and the concurrency setting is per-Node instance?
What happens if one Node instance specifies a different concurrency value?
Can I be certain that jobs will not be processed by more than one Node instance?

The TL;DR is: under normal conditions, jobs are being processed only once. If things go wrong (say Node.js process crashes), jobs may be double processed.
Quoting from Bull's official README.md:
Important Notes
The queue aims for an "at least once" working strategy. This means that in some situations, a job could be processed more than once. This mostly happens when a worker fails to keep a lock for a given job during the total duration of the processing.
When a worker is processing a job it will keep the job "locked" so other workers can't process it.
It's important to understand how locking works to prevent your jobs from losing their lock - becoming stalled - and being restarted as a result. Locking is implemented internally by creating a lock for lockDuration on interval lockRenewTime (which is usually half lockDuration). If lockDuration elapses before the lock can be renewed, the job will be considered stalled and is automatically restarted; it will be double processed. This can happen when:
The Node process running your job processor unexpectedly terminates.
Your job processor was too CPU-intensive and stalled the Node event loop, and as a result, Bull couldn't renew the job lock (see #488 for how we might better detect this). You can fix this by breaking your job processor into smaller parts so that no single part can block the Node event loop. Alternatively, you can pass a larger value for the lockDuration setting (with the tradeoff being that it will take longer to recognize a real stalled job).
As such, you should always listen for the stalled event and log this to your error monitoring system, as this means your jobs are likely getting double-processed.
As a safeguard so problematic jobs won't get restarted indefinitely (e.g. if the job processor aways crashes its Node process), jobs will be recovered from a stalled state a maximum of maxStalledCount times (default: 1).

Bull is designed for processing jobs concurrently with "at least once" semantics, although if the processors are working correctly, i.e. not stalling or crashing, it is in fact delivering "exactly once". However you can set the maximum stalled retries to 0 (maxStalledCount https://github.com/OptimalBits/bull/blob/develop/REFERENCE.md#queue) and then the semantics will be "at most once".
Having said that I will try to answer to the 2 questions asked by the poster:
What happens if one Node instance specifies a different concurrency value?
I will assume you mean "queue instance". If so, the concurrency is specified in the processor. If the concurrency is X, what happens is that at most X jobs will be processed concurrently by that given processor.
Can I be certain that jobs will not be processed by more than one Node instance?
Yes, as long as your job does not crash or your max stalled jobs setting is 0.

I spent a bunch of time digging into it as a result of facing a problem with too many processor threads.
The short story is that bull's concurrency is at a queue object level, not a queue level.
If you dig into the code the concurrency setting is invoked at the point in which you call .process on your queue object. This means that even within the same Node application if you create multiple queues and call .process multiple times they will add to the number of concurrent jobs that can be processed.
One contributor posted the following:
Yes, It was a little surprising for me too when I used Bull first
time. Queue options are never persisted in Redis. You can have as many
Queue instances per application as you want, each can have different
settings. The concurrency setting is set when you're registering a
processor, it is in fact specific to each process() function call, not
Queue. If you'd use named processors, you can call process() multiple
times. Each call will register N event loop handlers (with Node's
process.nextTick()), by the amount of concurrency (default is 1).
So the answer to your question is: yes, your processes WILL be processed by multiple node instances if you register process handlers in multiple node instances.

Ah Welcome! This is a meta answer and probably not what you were hoping for but a general process for solving this:
Read the documentation ultra carefully to identify which guarantees your solution aims to provide:
You can specify a concurrency argument. Bull will then call your
handler in parallel respecting this maximum value.
I personally don't really understand this or the guarantees that bull provides. Since it's not super clear:
Dive into source to better understand what is actually happening. I usually just trace the path to understand:
https://github.com/OptimalBits/bull/blob/f05e67724cc2e3845ed929e72fcf7fb6a0f92626/lib/queue.js#L629
https://github.com/OptimalBits/bull/blob/f05e67724cc2e3845ed929e72fcf7fb6a0f92626/lib/queue.js#L651
https://github.com/OptimalBits/bull/blob/f05e67724cc2e3845ed929e72fcf7fb6a0f92626/lib/queue.js#L658
... more this is pretty big :p
If the implementation and guarantees offered are still not clear than create test cases to try and invalidate assumptions it sounds like:
Initialize process for the same queue with 2 different concurrency values
Create a queue and two workers, set a concurrent level of 1, and a callback that logs message process then times out on each worker, enqueue 2 events and observe if both are processed concurrently or if it is limited to 1
IMO the biggest thing is:
Can I be certain that jobs will not be processed by more than one Node
instance?
If exclusive message processing is an invariant and would result in incorrectness for your application, even with great documentation, I would highly recommend to perform due diligence on the library :p

Looking into it more, I think Bull doesn't handle being distributed across multiple Node instances at all, so the behavior is at best undefined.

How to handle >100 Messages per Second with AMQP/Node

we're currently prototyping a microservice (Node v8.3.0) which have to consume about 60-150 messages per second of a RabbitMQ (RabbitMQ 3.6.12, Erlang 19.2.1). Sometimes it works like a charm and there are no remaining messages in the queue. But most of the time the messages stuck, only 5-20 messages per second get handled and and accumulate to up to 3M messages in the queue.
Now we're really curious about how to handle all those messages with a single consumer. Because there are already some Java consumer handling all those messages without any delays. We use this node library based on amqplib. Furthermore the handler acknowledges the incoming messages immediately — the business logic is absolutely asynchronous. So even without any business logic it stucks. The exchange's type is topic, it isn't durable and the queue has the auto-deleting feature enabled. We tried disabled prefetch, prefetch = 1 and 100, without any success.
So..
1) Which AMQP/RabbitMQ library for node do you use?
2) How many messages get handled per second?
3) Any further improvements/suggestions?
Thanks!

The RabbitMQ team monitors this mailing list and only sometimes answers questions on stackoverflow.
The fact that you have a Java consumer that works correctly points to either amqplib-easy, amqplib or your code as the culprit. Also, note that using a single queue in RabbitMQ is an anti-pattern as queues are the unit of concurrency in the broker.
I have put together a test project that includes a README on running a Node consumer with the RabbitMQ PerfTest application (Java). You should familiarize yourself with PerfTest as it provides many features for evaluating the performance capabilities of your environment.
In my test environment, I can sustain a publish rate of 4096msg/sec easily. If I increase that to 8192 I can see messages back up due to the fact that the Node app can't consume fast enough. It would be interesting to compare using "plain" amqplib as well.

Save to 3 firebase locations with a slow internet connection

Sometimes I'm having issues with firebase when the user is on a slow mobile connection. When the user saves an entry to firebase I actually have to write to 3 different locations. Sometimes, the first one works, but if the connection is slow the 2nd and 3rd may fail.
This leaves me with entries in the first location that I constantly need to clean up.
Is there a way to help prevent this from happening?
var newTikiID = ref.child("tikis").push(tiki, function(error){
if(!error){
console.log("new tiki created")
var tikiID = newTikiID.key()
saveToUser(tikiID)
saveToGeoFire(tikiID, tiki.tikiAddress)
} else {
console.log("an error occurred during tiki save")
}
});

There is no Firebase method to write to multiple paths at once. Some future tools planned by the team (e.g. Triggers) may resolve this in the future.
This topic has been explored before and the firebase-multi-write README contains a lot of discussion on the topic. The repo also has a partial solution to client-only atomic writes. However, there is no perfect solution without a server process.
It's important to evaluate your use case and see if this really matters. If the second and third writes failed to write to a geo query, chances are, there's really no consequence. Most likely, it's essentially the same as if the first write had failed, or if all writes had failed; it won't appear in searches by geo location. Thus, the complexity of resolving this issue is probably a time sink.
Of course, it does cost a few bytes of storage. If we're working with millions of records, that may matter. A simple solution for this scenario would be to run and audit report that detects broken links between the data and geofire tables and cleans up old data.
If an atomic operation is really necessary, such as gaming mechanics where fairness or cheating could be an issue, or where integrity is lost by having partial results, there are a couple options:
1) Master Record approach
Pick a master path (the one that must exist) and use security rules to ensure other records cannot be written, unless the master path exists.
".write": "root.child('maste_path').child(newData.child('master_record_id')).exists()"
2) Server-side script approach
Instead of writing the paths separately, use a queue strategy.
Create an single event by writing a single event to a queue
Have a server-side process monitor the queue and process events
The server-side process does the multiple writes and ensures they
all succeed
If any fail, the server-side process handles
rollbacks or retries
By using the server-side queue, you remove the risk of a client going offline between writes. The server can safely survive restarts and retry events or failures when using the queue model.

I have had the same problem and I ended up choosing to use condition Conditional Request with the Firebase REST API in order to write data transactionally. See my question and answer. Firebase: How to update multiple nodes transactionally? Swift 3 .
If you need to write concurrently (but not transactionally) to several paths, you can do that now as Firebase supports multi-path updates. https://firebase.google.com/docs/database/rest/save-data
https://firebase.googleblog.com/2015/09/introducing-multi-location-updates-and_86.html

How to prevent HTML5 Web Workers from locking up thus correctly responding to messages from parent

I'm using web workers to do some CPU intensive work but have the requirement that the worker will respond to messages from the parent script while the worker is still processing.
The worker however will not respond to messages while it is locked in a processing loop, and I have not found a way to say poll the message queue. Thus it seems like the only solution is to break processing at an interval to allow any messages in the queue to be serviced.
The obvious options are to use a timer (say with setInterval) however I have read that the minimum delay between firings is quite long (http://ajaxian.com/archives/settimeout-delay) which is unfortunate as it will slow down processing alot.
What are other peoples thoughts on this? I'm going to try have the worker dispatch onmessage to itself at the end of each onmessage, thus effectively implementing one step of the processing loop per event received from itself, but just wanted to see if anyone had any ideas about this.
Thanks,

A worker can spawn sub workers. You can have your main worker act as your message queue, and when it receives a request for a long running operation, spawn a sub worker to process that data. The sub worker can then send the results back to the main worker to remove the event from the queue and return the results to the main thread. That way your main worker will always be free to listen for new messages and you have complete control over the queue.
--Nick

I ran into this issue myself when playing with workers for the first time. I also debated using setInterval, but I felt that this would be a rather hacky approach to the problem (and I had already went this way for my emulated multithreading). Instead, I settled on terminating the workers from the main thread (worker.terminate()) and recreating them if the task that they are involved in needs to be interrupted. Garbage collection etc seemed to be handled in my testing.
If there is data from these tasks that you want to save, you can always post it back to the main thread for storage at regular intervals, and if there is some logic you wish to implement regarding whether they are terminated or not, you can post the relevant data back at regular enough intervals to allow it.
Spawning subworkers would lead to the same set of issues anyway; you'd still have to terminate the subworkers (or create new ones) according to some logic, and I'm not sure it's as well supported (on chrome for example).
James

Having the same problem I searched the web workers draft and found something in the Processing model section, steps from 9 to 12. As far as I have understood, a worker that starts processing a task will not process another one until the first is completed. So, if you don't care about stopping and resuming a task, nciagra's answer should give better performances than rescheduling each iteration of the task.
Still investigating, though.

We Keep Coding

JavaScript is the programming language of the Web.