Trying to better understand NodeJS with Redis and using a background task - javascript

I'm trying to test a node server locally for something I eventually want to deploy on DigitalOcean (which is a whole other story). I have successfully setup a local node server with rest endpoints and a self signed cert for the time being. My issue is I want to store data that my user's can retrieve by hitting a rest endpoint. My current thought is that I would have some sort of background task running on the server that is constantly getting new data, so that when some of the more popular queries come through, I have the newest data for them, sanitized and ready to go.
My problem is I can't understand how I am suppose to have a background function running that will call itself over and over again without eventually causing a memory problem. I was looking at Bull or Kue, but I am not sure if those are suitable for my specific needs. I've also never dealt with NoSQL databases before, so Redis is fairly new to me to. Any suggestions or pointers? I'm a little overwhelmed and not sure where to go from here even though I have a general idea of what I am trying to do.

Two options come to mind. If you're using the "hiredis" package you can attain 200k queries / sec on a modern quad core. It's conceivable you could just retrieve the freshest data every time.
The other option involves updating application memory at startup and again at intervals. A recursive call to a function with:
function updateValuesFromRedis(seconds) {
return redishmgetAsyc('keyhash')
.then(function(values) {
return saveValues(values);
})
.finally(function() {
setTimeout(function() {
console.log('Updating...');
updateValuesFromRedis(seconds);
}, 1000 * seconds)
})
.catch(function(error) {
console.error('Error updating values from Redis!', error);
});
}
(function schedule() {
updateValuesFromRedis(60);
})();

Related

Firebase "update" operation downloads data?

I was profiling a "download leak" in my firebase database (I'm using JavaScript SDK/firebase functions Node.js) and finally narrowed down to the "update" function which surprisingly caused data download (which impacts billing in my case quite significantly - ~50% of the bill comes from this leak):
Firebase functions index.js:
exports.myTrigger = functions.database.ref("some/data/path").onWrite((data, context) => {
var dbRootRef = data.after.ref.root;
return dbRootRef.child("/user/gCapeUausrUSDRqZH8tPzcrqnF42/wr").update({field1:"val1", field2:"val2"})
}
This function generates downloads at "/user/gCapeUausrUSDRqZH8tPzcrqnF42/wr" node
If I change the paths to something like this:
exports.myTrigger = functions.database.ref("some/data/path").onWrite((data, context) => {
var dbRootRef = data.after.ref.root;
return dbRootRef.child("/user/gCapeUausrUSDRqZH8tPzcrqnF42").update({"wr/field1":"val1", "wr/field2":"val2"})
}
It generates download at "/user/gCapeUausrUSDRqZH8tPzcrqnF42" node.
Here is the results of firebase database:profile
How can I get rid of the download while updating data or reduce the usage since I only need to upload it?
I dont think it is possible in firebase cloudfunction trigger.
The .onWrite((data, context) has a data field, which is the complete DataSnapshot.
And there is no way to configure not fetching its val.
Still, there are two things that you might do to help reduce the data cost:
Watch a smaller set for trigger. e.g. functions.database.ref("some/data/path") vs ("some").
Use more specific hook. i.e. onCreate() and onUpdate() vs onWrite().
You should expect that all operations will round trip with your client code. Otherwise, how would the client know when the work is complete? It's going to take some space to express that. The screenshot you're showing (which is very tiny and hard to read - consider copying the text directly into your question) indicates a very small amount of download data.
To get a better sense of what the real cost is, run multiple tests and see if that tiny cost is itself actually just part of the one-time handshake between the client and server when the connection is established. That cost might not be an issue as your function code maintains a persistent connection over time as the Cloud Functions instance is reused.

Meteor js publish and subscribe is very slow to react

I have used meteor publish and subscribe method to interact with client and server. Now according to my scenario I am using D3 js to generate a bar chart and as soon as the data is entered in mongo db collection I am using a client side function to generate a bar chart. My issue is that publish and subscribe is too slow to react. And even if I limit the number of documents returned by mongodb, the issue still persists. It is also inconsistent i.e. it will react under 1 second sometimes and other times it will take 4-5 second. Please guide me on what to do and what is wrong with my implementation.
Here is the server side code,
Test = new Mongo.Collection("test")
Meteor.publish('allowedData', function() {
return Test.find({});
})
and here is the client side code,
Test = new Mongo.Collection("test")
Meteor.subscribe('allowedData');
Meteor.setTimeout(function() {
Test.find().observe({
added: function(document){
//something
},
changed:function(){
//something
},
removed:function(){
//something
},
})
From your comments I see that you need a report chart which is reactive. Even though it is your requirement, it is too expensive to have a chart like this. In fact when you system grows bigger, say you have around 10000 documents for one single chart, this kind of chart will crash your server frequently.
To work around this problem, I have two suggestions:
Define a method that returns data for the chart. Set up a job/interval timer in client to call that method periodically. The interval value depends on your need, 10 seconds should be fine for charts. It is not completely reactive this way, you only get the newest data after an interval but it is still better than a slow and crash-frequent system. You could find good modules to manage job/timer here.
Use this Meteor package meteor-publish-join (disclaimer: I am the author), it is made to solve the kind of problem you have: the need to do reactive aggregations/joins on a big data set and still have a good overall performance

dealing with long server side calculations in meteor

I am using jimp (https://www.npmjs.com/package/jimp) in meteor JS to generate an image server side. In other words I am 'calculating' the pixels of the image using a recursive algorithm. The algorithm takes quite some time to complete.
The issue I am having is that this seems to completely block the meteor server. Users trying to visit the webpage while an image is being generated are forced to wait. The website is therefore not rendered at all.
Is there any (meteor) way to run the heavy recursive algorithm in a thread or something so that it does not block the entire website?
Node (and consequently meteor) runs in a single process which blocks on CPU activity. In short, node works really well when you are IO-bound, but as soon as you do anything that's compute-bound you need another approach.
As was suggested in the comments above, you'll need to offload this CPU-intensive activity to another process which could live on the same server (if you have multiple cores) or a different server.
We have a similar problem at Edthena were we need to transcode a subset of our video files. For now I decided to use a meteor-based solution, because it was easy to set up. Here's what we did:
When new transcode jobs need to happen, we insert a "video job" document in to the database.
On a separate server (we max out the full CPU when transcoding), we have an app which calls observe like this:
Meteor.startup(function () {
// Listen for non-failed transcode jobs in creation order. Use a limit of 1 to
// prevent multiple jobs of this type from running concurrently.
var selector = {
type: 'transcode',
state: { $ne: 'failed' },
};
var options = {
sort: { createdAt: 1 }, limit: 1,
};
VideoJobs.find(selector, options).observe({
added: function (videoJob) {
transcode(videoJob);
}, });
});
As the comments indicate this allows only one job to be called at a time, which may or may not be what you want. This has the further limitation that you can only run it on one app instance (multiple instances calling observe would simultaneously complete the job). So it's a pretty simplistic job queue, but it may work for your purposes for a while.
As you scale, you could use a more robust mechanism for dequeuing and processing the tasks like Amazon's sqs service. You can also explore other meteor-based solutions like job-collection.
I believe you're looking for Meteor.defer(yourFunction).
Relevant Kadira article: https://kadira.io/academy/meteor-performance-101/content/make-your-app-faster
Thanks for the comments and answers! It seems to be working now. What I did is what David suggested. I am running a meteor app on the same server. This app deals with the generating of the images. However, this resulted in the app still eating away all the processing power.
As a result of this I set a slightly lower priority on the generating algorithm with the renice command on the PID. (https://www.nixtutor.com/linux/changing-priority-on-linux-processes/) This works! Any time a user logs into the website the other (client) meteor application gains priority over the generating algorithm. Absolutely no delay at all anymore now.
The only issue I am having now is that whenever the server restarts I somehow have to rerun or run the (re)nice command.
Since I am using meteor up for deployment both apps run the same user and the same 'command': node main.js. I am currently trying to figure out how to run the nice command within the startup script of meteor up. (located at /etc/init/.conf)

Handling rollbacked MySQL transactions in Node.js

I'm dealing with a promblem for a couple of days, and I'm really hoping, you could help me.
It's a node.js based API using sequelize for MySQL.
On certain API calls the code starts SQL transactions which lock certain tables, and if I send multiple requests to the API simultaneously, I got LOCK_WAIT_TIMEOUT errors.
var SQLProcess = function () {
var self = this;
var _arguments = arguments;
return sequelize.transaction(function (transaction) {
return doSomething({transaction: transactioin});
})
.catch(function (error) {
if (error && error.original && error.original.code === 'ER_LOCK_WAIT_TIMEOUT') {
return Promise.delay(Math.random() * 1000)
.then(function () {
return SQLProcess.apply(self, _arguments);
});
} else {
throw error;
}
});
};
My problem is, the simultaneously running requests lock each other for a long time, and my request returns after a long-long time (~60 seconds).
I hope I could explain it clear and understandable, and you could offer me some solution.
This may not be a direct answer to your question, but maybe by looking at why you had this problem would also help.
1) What does that doSomething() do? Anyway we can do some improvements there?
First, a transaction that take 60 sec is suspicious.. If you lock a table for that long, chances are the design should be revisited. Given a typical db operation runs 10 - 100 ms.
Ideally, all the data preparation should be done outside of the transaction, including data read from database. And the transaction should be really for only transactional operations.
2) Is it possible to use mysql stored procedure?
True, the stored procedure for mysql is not compiled, as PL/SQL for Oracle. But it is still running on the database server. If your application is really complicated and contain a lot of back and force network traffic between database and your node application in that transaction, and considering there are so many layer of javascript calls, it could really slowing things down. If 1) doesn't save you a lot of time, consider using mysql stored procedure.
The drawback of this approach, obviously, is that it is harder to maintain the codes in both nodejs and mysql.
If 1) and 2) are definitely not possible, you may consider some kind of flow control or queuing tool. Either your app make sure the 2nd request doesn't go until the first one finishes, or your have some 3rd party queuing tools to handle that. Seems you don't need any parallelism in running those requests anyway.
The main reason for deadlocks is poor database design. Without further information about your database design and which exact queries might or might not lock each other it is impossible to give you a specific solution for your problem.
However I can give you a general advice/approach to solve this issue:
I would make sure that your database is normalized at least into Third Normal Form or, if that still isnt enough even further. There might be tools to automate this process for you.
Aside from reducing the likelihood of deadlocks this also helps keeping your data consistent, which is always a good thing.
Keep your transactions as slim as possible. If you are inserting new rows into your tables and update other tables accordingly you might want to use a Trigger rather than another SQL statement to do so. The same applies to reading rows and values. Such things can be done before or after your transaction.
Choose the correct Isolation Level. Possible isolation levels are:
READ_UNCOMMITTED
READ_COMMITTED
REPEATABLE_READ
SERIALIZABLE
Sequelize's official documentation describes how you can set the isolation level and lock/unlock transactions by yourself.
As I said, without further insight about your database and query design thats all I can do for you right now.
Hope this helps.

Breezejs, Browser freezes while manager assemples the objects

I have the following problem:
Breeze fetches metadata (23.4KB)
Breeze fetches lookups (4.5MB)
Right after lookups are downloaded, the browser will become unresponsive for about 30 seconds.
After this, everything works like a charm.
Why does breeze not use timeouts to inform the UI?
Firefox complains about long script operation, unresponsive, etc. The task manager (shows Firefox/Chrome/etc) as unresponsive.
Am I doing something wrong, or this is by design?
If this is by design, can i use a 'Web Worker' to do all the heavy operations and then return the whole model or something?
I tried something like this:
var test = function (name) {
return Q.fcall(function () {
setTimeout(function () {
toastr.success(name); // Notify me
return EntityQuery.from(name)
.using(manager).execute()
}, 1000) // This should be zero
});
};
var primeData = function (name) {
return test('Languages')
.then(test('dummy1'))
.then(test('dummy2'))
.then(test('dummy3'))
.then(test('dummy4'))
};
However the notifications seem to be poping up all at the same time, indicating that
return EntityQuery.from(name)
.using(manager).execute()
does not return when entity constuction finishes but when the JSON data for this entity arrived.
EDIT
Answer with webWorker provided here : BreezeJs with dedicated web worker
I think I see your point. Breeze hogs the UI thread while processing those thousands of arriving entities. If Breeze could somehow realize how much work it was doing, and would be doing, it could throw a timeout in there to give the UI a chance to breathe.
I'm not sure how safe that would be as Breeze would have to pick a moment that didn't leave the cache in an unstable state from someone's perspective.
I believe you can make this easier on yourself by breaking the one giant Lookups call into several smaller ones. You could still async await completion of all the smaller lookup promises if that is critical to your app. The fact that they are independent promise callbacks should give you the relief you seek.
Please try that and let us know how it works for you.
P.S.: You also have a cool opportunity here to optionally cache these lookups in local storage (indexdb) so you don't have to download them everytime. You'd need a versioning scheme of course and some plumbing so this lies in your future once things are looking good.

Categories