Memcache vs Redis vs Javascript Hash object - javascript

I know memcache and redis are used when caching needs to be there for more than one servers.
I'm creating a node application which will run on single server only and uses mysql as db, and i need to hash around 100,000 keys and each key will contain json string of 200 in length, so that i dont have to call mysql for reads.
If i use memcache or redis i will use a callback to get my data, but if i use javascript hash i can get the data synchronously, but will it affect the application somehow, like high usage of memory. Which one i should be using for a application like this?

I know memcache and redis are used when caching needs to be there for more than one servers.
Not necessarily, for instance Facebook puts a memcache instance in front of each of their mysql servers. You can use Redis/Memcache for fast computation (e.g. real-time analytics) without having a whole cluster.
and i need to hash around 100,000 keys and each key will contain json string of 200 in length, so that i dont have to call mysql for reads.
It seems like premature optimization to mee, if MySQL have enough RAM (the dataset live in memory) you don't have to worry about performance, that's just 100 keys.
If i use memcache or redis i will use a callback to get my data
If really depends on what language you use (Ruby and Python offers synchronous Redis clients) and what type of paradygm is used (event-loop, thread pool...)
but if i use javascript hash i can get the data synchronously
To be more precise, that's just because you are using node_redis and not because you are using a javascript "hash" (an object in fact).
but will it affect the application somehow, like high usage of memory
It depends if you are loading all keys in your process or not, if you are using a Redis Hash, you will be able to only query the field you want and not the whole field each time.
Which one i should be using for a application like this?
The best thing to keep in mind is to lower the number of application you have to maintain in your stack while still using the right tool for the right job. Here MySQL could be enough but if you really want to use Redis or MemCached, I would go for Redis. It will offers simirarly the same features as memcached with the same performances will allowing you to use its other data-structures in the future without needing another application in your stack.
Moreover, if you put all your data in a Redis HASH, you will be able to retrieve a field (hget) or a group of fields (hmget) or all fields (hgetall) with just one call.
Finally, regarding recent statistics and Redis ecosystem (GUI, hosting, librairies, ...), Redis seems to be way more future proof than Memcached if you really want to go that way.
Disclaimer: I am the founder of Redsmin, an online developer oriented service for administrating and monitoring Redis.

It depends- you could even opt for memcached over mysql :). For simple operation such as only -readonly just storing it within your javascript code (I believe as dictionary objects) is enough. But be sure that you have enough RAM :) .

Related

Best way to cache data on node js

I have Json file I need to look for specific value (file is pretty big) and I want to turn this json into array so doing so would be easier and faster.
BUT what is the best way to save this array? so I wont need to run over this json every time and it will be saved until recycle or service restart?
(node js project)
1) Use Redis recommended
Pros:
Access objects super fast.
Isolated from node process.
Will not affect heap memory.
Can be deployed at separated server.
Data persistence if your application crashed.
Support compression.
low memory consumption.
Cons:
You may face some limitations if you have nested objects, but there
is a workaround which requires extra work to handle.
2) Use database preferred MongoDB,
Pros:
Save/Load objects easily since MongoDB supports JSON.
The same as number 2 of Redis pros
Cons:
You have to query every time to access objects.
3) Use Files not recommended, When your application start/restart, load your objects form file into global array, and when close/shutdown your application dump your objects from global array back into file.
Pros:
Access objects fast.
Cons:
Heap memory leakage if your objects size are Huge.
Data loss if your application crashed.
At the end it's your choice, if the speed matters choose Redis, if you want the easy way choose mongoDB. If it's not a problem losing some of your data go for Files. Also you can mix between number 2 and 3.

Node.js, store object in memory or database?

Am developing a node js application, which reads a json list from a centralised db
List Object is around 1.2mb(if kept in txt file)
Requirement is like, data is to be refreshed every 24 hours, so i kept a cron job for it
Now after fetching data i keep it into a db(couchbase) which is locally running on my server
Data access is very frequent, i get around 1 or 2 req per sec and nearly all req need that Object
Is it good to keep that Object as a in memory object in node js or to keep it in local db ?
What are the advantages and disadvantages of both ?
Object only read for all requests , only written once by cron job
it's a high end system, i7 quad core, 16gb ram
It depends from your hardware
If this object is immutable, per requests, it's better to keep it in memory. If no - depends.
In any case workflow open connection to db - fetch data - return result - free data will consume more resources than caching in memory.
For example, in our project we processing high definition images, and keep all objects in memory - 3-7mb in raw format. Tests shows that this is much efficient than usage of any caching systems, such as redis or couch base.
I would keep the most recent version as memory object, and store it as well. That way you have a backup if anything crashes. If you edit the file however, I would only keep it as database object.
Accessing the DB for that object every 2 seconds would probably work fine, but 1.2MB of memory is not that much and if you can keep that contained, your server won't likely run into problems.
The DB is a little slow compared to memory, but has the advantage to (most likely) be thread-safe. If you would edit the document, you could run into thread problems with a memory object.
You know the application and the requirements, you should be able to tell if you would need a thread-safe database, or if you need to safe your memory on the server. If you don't know, we need to see the actual code and use-cases to tell you what you could do best.

postgresql stored procedures vs server-side javascript functions

In my application I receive json data in a post request that I store as raw json data in a table. I use postgresql (9.5) and node.js .
In this example, the data is an array of about 10 quiz questions experienced by a user, that looks like this:
[{"QuestionId":1, "score":1, "answerList":["1"], "startTime":"2015-12-14T11:26:54.505Z", "clickNb":1, "endTime":"2015-12-14T11:26:57.226Z"},
{"QuestionId":2, "score":1, "answerList":["3", "2"], "startTime":"2015-12-14T11:27:54.505Z", "clickNb":1, "endTime":"2015-12-14T11:27:57.226Z"}]
I need to store (temporarily or permanently) several indicators computed by aggregating data from this json at quizz level, as I need these indicators to perform other procedures in my database.
As of now I was computing the indicators using javascript functions at the time of handling the post request and inserting the values in my table alongside the raw json data. I'm wondering if it wouldn't be more performant to have the calculation performed by a stored trigger function in my postgresql db (knowing that the sql function would need to retrieve the data from inside the json raw data).
I have read other posts on this topic, but it was asked many years ago and not with node.js, so I thought people might have some new insight on the pros and cons of using sql stored procedures vs server-side javascript functions.
edit: I should probably have mentioned that most of my application's logic already mostly lies in postgresql stored procedures and views.
Generally, I would not use that approach due to the risk of getting the triggers out of sync with the code. In general, the single responsibility principle should be the guide: DB to store data and code to manipulate it. Unless you have a really pressing business need to break this pattern, I'd advise against it.
Do you have a migration that will recreate the triggers if you wipe the DB and start from scratch? Will you or a coworker not realise they are there at a later point when reading the app code and wonder what is going on? If there is a standardised way to manage the triggers where the configuration will be stored as code with the rest of your app, then maybe not a problem. If not, be wary. A small performance gain may well not be worth the potential for lost developer time and shipping bugs.
Currently working somewhere that has gone all-in on SQL functions.. We have over a thousand.. I'd strongly advise against it.
Having logic split between Javascript and SQL is a real pain when debugging issues especially if, like me, you are much more familiar with JS.
The functions are at least all tracked in source control and get updated/created in the DB as part of the deployment process but this means you have 2 places to look at when trying to follow the code.
I fully agree with the other answer, single responsibility principle, DB for storage, server/app for logic.

Structure of IndexDB database/objectstore

I'm trying to get my head around the use of IndexDB. I have an SQL database which I access via REST and I'm planning on providing some local caching using IndexDB.
My SQL structure uses a large (and variable) number of tables, each table storing an array of data (time sequence and value) for a specific sensor value. Ideally, I would have assumed I'd create a new object store for each of my tables from MySQL. However, it seems that you can only create an object store when the database is opened which is a bit of a pain.
So, I see a number of options -:
I could use a single object store and add two indexes - one for the time,
and one for the sensor. I'm a little worried that this might have
performance issues, but I'm not sure how data is stored under the
hood.
I could probably detect a new sensor somehow, and open the
database with a new version number. This just feels a little wrong to
me.
I could alternatively use different databases for each sensor,
but I've read somewhere that it's not recommended to use multiple
databases (although it's unclear why since this is possibly the
easiest solution).
I'd welcome any thoughts people have regarding the best structure for this sort of data, that will provide good performance.
If your data sets are independent, example you don't need to combine results from multiple sensors, I suggest you to split them in different tables and/or different databases. Different database option is more convenient for deleting data.
IndexedDB database limit for performance in a single database is for more than 50K data, depending on browser and hardware. I have a couple tests which can measure the speed, just tweak the object size that is inserted and you can test your use case.
If you have less than 10K data per sensor (object store/database) you won't hit big performance issues. One common mistake when inserting batch of data is separate transaction for each insert - this is completely unnecessary, since you can store 10K data with one transaction. If you are working with even larger data set, you can separate the inserting into couple transaction, so you won't block the reading of that database.
Also for every transaction that you do in IndexeDB you need to open a connection, some people use the approach for keeping the one connection alive and reusing it, I prefer the closing and opening a separate connection for each transaction.
Also for faster access, you can store all database info into Local Storage, that way you can track how many databases you have and descriptions for each of them.
Additionally you can take a look at this similar question

CouchDB: How to change view function via javascript?

I am playing around with CouchDB to test if it is "possible" [1] to store scientific data (simulated and experimental raw data + metadata). A big pro is the schema-less approach of CouchDB: we have to be very flexible with the metadata, as the set of parameters changes very often.
Up to now I have some code to feed raw data, plots (both as attachments), and hierarchical metadata (as JSON) into CouchDB documents, and have written some prototype Javascript for filtering and showing. But the filtering is done on the client side (a.k.a. browser): The map function simply returns everything.
How could I change the (or push a second) map function of a specific _design-document with simple browser-JS?
I do not think that a temporary view would yield any performance gain...
Thanks for your time and answers.
[1]: of course it is possible, but is it also useful? feasible? reasonable?
[added]
Ah, the jquery.couch.js (version 0.9.0) provides a saveDoc() function, which could update the _design document with the new map function.
But I also tried out the query function, which uses a temporary view. Okay, "do not use this in the real product, only during development"... But scientific research is steady development, right?
Temporary views are getting cached, as I noticed, and it works well for ~1000 documents per DB. A second plus: all users (think of 1 to 3, so a big user management is quit of an overkill) can work with their own temporary view.
Never ever use temporary views. They are really only there for dev and debugging purposes. For more information, see http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views (specifically the bold "NOTE").
And yes, because design documents are really just documents with special powers, you can run you GET/POST/PUT/DELETE methods on them. However, you will usually need admin privileges to do this. So, if you are allowing a client side piece of software to do that, you are making your entire database public for read/write access - this may be fine for your application, but is important to remember.
Ex., if you restrict access to your database, but put the username and password in client side javascript, then anyone can see that username and password.
Cheers.
I´ve written an helper functions for jquery.couch and design docs, take a look at:
https://github.com/grischaandreew/jquery.couch.js

Categories