I have Json file I need to look for specific value (file is pretty big) and I want to turn this json into array so doing so would be easier and faster.
BUT what is the best way to save this array? so I wont need to run over this json every time and it will be saved until recycle or service restart?
(node js project)
1) Use Redis recommended
Pros:
Access objects super fast.
Isolated from node process.
Will not affect heap memory.
Can be deployed at separated server.
Data persistence if your application crashed.
Support compression.
low memory consumption.
Cons:
You may face some limitations if you have nested objects, but there
is a workaround which requires extra work to handle.
2) Use database preferred MongoDB,
Pros:
Save/Load objects easily since MongoDB supports JSON.
The same as number 2 of Redis pros
Cons:
You have to query every time to access objects.
3) Use Files not recommended, When your application start/restart, load your objects form file into global array, and when close/shutdown your application dump your objects from global array back into file.
Pros:
Access objects fast.
Cons:
Heap memory leakage if your objects size are Huge.
Data loss if your application crashed.
At the end it's your choice, if the speed matters choose Redis, if you want the easy way choose mongoDB. If it's not a problem losing some of your data go for Files. Also you can mix between number 2 and 3.
Related
Am developing a node js application, which reads a json list from a centralised db
List Object is around 1.2mb(if kept in txt file)
Requirement is like, data is to be refreshed every 24 hours, so i kept a cron job for it
Now after fetching data i keep it into a db(couchbase) which is locally running on my server
Data access is very frequent, i get around 1 or 2 req per sec and nearly all req need that Object
Is it good to keep that Object as a in memory object in node js or to keep it in local db ?
What are the advantages and disadvantages of both ?
Object only read for all requests , only written once by cron job
it's a high end system, i7 quad core, 16gb ram
It depends from your hardware
If this object is immutable, per requests, it's better to keep it in memory. If no - depends.
In any case workflow open connection to db - fetch data - return result - free data will consume more resources than caching in memory.
For example, in our project we processing high definition images, and keep all objects in memory - 3-7mb in raw format. Tests shows that this is much efficient than usage of any caching systems, such as redis or couch base.
I would keep the most recent version as memory object, and store it as well. That way you have a backup if anything crashes. If you edit the file however, I would only keep it as database object.
Accessing the DB for that object every 2 seconds would probably work fine, but 1.2MB of memory is not that much and if you can keep that contained, your server won't likely run into problems.
The DB is a little slow compared to memory, but has the advantage to (most likely) be thread-safe. If you would edit the document, you could run into thread problems with a memory object.
You know the application and the requirements, you should be able to tell if you would need a thread-safe database, or if you need to safe your memory on the server. If you don't know, we need to see the actual code and use-cases to tell you what you could do best.
I'm trying to get my head around the use of IndexDB. I have an SQL database which I access via REST and I'm planning on providing some local caching using IndexDB.
My SQL structure uses a large (and variable) number of tables, each table storing an array of data (time sequence and value) for a specific sensor value. Ideally, I would have assumed I'd create a new object store for each of my tables from MySQL. However, it seems that you can only create an object store when the database is opened which is a bit of a pain.
So, I see a number of options -:
I could use a single object store and add two indexes - one for the time,
and one for the sensor. I'm a little worried that this might have
performance issues, but I'm not sure how data is stored under the
hood.
I could probably detect a new sensor somehow, and open the
database with a new version number. This just feels a little wrong to
me.
I could alternatively use different databases for each sensor,
but I've read somewhere that it's not recommended to use multiple
databases (although it's unclear why since this is possibly the
easiest solution).
I'd welcome any thoughts people have regarding the best structure for this sort of data, that will provide good performance.
If your data sets are independent, example you don't need to combine results from multiple sensors, I suggest you to split them in different tables and/or different databases. Different database option is more convenient for deleting data.
IndexedDB database limit for performance in a single database is for more than 50K data, depending on browser and hardware. I have a couple tests which can measure the speed, just tweak the object size that is inserted and you can test your use case.
If you have less than 10K data per sensor (object store/database) you won't hit big performance issues. One common mistake when inserting batch of data is separate transaction for each insert - this is completely unnecessary, since you can store 10K data with one transaction. If you are working with even larger data set, you can separate the inserting into couple transaction, so you won't block the reading of that database.
Also for every transaction that you do in IndexeDB you need to open a connection, some people use the approach for keeping the one connection alive and reusing it, I prefer the closing and opening a separate connection for each transaction.
Also for faster access, you can store all database info into Local Storage, that way you can track how many databases you have and descriptions for each of them.
Additionally you can take a look at this similar question
I know memcache and redis are used when caching needs to be there for more than one servers.
I'm creating a node application which will run on single server only and uses mysql as db, and i need to hash around 100,000 keys and each key will contain json string of 200 in length, so that i dont have to call mysql for reads.
If i use memcache or redis i will use a callback to get my data, but if i use javascript hash i can get the data synchronously, but will it affect the application somehow, like high usage of memory. Which one i should be using for a application like this?
I know memcache and redis are used when caching needs to be there for more than one servers.
Not necessarily, for instance Facebook puts a memcache instance in front of each of their mysql servers. You can use Redis/Memcache for fast computation (e.g. real-time analytics) without having a whole cluster.
and i need to hash around 100,000 keys and each key will contain json string of 200 in length, so that i dont have to call mysql for reads.
It seems like premature optimization to mee, if MySQL have enough RAM (the dataset live in memory) you don't have to worry about performance, that's just 100 keys.
If i use memcache or redis i will use a callback to get my data
If really depends on what language you use (Ruby and Python offers synchronous Redis clients) and what type of paradygm is used (event-loop, thread pool...)
but if i use javascript hash i can get the data synchronously
To be more precise, that's just because you are using node_redis and not because you are using a javascript "hash" (an object in fact).
but will it affect the application somehow, like high usage of memory
It depends if you are loading all keys in your process or not, if you are using a Redis Hash, you will be able to only query the field you want and not the whole field each time.
Which one i should be using for a application like this?
The best thing to keep in mind is to lower the number of application you have to maintain in your stack while still using the right tool for the right job. Here MySQL could be enough but if you really want to use Redis or MemCached, I would go for Redis. It will offers simirarly the same features as memcached with the same performances will allowing you to use its other data-structures in the future without needing another application in your stack.
Moreover, if you put all your data in a Redis HASH, you will be able to retrieve a field (hget) or a group of fields (hmget) or all fields (hgetall) with just one call.
Finally, regarding recent statistics and Redis ecosystem (GUI, hosting, librairies, ...), Redis seems to be way more future proof than Memcached if you really want to go that way.
Disclaimer: I am the founder of Redsmin, an online developer oriented service for administrating and monitoring Redis.
It depends- you could even opt for memcached over mysql :). For simple operation such as only -readonly just storing it within your javascript code (I believe as dictionary objects) is enough. But be sure that you have enough RAM :) .
I'm working on a PhoneGap app and find it most efficient to cache as much user data as possible on the device when the user logs in. (Profile information, etc.) Due to project constraints, I can't use local storage.
I'm making an API call and pulling back of JSON data that I use to power the app. My specific question: It is somewhat safe to assume that the byte size of the JSON results will be roughly equal to the memory consumed? i.e. if the API call response is 200k of JSON data, that about 200k of memory will be used to store it in a javascript object?
The best answer is, "Test it."
As several commenters have said, under many common conditions where will be a rough 1:1 correspondence, but there are so many caveats and gotchas, some of which are engine-specific, that it's impossible to say with any certainty.
In other words, if you test it and it looks like the assumption holds under your particular use-case, then 'somewhat safe' is a reasonable assumption. Just don't bet the farm on it.
I recall reading developers should think of records in a store as a row in a database where each column is a simple data type. We store complex JS objects in Ext JS stores without any apparent ramifications.
Does anybody know of pitfalls with storing JS objects in an Ext JS store?
Modern web browsers have gotten very good at managing this kind of memory usage and processing speed. Internally we had implemented the sort of record associations that extjs 4 has built-in now, and have scenarios with ~250k complex nested records stored without any real problem. I believe that the negligible impact on performance would continue to hold up for a long time, as it also is pretty good about cleaning up its own memory usage after itself. We mirrored our web server's ORM models to extjs record definitions, and regularly query against these nested stores in much of the same ways that we would hit a more traditional database. You have to be careful what you do with it, e.g., trying to render a grid of 250k records at once will not work out very well. But that is almost entirely the impact of dom rendering and not the iteration or storage of the record/store data. All of this seems to be even more true when testing with the recent extjs 4 beta releases.
Looking at the Ext JS source it seems like a Store is a wrapper around an Object which offers sorting / filter and event functionality. It's a simple key/value pair.
The complexity of the object should not cause any problems.
Now treating a Store as anything but a wrapper around an key/value pair collection is going to cause problems. Thinking it's like a table is going to cause problems. That kind of misunderstanding leads to poorly designed code.
A Store should be treated as a Bag of key/value data with helper methods to organise that bag.
I suppose there can potentially be a performance impact if your JS objects are very large, it could take some time if you end up doing a lot of serialization/deserializtion of those JS objects. If you are dealing with grids, you can mitigate these by use of pagination.
Stores are not necessarily used strictly for grid rows. They are used in many Ext objects such as comboboxes (drop down menus). Here it is used with a key/value pair. Usually this is done for a value and a displayValue relationship for the data.
If you need an even lower level object to work with, check out the Ext.util.MixedCollection object. There are lots of fun stuff in there. It's basically a hashmap of key/value pairs. I believe in the Ext source code, Stores use these objects at its core.