Best way to Parse json files from post request in nodejs - javascript

Scenario: I have a Nodejs rest server which accepts some json file,parse it and then add it to some DB. I expect hundreds of hits per second.
Requirement:Only insertions are to be done parsing the json from request.Since nodejs is single-threaded and JSON.parse is also synchronised, How can i increase the performace? Or which must be the correct design pattern for maximum performance in nodejs?

Before designing a more complex server (maybe with worker threads), you need to profile the actual performances. The bottleneck might not be the json parsing.

Related

Streaming JSON to Redis

I have a large JSON file that I would like to store it in Redis. The problem is when I parse it I run out of memory in Node.Js
I extended the heap memory from 1.39GB from 4GB it's still happening and I believe I am not doing it properly.
With a lot of search, I found out that streaming is my best bet. The thing is I am not really fluent with Streaming and I am not sure even this would resolve my problem
I read a lot and there is a lot of scattered information. I was wondering to ask if you think if this is even approachable or if this is correct?
Would I be able to stream a JSON object into Redis?
Do I have to Staingifiying it or it will be automatically?
Should I stringify my json chunk by chunk?
or streaming into redis will end up being a string?
I am using ioRedis client to interact with Redis.
I appreciate your help in advance.
If you can guarantee that only one processor will be updating that key, you could possibly use SETRANGE. As you parse the file, you can keep a reference to the next offset:
(psuedo-code)
offset = 0
offset = redis.set_range(key, offset, "string")
Then you can load pieces of the file up to Redis without having to load everything into memory at once.
SETRANGE returns the length of the string after it was modified.
This also assumes that you can load pieces of the file contents without having to parse everything as JSON then convert it back to a string. Also assumes that only one process is updating that key -- if multiple processes try to update it, the JSON value can get corrupted.

Call a stored procedure via javascript on the client side?

Is it possible to call a stored procedure within javascript on the client side?
I know how to do on the server side, but I am interesting in doing on the client side.
Basically it boils down to directly contacting a SQL server from within the client. Is that possible?
tldr; No, it is not possible to connect to SQL Server 'directly' from browser-JavaScript1.
JavaScript can "speak" HTTP and WebSockets, but SQL Server "speaks" TDS. To communicate there needs to be a common medium/protocol that both the client and server use.
While there are WebSocket proxies that technically make this possible it still requires a separate proxy service (and you'd still have to write/find a JavaScript TDS driver). I don't recommend eliminating the controlled access layer.
Likewise, an HTTP proxy where raw SQL commands are sent to/from the client could be used. I wouldn't advise this either, but some do exist.
External code/libraries (eg. ActiveX, Java) could establish the SQL connection and proxy through to the JavaScript client.
In all of these cases there is an intermediate helper and browser-JavaScript never connects 'directly'.
1 JavaScript is a language and this answer focuses on a browser implementation with browser-supported libraries/functions. One could argue that using node modules would still 'be JavaScript', and they would be correct .. in a different environment.
You cannot establish a direct connection to a database from a client's web browser. What you will need to do is create a server side application to expose an API for getting the data over HTTP.
Take a look at Microsoft's ASP.NET Web API
Sort of
You could create an endpoint that is a wrapper for stored procedure(s) that takes the procedure name as a parameter, as well as the parameters for the procedure.
Once you have such a mechanism in place, you can create endpoints that expose procedures automagically.
http://yourserver/services/yourprocname?prm1=val,prm2=val,etc
If you feel really ambitious you can try out SQL 2016 and return JSON directly from those procedures. Then you can nest data using subqueries and return the JSON in a single payload. No serialization, no objects, just read the data and return it.
< 2016 you could put the results into a Dictionary and use NewtonSoft to serialize it. Assuming you are returning flat data you'd be good to go. Just use a reader and get the meta data from the column names for the key, and the value as object. NewtonSoft will convert that into JSON for you.
If you are returning hierarchical you could (by convention) create a series of runners that take the reader, and pump it into a Dictionary where object is another Dictionary Again the Newtonsoft stuff will help you out with the serialization.
Hope this helps, we are using this approach with 2016 and it is very nice to be able to create a stored procedure and call it without any middle tier code, deployment, etc. It just works.
Hope this helps.
Yes, you can connect to SQL Server from Client side directly by using the WebAssembly. You can write your function that calls the SQL Server in C or C++ first. Compile it to .Wasm by Emscripten compiler. Then you can call the C or C++ code by using JavaScript. In future, you should be able to do that with C# but Microsoft just started work on that.
I am writing a post about it, and I will share it when it's ready.
Now just because you can do it, doesn't mean you should because of security issues. But I am not here to give a lecture about what you should or should not do.

NodeJS JSON.stringify() bottleneck

My service returns responses of very large JSON objects - around 60MB. After some profiling I have found that it spends almost all of the time doing the JSON.stringify() call which is used to convert to string and send it as a response. I have tried custom implementations of stringify and they are even slower.
This is quite a bottleneck for my service. I want to be able to handle as many requests per second as possible - currently 1 request takes 700ms.
My questions are:
1) Can I optimize the sending of response part? Is there a more effective way than stringify-ing the object and sending the response?
2) Will using async module and performing the JSON.stringify() in a separate thread improve overall the number of requests/second(given that over 90% of the time is spent at that call)?
You've got two options:
1) find a JSON module that will allow you to stream the stringify operation, and process it in chunks. I don't know if such a module is out there, if it's not you'd have to build it. EDIT: Thanks to Reinard Mavronicolas for pointing out JSONStream in the comments. I've actually had it on my back burner to look for something like this, for a different use case.
2) async does not use threads. You'd need to use cluster or some other actual threading module to drop the processing into a separate thread. The caveat here is that you're still processing a large amount of data, you're gaining bandwidth using threads but depending on your traffic you still may hit a limit.
After some year, this question has a new answer for the first question: yieldable-json lib.
As described by in this talk by Gireesh Punathil (IBM India), this lib can evaluate a JSON of 60MB without blocking the event loop of node.js let you accept new requests in order to upgrade your throughput.
For the second one, with node.js 11 in the experimental phase, you can use the worker thread in order to increase your web server throughput.

Memcache vs Redis vs Javascript Hash object

I know memcache and redis are used when caching needs to be there for more than one servers.
I'm creating a node application which will run on single server only and uses mysql as db, and i need to hash around 100,000 keys and each key will contain json string of 200 in length, so that i dont have to call mysql for reads.
If i use memcache or redis i will use a callback to get my data, but if i use javascript hash i can get the data synchronously, but will it affect the application somehow, like high usage of memory. Which one i should be using for a application like this?
I know memcache and redis are used when caching needs to be there for more than one servers.
Not necessarily, for instance Facebook puts a memcache instance in front of each of their mysql servers. You can use Redis/Memcache for fast computation (e.g. real-time analytics) without having a whole cluster.
and i need to hash around 100,000 keys and each key will contain json string of 200 in length, so that i dont have to call mysql for reads.
It seems like premature optimization to mee, if MySQL have enough RAM (the dataset live in memory) you don't have to worry about performance, that's just 100 keys.
If i use memcache or redis i will use a callback to get my data
If really depends on what language you use (Ruby and Python offers synchronous Redis clients) and what type of paradygm is used (event-loop, thread pool...)
but if i use javascript hash i can get the data synchronously
To be more precise, that's just because you are using node_redis and not because you are using a javascript "hash" (an object in fact).
but will it affect the application somehow, like high usage of memory
It depends if you are loading all keys in your process or not, if you are using a Redis Hash, you will be able to only query the field you want and not the whole field each time.
Which one i should be using for a application like this?
The best thing to keep in mind is to lower the number of application you have to maintain in your stack while still using the right tool for the right job. Here MySQL could be enough but if you really want to use Redis or MemCached, I would go for Redis. It will offers simirarly the same features as memcached with the same performances will allowing you to use its other data-structures in the future without needing another application in your stack.
Moreover, if you put all your data in a Redis HASH, you will be able to retrieve a field (hget) or a group of fields (hmget) or all fields (hgetall) with just one call.
Finally, regarding recent statistics and Redis ecosystem (GUI, hosting, librairies, ...), Redis seems to be way more future proof than Memcached if you really want to go that way.
Disclaimer: I am the founder of Redsmin, an online developer oriented service for administrating and monitoring Redis.
It depends- you could even opt for memcached over mysql :). For simple operation such as only -readonly just storing it within your javascript code (I believe as dictionary objects) is enough. But be sure that you have enough RAM :) .

Javascript Fastest Local Database

What would be the best format for storing a relatively large amount of data (essentially a big hashmap) for quick retrieval using javascript? It would need to support Unicode as well.
XML, JSON?
Gigantic javascript objects are generally a sign that you're trying to do something you really shouldn't be doing. XML is even worse, it has to be parsed to form meaningful data.
In this case an AJAX query to RESTful interface to a proper database backend would probably serve you well.
Javascript object access (particularly for any query beyond accessing a single item by its hash) is very slow compared to even a basic database.
There is a nice research of the people at flickr about this topic. They ended up by using csv over xml and json.
JSON definitely beats XML for performance reasons.
But a query against DB on the backend would probably be the only feasible solution once a certain scale is reached, since local resources can not possibly match data retrieval from large store compared to DB.

Categories