Parse.com batch save from client (saveAll) timing out - javascript

I have a bunch of Parse objects (Can be as high as 200) that need to be updated with a common field set to a common (short) string value. I tried using a loop with save on each one, but then it spiked my API usage beyond the limit as you can imagine when there were hundreds of them.
So, I looked into how to use the saveAll to do a batch from the Javascript client. I got the code itself working fine and it is trying to update all of the files as expected. Now, the problem with this appears to be that it is doing a batch of PUT's inside a single batch POST to https://api.parse.com/1/batch and while it is treating this as a single HTTP operation from the client, the parse.com servers treat this as a single operation in terms of the timeout limit.
If I have more than about 5 files in the batch it will time out (Giving an error 124) since for some reason each individual save in the batch appears to take ~3 seconds according to chrome's network browser. How can a single save take so long?
Also, this begs the question of why it is timing out at all since each save should be a separate API call (As shown in the requests internal to the batch operation). Since I am running this batch save from the client, shouldn't there be no timeout limits anyhow as is the case in cloud code (15 seconds there)?
Can someone help me understand this? It is a huge bottleneck and I cannot figure out any other workaround. Seems like saving a batch of 5+ objects (With only a single string field that is dirty) shouldn't be so arduous!

Since the objects are all being updated with the same string to the same field have you considered using a collection? As the docs say you can create a new subclass using either a model class, or a particular Parse.Query. The code to update is simplistic:
collection.reset([
{"name": "Hawk"},
{"name": "Jane"}
]);
https://parse.com/docs/js_guide#collections

Related

Choosing the right pattern for filter large arrays - Rest vs Local Array

Which is the best approach for an app that aims to filter data, like 5000+ records, by keeping the response speed in focus?
Filter local memory arrays
Query to db through http API request calls
For my app I use angularjs, php and SQLite3. Right now I load all record from slite db to my table and then filter this field by search. All works great, but when I exceed 3000 records I notice a certain slowing down. By limiting the search on two fields, I get better performance.
My doubt is if changing the model and querying the db I get a better performance or not.
Local array advantages
I can use JavaScript Array map() Method
low consuming data bandwidth
I can see all records in table before filter
I can work, after loading data, in offline.
Local array disadvantages
slowing down performance over 2000 record.
So can you help me to evaluate advantages and disadvantages if I make http API call for any filter action request keeping in focus the performances?
I can't tell about caching in PHP, but for the AngularJS end, there's an approach you can follow:
When the user searches for the first time, fetch the result(s) from db.
Make 2 copies of the data: one presented to the user directly, another can be stores in a local json with a key value pair approach.
3.Next time the user searches for anything, look into the local json first for the result. If the data is present locally, no need for the db query, else make the db query and repeat step 2.
The idea is not to make user wait for every search, you cannot simply call all 5000+ records at once and store locally, and you definitely cannot make db queries every-time since RDMS having that much records simply have low performance issues.
So this seems best to me.

Parse.com save slow with pointers to files

I have a class called Photo with Parse.Files for different sizes of the photo and a bunch of photo metadata. Everything is working fine until I try to update one of the non-File fields on a fully populated Photo object. If I try to update one of the non-file fields but all of the File fields (Original photo, Resized photo, and thumbnail) are populated, then it takes >3 seconds to do an update of one of the String fields of metadata!
I have checked save performance against every other class I have and they all update as quickly as expected (<1 second). Why would it seem to be checking the File binaries if they are dirty (Or whatever else it might be doing on the Parse server during the save) to make the save operation take so long?
Here is an image of the performance of a Photo Object save in Chrome network browser:
And this is an example of a save for an object of class that just has primitive data types (no files):
Anyone have any insights about what is going on or how I can get around this? 3 seconds is way too long just to update a String field on a Parse Object!
I was able to determine that this was actually my afterSave handler for this class doing unnecessary work.
It was only supposed to be running at initial object creation, but was running every time and sometimes even in a recursive way. Added some logic to achieve the desired behavior and everything looks to be working as expected.
A good point to note is that it looks like the HTTP request for save() will not return until after the afterSave cloud module has completely finished running. This wasn't clear to me until after thorough testing.

NodeJS JSON.stringify() bottleneck

My service returns responses of very large JSON objects - around 60MB. After some profiling I have found that it spends almost all of the time doing the JSON.stringify() call which is used to convert to string and send it as a response. I have tried custom implementations of stringify and they are even slower.
This is quite a bottleneck for my service. I want to be able to handle as many requests per second as possible - currently 1 request takes 700ms.
My questions are:
1) Can I optimize the sending of response part? Is there a more effective way than stringify-ing the object and sending the response?
2) Will using async module and performing the JSON.stringify() in a separate thread improve overall the number of requests/second(given that over 90% of the time is spent at that call)?
You've got two options:
1) find a JSON module that will allow you to stream the stringify operation, and process it in chunks. I don't know if such a module is out there, if it's not you'd have to build it. EDIT: Thanks to Reinard Mavronicolas for pointing out JSONStream in the comments. I've actually had it on my back burner to look for something like this, for a different use case.
2) async does not use threads. You'd need to use cluster or some other actual threading module to drop the processing into a separate thread. The caveat here is that you're still processing a large amount of data, you're gaining bandwidth using threads but depending on your traffic you still may hit a limit.
After some year, this question has a new answer for the first question: yieldable-json lib.
As described by in this talk by Gireesh Punathil (IBM India), this lib can evaluate a JSON of 60MB without blocking the event loop of node.js let you accept new requests in order to upgrade your throughput.
For the second one, with node.js 11 in the experimental phase, you can use the worker thread in order to increase your web server throughput.

Reduce requested file size or reduce number of browser calculations?

I have some data that I want to display on a web page. There's quite a lot of data so I really need to figure out the most optimized way of loading and parsing it. In CSV format, the file size is 244K, and in JSON it's 819K. As I see it, I have three different options:
Load the web page and fetch the data in CSV format as an Ajax request. Then transform the data into a JS object in the browser (I'm using a built-in method of the D3.js library to accomplish this).
Load the web page and fetch the data in JSON format as an Ajax request. Data is ready to go as is.
Hard code the data in the main JS file as a JS object. No need for any async requests.
Method number one has the advantage of reduced file size, but the disadvantage of having to loop through all (2700) rows of data in the browser. Method number two gives us the data in the end-format so there's no need for heavy client-side operations. However, the size of the JSON file is huge. Method number three has the advantage of skipping additional requests to the server, with the disadvantage of a longer initial page load time.
What method is the best one in terms of optimization?
In my experience, data processing times in Javascript are usually dwarfed by transfer times and the time it takes to render the display. Based on this, I would recommend going with option 1.
However, what's best in your particular case really does depend on your particular case -- you'll have to try. It sounds like you have all the code/data you need to do that anyway, so why not run a simple experiment to see which one works best for you.

how to pass large data to web workers

I am working on web workers and I am passing large amount of data to web worker, which takes a lot of time. I want to know the efficient way to send the data.
I have tried the following code:
var worker = new Worker('js2.js');
worker.postMessage( buffer,[ buffer]);
worker.postMessage(obj,[obj.mat2]);
if (buffer.byteLength) {
alert('Transferables are not supported in your browser!');
}
UPDATE
Modern versions of Chrome, Edge, and Firefox now support SharedArrayBuffers (though not safari at the time of this writing see SharedArrayBuffers on MDN), so that would be another possibility for a fast transfer of data with a different set of trade offs compared to a transferrable (you can see MDN for all the trade offs and requirements of SharedArrayBuffers).
UPDATE:
According to Mozilla the SharedArrayBuffer has been disabled in all major browsers, thus the option described in the following EDIT does no longer apply.
Note that SharedArrayBuffer was disabled by default in all major
browsers on 5 January, 2018 in response to Spectre.
EDIT: There is now another option and it is sending a sharedArray buffer. This is part of ES2017 under shared memory and atomics and is now supported in FireFox 54 Nightly. If you want to read about it you can look here. I will probably write up something some time and add it to my answer. I will try and add to the performance benchmark as well.
To answer the original question:
I am working on web workers and I am passing large amount of data to
web worker, which takes a lot of time. I want to know the efficient
way to send the data.
The alternative to #MichaelDibbets answer, his sends a copy of the object to the webworker, is using a transferrable object which is zero-copy.
It shows that you were intending to make your data transferrable, but I'm guessing it didn't work out. So I will explain what it means for some data to be transferrable for you and future readers.
Transferring objects "by reference" (although that isn't the perfect term for it as explained in the next quote) doesn't just work on any JavaScript Object. It has to be a transferrable data-type.
[With Web Workers] Most browsers implement the structured cloning
algorithm, which allows you to pass more complex types in/out of
Workers such as File, Blob, ArrayBuffer, and JSON objects. However,
when passing these types of data using postMessage(), a copy is still
made. Therefore, if you're passing a large 50MB file (for example),
there's a noticeable overhead in getting that file between the worker
and the main thread.
Structured cloning is great, but a copy can take hundreds of
milliseconds. To combat the perf hit, you can use Transferable
Objects.
With Transferable Objects, data is transferred from one context to
another. It is zero-copy, which vastly improves the performance of
sending data to a Worker. Think of it as pass-by-reference if you're
from the C/C++ world. However, unlike pass-by-reference, the 'version'
from the calling context is no longer available once transferred to
the new context. For example, when transferring an ArrayBuffer from
your main app to Worker, the original ArrayBuffer is cleared and no
longer usable. Its contents are (quiet literally) transferred to the
Worker context.
- Eric Bidelman Developer at Google, source: html5rocks
The only problem is there are only two things that are transferrable as of now. ArrayBuffer, and MessagePort. (Canvas Proxies are hopefully coming later). ArrayBuffers cannot be manipulated directly through their API and should be used to create a typed array object or a DataView to give a particular view into the buffer and be able to read and write to it.
From the html5rocks link
To use transferrable objects, use a slightly different signature of
postMessage():
worker.postMessage(arrayBuffer, [arrayBuffer]);
window.postMessage(arrayBuffer, targetOrigin, [arrayBuffer]);
The worker case, the first argument is the data and the second is the
list of items that should be transferred. The first argument doesn't
have to be an ArrayBuffer by the way. For example, it can be a JSON
object:
worker.postMessage({data: int8View, moreData: anotherBuffer}, [int8View.buffer, anotherBuffer]);
So according to that your
var worker = new Worker('js2.js');
worker.postMessage(buffer, [ buffer]);
worker.postMessage(obj, [obj.mat2]);
should be performing at great speeds and should be being transferred zero-copy. The only problem would be if your buffer or obj.mat2 is not an ArrayBuffer or transferrable. You may be confusing ArrayBuffers with a view of a typed array instead of what you should be using its buffer.
So if you have this ArrayBuffer and it's Int32 representation. (though the variable is titled view it is not a DataView, but DataView's do have a property buffer just as typed arrays do. Also at the time this was written the MDN use the name 'view' for the result of calling a typed arrays constructor so I assumed it was a good way to define it.)
var buffer = new ArrayBuffer(90000000);
var view = new Int32Array(buffer);
for(var c=0;c<view.length;c++) {
view[c]=42;
}
This is what you should not do (send the view)
worker.postMessage(view);
This is what you should do (send the ArrayBuffer)
worker.postMessage(buffer, [buffer]);
These are the results after running this test on plnkr.
Average for sending views is 144.12690000608563
Average for sending ArrayBuffers is 0.3522000042721629
EDIT: As stated by #Bergi in the comments you don't need the buffer variable at all if you have the view, because you can just send view.buffer like so
worker.postMessage(view.buffer, [view.buffer]);
Just as a side note to future readers just sending an ArrayBuffer without the last argument specifying what the ArrayBuffers are you will not send the ArrayBuffer transferrably
In other words when sending transferrables you want this:
worker.postMessage(buffer, [buffer]);
Not this:
worker.postMessage(buffer);
EDIT: And one last note since you are sending a buffer don't forget to turn your buffer back into a view once it's received by the webworker. Once it's a view you can manipulate it (read and write from it) again.
And for the bounty:
I am also interested in official size limits for firefox/chrome (not
only time limit). However answer the original question qualifies for
the bounty (;
As to a webbrowsers limit to send something of a certain size I am not completeley sure, but from that quote that entry on html5rocks by Eric Bidelman when talking about workers he did bring up a 50 mb file being transferred without using a transferrable data-type in hundreds of milliseconds and as shown through my test in a only around a millisecond using a transferrable data-type. Which 50 mb is honestly pretty large.
Purely my own opinion, but I don't believe there to be a limit on the size of the file you send on a transferrable or non-transferrable data-type other than the limits of the data type itself. Of course your biggest worry would probably be for the browser stopping long running scripts if it has to copy the whole thing and is not zero-copy and transferrable.
Hope this post helps. Honestly I knew nothing about transferrables before this, but it was fun figuring out them through some tests and through that blog post by Eric Bidelman.
I had issues with webworkers too, until I just passed a single argument to the webworker.
So instead of
worker.postMessage( buffer,[ buffer]);
worker.postMessage(obj,[obj.mat2]);
Try
var myobj = {buffer:buffer,obj:obj};
worker.postMessage(myobj);
This way I found it gets passed by reference and its insanely fast. I post back and forth over 20.000 dataelements in a single push per 5 seconds without me noticing the datatransfer.
I've been exclusively working with chrome though, so I don't know how it'll hold up in other browsers.
Update
I've done some testing for some stats.
tmp = new ArrayBuffer(90000000);
test = new Int32Array(tmp);
for(c=0;c<test.length;c++) {
test[c]=42;
}
for(c=0;c<4;c++) {
window.setTimeout(function(){
// Cloning the Array. "We" will have lost the array once its sent to the webworker.
// This is to make sure we dont have to repopulate it.
testsend = new Int32Array(test);
// marking time. sister mark is in webworker
console.log("sending at at "+window.performance.now());
// post the clone to the thread.
FieldValueCommunicator.worker.postMessage(testsend);
},1000*c);
}
results of the tests. I don't know if this falls in your category of slow or not since you did not define "slow"
sending at at 28837.418999988586
recieved at 28923.06199995801
86 ms
sending at at 212387.9840001464
recieved at 212504.72499988973
117 ms
sending at at 247635.6210000813
recieved at 247760.1259998046
125 ms
sending at at 288194.15999995545
recieved at 288304.4079998508
110 ms
It depends on how large the data is
I found this article that says, the better strategy is to pass large data to a web worker and back in small bits. In addition, it also discourages the use of ArrayBuffers.
Please have a look: https://developers.redhat.com/blog/2014/05/20/communicating-large-objects-with-web-workers-in-javascript

Categories