I have large amounts (gigabytes) of json data that I would like to make available via a restful web service. The consumer of the data will be another service, and it will all be happening on a server (so the browser is not involved). Is there a practical limit on how much data can be transferred over http? Will http timeouts start to occur, or is that more a function of a browser?
There's no size limit for HTTP body. Just like downloading a huge file through web browser. And timeout is a setting of socket connection, over which HTTP is built, so it is not a browser specified feature.
However, I've met the same issue with transporting quite large json object. What need to be considered are network load, serialize/deserialize time, and memory cost. The whole process is slow (2GB of data, via intranet, using JSON.NET and some calculation we take 2-3 minutes) and it costs quite large memory. Fortunately, we just need to do that once everyday and it is a back end process. So we don't pay more attention on it. We just use sync mode for HTTP connection and set a long timeout value to prevent timeout exception (Maybe async is a good choice).
So I think it depends on your hardware and infrastructure.
Related
When initiating a network request in the browser via fetch or xmlhttprequest, what happens if the network request takes a very long time? E.g. 20 minutes. Does the browser have a time limit after which it rejects requests? Is it possible to extend this?
I am thinking about a large file upload using a single network request to a server endpoint, but which might take a very long time over slow connections. Though I am only asking about browser behavior.
Usually these values are set on the web server.
You may want to reach out to the Web Administrator and see if they can adjust the xmlhttprequest time out value.
https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/timeout
Additional Unsolicited Suggestion: For large uploads / big data, try to utilize Jumbo Frames if possible.
Some background: I am working with legacy code, and am attempting to upload a binary file (~2MB) to an embedded microhttpd web server via an HTTP form (POST request). Lately I've noticed that the upload speed from Windows 10 machines is significantly slower than from non-Windows 10 machines: the upload will send a handful of bytes at a time (about 6-7 chunks of ~1500 bytes each) and will then pause, sometimes for 30-60 seconds, before sending another handful of bytes. The decrease in speed caused by this issue renders the whole upload process unusable.
I performed some debugging on the embedded server and found that it was indeed waiting for more data to come in on the socket created for the POST request, and that for some reason this data was not being sent by the client machine. After some analysis in Wireshark on Win10 vs non-Win10 traffic, the messages I am seeing appear to tally up with the issue described by Microsoft here: https://support.microsoft.com/en-gb/help/823764/slow-performance-occurs-when-you-copy-data-to-a-tcp-server-by-using-a.
Specifically, I am seeing that in the case of Windows 10, the first TCP packet sent to the embedded web server is indeed "a single send call [that] fills the whole underlying socket send buffer", as per the Microsoft article. This does not appear to be the case for non-Windows 10 machines. Hence, I need to be able to set up my sockets so that the web client does not send so much data as to fill up the receive buffer in one packet.
Unfortunately, major modifications to the web server itself (aside from little config tweaks) are out of the question, since the legacy code I'm working with is notoriously coupled and unstable. Therefore, I'm looking for a way to specify socket settings via JavaScript, if this is possible. I'm currently using the JQuery Form plugin, which operates on top of XMLHttpRequests. Since I have complete control over both the JavaScript page and the embedded web backend, I can hard-code the socket buffer sizes appropriately in both cases.
Is there a way to tweak low-level socket settings like this from JavaScript? If not, would there be another workaround for this issue?
There is no way you can do the TCP stack specific tuning you need from inside Javascript running inside a browser on the client side. It simply does not allow this kind of access.
I have a web app that's constantly sending and requesting JSON objects to/from the server. These JSON objects can get as big as 20-40kb, and these requests might happen once every 5 to 20 seconds, depending on the user interaction.
I decided to keep my processing on the client side, so the user can use my web app without having to keep an active internet connection, but I need to sync the data to the server every once in a while. I couldn't think of a better solution than storing/processing data in the client as javascript objects and eventually saving them as json on a server. (this would also enable me to serve theses objectes with an API to mobile applications in the future)
I'd like to know how having these relatively large JSON data back and forth could make my application worse in performance, in comparison to just sending simple ajax request of a few bytes and doind all the processing on the server, and how could I make this more optimized?
20-40Kb size JSON objects for requests is pretty small according to tests done by Josh Zeigler, where the DOM Ready even took less than 62milliseconds (MAX, in IE) across 4 Major browsers for a 40KB JSON payload.
The tests were done on a 2011 2.2GHz i7 MacBook Pro with 8GB of RAM.
Here's the detailed test and test results: How Big is TOO BIG for JSON? Credit: Josh Zeigler
I am running SocketIO on NodeJS and I don't care much about wide browsers support as it's my pet project where I want to use all the power of new technologies to ease the development. My concern is about how I should send large amounts of JSON data from server to client and back. Well, these amounts are not as large as could be for video or image binary data, I suppose not larger than hundreds of kilobytes per request.
Two scenarios I see are:
Send a notification via WebSockets from server to client that some data should be fetched. Then client code runs a regular XHR request to server and gets some data via XHR.
Send the whole data set over WebSockets from server to client. In this case I don't need to run any additional requests - I just get all the data via WebSockets.
I saw first case in Meteor.js, so I wondered the reasons of it.
Please share your opinion.
Websockets should support large data sets (up to 16 exabyte in theory), so from that point of view it should work fine. The advantage of XHR is that you will be able to observe progress over time and in general better tested for large data blocks. For example, I have seen websocket server implementations which (thinking retrospectively) wouldn't handle large data well, because they would load the entire data into memory (rather than streaming the data), but that's of course not necessarily the case for socket.io (dunno). Point in case: try it out with socket.io whilst observing memory usage and stability. If it works, definitely go with websockets, because long term the support for big data packages will only get better and definitely not worse. If it turns out to be unstable or if socket.io can't stream larger data files, then use the XHR construct.
Btw, just a google search turned up siofile, haven't looked into it that much, but it might be just the thing you need.
since I'm using WebSocket connections on more regular bases, I was interested in how things work under the hood. So I digged into the endless spec documents for a while, but so far I couldn't really find anything about chunking the transmission stream itself.
The WebSocket protocol calls it data frames (which describes the pure data stream, so its also called non-control frames). As far as I understood the spec, there is no defined max-length and no defined MTU (maximum transfer unit) value, that in turn means a single WebSocket data-frame may contain, by spec(!), an infinite amount of data (please correct me if I'm wrong here, I'm still a student on this).
After reading that, I instantly setup my little Node WebSocket server. Since I have a strong Ajax history (also on streaming and Comet), my expectations originaly were like, "there must be some kind of interactive mode for reading data while it is transfered". But I am wrong there, ain't I ?
I started out small, with 4kb of data.
server
testSocket.emit( 'data', new Array( 4096 ).join( 'X' ) );
and like expected this arrives on the client as one data-chunk
client
wsInstance.onmessage = function( data ) {
console.log( data.length ); // 4095
};
so I increased the payload and I actually was expecting again, that at some point, the client-side onmessage handler will fire repeatly, effectivley chunking the transmission. But to my shock, it never happened (node-server, tested on firefox, chrome and safari client-side). My biggest payload was 80 MB
testSocket.emit( 'data', new Array( 1024*1024*80 ).join( 'X' ) );
and it still arrived in one big data-chunk on the client. Of course, this takes a while even if you have a pretty good connection. Questions here are
is there any possiblity to chunk those streams, similar to the XHR readyState3 mode ?
is there any size limit for a single ws data-frame ?
are websockets not supposed to transfer such large payloads? (which would make me wonder again why there isn't a defined max-size)
I might still look from the wrong perspective on WebSockets, probably the need for sending large data-amounts is just not there and you should chunk/split any data logically yourself before sending ?
First, you need to differentiate between the WebSocket protocol and the WebSocket API within browsers.
The WebSocket protocol has a frame-size limit of 2^63 octets, but a WebSocket message can be composed of an unlimited number of frames.
The WebSocket API within browsers does not expose a frame-based or streaming API, but only a message-based API. The payload of an incoming message is always completely buffered up (within the browser's WebSocket implementation) before providing it to JavaScript.
APIs of other WebSocket implementations may provide frame- or streaming-based access to payload transferred via the WebSocket protocol. For example, AutobahnPython does. You can read more in the examples here https://github.com/tavendo/AutobahnPython/tree/master/examples/twisted/websocket/streaming.
Disclosure: I am original author of Autobahn and work for Tavendo.
More considerations:
As long as there is no frame/streaming API in browser JS WebSocket API, you can only receive/send complete WS messages.
A single (plain) WebSocket connection cannot interleave the payload of multiple messages. So i.e. if you use large messages, those are delivered in order, and you won't be able to send small messages in between while a big message is still on the fly.
There is an upcoming WebSocket extension (extensions are a builtin mechanism to extend the protocol): WebSocket multiplexing. This allows to have multiple (logical) WebSocket connections over a single underlying TCP connection, which has multiple advantages.
Note also: you can open multiple WS connections (over different underlying TCPs) to a single target server from a single JS / HTML page today.
Note also: you can do "chunking" yourself in application layer: send your stuff in smaller WS messages a reassemble yourself.
I agree, in an ideal world, you'd have message/frame/streaming API in browser plus WebSocket multiplexing. That would give all the power and convenience.
RFC 6455 Section 1.1:
This is what the WebSocket Protocol provides: [...] an alternative to HTTP polling for two-way communication from a web page to a remote server.
As stated, WebSockets are for commmunications between a web page and a server. Please note the difference between a web page and a web browser. Examples being used are browser games and chat applications, who excange many small messages.
If you want to send many MB's in one message, I think you're not using WebSockets the way they were intended. If you want to transfer files, then do so using a Plain Old Http Request, answered with Content-Disposition to let the browser download a file.
So if you explain why you want to send such large amounts of data, perhaps someone can help come up with a more elegant solution than using WebSockets.
Besides, a client or server may refuse too large messages (although it isn't explicitly stated how it'll refuse):
RFC 6455 Section 10.4:
Implementations that have implementation- and/or platform-specific
limitations regarding the frame size or total message size after
reassembly from multiple frames MUST protect themselves against
exceeding those limits. (For example, a malicious endpoint can try
to exhaust its peer's memory or mount a denial-of-service attack by
sending either a single big frame (e.g., of size 2**60) or by sending
a long stream of small frames that are a part of a fragmented
message.) Such an implementation SHOULD impose a limit on frame
sizes and the total message size after reassembly from multiple
frames.