In the Chrome developer tools, I notice that I have 6 TCP connections to a particular origin. The first 5 connections are idle from what I can tell. For the last of those connections, Chrome is making a call to our amazon S3 to get some images per the application logic. What I notice is that all the requests for that connection are queued till a certain point of time (say T1) and then the images are downloaded. Of course, this scenario is hard to reproduce, so I am looking for some hints on what might be going on.
My questions:
The connection in question does not have the "initial connection" in the timing information, which means that the connection might have been established before in a different tab. Is that plausible?
The other 5 connections for the same origin are to different remote addresses. Is that the reason they cannot be used to retrieve images that the 6th connection is retrieving?
Is there a mechanism to avoid this queueing delay in this scenario on the front end?
From the docs (emphasis mine)
A request being queued indicates that:
The request was postponed by the rendering engine because it's considered lower priority than critical resources (such as
scripts/styles). This often happens with images.
The request was put on hold to wait for an unavailable TCP socket that's about to free up.
The request was put on hold because the browser only allows six TCP connections per origin on HTTP 1. Time spent making disk cache
entries (typically very quick.)
This could be related to the amount of images you are requesting from your amazon service. According to this excerpt, requests on different origins should not impact each other.
If you are loading a lot of images, then considering sprite sheets or something may help you along - but that will depend on the nature of the images you are requesting.
Seems like you are making too many requests at once.
Since there is restriction on maximum number of active requests to 6 in HTTP 1.1 all other requests will get queued until the active requests get completed.
As alternative, you can use HTTP 2 / Speedy at Server which dosen't have any such restriction and many other benefits for applications making huge number of parallel requests.
You can easily enable HTTP 2 on nginx / apache.
Related
In the app I'm working on, there's a page which makes an excessive amount of requests. A few hundred requests are sent to the server at a time. Some of them are batched to reduce the amount, however it's quite a challenging task to batch all of them. So I am looking for some "cheap trick" to try out first.
But as it currently is, the requests that are deeper in the list end up resolving later than the others and it's because their "stalled" time is increasing. The screenshot displays one of the "latest" requests.
We're using http/3 so it's not because of the TCP connections limit. I feel like it's either because Chrome network thread can't handle so many requests at once and "queues" them or it's because the server can't respond to them quickly enough.
If option 1 is correct, I'm wondering if web worker can help with it. Therefore the question:
Does the web worker spawn another network thread or it's just for calculations and it uses the same process for performing XHR requests as the main thread?
I have an array with a length of one million. Each element is a string. I have a cloud function that takes a string and processes it. What is the fastest way to POST all million strings in my array to the cloud function? I don't care for the response of the cloud function. Ideally, it would POST, not wait for a response, then move on and POST the next one, iterating through the entire list as fast as possible. The issue is apparently with HTTP you cannot not wait for a response. You must wait for the response. Each cloud function takes about 10 seconds to execute. So if I need to wait for each response before moving to the next one, this would take 10 million seconds. Whereas if I could post each and not wait, I could probably run through the entire array in a few seconds.
A lot of this has been covered before in prior questions/answers, but none that I found is a pure duplicate of what you're asking so I'll reference some that have come before and add some explanation. First the ones that have come before:
How to make millions of parallel http requests from nodejs app
How to fire off 1,000,000 requests
Is there a limit to how many promises can or should run concurrently when making requests
In Node js. How many simultaneous requests can I send with the "request" package
What is the limit of sending concurrent ajax requests with node.js?
How to loop many http requests with axios in node.js
Handling large number of outbound HTTP requests
Promise.all consumes all my RAM
Properly batch nested promises in Node
How can I handle a file of 30,000 urls without memory leaks?
First off, you can send a lot of parallel outbound requests. You do not have to wait for a prior response before sending the next one.
Second, you have resource limits on both client and server and ultimately, you will have to explore with testing your local configuration and your target server to find out where those resource limits are and then write your code to stay within those limits. There is no way to reliably send a request and then immediately kill the socket because you don't care about the response. If your socket gets queued by the target server (because you've already overwhelmed it), then killing the socket may drop it from the target server's queue before it gets processed by the target server.
Your local configuration will be limited by how many simultaneous sockets you can have open and how much memory you have (as each outbound request takes some amount of memory to keep track of).
The target server will be limited by its own resources. It may have protections built-in to limit how many posts/sec it can received from one particular source (rate limiting). It may have overall server protections against how many incoming requests at once it can handle. Typically servers protect themselves from overload by configuring things so that once an incoming request queue gets to a certain level, they just immediately hang up on new requests. The idea is to provide some level of protection of service and just deflect new requests when they come in too fast.
If this isn't your target server and there isn't any documentation about what its limits are supposed to be, then you will just have to test how many simutaneous requests you can have "in-flight" at the same time. If they implement rate limiting from a given source, then it's not uncommon that this might be a fairly low number such as 5. If no rate limiting, then you're really just trying to figure out what their http server can handle without causing it to drop connections in defense of service.
Once you figure out (with testing) how many simultaneous requests in flight the target server can comfortably handle, you will have to structure your code to deliver that. Usually, you would take an approach like is show in this mapConcurrent() function where you code things so that only N requests are in flight at the same time where N is a number you figured out experimentally by testing the target server.
Relevant pieces of helper code:
mapConcurrent(array, maxConcurrent, fn)
rateLimitMap(array, requestsPerSec, maxInFlight, fn)
runN(fn, limit, cnt, options)
pMap(array, fn, limit)
And, if you want a pre-made library, the async library contains a bunch of control flow helpers like these.
I learned that under HTTP1.1, the max number of default simultaneous persistent connections per host name (origin?) is going to be 6, at least for chrome. I am not asking about the exact number of the limit since I know it varies from browser to browser. I am more curious about when we will open a new connection for new requests - does the browser reuse the same TCP connection somehow or it always starts a new TCP connection unless if it hasn't reached the limit of concurrent requests?
Let's say we are using HTTP1.1 and we have Connection: Keep-Alive
if in the html we have
<script src="https://foo/foo1.js"></script>
<script src="https://foo/foo2.js"></script>
<script src="https://foo/foo3.js"></script>
<script src="https://foo/foo4.js"></script>
<script src="https://foo/foo5.js"></script>
<script src="https://foo/foo6.js"></script>
<script src="https://foo/foo7.js"></script>
will each one of the scripts result in a new TCP connection established or all the subsequent requests will reuse the first TCP connection established by the first script tab? And if each one of these script result in a new TCP connection established, given the browser's limit for concurrent requests being 6, does the 7th request have to wait until the 6th request to be finished in order to establish the connection?
The above example is about initiating requests from HTML tags. What about api calls made from JavaScript? Let's in our javascript we have
const result1 = apiCall1()
const result2 = apiCall2()
const result3 = apiCall3()
const result4 = apiCall4()
const result5 = apiCall5()
const result6 = apiCall6()
const result7 = apiCall7()
And assume the endpoint that those API calls are hitting is all api.foo.com/v1/tasks, my questions are, again: will each one of the api call result in a new TCP connection established or all the subsequent requests will reuse the first TCP connection established by the first api call? And if each one of these api call result in a new TCP connection established, given the browser's limit for concurrent requests being 6, does the 7th request have to wait until the 6th request to be finished in order to establish the connection?
My last question is, compared to http1.1, does http2 address this problem by allowing sending many requests at the same time over one single TCP connection?
will each one of the scripts result in a new TCP connection established or all the subsequent requests will reuse the first TCP connection established by the first script tab?
Yes it would download them one by one, and start to open up more TCP connections to do that, up to the maximum of 6. The 7th request would have to wait for one of the connections to free up before it could be downloaded.
But the reality is, that the first request may have finished by the time later TCP connections are opened so it might not quite reach the 6 limit for only 6 or 7 requests.
What about api calls made from JavaScript? Let's in our javascript
Exact same thing. Limit of 6 per origin. Though one thing to note is certain CORS requests sent without credentials effectively counts as another origin (even though it’s the same actual origin) and so get another 6 connections.
My last question is, compared to http1.1, does http2 address this problem by allowing sending many requests at the same time over one single TCP connection?
Basically yes. Not quite at the same time due to the way TCP works, but as near as possible. See my answer here: What does multiplexing mean in HTTP/2
The process is simple, if you assign keep-alive the connection is remembered for faster handshake so a user can make many requests without having to re-open a costly secure connection.
Now there will always be the syn/ack process to make requests with the server. For the server to respond to every item your user requested a new connection is needed. There's bypassing this a little with cache to help your bandwidth and lessen the requests to server. All connections are ended upon request served.
So in a scenario 100 browsers want to hit your site, each request looks like 1.js 2.js... The output should be in order but this can greatly depend on a lot of things. Your language you're coding in server-sided, how it's handled, serves and if you manage any queues. If you make a request that requires longer processing (will get back to you in the future) other requests could go ahead as long as you're not blocking the event loop (comes down to your server).
Below you can see the process to establish a connection to the server, this is engaged each and every request. The cost to TLS can be improved but initial request is expensive.
Per the title, is there a maximum number of Get requests?
I need to make a couple hundred get requests to a rest API in order to dynamically load data into webpage, but I find that if I make a Promise.All array and output the promise result in the .then, eventually I get undefined due to request time outs.
Is this due to a limit on the number of connections? Is there a best practice for making large number of simultaneous requests?
Thanks for your insight!
A receiving server has a particular capability for how many simultaneous requests it can handle. It could be a small number or a very large number depending upon a whole bunch of things including the server configuration, the server software architecture, the types of request being sent, etc...
If you're getting timeouts from the server, then you are probably sending enough requests that the server can't process all of them before whatever request timeout is configured (on either client or server) and thus you get a timeout error.
The usual way of handling this on the client is to control how many simultaneous requests you will send at once and then when one finishes, you can send the next and so on. You will have to test to find out what the capabilities are of the receiving server and then you should back off a bit from that to allow other load from other sources some room to execute while your requests are running.
Assuming your requests are not unusually heavy-weight things to do on the server, I would typically test 5 or 10 requests at a time and see how the receiving server handles that.
There's a discussion of a lot of options for controlling this here:
Promise.all consumes all my RAM
Make several requests to an API that can only handle 20 request a minute
Concurrency control is also part of Promise.map() in the Bluebird promise library.
Is there a maximum number of Get requests?
Servers are limited on how many requests they can handle at once for a whole variety of reasons. Every server setup will likely be different and it also depends upon the types of requests you're sending too (and what they have to do). Some servers may be able to handle hundreds of thousands of requests (probably because there's a cluster behind them and they're configured for big load). Smaller configurations may only handle dozens at a time.
Is this due to a limit on the number of connections?
Any receiving server will have a limit on how many incoming connections it will allow to queue. What that is will depend upon many factors and there is no way for you (from the outside) to know exactly what that limit is. Timeout errors usually don't mean you're hitting this limit.
We have executed the Performance tests. Started with 25 Users and we see application crashes and application does not respond for 15 users.
Error percentage gradually increased from 0 to 100 for 15 users in 2 to 4 minutes. Below are the errors and snaps for performance tests.
Errors:
• Server side errors
• The resource cannot be found.
• HTTP 404 errors.
Can you give some pointers to improve the performance?
Well, there could be many reasons for the failure. Did you keep JMeter and the server in the same machine?
If yes, probably that is the reason for failures as both JMeter & Server consume resources.
If not, then check the following settings.
Following are the Checklist/Suggestions/Pointers:
Client Side: Simulate the anticipated load - neither less not more (There may a chance of wrong test design/scripting which resulted in more/less than the anticipated load)
Ramp-Up: Check how 25 threads are launched. There should be enough ramp-up time (say 25 threads in 2 minutes) unless you are conducting Stree/Spike test.
User Think Time: Check whether you added any timer b/w the transactions or not. If not, you are generating the more load than the anticipated. Add Timer b/w transactions to replicate real-time scenario.
Pacing: Check whether you are giving any time b/s each iteration. It not, It fastens the execution speed.
Misplaced timers (scope): Check whether your timer is applicable to all the samplers/transactions.
Cache configuration: Configure cache for static resources using HTTP Cache Manager, so that from second iteration JMeter uses its cache instead of requesting the server. This decision must be taken only if the server is allowing the client to cache the resources (check Cache-control and other related headers). Otherwise, this configuration can be ignored.
Parallel requests: If you are using Parallel Downloads field in HTTP Sampler, don't keep it more than 6 (modern browsers use multi-threading to download resources in parallel).
Above factors, if misconfigured, can result in unwanted load.
Server Side:
Scarcity of resources in Server machines: Use nmon for Linux, PerfMon for Windows. Analyze the results and find which resource is causing the trouble i.e., CPU, Memory, Network, HardDisk I/O. (Most common reason for server crash)
Misconfiguration of the server: Check for maxThreads, minThreads, maxConnections, Keep-Alive timeout, connection timeout etc. and tweak as per the needs.
Possible solution:
If resources are the bottleneck and you are generating the anticipated load, then either you have to scale in (add resource which is the bottleneck to the existing machine) or scale out (deploy the app in more server machines)