Chrome timers throttling and #microsoft/signalr issue

Chrome timers throttling and #microsoft/signalr issue - javascript

Accordingly blogpost from Jake Archibald, from Chrome 88 implements 3 stages of throttling
Accordingly throttling implementation from Chrome 57
There are a number of automatic exemptions from this throttling:
Applications playing audio are considered foreground and aren’t throttled.
Applications with real-time connections (WebSockets and WebRTC), to avoid closing these connections by timeout. The run-timers-once-a-second rule is still applied in these cases.
Second cite imeratively says, that once application has Websocket connection, application exempt from throttling.
The fact is, we use #microsoft/signalr library as top-level api for websocket connections, and this library uses internal ping (not a ping opcodes) messages, wrapped with setTimeout. After 5 minutes of background work, that timer throtlled and stops sending ping messages, thats leads to close event and websocket connection being closed.
I'm asking for more detailed explanation:
Does Chorome 88 enbles throttling for applications, that have real-time connections?
Does timers will be throttled regardless websocket connection appeareance and only websocket instances exempt from throttling?

accordingly this post same issues reported.
Quick explanation is:
As Jake wrote in blogpost about heavy throttling
the browser will check timers in this group once per minute. Similar to before, this means timers will batch together in these minute-by-minute checks.
That is! After 5 minutes tab spent in background, signalr ping algorithm will be throttled to 1 minute, BUT default values for keepAliveIntervalInMilliseconds = 15sec and serverTimeoutInMilliseconds = 30sec, thats twice smaller than heavy throttling timer delaying time and for server side it is count as ping failure that is predicate for lifetime methods invocation and stopping the connection, but first, server side trying to stop connection with disconnect handshake, and physically client is still connected, the result is - CloseEvent with code 1000 and wasClean = true. This behaviour wount produce any errors.
Front-end clients must update version of #microsoft/signalr to >= 5.0.6 to solve this problem. Changes

Related

Chrome TCP connection queuing for many seconds

In the Chrome developer tools, I notice that I have 6 TCP connections to a particular origin. The first 5 connections are idle from what I can tell. For the last of those connections, Chrome is making a call to our amazon S3 to get some images per the application logic. What I notice is that all the requests for that connection are queued till a certain point of time (say T1) and then the images are downloaded. Of course, this scenario is hard to reproduce, so I am looking for some hints on what might be going on.
My questions:
The connection in question does not have the "initial connection" in the timing information, which means that the connection might have been established before in a different tab. Is that plausible?
The other 5 connections for the same origin are to different remote addresses. Is that the reason they cannot be used to retrieve images that the 6th connection is retrieving?
Is there a mechanism to avoid this queueing delay in this scenario on the front end?

From the docs (emphasis mine)
A request being queued indicates that:
The request was postponed by the rendering engine because it's considered lower priority than critical resources (such as
scripts/styles). This often happens with images.
The request was put on hold to wait for an unavailable TCP socket that's about to free up.
The request was put on hold because the browser only allows six TCP connections per origin on HTTP 1. Time spent making disk cache
entries (typically very quick.)
This could be related to the amount of images you are requesting from your amazon service. According to this excerpt, requests on different origins should not impact each other.
If you are loading a lot of images, then considering sprite sheets or something may help you along - but that will depend on the nature of the images you are requesting.

Seems like you are making too many requests at once.
Since there is restriction on maximum number of active requests to 6 in HTTP 1.1 all other requests will get queued until the active requests get completed.
As alternative, you can use HTTP 2 / Speedy at Server which dosen't have any such restriction and many other benefits for applications making huge number of parallel requests.
You can easily enable HTTP 2 on nginx / apache.

Understanding socket.io ping interval/timeout settings

According to SO accepted answer, the ping timeout must be greater than the ping interval, but according to the examples in official socket.io docs, the timeout is less than the interval. Which one is correct?? Also, what could be the ideal values for both the settings for a shared whiteboard application where the moderator should not be able to edit the canvas when disconnected (internet drops off).

According to the socket.io documentation:
Among those options:
pingTimeout (Number): how many ms without a pong packet to consider the connection closed (60000)
pingInterval (Number): how many ms before sending a new ping packet (25000).
Those two parameters will impact the delay before a client knows the server is not available anymore. For example, if the underlying TCP connection is not closed properly due to a network issue, a client may have to wait up to pingTimeout + pingInterval ms before getting a disconnect event.
This leads me to believe there is no dependency on one value being greater than another. You will likely want to set a higher timeout time to allow for slow network connection to receive a response. The interval time will be the time from a failure to attempt trying again and should be set long enough to allow reconnection, but not so long you are holding the connection.
As for ideal values, this will be application specific. A few things I would consider:
How responsive must your application be?
How long will your application take to respond to a network request?
How large is the data being passed back and forth?
How many concurrent users will you have?
These are just to name a few, for a small local application you would likely be fine with a timeout of 10000 and an interval of 5000, but this is an absolute guess. You will need to consider the previously mentioned bullet points.

Server-sent events (SSE) connection loss on mobile phones

I'm working on a website where some events are broadcasted to the clients using SSE (EventSource API). When testing my site on the mobile version of Chrome for Android, I noticed the connection is lost when the channel is idle for about five minutes.
I used several Android devices, with different carriers and different browsers and the result is the same, no matter if the screen is on or off. Desktop Chrome seems to keep the connection alive for a longer time.
Couldn't find any info about this, and when trying to debug the issue, all I got was a TCP "FIN" packet received from the telephone IP address about 3 and a half minutes after the last event was sent.
EventSource's onerror event doesn't get fired so I can't know when the connection was dropped to initiate a new one.
Is there any way to avoid this problem or should I just send some fake message every 30 secs to prevent connection idling?
Thanks in advance

Your connection was probably taken by a "push proxy" - a feature that is designed to improve battery life in phones.
Quote from "https://html.spec.whatwg.org/multipage/server-sent-events.html":
User agents running in controlled environments, e.g. browsers on
mobile handsets tied to specific carriers, may offload the management
of the connection to a proxy on the network. In such a situation, the
user agent for the purposes of conformance is considered to include
both the handset software and the network proxy.
For example, a browser on a mobile device, after having established a
connection, might detect that it is on a supporting network and
request that a proxy server on the network take over the management of
the connection. The timeline for such a situation might be as follows:
Browser connects to a remote HTTP server and requests the resource specified by the author in the EventSource constructor.
The server sends occasional messages.
In between two messages, the browser detects that it is idle except for the network activity involved in keeping the TCP connection alive,
and decides to switch to sleep mode to save power.
The browser disconnects from the server.
The browser contacts a service on the network, and requests that the service, a "push proxy", maintain the connection instead.
The "push proxy" service contacts the remote HTTP server and requests the resource specified by the author in the EventSource
constructor (possibly including a Last-Event-ID HTTP header, etc).
The browser allows the mobile device to go to sleep.
The server sends another message.
The "push proxy" service uses a technology such as OMA push to convey the event to the mobile device, which wakes only enough to
process the event and then returns to sleep.
This can reduce the total data usage, and can therefore result in
considerable power savings.

You can set the retry field to establish the reconnection time of EventSource instance
If the field name is "retry"
If the field value consists of only ASCII digits, then interpret the field value as an integer in base ten, and set the event stream's
reconnection time to that integer. Otherwise, ignore the field.

Jmeter results - advice on perfomance testing

We have executed the Performance tests. Started with 25 Users and we see application crashes and application does not respond for 15 users.
Error percentage gradually increased from 0 to 100 for 15 users in 2 to 4 minutes. Below are the errors and snaps for performance tests.
Errors:
• Server side errors
• The resource cannot be found.
• HTTP 404 errors.
Can you give some pointers to improve the performance?

Well, there could be many reasons for the failure. Did you keep JMeter and the server in the same machine?
If yes, probably that is the reason for failures as both JMeter & Server consume resources.
If not, then check the following settings.
Following are the Checklist/Suggestions/Pointers:
Client Side: Simulate the anticipated load - neither less not more (There may a chance of wrong test design/scripting which resulted in more/less than the anticipated load)
Ramp-Up: Check how 25 threads are launched. There should be enough ramp-up time (say 25 threads in 2 minutes) unless you are conducting Stree/Spike test.
User Think Time: Check whether you added any timer b/w the transactions or not. If not, you are generating the more load than the anticipated. Add Timer b/w transactions to replicate real-time scenario.
Pacing: Check whether you are giving any time b/s each iteration. It not, It fastens the execution speed.
Misplaced timers (scope): Check whether your timer is applicable to all the samplers/transactions.
Cache configuration: Configure cache for static resources using HTTP Cache Manager, so that from second iteration JMeter uses its cache instead of requesting the server. This decision must be taken only if the server is allowing the client to cache the resources (check Cache-control and other related headers). Otherwise, this configuration can be ignored.
Parallel requests: If you are using Parallel Downloads field in HTTP Sampler, don't keep it more than 6 (modern browsers use multi-threading to download resources in parallel).
Above factors, if misconfigured, can result in unwanted load.
Server Side:
Scarcity of resources in Server machines: Use nmon for Linux, PerfMon for Windows. Analyze the results and find which resource is causing the trouble i.e., CPU, Memory, Network, HardDisk I/O. (Most common reason for server crash)
Misconfiguration of the server: Check for maxThreads, minThreads, maxConnections, Keep-Alive timeout, connection timeout etc. and tweak as per the needs.
Possible solution:
If resources are the bottleneck and you are generating the anticipated load, then either you have to scale in (add resource which is the bottleneck to the existing machine) or scale out (deploy the app in more server machines)

Monitoring WebSockets latency

We're building a latency-sensitive web application that uses websockets (or a Flash fallback) for sending messages to the server. While there is an excellent tool called Yahoo Boomerang for measuring bandwidth/latency for web apps, the latency value produced by Boomerang also includes the time necessary to establish HTTP connection, which is not what I need, since the websockets connection is already established and we actually only need to measure the ping time. Is there any way to solve it?
Second, Boomerang appears to fire only once when the page is loaded and doesn't seem to rerun the tests later even if commanded to. Is it possible to force it to run connection tests e.g. every 60 seconds?

Seems pretty trivial to me.
Send PING to the server. Time is t1.
Read PONG response. Time is t2 now.
ping time = t2 - t1
Repeat every once in a while (and optionally report to the stats server).
Obviously, your server would have to know to send PONG in response to a PING command.

We Keep Coding

JavaScript is the programming language of the Web.