Worker.onmessage firing synchronously? - javascript

Investigating a strange bug for which the stack trace (on Firefox 87) points to the line where the onmesssage handler is assigned:
const worker = new Worker(/* URI */);
worker.onmessage = msg => { // stacktrace points here
let trx = JSON.parse(msg.data);
// ...
}
// more init code
The calls in the stacktrace leading into this spot match the context in which the worker is created and onmessage assigned, but the calls after this spot in the stack trace make it seem like the handler function is called synchronously when assigned.
The worker itself connects to the server and can push messages to the main thread, without the main thread first having to post messages to it. It's thus entirely possible for a new message to be there before the onmessage assignment is executed. However, I've been unable to reproduce the behavior. It seems messages posted before the handler is assigned are discarded.
Short of a race condition in Firefox itself, is there anything else that could be going on?
(The race in question here is different from this similar question)

Related

Web worker onMessage receives message twice

When I issue webworker.postMessage('any message'), it gets processed twice by the onMessage listener. I tried running stopPropagation on the message, but it still gets run again. I verified by console.log that the call to postMessage in the main thread only gets called once. I verified that the webworker being called is unique. Can I fix it so that it postMessage only results in one onMessage event being called?
Snippet showing call to web worker:
this.webworker.postMessage('Message one');
}
My web worker:
/// <reference lib="webworker" />
onmessage = function(data) {
console.log('## in web worker' + JSON.stringify(data.data));
data.stopPropagation();
}
Something similar happened also to me. I can't explain why, even I did some test. Anyway my only contribute is that the same somewhat buggy script gave me the doubled console.log message only in Firefox, not in Edge, nor Chrome.
Anyhow in my simple test worker I set a counter. This way I can confirm that the execution happens only once, and the issue is only related to the console.log.
self.curcounter=self.curcounter+ 1; //this has been executed once
self.console.log('WORKER received a INT:' + e.data + ' self.curcounter: '+ self.curcounter); //this twice

Web workers terminating abruptly

I initiated a web worker on chrome and it had a simple function that was called repeatedly using setTimeout. Surprisingly the web worker terminated after the function was called around 1000 times. Can anyone explain why? I guess chrome is doing some optimization.
webworker.js
function hi() {
postMessage('1');
setTimeout(hi, 1);
}
hi();
main.js
var blob = new Blob([code]);
var blobURL = window.URL.createObjectURL(blob);
var worker = new Worker(blobURL);
worker.onmessage = function(data) {
console.log(data.data); // gets called around 1000 times and done
};
EDIT:
Reproduced in a fiddle:
http://jsfiddle.net/meovfpv3/1/
It seems to takes arbitrarily long for the onmessage callback to stop firing, as quickly as a few seconds and as long as +5 mins
Here is my best guess at what is happening. By posting a message from the Web Worker every 1ms, you are demanding that the main thread processes each posted message within 1ms.
If the main thread isn't able to process the message within 1ms, you are still sending it a new message even though it isn't finished processing the last message. I would imagine this puts it into a queue of messages waiting to be processed.
Now since you are sending messages from the web worker faster than they can be processed, this queue of unprocessed messages is going to get bigger and bigger. At some point Chrome is going to throw up its hands and say "There are too many messages in the queue", and instead of queueing new messages for processing, it drops them.
This is why if you use a reasonable number in your timeout like 100ms, the message has plenty of time to be processed before the next message is sent, and no problem with unprocessed messages occurs.
I've created a jsFiddle where the worker sends a message to the main thread, and the main thread sends the message back to the worker. If that process doesn't happen before the next message is sent, the counters in both threads will be mismatched and the web worker will terminate.
http://jsfiddle.net/meovfpv3/3/
You can see that with a reasonable setTimeout of 100ms, all messages have adequate time to process before the next message occurs.
When you lower the setTimeout to 1ms, the message chain doesn't have time to finish before the next message is sent and the counters in each thread become eventually desynced, tripping the if clause and terminating the web worker.
One way to fix this problem is instead of blindly posting a message every 1ms whether the last one has been processed or not, only post a new message after you have received a message back from the main thread. This means that you are only posting messages as fast as the main thread can process them.
For completeness here is a copy of the JSFiddle code:
Worker:
var counter2 = 0;
var rcvd = true;
function hi() {
counter2++;
console.log("")
console.log("postMessage", counter2)
postMessage(counter2);
if (!rcvd) {
self.close();
console.log("No message received");
}
rcvd = false;
setTimeout(hi, 1);
}
hi();
onmessage = function(e) {
rcvd = true;
console.log("secondMessage", e.data);
}
Main:
var ww = document.querySelector('script[type="text/ww"]'),
code = ww.textContent,
blob = new Blob([code], {type: 'text/javascript'}),
blobUrl = URL.createObjectURL(blob),
worker = new Worker(blobUrl),
counter = 0;
worker.onmessage = function(e) {
counter++;
console.log("onmessage:", counter);
worker.postMessage(e.data);
}
Firstly, a couple of observations, which I cannot explain but are kind of interesting and might be inspirational for someone:
#Anson - If I put your jsFiddle code into Codepen (still in Chrome) there are no problems there. The onmessage callback just keeps working!
And back in jsFiddle... It fails even changing the setTimeout to a long gap like 10s so it's not the number of times that the worker posts a message, it's how long before the onmessage callback stops firing – which has a lot of variance.
Then I found some ways to keep the onmessage handler alive in this specific example:
Add a button/link in the html and a handler (I used jQuery) that will terminate the worker on click. Just adding this code fixes it. $("#stop").on("click",function(e){e.preventDefault();worker.terminate();});
Just add console.log(worker) after defining onmessage.
Inspired by an answer posted in the related question you can also simply add window.worker = worker after defining onmessage.
Something about mentioning worker again in all cases seems to keep it alive.
Are you trying to postMessage every 1ms? Then you probably meant to use setInterval():
setInterval(function(){
postMessage('1');
}, 1);
Edit: I incorrectly saw recursion which wasn't there, just because I was looking for it. I would still use setInterval over setTimeout though.

How to let javascript main thread sleep?

I'm writing a firefox addon and I want to do something before every http request is issued. Pseudocode:
observerService.addObserver(httpRequestObserver, "http-on-modify-request", false);
var httpRequestObserver=
{
observe: function()
{
var httpChannel = subject.QueryInterface(Ci.nsIHttpChannel);
asyncFunction1(){
// asynchronous function No.1
}.then(asyncFunction2(){
// asynchronous function No.2, this function will be called after asyncFunction1 finished
// do something to httpChannel (edit the request header etc.)
});
//point 1
}
}
The problem is that asyncFunction2() may finishes after observe() finished. And according to Firefox, the request will be issued after observe() finished. Thus when asyncFunction2() is editing httpChannel, the variable 'httpChannel' is already out of date(because observe() ends).
To keep the request not issued before asyncFunction2() finished, I need to do sth to let the main thread wait for asyncFunction2() at point 1. I have tried to put 'setTimeout(function waitfor(){},xxxx)' at point 1, but waitfor() starts after observe() ends. I also tried to put asyncFunction1&2 in a chrome workder(similar to web worker) and let the worker thread sends a message when asyncFunction2 finished. However, javascript seems unable to be interrupted when executing the main thread. What i found is that javascript only put the 'receive a message ! event' into the task queue. Thus when javascript is dealing with the message, the observe() has already returned.

Internals (client and server) of aborting an XMLHttpRequest

So I'm curious about the actual underlying behaviours that occur when aborting an async javascript request. There was some related info in this question but I've yet to find anything comprehensive.
My assumption has always been that aborting the request causes the browser to close the connection and stop processing it entirely, thus causing the server to do the same if it's been setup to do so. I imagine however that there might be browser-specific quirks or edge cases here I'm not thinking of.
My understanding is as follows, I'm hoping someone can correct it if necessary and that this can be a good reference for others going forwards.
Aborting the XHR request clientside causes the browser to internally close the socket and stop processing it. I would expect this behaviour rather than simply ignoring the data coming in and wasting memory. I'm not betting on IE on that though.
An aborted request on the server would be up to what's running there:
I know with PHP the default behaviour is to stop processing when the client socket is closed, unless ignore_user_abort() has been called. So closing XHR connections saves you server power as well.
I'm really interested to know how this could be handled in node.js, I assume some manual work would be needed there.
I have no idea really about other server languages / frameworks and how they behave but if anyone wants to contribute specifics I'm happy to add them here.
For the client, the best place to look is in the source, so let's do this! :)
Let's look at Blink's implementation of XMLHttpRequest's abort method (lines 1083-1119 in XMLHttpRequest.cpp):
void XMLHttpRequest::abort()
{
WTF_LOG(Network, "XMLHttpRequest %p abort()", this);
// internalAbort() clears |m_loader|. Compute |sendFlag| now.
//
// |sendFlag| corresponds to "the send() flag" defined in the XHR spec.
//
// |sendFlag| is only set when we have an active, asynchronous loader.
// Don't use it as "the send() flag" when the XHR is in sync mode.
bool sendFlag = m_loader;
// internalAbort() clears the response. Save the data needed for
// dispatching ProgressEvents.
long long expectedLength = m_response.expectedContentLength();
long long receivedLength = m_receivedLength;
if (!internalAbort())
return;
// The script never gets any chance to call abort() on a sync XHR between
// send() call and transition to the DONE state. It's because a sync XHR
// doesn't dispatch any event between them. So, if |m_async| is false, we
// can skip the "request error steps" (defined in the XHR spec) without any
// state check.
//
// FIXME: It's possible open() is invoked in internalAbort() and |m_async|
// becomes true by that. We should implement more reliable treatment for
// nested method invocations at some point.
if (m_async) {
if ((m_state == OPENED && sendFlag) || m_state == HEADERS_RECEIVED || m_state == LOADING) {
ASSERT(!m_loader);
handleRequestError(0, EventTypeNames::abort, receivedLength, expectedLength);
}
}
m_state = UNSENT;
}
So from this, it looks like the majority of the grunt work is done within internalAbort, which looks like this:
bool XMLHttpRequest::internalAbort()
{
m_error = true;
if (m_responseDocumentParser && !m_responseDocumentParser->isStopped())
m_responseDocumentParser->stopParsing();
clearVariablesForLoading();
InspectorInstrumentation::didFailXHRLoading(executionContext(), this, this);
if (m_responseLegacyStream && m_state != DONE)
m_responseLegacyStream->abort();
if (m_responseStream) {
// When the stream is already closed (including canceled from the
// user), |error| does nothing.
// FIXME: Create a more specific error.
m_responseStream->error(DOMException::create(!m_async && m_exceptionCode ? m_exceptionCode : AbortError, "XMLHttpRequest::abort"));
}
clearResponse();
clearRequest();
if (!m_loader)
return true;
// Cancelling the ThreadableLoader m_loader may result in calling
// window.onload synchronously. If such an onload handler contains open()
// call on the same XMLHttpRequest object, reentry happens.
//
// If, window.onload contains open() and send(), m_loader will be set to
// non 0 value. So, we cannot continue the outer open(). In such case,
// just abort the outer open() by returning false.
RefPtr<ThreadableLoader> loader = m_loader.release();
loader->cancel();
// If abort() called internalAbort() and a nested open() ended up
// clearing the error flag, but didn't send(), make sure the error
// flag is still set.
bool newLoadStarted = m_loader;
if (!newLoadStarted)
m_error = true;
return !newLoadStarted;
}
I'm no C++ expert but from the looks of it, internalAbort does a few things:
Stops any processing it's currently doing on a given incoming response
Clears out any internal XHR state associated with the request/response
Tells the inspector to report that the XHR failed (this is really interesting! I bet it's where those nice console messages originate)
Closes either the "legacy" version of a response stream, or the modern version of the response stream (this is probably the most interesting part pertaining to your question)
Deals with some threading issues to ensure the error is propagated properly (thanks, comments).
After doing a lot of digging around, I came across an interesting function within HttpResponseBodyDrainer (lines 110-124) called Finish which to me looks like something that would eventually be called when a request is cancelled:
void HttpResponseBodyDrainer::Finish(int result) {
DCHECK_NE(ERR_IO_PENDING, result);
if (session_)
session_->RemoveResponseDrainer(this);
if (result < 0) {
stream_->Close(true /* no keep-alive */);
} else {
DCHECK_EQ(OK, result);
stream_->Close(false /* keep-alive */);
}
delete this;
}
It turns out that stream_->Close, at least in the BasicHttpStream, delegates to the HttpStreamParser::Close, which, when given a non-reusable flag (which does seem to happen when the request is aborted, as seen in HttpResponseDrainer), does close the socket:
void HttpStreamParser::Close(bool not_reusable) {
if (not_reusable && connection_->socket())
connection_->socket()->Disconnect();
connection_->Reset();
}
So, in terms of what happens on the client, at least in the case of Chrome, it looks like your initial intuitions were correct as far as I can tell :) seems like most of the quirks and edge cases have to do with scheduling/event notification/threading issues, as well as browser-specific handling, e.g. reporting the aborted XHR to the devtools console.
In terms of the server, in the case of NodeJS you'd want to listen for the 'close' event on the http response object. Here's a simple example:
'use strict';
var http = require('http');
var server = http.createServer(function(req, res) {
res.on('close', console.error.bind(console, 'Connection terminated before response could be sent!'));
setTimeout(res.end.bind(res, 'yo'), 2000);
});
server.listen(8080);
Try running that and canceling the request before it completes. You'll see an error at your console.
Hope you found this useful. Digging through the Chromium/Blink source was a lot of fun :)

When are Javascript events executed?

When I look at JS code like:
socket = new WebSocket(server);
socket.onopen = function (evt)
{
// STUFF
};
I'm always a little bit confused. If you wrote something like that in any other language, there would be a very huge chance of the onopen 'eventhandler' being bound AFTER the connection to server already being established, causing you to miss the onopen event. Even if first line was executed asynchronously by the Javascript interpreter, there was still a slight chance of being too late on the second line.
Why does the above code run fine in Javascript while in C# (for example) it should be written as:
WebSocket socket = new WebSocket();
socket.onopen = new EventHandler<EventArgs>(Open);
socket.Connect(server);
Unlike most other languages, Javascript is strictly single-threaded.
While your code is running, nothing else can happen.
onopen cannot fire until control returns to the event loop (after the synchronous portion of that code has finished).
Note that this is true because onopen is fired in response to an asynchronous event (in this case, a socket).
If it were raised synchronously, that would not be true; to fix this, code that synchronously raises events used with this pattern should raise the event asynchronously in process.nextTick.
a DIY version might be easier to visualize than a black-box native function:
function Sock(url){
this.init=function(){
this.url=url;
this.onopen.call(this, {name:"open", url:url, dt: +new Date});
} .bind(this);
setTimeout(this.init, 0);
}
socket = new Sock("123");
socket.onopen = function (evt) {
alert( JSON.stringify(evt, null, "\t") );
};
It should be a lot more obvious why this works than why new Socket(server) works...

Categories