Balancing clean design against browser limits - javascript

I'm trying to design an interface to a model (part of an MVC/MVP design) which could represent data stored remotely (on a server) or locally. To do this requires that my interface be asynchronous, i.e. I will request some data from the model and a callback will provide the actual data. However, the model could also be stored locally, that is, the request could be satisfied synchronously.
The issue I am running into is that there could conceivably be a lot of calls to the model in order to generate the view, but if the model is synchronous this could result in a stack overflow since each callback is recursing one level deeper and no browser javascript implementations support tail recursion. I thought about using setTimeout but the minimum delay is 4ms, and anything running into a stack overflow (which under my version of chrome is a bit over 10k) is going to be unacceptably slow using setTimeout (over 40s!) due to all those unnecessary 4ms waits between iterations through the model.
So I'm wondering how to solve this problem and I'm not coming up with any good solutions. The browser doesn't seem capable of doing what I need it to, and I can't have two different interfaces - one for local and one for remote - because the calling code shouldn't have to care.
Update: Here is the source of the recursion:
I'm using continuation.js to turn synchronous looking code here:
function test2() {
var i
for (i=0;i<10;i++) {
AJaX(some_query+i,cont(tab))
console.log(tab.status)
}
}
Into this:
function test2() {
var i, tab;
i = 0;
function _$loop_0(_$loop_0__$cont) {
if (i < 10) {
AJaX(some_query + i, function (arguments, _$param0) {
tab = _$param0;
console.log(tab.status);
i++;
_$loop_0(_$loop_0__$cont);
}.bind(this, arguments));
} else {
_$loop_0__$cont();
}
}
_$loop_0(function () {
});
}

Related

How to run an infinite blocking process in NodeJS?

I have a set of API endpoints in Express. One of them receives a request and starts a long running process that blocks other incoming Express requests.
My goal to make this process non-blocking. To understand better inner logic of Node Event Loop and how I can do it properly, I want to replace this long running function with my dummy long running blocking function that would start when I send a request to its endpoint.
I suppose, that different ways of making the dummy function blocking could cause Node manage these blockings differently.
So, my question is - how can I make a basic blocking process as a function that would run infinitely?
You can use node-webworker-threads.
var Worker, i$, x$, spin;
Worker = require('webworker-threads').Worker;
for (i$ = 0; i$ < 5; ++i$) {
x$ = new Worker(fn$);
x$.onmessage = fn1$;
x$.postMessage(Math.ceil(Math.random() * 30));
}
(spin = function(){
return setImmediate(spin);
})();
function fn$(){
var fibo;
fibo = function(n){
if (n > 1) {
return fibo(n - 1) + fibo(n - 2);
} else {
return 1;
}
};
return this.onmessage = function(arg$){
var data;
data = arg$.data;
return postMessage(fibo(data));
};
}
function fn1$(arg$){
var data;
data = arg$.data;
console.log("[" + this.thread.id + "] " + data);
return this.postMessage(Math.ceil(Math.random() * 30));
}
https://github.com/audreyt/node-webworker-threads
So, my question is - how can I make a basic blocking process as a function that would run infinitely?
function block() {
// not sure why you need that though
while(true);
}
I suppose, that different ways of making the dummy function blocking could cause Node manage these blockings differently.
Not really. I can't think of a "special way" to block the engine differently.
My goal to make this process non-blocking.
If it is really that long running you should really offload it to another thread.
There are short cut ways to do a quick fix if its like a one time thing, you can do it using a npm module that would do the job.
But the right way to do it is setting up a common design pattern called 'Work Queues'. You will need to set up a queuing mechanism, like rabbitMq, zeroMq, etc. How it works is, whenever you get a computation heavy task, instead of doing it in the same thread, you send it to the queue with relevant id values. Then a separate node process commonly called a 'worker' process will be listening for new actions on the queue and will process them as they arrive. This is a worker queue pattern and you can read up on it here:
https://www.rabbitmq.com/tutorials/tutorial-one-javascript.html
I would strongly advise you to learn this pattern as you would come across many tasks that would require this kind of mechanism. Also with this in place you can scale both your node servers and your workers independently.
I am not sure what exactly your 'long processing' is, but in general you can approach this kind of problem in two different ways.
Option 1:
Use the webworker-threads module as #serkan pointed out. The usual 'thread' limitations apply in this scenario. You will need to communicate with the Worker in messages.
This method should be preferable only when the logic is too complicated to be broken down into smaller independent problems (explained in option 2). Depending on complexity you should also consider if native code would better serve the purpose.
Option 2:
Break down the problem into smaller problems. Solve a part of the problem, schedule the next part to be executed later, and yield to let NodeJS process other events.
For example, consider the following example for calculating the factorial of a number.
Sync way:
function factorial(inputNum) {
let result = 1;
while(inputNum) {
result = result * inputNum;
inputNum--;
}
return result;
}
Async way:
function factorial(inputNum) {
return new Promise(resolve => {
let result = 1;
const calcFactOneLevel = () => {
result = result * inputNum;
inputNum--;
if(inputNum) {
return process.nextTick(calcFactOneLevel);
}
resolve(result);
}
calcFactOneLevel();
}
}
The code in second example will not block the node process. You can send the response when returned promise resolves.

Emscripten sandwiched by asynchronous Javascript Code

I'm trying to use Emscripten to write a Software to run in browser but also on other architectures (e.g. Android, PC-standalone app).
The Software structure is something like this:
main_program_loop() {
if (gui.button_clicked()) {
run_async(some_complex_action, gui.text_field.to_string())
}
if (some_complex_action_has_finished())
{
make_use_of(get_result_from_complex_action());
}
}
some_complex_action(string_argument)
{
some_object = read_local(string_argument);
interm_res = simple_computation(some_object);
other_object = expensive_computation(interm_res);
send_remote(some_object.member_var, other_object);
return other_object.member_var;
}
Let's call main_program_loop the GUI or frontend, some_complex_action the intermediate layer, and read_local, send_remode and expensive_computation the backend or lower layer.
Now the frontend and backend would be architecture specific (e.g. for Javascript read_local could use IndexDB, send_remote could use fetch),
but the intermediate layer should make up more then 50% of the code (that's why I do not want to write it two times in two different languages, and instead write it once in C and transpile it to Javascript, for Android I would use JNI).
Problems come in since in Javascript the functions on the lowest layer (fetch etc) run asyncronously (return a promise or require a callback).
One approach I tried was to use promises and send IDs through the intermediate layer
var promises = {};
var last_id = 0;
handle_click() {
var id = Module.ccall('some_complex_action', 'number', ['string'], [text_field.value]);
promises[id].then((result) => make_us_of(result));
}
recv_remote: function(str) {
promises[last_id] = fetch(get_url(str)).then((response) => response.arrayBuffer());
last_id += 1;
return last_id - 1;
}
It works for the simple case of
some_complex_action(char *str)
{
return recv_remote(str);
}
But for real cases it seem to be getting really complicated, maybe impossible. (I tried some approach where I'd given every function a state and every time a backend function finishes, the function is recalled and advances it's state or so, but the code started getting complicated like hell.) To compare, if I was to call some_complex_action from C or Java, I'd just call it in a thread separate from the GUI thread, and inside the thread everything would happen synchronously.
I wished I could just call some_complex_action from an async function and put await inside recv_remote but of cause I can put await only directly in the async function, not in some function called down the line. So that idea did not work out either.
Ideally if somehow I could stop execution of the intermediate Emscripten transpiled code until the backend function has completed, then return from the backend function with the result and continue executing the transpiled code.
Has anyone used Emterpreter and can imagine that it could help me get to my goal?
Any ideas what I could do?

Understanding execute async script in Selenium

I've been using selenium (with python bindings and through protractor mostly) for a rather long time and every time I needed to execute a javascript code, I've used execute_script() method. For example, for scrolling the page (python):
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
Or, for infinite scrolling inside an another element (protractor):
var div = element(by.css('div.table-scroll'));
var lastRow = element(by.css('table#myid tr:last-of-type'));
browser.executeScript("return arguments[0].offsetTop;", lastRow.getWebElement()).then(function (offset) {
browser.executeScript('arguments[0].scrollTop = arguments[1];', div.getWebElement(), offset).then(function() {
// assertions
});
});
Or, for getting a dictionary of all element attributes (python):
driver.execute_script('var items = {}; for (index = 0; index < arguments[0].attributes.length; ++index) { items[arguments[0].attributes[index].name] = arguments[0].attributes[index].value }; return items;', element)
But, WebDriver API also has execute_async_script() which I haven't personally used.
What use cases does it cover? When should I use execute_async_script() instead of the regular execute_script()?
The question is selenium-specific, but language-agnostic.
When should I use execute_async_script() instead of the regular execute_script()?
When it comes to checking conditions on the browser side, all checks you can perform with execute_async_script can be performed with execute_script. Even if what you are checking is asynchronous. I know because once upon a time there was a bug with execute_async_script that made my tests fail if the script returned results too quickly. As far as I can tell, the bug is gone now so I've been using execute_async_script but for months beforehand, I used execute_script for tasks where execute_async_script would have been more natural. For instance, performing a check that requires loading a module with RequireJS to perform the check:
driver.execute_script("""
// Reset in case it's been used already.
window.__selenium_test_check = undefined;
require(["foo"], function (foo) {
window.__selenium_test_check = foo.computeSomething();
});
""")
result = driver.wait(lambda driver:
driver.execute_script("return window.__selenium_test_check;"))
The require call is asynchronous. The problem with this though, besides leaking a variable into the global space, is that it multiplies the network requests. Each execute_script call is a network request. The wait method works by polling: it runs the test until the returned value is true. This means one network request per check that wait performs (in the code above).
When you test locally it is not a big deal. If you have to go through the network because you are having the browsers provisioned by a service like Sauce Labs (which I use, so I'm talking from experience), each network request slows down your test suite. So using execute_async_script not only allows writing a test that looks more natural (call a callback, as we normally do with asynchronous code, rather than leak into the global space) but it also helps the performance of your tests.
result = driver.execute_async_script("""
var done = arguments[0];
require(["foo"], function (foo) {
done(foo.computeSomething());
});
""")
The way I see it now is that if a test is going to hook into asynchronous code on the browser side to get a result, I use execute_async_script. If it is going to do something for which there is no asynchronous method available, I use execute_script.
Here's the reference to the two APIs (well it's Javadoc, but the functions are the same), and here's an excerpt from it that highlights the difference
[executeAsyncScript] Execute an asynchronous piece of JavaScript in
the context of the currently selected frame or window. Unlike
executing synchronous JavaScript, scripts executed with this method
must explicitly signal they are finished by invoking the provided
callback. This callback is always injected into the executed function
as the last argument.
Basically, execSync blocks further actions being performed by the selenium browser, while execAsync does not block and calls on a callback when it's done.
Since you've worked with protractor, I'll use that as example.
Protractor uses executeAsyncScript in both get and waitForAngular
In waitForAngular, protractor needs to wait until angular announces that all events settled. You can't use executeScript because that needs to return a value at the end (although I guess you can implement a busy loop that polls angular constantly until it's done). The way it works is that protractor provides a callback, which Angular calls once all events settled, and that requires executeAsyncScript. Code here
In get, protractor needs to poll the page until the global window.angular is set by Angular. One way to do it is driver.wait(function() {driver.executeScript('return window.angular')}, 5000), but that way protractor would pound at the browser every few ms. Instead, we do this (simplified):
functions.testForAngular = function(attempts, callback) {
var check = function(n) {
if (window.angular) {
callback('good');
} else if (n < 1) {
callback('timedout');
} else {
setTimeout(function() {check(n - 1);}, 1000);
}
};
check(attempts);
};
Again, that requires executeAsyncScript because we don't have a return value immediately. Code here
All in all, use executeAsyncScript when you care about a return value in a calling script, but that return value won't be available immediately. This is especially necessary if you can't poll for the result, but must get the result using a callback or promise (which you must translate to callback yourself).

How do I make this JS function asynchronous?

function takesTime(){
for (var i = 0; i<some_very_large_number; i++){
//do something synchronous
}
console.log('a');
}
takesTime();
console.log('b');
This prints:
a
b
How would you make it print:
b
a
for (var i = 0; i < someVeryLargeNumber; ++i) {
setTimeout(function () {
//do something synchronous
}, 0);
}
Also see setZeroTimeout to gain a few milliseconds each loop, although the work people are doing there seems to be browser-based.
I see this is tagged node.js, so I'll answer it from that perspective: you shouldn't. Usually, if you're blocking, it will be: network-bound (you should be using and/or reusing network libraries around asynchronous methods), I/O-bound (you should be using and/or reusing I/O libraries), or CPU-bound. You haven't provided any context for what the long-running task is, and given that you have a loop invariant containing some_very_large_number, I'm assuming you're imagining some CPU-intensive task iterating over a large field.
If you're actually CPU-bound, you should rethink your strategy. Node only lives on one core, so even if you were able to use multithreading, you'd really just be spinning your wheels, as each request would still require a certain amount of CPU time. If you actually intend on doing something computationally-intensive, you may want to look into using a queuing system, and having something else processing the data that's better designed for crunching it.
Javascript is event-based, and everything happens in a single thread. The way for you to make it "asynchronous" is to use a timeout (setTimeout()).
You can use web workers to achieve your objective, but you'll require a separate js file, and you'll have to add plumbing code to post messages and handle those messages.
node.js doesn't support web workers natively, but an implementation is available at:
https://github.com/cramforce/node-worker/
Otherwise, it's similar to the following code:
var pid = require('child_process').spawn('node', ['childScript.js'])
pid.stdout.on('data', function(data) {
console.log(data);
});
console.log('b');
childScript.js
for (var i = 0; i < some_very_large_number; i++) {
// do something synchronous
}
console.log('a');

Node.js and Mutexes

I'm wondering if mutexes/locks are required for data access within Node.js. For example, lets say I've created a simple server. The server provides a couple protocol methods to add to and remove from an internal array. Do I need to protect the internal array with some type of mutex?
I understand Javascript (and thus Node.js) is single threaded. I'm just not clear on how events are handled. Do events interrupt? If that is the case, my app could be in the middle of reading the array, get interrupted to run an event callback which changes the array, and then continue processing the array which has now been changed by the event callback.
Locks and mutexes are indeed necessary sometimes, even if Node.js is single-threaded.
Suppose you have two files that must have the same content and not having the same content is considered an inconsistent state. Now suppose you need to change them without blocking the server. If you do this:
fs.writeFile('file1', 'content', function (error) {
if (error) {
// ...
} else {
fs.writeFile('file2', 'content', function (error) {
if (error) {
// ...
} else {
// ready to continue
}
});
}
});
you fall in an inconsistent state between the two calls, when another function in the same script may be able to read the two files.
The rwlock module is perfect to handle these cases.
I'm wondering if mutexes/locks are required for data access within Node.js.
Nope! Events are handled the moment there's no other code to run, this means there will be no contention, as only the currently running code has access to that internal array. As a side-effect of node being single-threaded, long computations will block all other events until the computation is done.
I understand Javascript (and thus Node.js) is single threaded. I'm just not clear on how events are handled. Do events interrupt?
Nope, events are not interrupted. For example, if you put a while(true){} into your code, it would stop any other code from being executed, because there is always another iteration of the loop to be run.
If you have a long-running computation, it is a good idea to use process.nextTick, as this will allow it to be run when nothing else is running (I'm fuzzy on this: the example below shows that I'm probably right about it running uninterrupted, probably).
If you have any other questions, feel free to stop into #node.js and ask questions. Also, I asked a couple people to look at this and make sure I'm not totally wrong ;)
var count = 0;
var numIterations = 100;
while(numIterations--) {
process.nextTick(function() {
count = count + 1;
});
}
setTimeout(function() {
console.log(count);
}, 2);
//
//=> 100
//
Thanks to AAA_awright of #node.js :)
I was looking for solution for node mutexes. Mutexes are sometimes necessary - you could be running multiple instances of your node application and may want to assure that only one of them is doing some particular thing. All solutions I could find were either not cross-process or depending on redis.
So I made my own solution using file locks: https://github.com/Perennials/mutex-node
Mutexes are definitely necessary for a lot of back end implementations. Consider a class where you need to maintain synchronicity of async execution by constructing a promise chain.
let _ = new WeakMap();
class Foobar {
constructor() {
_.set(this, { pc : Promise.resolve() } );
}
doSomething(x) {
return new Promise( (resolve,reject) => {
_.get(this).pc = _.get(this).pc.then( () => {
y = some value gotten asynchronously
resolve(y);
})
})
}
}
How can you be sure that a promise is not left dangling via race condition? It's frustrating that node hasn't made mutexes native since javascript is so inherently asynchronous and bringing third party modules into the process space is always a security risk.

Categories