What Does 'Then' Really Mean in CasperJS - javascript

I'm using CasperJS to automate a series of clicks, completed forms, parsing data, etc through a website.
Casper seems to be organized into a list of preset steps in the form of then statements (see their example here: http://casperjs.org/quickstart.html) but it's unclear what triggers the next statement to actually run.
For example, does then wait for all pending requests to complete? Does injectJS count as a pending request? What happens if I have a then statement nested - chained to the end of an open statement?
casper.thenOpen('http://example.com/list', function(){
casper.page.injectJs('/libs/jquery.js');
casper.evaluate(function(){
var id = jQuery("span:contains('"+itemName+"')").closest("tr").find("input:first").val();
casper.open("http://example.com/show/"+id); //what if 'then' was added here?
});
});
casper.then(function(){
//parse the 'show' page
});
I'm looking for a technical explanation of how the flow works in CasperJS. My specific problem is that my last then statement (above) runs before my casper.open statement & I don't know why.

then() basically adds a new navigation step in a stack. A step is a javascript function which can do two different things:
waiting for the previous step - if any - being executed
waiting for a requested url and related page to load
Let's take a simple navigation scenario:
var casper = require('casper').create();
casper.start();
casper.then(function step1() {
this.echo('this is step one');
});
casper.then(function step2() {
this.echo('this is step two');
});
casper.thenOpen('http://google.com/', function step3() {
this.echo('this is step 3 (google.com is loaded)');
});
You can print out all the created steps within the stack like this:
require('utils').dump(casper.steps.map(function(step) {
return step.toString();
}));
That gives:
$ casperjs test-steps.js
[
"function step1() { this.echo('this is step one'); }",
"function step2() { this.echo('this is step two'); }",
"function _step() { this.open(location, settings); }",
"function step3() { this.echo('this is step 3 (google.com is loaded)'); }"
]
Notice the _step() function which has been added automatically by CasperJS to load the url for us; when the url is loaded, the next step available in the stack — which is step3() — is called.
When you have defined your navigation steps, run() executes them one by one sequentially:
casper.run();
Footnote: the callback/listener stuff is an implementation of the Promise pattern.

then() merely registers a series of steps.
run() and its family of runner functions, callbacks, and listeners, are all what actually do the work of executing each step.
Whenever a step is completed, CasperJS will check against 3 flags: pendingWait, loadInProgress, and navigationRequested. If any of those flags is true, then do nothing, go idle until a later time (setInterval style). If none of those flags is true, then the next step will get executed.
As of CasperJS 1.0.0-RC4, a flaw exists, where, under certain time-based circumstances, the "try to do next step" method will be triggered before CasperJS had the time to raise either one of the loadInProgress or navigationRequested flags. The solution is to raise one of those flags before leaving any step where those flags are expected to be raised (ex: raise a flag either before or after asking for a casper.click()), maybe like so:
(Note: This is only illustrative, more like psuedocode than proper CasperJS form...)
step_one = function(){
casper.click(/* something */);
do_whatever_you_want()
casper.click(/* something else */); // Click something else, why not?
more_magic_that_you_like()
here_be_dragons()
// Raise a flag before exiting this "step"
profit()
}
To wrap up that solution into a single-line of code, I introduced blockStep() in this github pull request, extending click() and clickLabel() as a means to help guarantee that we get the expected behaviour when using then(). Check out the request for more info, usage patterns, and minimum test files.

According to the CasperJS Documentation:
then()
Signature: then(Function then)
This method is the standard way to add a new navigation step to the stack, by providing a simple function:
casper.start('http://google.fr/');
casper.then(function() {
this.echo('I\'m in your google.');
});
casper.then(function() {
this.echo('Now, let me write something');
});
casper.then(function() {
this.echo('Oh well.');
});
casper.run();
You can add as many steps as you need. Note that the current Casper instance automatically binds the this keyword for you within step functions.
To run all the steps you defined, call the run() method, and voila.
Note: You must start() the casper instance in order to use the then() method.
Warning: Step functions added to then() are processed in two different cases:
when the previous step function has been executed,
when the previous main HTTP request has been executed and the page loaded;
Note that there's no single definition of page loaded; is it when the DOMReady event has been triggered? Is it "all requests being finished"? Is it "all application logic being performed"? Or "all elements being rendered"? The answer always depends on the context. Hence why you're encouraged to always use the waitFor() family methods to keep explicit control on what you actually expect.
A common trick is to use waitForSelector():
casper.start('http://my.website.com/');
casper.waitForSelector('#plop', function() {
this.echo('I\'m sure #plop is available in the DOM');
});
casper.run();
Behind the scenes, the source code for Casper.prototype.then is shown below:
/**
* Schedules the next step in the navigation process.
*
* #param function step A function to be called as a step
* #return Casper
*/
Casper.prototype.then = function then(step) {
"use strict";
this.checkStarted();
if (!utils.isFunction(step)) {
throw new CasperError("You can only define a step as a function");
}
// check if casper is running
if (this.checker === null) {
// append step to the end of the queue
step.level = 0;
this.steps.push(step);
} else {
// insert substep a level deeper
try {
step.level = this.steps[this.step - 1].level + 1;
} catch (e) {
step.level = 0;
}
var insertIndex = this.step;
while (this.steps[insertIndex] && step.level === this.steps[insertIndex].level) {
insertIndex++;
}
this.steps.splice(insertIndex, 0, step);
}
this.emit('step.added', step);
return this;
};
Explanation:
In other words, then() schedules the next step in the navigation process.
When then() is called, it is passed a function as a parameter which is to be called as a step.
It checks if an instance has started, and if it has not, it displays the following error:
CasperError: Casper is not started, can't execute `then()`.
Next, it checks if the page object is null.
If the condition is true, Casper creates a new page object.
After that, then() validates the step parameter to check if it is not a function.
If the parameter is not a function, it displays the following error:
CasperError: You can only define a step as a function
Then, the function checks if Casper is running.
If Casper is not running, then() appends the step to the end of the queue.
Otherwise, if Casper is running, it inserts a substep a level deeper than the previous step.
Finally, the then() function concludes by emitting a step.added event, and returns the Casper object.

Related

How does node process concurrent requests?

I have been reading up on nodejs lately, trying to understand how it handles multiple concurrent requests. I know NodeJs is a single threaded event loop based architecture, and at a given point in time only one statement is going to be executing, i.e. on the main thread and that blocking code/IO calls are handled by the worker threads (default is 4).
Now my question is, what happens when a web server built using NodeJs receives multiple requests? I know that there are lots of similar questions here, but haven't found a concrete answer to my question.
So as an example, let's say we have following code inside a route like /index:
app.use('/index', function(req, res, next) {
console.log("hello index routes was invoked");
readImage("path", function(err, content) {
status = "Success";
if(err) {
console.log("err :", err);
status = "Error"
}
else {
console.log("Image read");
}
return res.send({ status: status });
});
var a = 4, b = 5;
console.log("sum =", a + b);
});
Let's assume that the readImage() function takes around 1 min to read that Image.
If two requests, T1, and T2 come in concurrently, how is NodeJs going to process these request ?
Does it going to take first request T1, process it while queueing the request T2? I assume that if any async/blocking stuff is encountered like readImage, it then sends that to a worker thread (then some point later when async stuff is done that thread notifies the main thread and main thread starts executing the callback?), and continues by executing the next line of code?
When it is done with T1, it then processes the T2 request? Is that correct? Or it can process T2 in between (meaning whilethe code for readImage is running, it can start processing T2)?
Is that right?
Your confusion might be coming from not focusing on the event loop enough. Clearly you have an idea of how this works, but maybe you do not have the full picture yet.
Part 1, Event Loop Basics
When you call the use method, what happens behind the scenes is another thread is created to listen for connections.
However, when a request comes in, because we're in a different thread than the V8 engine (and cannot directly invoke the route function), a serialized call to the function is appended onto the shared event loop, for it to be called later. ('event loop' is a poor name in this context, as it operates more like a queue or stack)
At the end of the JavaScript file, the V8 engine will check if there are any running theads or messages in the event loop. If there are none, it will exit with a code of 0 (this is why server code keeps the process running). So the first Timing nuance to understand is that no request will be processed until the synchronous end of the JavaScript file is reached.
If the event loop was appended to while the process was starting up, each function call on the event loop will be handled one by one, in its entirety, synchronously.
For simplicity, let me break down your example into something more expressive.
function callback() {
setTimeout(function inner() {
console.log('hello inner!');
}, 0); // †
console.log('hello callback!');
}
setTimeout(callback, 0);
setTimeout(callback, 0);
† setTimeout with a time of 0, is a quick and easy way to put something on the event loop without any timer complications, since no matter what, it has always been at least 0ms.
In this example, the output will always be:
hello callback!
hello callback!
hello inner!
hello inner!
Both serialized calls to callback are appended to the event loop before either of them is called. This is guaranteed. That happens because nothing can be invoked from the event loop until after the full synchronous execution of the file.
It can be helpful to think of the execution of your file, as the first thing on the event loop. Because each invocation from the event loop can only happen in series, it becomes a logical consequence, that no other event loop invocation can occur during its execution; Only when the previous invocation is finished, can the next event loop function be invoked.
Part 2, The inner Callback
The same logic applies to the inner callback as well, and can be used to explain why the program will never output:
hello callback!
hello inner!
hello callback!
hello inner!
Like you might expect.
By the end of the execution of the file, two serialized function calls will be on the event loop, both for callback. As the Event loop is a FIFO (first in, first out), the setTimeout that came first, will be be invoked first.
The first thing callback does is perform another setTimeout. As before, this will append a serialized call, this time to the inner function, to the event loop. setTimeout immediately returns, and execution will move on to the first console.log.
At this time, the event loop looks like this:
1 [callback] (executing)
2 [callback] (next in line)
3 [inner] (just added by callback)
The return of callback is the signal for the event loop to remove that invocation from itself. This leaves 2 things in the event loop now: 1 more call to callback, and 1 call to inner.
Now callback is the next function in line, so it will be invoked next. The process repeats itself. A call to inner is appended to the event loop. A console.log prints Hello Callback! and we finish by removing this invocation of callback from the event loop.
This leaves the event loop with 2 more functions:
1 [inner] (next in line)
2 [inner] (added by most recent callback)
Neither of these functions mess with the event loop any further. They execute one after the other, the second one waiting for the first one's return. Then when the second one returns, the event loop is left empty. This fact, combined with the fact that there are no other threads currently running, triggers the end of the process, which exits with a return code of 0.
Part 3, Relating to the Original Example
The first thing that happens in your example, is that a thread is created within the process which will create a server bound to a particular port. Note, this is happening in precompiled C++ code, not JavaScript, and is not a separate process, it's a thread within the same process. see: C++ Thread Tutorial.
So now, whenever a request comes in, the execution of your original code won't be disturbed. Instead, incoming connection requests will be opened, held onto, and appended to the event loop.
The use function, is the gateway into catching the events for incoming requests. Its an abstraction layer, but for the sake of simplicity, it's helpful to think of the use function like you would a setTimeout. Except, instead of waiting a set amount of time, it appends the callback to the event loop upon incoming http requests.
So, let's assume that there are two requests coming in to the server: T1 and T2. In your question you say they come in concurrently, since this is technically impossible, I'm going to assume they are one after the other, with a negligible time in between them.
Whichever request comes in first, will be handled first by the secondary thread from earlier. Once that connection has been opened, it's appended to the event loop, and we move on to the next request, and repeat.
At any point after the first request is added to the event loop, V8 can begin execution of the use callback.
A quick aside about readImage
Since its unclear whether readImage is from a particular library, something you wrote or otherwise, it's impossible to tell exactly what it will do in this case. There are only 2 possibilities though, so here they are:
It's entirely synchronous, never using an alternate thread or the event loop
function readImage (path, callback) {
let image = fs.readFileSync(path);
callback(null, image);
// a definition like this will force the callback to
// fully return before readImage returns. This means
// means readImage will block any subsequent calls.
}
It's entirely asynchronous, and takes advantage of fs' async callback.
function readImage (path, callback) {
fs.readFile(path, (err, data) => {
callback(err, data);
});
// a definition like this will force the readImage
// to immediately return, and allow exectution
// to continue.
}
For the purposes of explanation, I'll be operating under the assumption that readImage will immediately return, as proper asynchronous functions should.
Once the use callback execution is started, the following will happen:
The first console log will print.
readImage will kick off a worker thread and immediately return.
The second console log will print.
During all of this, its important to note, these operations are happening synchronously; No other event loop invocation can start until these are finished. readImage may be asynchronous, but calling it is not, the callback and usage of a worker thread is what makes it asynchronous.
After this use callback returns, the next request has probably already finished parsing and was added to the event loop, while V8 was busy doing our console logs and readImage call.
So the next use callback is invoked, and repeats the same process: log, kick off a readImage thread, log again, return.
After this point, the readImage functions (depending on how long they take) have probably already retrieved what they needed and appended their callback to the event loop. So they will get executed next, in order of whichever one retrieved its data first. Remember, these operations were happening in separate threads, so they happened not only in parallel to the main javascript thread, but also parallel to each other, so here, it doesn't matter which one got called first, it matters which one finished first, and got 'dibs' on the event loop.
Whichever readImage completed first will be the first one to execute. So, assuming no errors occured, we'll print out to the console, then write to the response for the corresponding request, held in lexical scope.
When that send returns, the next readImage callback will begin execution: console log, and writing to the response.
At this point, both readImage threads have died, and the event loop is empty, but the thread that holds the server port binding is keeping the process alive, waiting for something else to add to the event loop, and the cycle to continue.
I hope this helps you understand the mechanics behind the asynchronous nature of the example you provided.
For each incoming request, node will handle it one by one. That means there must be order, just like the queue, first in first serve. When node starts processing request, all synchronous code will execute, and asynchronous will pass to work thread, so node can start to process the next request. When the asynchrous part is done, it will go back to main thread and keep going.
So when your synchronous code takes too long, you block the main thread, node won't be able to handle other request, it's easy to test.
app.use('/index', function(req, res, next) {
// synchronous part
console.log("hello index routes was invoked");
var sum = 0;
// useless heavy task to keep running and block the main thread
for (var i = 0; i < 100000000000000000; i++) {
sum += i;
}
// asynchronous part, pass to work thread
readImage("path", function(err, content) {
// when work thread finishes, add this to the end of the event loop and wait to be processed by main thread
status = "Success";
if(err) {
console.log("err :", err);
status = "Error"
}
else {
console.log("Image read");
}
return res.send({ status: status });
});
// continue synchronous part at the same time.
var a = 4, b = 5;
console.log("sum =", a + b);
});
Node won't start processing the next request until finish all synchronous part. So people said don't block the main thread.
There are a number of articles that explain this such as this one
The long and the short of it is that nodejs is not really a single threaded application, its an illusion. The diagram at the top of the above link explains it reasonably well, however as a summary
NodeJS event-loop runs in a single thread
When it gets a request, it hands that request off to a new thread
So, in your code, your running application will have a PID of 1 for example. When you get request T1 it creates PID 2 that processes that request (taking 1 minute). While thats running you get request T2 which spawns PID 3 also taking 1 minute. Both PID 2 and 3 will end after their task is completed, however PID 1 will continue listening and handing off events as and when they come in.
In summary, NodeJS being 'single threaded' is true, however its just an event-loop listener. When events are heard (requests), it passes them off to a pool of threads that execute asynchronously, meaning its not blocking other requests.
You can simply create child process by shifting readImage() function in a different file using fork().
The parent file, parent.js:
const { fork } = require('child_process');
const forked = fork('child.js');
forked.on('message', (msg) => {
console.log('Message from child', msg);
});
forked.send({ hello: 'world' });
The child file, child.js:
process.on('message', (msg) => {
console.log('Message from parent:', msg);
});
let counter = 0;
setInterval(() => {
process.send({ counter: counter++ });
}, 1000);
Above article might be useful to you .
In the parent file above, we fork child.js (which will execute the file with the node command) and then we listen for the message event. The message event will be emitted whenever the child uses process.send, which we’re doing every second.
To pass down messages from the parent to the child, we can execute the send function on the forked object itself, and then, in the child script, we can listen to the message event on the global process object.
When executing the parent.js file above, it’ll first send down the { hello: 'world' } object to be printed by the forked child process and then the forked child process will send an incremented counter value every second to be printed by the parent process.
The V8 JS interpeter (ie: Node) is basically single threaded. But, the processes it kicks off can be async, example: 'fs.readFile'.
As the express server runs, it will open new processes as it needs to complete the requests. So the 'readImage' function will be kicked off (usually asynchronously) meaning that they will return in any order. However the server will manage which response goes to which request automatically.
So you will NOT have to manage which readImage response goes to which request.
So basically, T1 and T2, will not return concurrently, this is virtually impossible. They are both heavily reliant on the Filesystem to complete the 'read' and they may finish in ANY ORDER (this cannot be predicted). Note that processes are handled by the OS layer and are by nature multithreaded (in a modern computer).
If you are looking for a queue system, it should not be too hard to implement/ensure that images are read/returned in the exact order that they are requested.
Since there's not really more to add to the previous answer from Marcus - here's a graphic that explains the single threaded event-loop mechanism:

Stopping synchronous function after 2 seconds

I'm using the npm library jsdiff, which has a function that determines the difference between two strings. This is a synchronous function, but given two large, very different strings, it will take extremely long periods of time to compute.
diff = jsdiff.diffWords(article[revision_comparison.field], content[revision_comparison.comparison]);
This function is called in a stack that handles an request through Express. How can I, for the sake of the user, make the experience more bearable? I think my two options are:
Cancelling the synchronous function somehow.
Cancelling the user request somehow. (But would this keep the function still running?)
Edit: I should note that given two very large and different strings, I want a different logic to take place in the code. Therefore, simply waiting for the process to finish is unnecessary and cumbersome on the load - I definitely don't want it to run for any long period of time.
fork a child process for that specific task, you can even create a queu to limit the number of child process that can be running in a given moment.
Here you have a basic example of a worker that sends the original express req and res to a child that performs heavy sync. operations without blocking the main (master) thread, and once it has finished returns back to the master the outcome.
Worker (Fork Example) :
process.on('message', function(req,res) {
/* > Your jsdiff logic goes here */
//change this for your heavy synchronous :
var input = req.params.input;
var outcome = false;
if(input=='testlongerstring'){outcome = true;}
// Pass results back to parent process :
process.send(req,res,outcome);
});
And from your Master :
var cp = require('child_process');
var child = cp.fork(__dirname+'/worker.js');
child.on('message', function(req,res,outcome) {
// Receive results from child process
console.log('received: ' + outcome);
res.send(outcome); // end response with data
});
You can perfectly send some work to the child along with the req and res like this (from the Master): (imagine app = express)
app.get('/stringCheck/:input',function(req,res){
child.send(req,res);
});
I found this on jsdiff's repository:
All methods above which accept the optional callback method will run in sync mode when that parameter is omitted and in async mode when supplied. This allows for larger diffs without blocking the event loop. This may be passed either directly as the final parameter or as the callback field in the options object.
This means that you should be able to add a callback as the last parameter, making the function asynchronous. It will look something like this:
jsdiff.diffWords(article[x], content[y], function(err, diff) {
//add whatever you need
});
Now, you have several choices:
Return directly to the user and keep the function running in the background.
Set a 2 second timeout (or whatever limit fits your application) using setTimeout as outlined in this
answer.
If you go with option 2, your code should look something like this
jsdiff.diffWords(article[x], content[y], function(err, diff) {
//add whatever you need
return callback(err, diff);
});
//if this was called, it means that the above operation took more than 2000ms (2 seconds)
setTimeout(function() { return callback(); }, 2000);

Forcing code to run sequentially in JavaScript

In a Angular.js and Socket.io App, I want to show a loading before sending an image via Socket.io. I write this code:
When a button clicked this code runs, first i want to run startLoading function and after that I want to send image:
$scope.startLoading(function(){
$scope.socket.emit('sendImg', {
data: $scope.newImg,
});
});
and this is my startLoading function:
$scope.startLoading = function (callback) {
//DO SOME STUFF LIKE ENABLING LOADING ANIMATION
$scope.animation =true; //binded to UI
$scope.errors = false; //binded to UI
callback(); //it seems this code runs before above lines
};
But it seems callback() line runs before first two lines and because of that, my loading appears after sending of image to the server! why? i change callback line to a timeout like this and it works fine but is this a good solution? i dont think! what i have to do for a standard code?
$scope.startLoading = function (callback) {
//DO SOME STUFF LIKE ENABLING LOADING ANIMATION
$scope.animation =true; //binded to UI
$scope.errors = false; //binded to UI
$timeout(function(){callback();}, 1000);
};
Actually, code runs sequentially but calling callback and sending image freezes page and because of that, my loading appears after freezing ends. but i need before freezing, loading starts
Your code really does run sequentially, but the first two lines don't change the UI immediately.
When you assign some value to a scope variable, it's just that, a variable assignment. It doesn't trigger any events. Angular will only update the UI later, when it evaluates the bindings and finds the change. So here is what happens:
$scope.startLoading = function (callback) {
// Presumably this is called from some event from Angular,
// so all this is run in an $apply, in an Angular "context".
// But this is still "plain" javascript, so the next two lines
// are just plain variable assignments.
$scope.animation =true;
$scope.errors = false;
callback(); // This function does its thing, then returns
// When this function returns, Angular will evaluate all of its
// bindings, will find that the above values have changed,
// and will update the DOM.
};
For details, see the "Integration with the browser event loop" section in the dev guide.
What you want is to ensure that the DOM is updated before your callback runs. I think there is nothing wrong with using $timeout for this.
There might be a better/nicer way, but I haven't found it yet...
So it would become something like this:
$scope.startLoading = function (callback) {
$scope.animation =true; // at this point, these are just
$scope.errors = false; // plain variable assignments
$timeout(callback); // schedule callback to run later
// After this returns, Angular will evaluate its bindings,
// and update the DOM, so if $scope.animation and $scope.errors
// are bound to something, they can trigger some visible change.
// callback will be called in the next $digest cycle, so _after_
// the DOM has been updated
};
(There is no need to specify a value for the timeout if you only want to run it in the next "tick". Also, there is no need to wrap callback, it can be directly passed to $timeout.)
Hope this helps!

Why will my subsequent Meteor method calls not wait for the first one to finish when I call Meteor.setTimeout()?

I am fairly new to Meteor, fibers and futures and I am trying to understand how Meteor methods work. It is my understanding that each method call from a client would wait for a previous one to finish. This belief is mostly based on the documentation of the this.unblock() function in the Meteor docs. However, when I try setting up a simple example with a Meteor.setTimeout() call this does not seem to be a correct assumption.
methodCall.js:
if (Meteor.isClient) {
Template.hello.events({
'click button': function () {
Meteor.call('test', function(error, result){
});
}
});
}
if (Meteor.isServer) {
Meteor.methods({
test: function(){
console.log("outside");
Meteor.setTimeout(function(){
console.log("inside");
return 'done';
}, 2000);
}
});
}
When triggering the 'click button' event several times the terminal output is as follows:
outside
outside
outside
outside
inside
inside
inside
inside
and not alternating between outside and inside as I would expect. I think there is a very relevant bit of information on Meteor.setTimeout() I am missing, but I could not find anything in the documentation indicating this behaviour. What am I missing and is there a way of making the Meteor method invocations from a client wait until a previous invocation is finished before starting the execution of the next?
I found this question on SO which seemed promising, but the question is more focused on blocking the possibility to call the method from the client side. Likewise, the accepted answer is not completely satisfying as it focuses on making subsequent calls skip certain code blocks of the Meteor method instead of waiting for the first invocation to finish. This very well be the answer I guess, but I really want to understand why the method call is not blocked in the first place as I feel the Meteor documentation indicates.
The answer is that the setTimeout callback is executed outside the fiber in which the method is running. What that means is that the method actually finishes execution (returning undefined) before the setTimeout callback is ever invoked, and you get the behavior you observed.
To provide a better test (and for an example of using asynchronous functions in methods), try this:
if (Meteor.isServer) {
var Future = Npm.require('fibers/future');
Meteor.methods({
test: function(){
var fut = new Future();
console.log("outside");
Meteor.setTimeout(function(){
console.log("inside");
fut.return('done');
return 'done';
}, 2000);
return fut.wait();
}
});
}
The return value from your setTimeout callback doesn't actually go anywhere, it just curtails that function (i.e. the callback, not the method). The way it's written above, the Future object, fut, is supplied with the return value once the callback runs, but the main method function (which is still running in its original fiber) is prevented from returning until that value has been supplied.
The upshot is that unless you unblock this method, you will get the expected output as the next method invocation won't start until the previous one has returned.
UPDATE
In general, anything with a callback will have the callback added to the event loop after the current Fiber is closed, so timeouts, HTTP calls, asynchronous DB queries - all of these fall into this category. If you want to recreate the environment of the method within the callback, you need to use Meteor.bindEnvironment otherwise you can't use any Meteor API functionality. This is an old, but very good video on the subject.

How would I design a client-side Queue system?

OVERVIEW
I'm working on a project and I've come across a bit of a problem in that things aren't happening in the order I want them to happen. So I have been thinking about designing some kind of Queue that I can use to organize function calls and other miscellaneous JavaScript/jQuery instructions used during start-up, i.e., while the page is loading. What I'm looking for doesn't necessarily need to be a Queue data structure but some system that will ensure that things execute in the order I specify and only when the previous task has been completed can the new task begin.
I've briefly looked at the jQuery Queue and the AjaxQueue but I really have no idea how they work yet so I'm not sure if that is the approach I want to take... but I'll keep reading more about these tools.
SPECIFICS
Currently, I have set things up so that some work happens inside $(document).ready(function() {...}); and other work happens inside $(window).load(function() {...});. For example,
<head>
<script type="text/javascript">
// I want this to happen 1st
$().LoadJavaScript();
// ... do some basic configuration for the stuff that needs to happen later...
// I want this to happen 2nd
$(document).ready(function() {
// ... do some work that depends on the previous work do have been completed
var script = document.createElement("script");
// ... do some more work...
});
// I want this to happen 3rd
$(window).load(function() {
// ... do some work that depends on the previous work do have been completed
$().InitializeSymbols();
$().InitializeBlock();
// ... other work ... etc...
});
</script>
</head>
... and this is really tedious and ugly, not to mention bad design. So instead of dealing with that mess, I want to design a pretty versatile system so that I can, for example, enqueue $().LoadJavaScript();, then var script = document.createElement("script");, then $().InitializeSymbols();, then $().InitializeBlock();, etc... and then the Queue would execute the function calls and instructions such that after one instruction is finished executing, the other can start, until the Queue is empty instead of me calling dequeue repeatedly.
The reasoning behind this is that some work needs to happen, like configuration and initialization, before other work can begin because of the dependency on the configuration and initialization steps to have completed. If this doesn't sound like a good solution, please let me know :)
SOME BASIC WORK
I've written some code for a basic Queue, which can be found here, but I'm looking to expand its functionality so that I can store various types of "Objects", such as individual JavaScript/jQuery instructions and function calls, essentially pieces of code that I want to execute.
UPDATE
With the current Queue that I've implemented, it looks like I can store functions and execute them later, for example:
// a JS file...
$.fn.LoadJavaScript = function() {
$.getScript("js/Symbols/Symbol.js");
$.getScript("js/Structures/Structure.js");
};
// another JS file...
function init() { // symbols and structures };
// index.html
var theQueue = new Queue();
theQueue.enqueue($().LoadJavaScript);
theQueue.enqueue(init);
var LJS = theQueue.dequeue();
var INIT = theQueue.dequeue();
LJS();
INIT();
I also think I've figured out how to store individual instructions, such as $('#equation').html(""); or perhaps even if-else statements or loops, by wrapping them as such:
theQueue.enqueue(function() { $('#equation').html(""); // other instructions, etc... });
But this approach would require me to wait until the Queue is done with its work before I can continue doing my work. This seems like an incorrect design. Is there a more clever approach to this? Also, how can I know that a certain function has completed executing so that the Queue can know to move on? Is there some kind of return value that I can wait for or a callback function that I can specify to each task in the Queue?
WRAP-UP
Since I'm doing everything client-side and I can't have the Queue do its own thing independently (according to an answer below), is there a more clever solution than me just waiting for the Queue to finish its work?
Since this is more of a design question than a specific code question, I'm looking for suggestions on an approach to solving my problem, advice on how I should design this system, but I definitely welcome, and would love to see, code to back up the suggestions :) I also welcome any criticism regarding the Queue.js file I've linked to above and/or my description of my problem and the approach I'm planning to take to resolve it.
Thanks, Hristo
I would suggest using http://headjs.com/ It allows you to load js files in parallel, but execute them sequentially, essentially the same thing you want to do. It's pretty small, and you could always use it for inspiration.
I would also mention that handlers that rely on execution order are not good design. I am always able to place all my bootstrap code in the ready event handler. There are cases where you'd need to use the load handler if you need access to images, but it hasn't been very often for me.
Here is something that might work, is this what you're after?
var q = (function(){
var queue = [];
var enqueue = function(fnc){
if(typeof fnc === "function"){
queue.push(fnc);
}
};
var executeAll = function(){
var someVariable = "Inside the Queue";
while(queue.length>0){
queue.shift()();
}
};
return {
enqueue:enqueue,
executeAll:executeAll
};
}());
var someVariable = "Outside!"
q.enqueue(function(){alert("hi");});
q.enqueue(function(){alert(someVariable);});
q.enqueue(function(){alert("bye");});
alert("test");
q.executeAll();
the alert("test"); runs before anything you've put in the queue.
how do I store pieces of code in the Queue and have it execute later
Your current implementation already works for that. There are no declared types in JavaScript, so your queue can hold anything, including function objects:
queue.enqueue(myfunc);
var f = queue.dequeue();
f();
how can I have the Queue do its own thing independently
JavaScript is essentially single-threaded, meaning only one thing can execute at any instant of time. So the queue can't really operate "independently" of the rest of your code, if that is what you mean.
You basically have two choices:
Run all the queued functions, one after the other, in a single go -- this doesn't even require a queue since it is the same as simply putting the function calls directly in your code.
Use timed events: run one function at a time and once it completes, set a timeout to execute the next queued function after a certain interval. An example of this follows.
function run() {
var func = this.dequeue();
func();
var self = this;
setTimeout(function() { self.run(); }, 1000);
}
If func is an asynchronous request, you'll have to move setTimeout into the callback function.
**The main functions**
**From there we can define the main elements required:**
var q=[];//our queue container
var paused=false; // a boolean flag
function queue() {}
function dequeue() {}
function next() {}
function flush() {}
function clear() {}
**you may also want to 'pause' the queue. We will therefore use a boolean flag too.
Now let's see the implementation, this is going to be very straightforward:**
var q = [];
var paused = false;
function queue() {
for(var i=0;i< arguments.length;i++)
q.push(arguments[i]);
}
function dequeue() {
if(!empty()) q.pop();
}
function next() {
if(empty()) return; //check that we have something in the queue
paused=false; //if we call the next function, set to false the paused
q.shift()(); // the same as var func = q.shift(); func();
}
function flush () {
paused=false;
while(!empty()) next(); //call all stored elements
}
function empty() { //helper function
if(q.length==0) return true;
return false;
}
function clear() {
q=[];
}
**And here we have our basic queue system!
let's see how we can use it:**
queue(function() { alert(1)},function(){ alert(2)},function(){alert(3)});
next(); // alert 1
dequeue(); // the last function, alerting 3 is removed
flush(); // call everything, here alert 2
clear(); // the queue is already empty in that case but anyway...

Categories