So I've been using require.js for while now, but I realized that I actually don't know how it works under the hood. It says that it's an AMD loader.
I do understand that CommonJS is synchronous, which means that it blocks execution of other codes while it's being loaded. On the other hand, AMD is asynchronous. This is where I get confused.
When I define a module, it has to load a,b,c in order to execute the callback. How does asynchronous work here?
Isn't it synchronous when it has to load those three dependencies first?
Does it mean that AMD loads a,b,c asynchronously then checks to see if those files are loaded (doesn't care about the order) then execute the callback?
define("name",["a","b","c"], function(a,b,c){
});
As you know, "AMD" (Asynchronous Module Definition (AMD)) is a specific API. There are many AMD-compatible "loaders", including RequireJS, curl.js and Dojo (among others).
Just as frameworks like JQuery and Dojo give you an API over raw Javascript; a program that uses AMD:
1) requires you an AMD-compatible .js library,
2) demands certain programming "rules" and "conventions", and
3) Ultimately sits "on top" of Javascript, which runs on your "Javascript engine" (be it IE, Chrome, Firefox - whatever).
Here are a couple of links I found useful:
https://www.ibm.com/developerworks/mydeveloperworks/blogs/94e7fded-7162-445e-8ceb-97a2140866a9/entry/loading_jquery_with_dojo_1_7_amd_loader2?lang=en
http://dojotoolkit.org/reference-guide/1.8/loader/amd.html
http://blog.millermedeiros.com/amd-is-better-for-the-web-than-commonjs-modules/
http://addyosmani.com/writing-modular-js/
PS:
To answer your immediate question, the latter link has a bit of discussion about "require()" and "dynamically_loaded dependencies".
Since I wrote an AMD loader, I'll try to answer the questions directly:
Isn't it synchronous when it has to load those three dependencies first?
Javascript, by definition, is single threaded. That means that anything you run in it always runs sequentially. The only thing that you can do in a browser is include scripts using the "async" parameter on the script tag, which will make the order in which scripts are being loaded undefined (asynchronous). Once a script executes it will be the only one executing at that point in time.
Does it mean that AMD loads a,b,c asynchronously then checks to see if those files are loaded (doesn't care about the order) then execute the callback?
Correct. AMD-define() allows you to load all scripts in any order you wish (i.e. ultimately you let the browser roll the dice and load them in any order it sees fit at any time it sees fit).
Then any time a define() is called, the AMD-loader will check if the current list of dependencies for this define has already been satisfied. If it is, it will call the current callback immediately, and after that, it will check if any of the previously registered define-callbacks can be called too (because all of their dependencies have been satisfied). If the dependencies for this callback have not all been satisfied yet, the callback is added to the queue to be resolved later.
This eventually results in all callbacks being called in the correct dependency order, regardless of the order in which the scripts have been loaded/executed in the first place.
Related
Thanks to this post I've become aware, for modules (not regular scripts & without async attribute) that import/export (from each other), one of the only simple rules that determines the order in which they're executed is that a module that imports from another module will not execute before the module it imports from.
I'm worried we can't always control the complete order in which modules execute (I've put some examples of this at the bottom of the post). For example, what if I want a timeout to start as soon as the page is loaded? Is it okay for a setTimeout to occur on a module (at the end of a very long list of modules) as the time to execute the initial modules will be negligible?
P.s. Am I safe to assume modules that don't import or export (from each other) always execute in the order they appear in?
Examples: ModuleA imports from ModuleB. ModuleC imports from ModuleD. Only rule is that D or B executes first (sometimes DB or BD executes first; sometimes DC or BA executes first).
I personally find that assuming something will happen before something else without enforcing that explicitly will result in a dangerous situation. Even if your modules would always execute in the correct order, you have created an implicit dependency between two modules that will make the code more difficult understand.
Will another developer working on your project be able to understand that certain modules need to execute first? Will you remember this when you revisit this code a few months down the line? What about when you leave the project and someone else inherits it?
Working on code like this can create spaghetti situations where making a change in one place creates cascading changes in other, unexpected places.
Alleviating this can really depend on what framework you're using, or if you're not using any framework at all. If you need certain dependencies loaded before executing certain code, use a dependency loader to ensure that dependencies are always available when they are needed. If you need a certain function to be executed before executing a function in a different module, create an event system. Redux/flux and Ngrx are really nice solutions for situations like this.
If you use plain script tags on an HTML page, rendering is blocked until the script has been downloaded and parsed. To avoid that, for faster page display, you can add the 'async' attribute, which tells the browser to continue processing down the page without waiting for that script. However, that inherently means that other javascript that refers to anything in that script will probably crash, because the objects it requires don't exist yet.
As far as I know, there's no allScriptsLoaded event you can tie into, so I'm looking for ways to simulate one.
I'm aware of the following strategies to defer running other code until an async script is available:
For a single script, use their 'onload' event or attribute. However, there's no built-in way I know of to tell when ALL scripts have loaded if there's more than one.
Run all dependent code in onload event handlers attached to the window. However, those wait for all images too, not just all scripts, so the run later than would be ideal.
Use a loader library to load all scripts; those typically provide for a callback to run when everything has loaded. Downside (besides needing a library to do this, which has to load early), is that all code has to wrapped in a (typically anonymous) function that you pass into the loader library. That's as opposed to just creating a function that runs when my mythical allScriptsLoaded fires.
Am I missing something, or is that the state of the art?
The best you could hope for would be to know if there are any outstanding async calls (XMLHttpRequest, setTimeout, setInterval, SetImmediate, process.nextTick, Promise), and wait for there to not be one. However, that is an implementation detail that is lost to the underlying native code--javascript only has its own event loop, and async calls are passed off to the native code, if I understand it correctly. On top of that, you don't have access to the event loop. You can only insert, you can't read or control flow (unless you're in io.js and feeling frisky).
The way to simulate one would be to track your script calls yourself, and call after all script are complete. (i.e., track every time you insert a relevant script into the event loop.)
But yeah, the DOM doesn't provide a NoAsyncPending global or something, which is what you'd really require.
In Webpack 1.x I used to do the following on regular basis:
require.ensure([ './mod2.js' ], ( require ) => {
setTimeout(() => {
// some later point in time, most likely through any kind of event
var data = require( './mod2.js' ); // actual evaluating the code
},1100);
}, 'myModule2');
With this technique, we were able to transfer a webpack-bundle over the wire, but evaluate the actual contents (the JavaScript code) from that bundle at some later point in time. Also, using require.ensure we could name the bundle, in this case myModule2, so we could see the name / alias when bundling happened executing webpack.
In Webpack 2.x, the new way to go is using System.import. While I love receiving a Promise object now, I have two issues with that style. The equivalent of the above code would look like:
System.import( './mod2.js' ).then( MOD2 => {
// bundle was transferred AND evaluated at this point
});
How can we split the transfer and the evaluation now?
How can we still name the bundle?
The Webpack documentation on Github says the following:
Full dynamic requires now fail by default
A dependency with only an expression (i. e. require(expr)) will now
create an empty context instead of an context of the complete
directory.
Best refactor this code as it won't work with ES6 Modules. If this is
not possible you can use the ContextReplacementPlugin to hint the
compiler to the correct resolving.
I'm not sure if that plays a role in this case. They also talk about code splitting in there, but it's pretty briefly and they don't mention any of the "issues" or how to workaround.
tl;dr: System.resolve and System.register do most of what you want. The rest of this answer is why require.ensure cannot and how System.import calls the others.
I think ES6 modules prevent this for working well, although following it through the relevants specs is tricky, so I may be totally wrong.
That said, let's start with a few references:
the WhatWG module loader
the ES6 specification on modules (§15.2)
the CommonJS module specification
the fantastic 2ality article on ES6 modules
The first reference explains more behavior, although I'm not entirely sure how normatize it is. The latter explains the implementation details on the JS side. Since no platforms implement this yet, I don't have references for how it actually work in real life, and we'll have to rely on the spec.
The require that has been available in webpack 1.x is a mashup of the CommonJS and AMD requires. The CommonJS side of that is described in ref#3, specifically the "Module Context" section. Nowhere does that mention require.ensure, nor does the AMD "specification" (such as it is), so this is purely an invention of webpack. That is, the feature was never real, in the sense of being specified somewhere official and fancy looking.
That said, I think require.ensure conflicts with ES6 modules. Calling System.import should invoke the import method from a Loader object. The relevant section in ref#2 does not lay that out explicitly, but §10.1 does mention attaching a loader to System.
The Loader.prototype.import method is not terribly involved, and step 4 is the only one that interests us:
Return the result of transforming Resolve(loader, name, referrer) with a fulfillment handler that, when called with argument key, runs the following steps:
Let entry be EnsureRegistered(loader, key).
Return the result of transforming LoadModule(entry, "instantiate") with a fulfillment handler that, when called, runs the following steps:
Return EnsureEvaluated(entry).
The flow is resolve-register-load-evaluate, and you want to break between load and evaluate. Note, however, that the load stage calls LoadModule with stage set to "instantiate". That implies and probably requires the module has already been translated via RequestTranslate, which does much of the heavy parsing as it tries to find the module's entry point and so on.
This has already done more work than you want, from the sounds of it. Since the basics of module loading require a known entry point, I don't think there's a way to avoid parsing and partially evaluating the module with the calls exposed from System. You already knew that.
The problem is that System.import can't possibly know -- until after parsing -- whether the module is an ES6 module that must be evaluated or a webpack bundle that could be deferred. The parsing must be done to figure out if we need to parse, leading to a chicken-and-egg problem.
Up to this point, we've been following the path from System.import through the Loader. The import call is dictating what stage of import we're at, assuming you want to go through the full end-to-end loading process. The underlying calls, like Loader.prototype.load, provide fine grained control over those stages.
I'm not sure how you would invoke the first two stages (fetch and translate), but if you were able to translate and register a module, later calls should simply evaluate and return it.
If the spec is accurate, this should be exposed (in supporting implementations) through the System.loader property and would have the methods you need to call. There are large parts of the flow you don't have access to, so I would suggest not doing this and instead setting up your code so nothing significant runs when a module is loaded. If that's not possible, you need to recreate the flow up through registering but stop shy of evaluation.
I don't understand WHY and in what scenario this would be used..
My current web setup consists of lots of components, which are just functions or factory functions, each in their own file, and each function "rides" the app namespace, like : app.component.breadcrumbs = function(){... and so on.
Then GULP just combines all the files, and I end up with a single file, so a page controller (each "page" has a controller which loads the components the page needs) can just load it's components, like: app.component.breadcrumbs(data).
All the components can be easily accessed on demand, and the single javascript file is well cached and everything. This way of work seems extremely good, never saw any problem with this way of work. of course, this can (and is) be scaled nicely.
So how are ES6 imports for functions any better than what I described?
what's the deal with importing functions instead of just attaching them to the App's namespace? it makes much more sense for them to be "attached".
Files structure
/dist/app.js // web app namespace and so on
/dist/components/breadcrumbs.js // some component
/dist/components/header.js // some component
/dist/components/sidemenu.js // some component
/dist/pages/homepage.js // home page controller
// GULP concat all above to
/js/app.js // this file is what is downloaded
Then inside homepage.js it can look like this:
app.routes.homepage = function(){
"use strict";
var DOM = { page : $('#page') };
// append whatever components I want to this page
DOM.page.append(
app.component.header(),
app.component.sidemenu(),
app.component.breadcrumbs({a:1, b:2, c:3})
)
};
This is an extremely simplified code example but you get the point
Answers to this are probably a little subjective, but I'm going to do my best.
At the end of the day, both methods allow support creating a namespace for a piece of functionality so that it does not conflict with other things. Both work, but in my view, modules, ES6 or any other, provide a few extra benefits.
Explicit dependencies
Your example seems very bias toward a "load everything" approach, but you'll generally find that to be uncommon. If your components/header.js needs to use components/breadcrumbs.js, assumptions must be made. Has that file been bundled into the overall JS file? You have no way of knowing. You're two options are
Load everything
Maintain a file somewhere that explicitly lists what needs to be loaded.
The first option is easy and in the short term is probably fine. The second is complicated for maintainability because it would be maintained as an external list, it would be very easy to stop needing one of your component file but forget to remove it.
It also means that you are essentially defining your own syntax for dependencies when again, one has now been defined in the language/community.
What happens when you want to start splitting your application into pieces? Say you have an application that is a single large file that drives 5 pages on your site, because they started out simple and it wasn't big enough to matter. Now the application has grown and should be served with a separate JS file per-page. You have now lost the ability to use option #1, and some poor soul would need to build this new list of dependencies for each end file.
What if you start using a file in a new places? How do you know which JS target files actually need it? What if you have twenty target files?
What if you have a library of components that are used across your whole company, and one of they starts relying on something new? How would that information be propagated to any number of the developers using these?
Modules allow you to know with 100% certainty what is used where, with automated tooling. You only need to package the files you actually use.
Ordering
Related to dependency listing is dependency ordering. If your library needs to create a special subclass of your header.js component, you are no longer only accessing app.component.header() from app.routes.homepage(), which would presumable be running at DOMContentLoaded. Instead you need to access it during the initial application execution. Simple concatenation offers no guarantees that it will have run yet. If you are concatenating alphabetically and your new things is app.component.blueHeader() then it would fail.
This applies to anything that you might want to do immediately at execution time. If you have a module that immediately looks at the page when it runs, or sends an AJAX request or anything, what if it depends on some library to do that?
This is another argument agains #1 (Load everything) so you start having to maintain a list again. That list is again going to be a custom things you'll have come up with instead of a standardized system.
How do you train new employees to use all of this custom stuff you've built?
Modules execute files in order based on their dependencies, so you know for sure that the stuff you depend on will have executed and will be available.
Scoping
Your solution treats everything as a standard script file. That's fine, but it means that you need to be extremely careful to not accidentally create global variables by placing them in the top-level scope of a file. This can be solved by manually adding (function(){ ... })(); around file content, but again, it's one more things you need to know to do instead of having it provided for you by the language.
Conflicts
app.component.* is something you've chosen, but there is nothing special about it, and it is global. What if you wanted to pull in a new library from Github for instance, and it also used that same name? Do you refactor your whole application to avoid conflicts?
What if you need to load two versions of a library? That has obvious downsides if it's big, but there are plenty of cases where you'll still want to trade big for non-functional. If you rely on a global object, it is now up to that library to make sure it also exposes an API like jQuery's noConflict. What if it doesn't? Do you have to add it yourself?
Encouraging smaller modules
This one may be more debatable, but I've certainly observed it within my own codebase. With modules, and the lack of boilerplate necessary to write modular code with them, developers are encouraged to look closely on how things get grouped. It is very easy to end up making "utils" files that are giant bags of functions thousands of lines long because it is easier to add to an existing file that it is to make a new one.
Dependency webs
Having explicit imports and exports makes it very clear what depends on what, which is great, but the side-effect of that is that it is much easier to think critically about dependencies. If you have a giant file with 100 helper functions, that means that if any one of those helpers needs to depend on something from another file, it needs to be loaded, even if nothing is ever using that helper function at the moment. This can easily lead to a large web of unclear dependencies, and being aware of dependencies is a huge step toward thwarting that.
Standardization
There is a lot to be said for standardization. The JavaScript community has moved heavily in the direction of reusable modules. This means that if you hope into a new codebase, you don't need to start off by figuring out how things relate to eachother. Your first step, at least in the long run, won't be to wonder whether something is AMD, CommonJS, System.register or what. By having a syntax in the language, it's one less decision to have to make.
The long and short of it is, modules offer a standard way for code to interoperate, whether that be your own code, or third-party code.
Your current process is to concatenate everything always into a single large file, only ever execute things after the whole file has loaded and you have 100% control over all code that you are executing, then you've essentially defined your own module specification based on your own assumptions about your specific codebase. That is totally fine, and no-one is forcing you to change that.
No such assumptions can be made for the general case of JavaScript code however. It is precisely the objective of modules to provide a standard in such a way as to not break existing code, but to also provide the community with a way forward. What modules offer is another approach to that, which is one that is standardized, and one that offers clearer paths for interoperability between your own code and third-party code.
Is 'require' synchronous in AMD (asynchronous module definition)? If so, what makes this specification asynchronous? What if I have require() (and it hasn't been loaded yet) in the middle of my code, will it stall execution? Talking browser-side.
There are two different synchronous concepts here.
The first is "Will it stop my entire webpage, and sit and wait for the file.".
The answer is no. RequireJS doesn't do that if you've got a script with dependencies.
If you use it appropriately, it uses a promise-system.
What that means is that if you send in your callback and define your requirements for that file, the callback won't be run until all of the required files are loaded.
If there's a require inside of one of those required files, then THAT callback won't be run until ITS dependencies have loaded.
The outermost callback (the one that would be at the bottom of your script, normally), won't run until everything inside has.
This works on a promise system.
It's worth understanding how promise systems work (similar to an observer-pattern, in a way).
They're meant to be passed around or chained, based on an event, rather than having multiple people listen in any order.
var widget = new Widget(),
widgetLoaded = widget.load(url); // return a promise to let the program use the widget
widgetLoaded.then(function () { widget.move(35); })
.then(function () { widget.setColour("Blue"); })
.then(function () { widget.show(); });
This is like returning this so that you can chain function calls, except that the calls don't actually happen until widget.load() completes.
The widget will actually control when this happens, by keeping its promise if the widget loads and everything is fine, or by breaking its promise if something went wrong.
In most promise systems, .then or whatever they call it, either takes two functions (kept and broken -- in my systems, brokens are always optional), or they take an object with success and failure -- $.ajax does this, and then lets you predetermine what you want to do with the data when it's loaded, or if it fails -- promises.
So your page still work 100% asynchronously (without interrupting the UI), but it's 100% synchronous in that all of the modules will fire in the right order.
One thing you MUST REMEMBER:
If you have these dependencies in your code, you can not have any dependencies lying around at the bottom of your script, waiting to run, inline.
They must all be locked away inside of your callback, or locked inside a function waiting to be called by your callback.
This is simply because it is an asynchronous process, in terms of actual processing, and will not block the browser from running events/JS, rendering the page, et cetera.
For requireJS:
You have to pass a callback method alongside the required modules to .require(), that will get fired when the resources were loaded successfully. So, of course you should/can only access loaded AMD or CommonJS modules just within that callback.
for NodeJS:
Yes, .require() does work synchronously. NodeJS uses the CommonJS module system, not AMD.