Doesn't ES module system guarantee module singleton? - javascript

Let's suppose we have a dependency module graph where more than one module import another at once, something like that: (A) ---> (B) <--- (C)
It seems the ECMAScript specification doesn't guarantee both A and C would get the same instance of an object represents the B module (or its exported entity).
To wit: a JS module is represented by an instance of specification object called Abstract Module Record (its concrete version is Source Text Module Record). An object of this type, in particular, stores bindings for the module (that is, declared variable names with their corresponding values). In the other words, different Module Records have different bindings (it's a pretty simple idea because it's the reason we tend to use modules).
Before code evaluating, a Module Record Link() method is executed and it, in turn, recursively calls the InitializeEnvironment() on each of graph Module Records. The aim of InitializeEnvironment() is to create bindings for every imported module. To obtain the needed Module Record which would represent one of them, this function uses HostResolveImportedModule(). It receives as its argument referencingScriptOrModule (an importing module, that is A or C) and a string specifier (something like b.js).
There is the bottom line starts to appear.
Each time this operation is called with a specific
referencingScriptOrModule, specifier pair as arguments it must return
the same Module Record instance if it completes normally.
I can't draw from this excerpt there is a guarantee A and C will get the same instance of Module Record represents B in due to (A, "b.js") and (C, "b.js") are not the same pairs.
Does it all mean, in effect, the ES module system doesn't guarantee a possibility to create a module singleton? There is a good few of advice to export a singleton entity from a module. It is said a module is evaluated once and then cached, so it's an acceptable way to deal with singletons. However, although the statement about caching is true, but still this fact per se doesn't entail an module system impementation would resolve to the same imported module for different importing ones.

Yes, there is no guarantee in ECMAScript - the rules for how to resolve module specifiers are determined by the host.
Multiple different referencingScriptOrModule, specifier pairs may
map to the same Module Record instance. The actual mapping
semantic is implementation-defined but typically a normalization
process is applied to specifier as part of the mapping process. A
typical normalization process would include actions such as alphabetic
case folding and expansion of relative and abbreviated path
specifiers.
So only your host (like node.js or the browser) will give you the guarantee that it will always resolve certain specifiers to the same module.
There are many examples where (A, "b.js") and (C, "b.js") should not resolve to the same module - e.g. when A is one/a.js, it should return the module one/b.js, while C might be two/c.js and result in the module two/b.js.

Related

ES6 named import introduces a const?

In implementing the Python module mechanism on top of ES6 modules for the Transcrypt Python to JavaScript compiler, I am faced with the following problem:
There are a large number of standard functions imported from the Python runtime module, like e.g. the Python input function (implemented in JS), which can be made available using named imports (since they shouldn't have to be prefixed with anything in the user code, so input rather than __runtime__.input, to be consistent with Python).
In Python it's allowed to rebind named imports. So I define another function input, which will override the one from the runtime. But if I do so in JS, I get an error:
Identifier 'input' has already been declared
It seems that all imported names are regarded as JS consts, so non-rebindable according to this article. I can think of several clever workarounds, like importing under an alias and then assigning to a module global var rather than const, but like to keep things simple, so my question is:
Am I right that JS named imports are consts, so non-rebindable (and if so, just curious, anyone knows WHY)? Where can I find details on this?
Is there a simple way to circumvent that and still put them in the global namespace of the importing module, but override them at will?
As per the language specification, imported bindings are immutable bindings, so they cannot be changed. The identifiers are reserved as the module gets parsed because of how ES6 modules work: Unlike in Python, imports are not statements that are included as they are executed; instead, all a module’s imports are basically collected during the early compilation and then resolved before the module starts executing.
This makes ES6 modules kind of unsuitable as an implementation for Python’s import system.
As a general way, to avoid losing those names, you can simply give the imported bindings different names. For example an from foo import bar, baz may be compiled to the following:
import { bar as _foo__bar, baz as _foo__baz } from 'foo';
let bar = _foo__bar;
let baz = _foo__baz;
That will only reserve some special names while keeping the bar and baz identifiers mutable.
Another way, which would probably also help you to solve possible import semantics differences would be to simply create a closure:
import { bar, baz } from 'foo';
(function (bar, baz) {
// …
})(bar, baz);
Or even add some other lookup mechanism in between.
Btw. Python’s import is very similar to Node’s require, so it might be worth looking into all those solutions that made Node’s module system work in the browser.

Migrating from Webpack 1.x to 2.x

In Webpack 1.x I used to do the following on regular basis:
require.ensure([ './mod2.js' ], ( require ) => {
setTimeout(() => {
// some later point in time, most likely through any kind of event
var data = require( './mod2.js' ); // actual evaluating the code
},1100);
}, 'myModule2');
With this technique, we were able to transfer a webpack-bundle over the wire, but evaluate the actual contents (the JavaScript code) from that bundle at some later point in time. Also, using require.ensure we could name the bundle, in this case myModule2, so we could see the name / alias when bundling happened executing webpack.
In Webpack 2.x, the new way to go is using System.import. While I love receiving a Promise object now, I have two issues with that style. The equivalent of the above code would look like:
System.import( './mod2.js' ).then( MOD2 => {
// bundle was transferred AND evaluated at this point
});
How can we split the transfer and the evaluation now?
How can we still name the bundle?
The Webpack documentation on Github says the following:
Full dynamic requires now fail by default
A dependency with only an expression (i. e. require(expr)) will now
create an empty context instead of an context of the complete
directory.
Best refactor this code as it won't work with ES6 Modules. If this is
not possible you can use the ContextReplacementPlugin to hint the
compiler to the correct resolving.
I'm not sure if that plays a role in this case. They also talk about code splitting in there, but it's pretty briefly and they don't mention any of the "issues" or how to workaround.
tl;dr: System.resolve and System.register do most of what you want. The rest of this answer is why require.ensure cannot and how System.import calls the others.
I think ES6 modules prevent this for working well, although following it through the relevants specs is tricky, so I may be totally wrong.
That said, let's start with a few references:
the WhatWG module loader
the ES6 specification on modules (§15.2)
the CommonJS module specification
the fantastic 2ality article on ES6 modules
The first reference explains more behavior, although I'm not entirely sure how normatize it is. The latter explains the implementation details on the JS side. Since no platforms implement this yet, I don't have references for how it actually work in real life, and we'll have to rely on the spec.
The require that has been available in webpack 1.x is a mashup of the CommonJS and AMD requires. The CommonJS side of that is described in ref#3, specifically the "Module Context" section. Nowhere does that mention require.ensure, nor does the AMD "specification" (such as it is), so this is purely an invention of webpack. That is, the feature was never real, in the sense of being specified somewhere official and fancy looking.
That said, I think require.ensure conflicts with ES6 modules. Calling System.import should invoke the import method from a Loader object. The relevant section in ref#2 does not lay that out explicitly, but §10.1 does mention attaching a loader to System.
The Loader.prototype.import method is not terribly involved, and step 4 is the only one that interests us:
Return the result of transforming Resolve(loader, name, referrer) with a fulfillment handler that, when called with argument key, runs the following steps:
Let entry be EnsureRegistered(loader, key).
Return the result of transforming LoadModule(entry, "instantiate") with a fulfillment handler that, when called, runs the following steps:
Return EnsureEvaluated(entry).
The flow is resolve-register-load-evaluate, and you want to break between load and evaluate. Note, however, that the load stage calls LoadModule with stage set to "instantiate". That implies and probably requires the module has already been translated via RequestTranslate, which does much of the heavy parsing as it tries to find the module's entry point and so on.
This has already done more work than you want, from the sounds of it. Since the basics of module loading require a known entry point, I don't think there's a way to avoid parsing and partially evaluating the module with the calls exposed from System. You already knew that.
The problem is that System.import can't possibly know -- until after parsing -- whether the module is an ES6 module that must be evaluated or a webpack bundle that could be deferred. The parsing must be done to figure out if we need to parse, leading to a chicken-and-egg problem.
Up to this point, we've been following the path from System.import through the Loader. The import call is dictating what stage of import we're at, assuming you want to go through the full end-to-end loading process. The underlying calls, like Loader.prototype.load, provide fine grained control over those stages.
I'm not sure how you would invoke the first two stages (fetch and translate), but if you were able to translate and register a module, later calls should simply evaluate and return it.
If the spec is accurate, this should be exposed (in supporting implementations) through the System.loader property and would have the methods you need to call. There are large parts of the flow you don't have access to, so I would suggest not doing this and instead setting up your code so nothing significant runs when a module is loaded. If that's not possible, you need to recreate the flow up through registering but stop shy of evaluation.

Update a single CommonJS module without having to touch every single module it is required in

This may be a bit complicated to follow but I'll do my best. I have a series of API modules that I'm using with Browserify, and some of them call other modules as well. They are pretty much interwoven, which is going to make updates difficult.
For example, let's say I have five modules: A, B, C, D, and E. A is required by B and C, and B is required by C, D, and E. If I need to make some updates to A that are breaking changes, I could version it, but then I need to update the require statements in B and C. And since B is now using a different version of A, I need to version B as well, which means changing the require statements in C, D, and E. So a single change in one module means I have to reversion everything else.
I should note that the main reason for this is that I have to keep old versions around. These are small microsites, and one site might get built with A - E and the next might get built with A' - E', but I still need to be able to build both independently. Even though the change to A' may not have an effect on the API it exposes, I have no desire to have to go back and re-test every single project ever built with each single file modification.
I thought about having a separate per-project file that could be required in and it then requires all of the versioned modules for that project, but that's a circular dependency.
If it matters, I'm using Gulp and Browserify to build the final JS file.
I found something that would do the trick: aliasify
I changed my require statements to require('Api/ModuleA') and then in the aliasify config I map Api/ModuleA to ./libs/Api/ModuleA-1.0 and it picks up the version I want every time it requires Module A.
I used a mechanism similar to Java classpaths. I created a custom module resolver and used separate module roots for separate projects. Ther was also a folder for common files. The module resolver first searced for the module in the project's folder, and if it wasn't found searched for it in the common folder. This way you can provide a specialized implementation of module A for a specific project. Not sure if its suitable in your case. It is something like aliasify with the difference of the config being backed by filesystem folders instead of config files.
A is a module with only a few dependencies, and is referenced by many modules. They usually call that kind of module "mature". These mature modules should be well-tested, and the public interface should not change often, as every dependant module would need to be updated. So you might try to make the changes without breaking the api, possibly creating a new module with versioned name, and a wrapper of the module which provides the old api using the new module. New components could use the new module, old components are not affected.
They use multiple numbers to version a software for a reason. In the simplest case there are two numbers: major and minor versions. Minor version can change with every release, major version increases when the public api changes. All components that depend on this one only need to be updated when major versions change. (of course sometimes there are bugs in the implementation of a minor version that breaks some depending components, but thats not the usual case). If you change the public api of A, you need to change B and C too, but not the others.A would have a major version change, B and C would have a minor. The rest stays the same. This needs a more complicated module resolver which can resolve the latest modules by their major version. But thats what npm does anyway.

How does Node.js handle big arrays and objects included multiple times using require()?

In my project I indirectly use some big arrays of data - in my specific case these are Minecraft block and item info.
As I said I'm using the data indirectly - one of my dependencies uses it. But now I want to use it too which means I'll need to require() that .js file that contains all the data. Since there are no constants in javascript and the loaded object will be mutable my question is if it will really be loaded two times in Node's memory. If it is, what can I do to save memory?
https://nodejs.org/docs/latest/api/modules.html#modules_caching
Modules are cached after the first time they are loaded. This means (among other things) that every call to require('foo') will get exactly the same object returned, if it would resolve to the same file.
So if your dependency is inside your project, and it requires the same file that you want to require - it will return the same object. But if your dependency A is a node module that requires dependency B (the big array) as a separate node module... And you add B as a dependency for your whole project, it will resolve to a different file. That means it will be a different object.
Modules are cached based on their resolved filename. Since modules may resolve to a different filename based on the location of the calling module (loading from node_modules folders), it is not a guarantee that require('foo') will always return the exact same object, if it would resolve to different files.
When you require the same file twice it only gets loading into memory once, you just get two references to it. The data will usually be mutable as well, so if one place modifies it the other place will see that change (which can lead to some confusing bugs!). You can avoid that either by using immutable data structures (like those in immutable.js) or by using Object.freeze (but be aware that this just does shallow immutability, so if any of the keys of your object are themselves mutable objects they will remain so).

Nodejs uses variable assignment to load modules

Most languages use 'import' directives to load other module code, like
java -
import a.b.c
elisp -
(load a)
python -
from a import b
But, why does nodejs use a variable expression to load other module functions like
var a = require('a')
i see, most IDEs for javascript like tern.js-emacs, nodeclipse are not able to do source code lookup (for loaded modules) properly because the IDE has to run the code (or) do eval to find out, what properties a loaded module object contains.
You could say JS belongs to a category of languages where the idea that everything is an object on equal footing is part of the "philosophy" that has guided its development. Node's require is a function (an object) supplied by the environment, as is the module object. This pattern is called the Common JS format.
You actually don't have to assign the result of the require function to a variable. It's rare in practice, but the node module you're calling on could just be invoked to cause an action to take place, for example one might require sugar.js which alters some of the native objects but has no methods of its own to offer, so there would be no point in assigning the return value (which is the module.exports object that was supplied during that module's execution).
A more common example of not assigning a module to a variable is when one uses require just to grab some property off the module -- e.g. var x = require('module').methodOfInterest. Similarly, some modules return a constructor, so you may sometimes see var instance = new (require('ConstructorModule'))(options) (which is ugly in my opinion; requires should generally be grouped at the top of a file and acted on only afterwards).
Note: There's really no concrete answer to your question so odds are high that it will get closed as SO-inappropriate.

Categories