Nodejs uses variable assignment to load modules

Nodejs uses variable assignment to load modules - javascript

Most languages use 'import' directives to load other module code, like
java -
import a.b.c
elisp -
(load a)
python -
from a import b
But, why does nodejs use a variable expression to load other module functions like
var a = require('a')
i see, most IDEs for javascript like tern.js-emacs, nodeclipse are not able to do source code lookup (for loaded modules) properly because the IDE has to run the code (or) do eval to find out, what properties a loaded module object contains.

You could say JS belongs to a category of languages where the idea that everything is an object on equal footing is part of the "philosophy" that has guided its development. Node's require is a function (an object) supplied by the environment, as is the module object. This pattern is called the Common JS format.
You actually don't have to assign the result of the require function to a variable. It's rare in practice, but the node module you're calling on could just be invoked to cause an action to take place, for example one might require sugar.js which alters some of the native objects but has no methods of its own to offer, so there would be no point in assigning the return value (which is the module.exports object that was supplied during that module's execution).
A more common example of not assigning a module to a variable is when one uses require just to grab some property off the module -- e.g. var x = require('module').methodOfInterest. Similarly, some modules return a constructor, so you may sometimes see var instance = new (require('ConstructorModule'))(options) (which is ugly in my opinion; requires should generally be grouped at the top of a file and acted on only afterwards).
Note: There's really no concrete answer to your question so odds are high that it will get closed as SO-inappropriate.

Related

Doesn't ES module system guarantee module singleton?

Let's suppose we have a dependency module graph where more than one module import another at once, something like that: (A) ---> (B) <--- (C)
It seems the ECMAScript specification doesn't guarantee both A and C would get the same instance of an object represents the B module (or its exported entity).
To wit: a JS module is represented by an instance of specification object called Abstract Module Record (its concrete version is Source Text Module Record). An object of this type, in particular, stores bindings for the module (that is, declared variable names with their corresponding values). In the other words, different Module Records have different bindings (it's a pretty simple idea because it's the reason we tend to use modules).
Before code evaluating, a Module Record Link() method is executed and it, in turn, recursively calls the InitializeEnvironment() on each of graph Module Records. The aim of InitializeEnvironment() is to create bindings for every imported module. To obtain the needed Module Record which would represent one of them, this function uses HostResolveImportedModule(). It receives as its argument referencingScriptOrModule (an importing module, that is A or C) and a string specifier (something like b.js).
There is the bottom line starts to appear.
Each time this operation is called with a specific
referencingScriptOrModule, specifier pair as arguments it must return
the same Module Record instance if it completes normally.
I can't draw from this excerpt there is a guarantee A and C will get the same instance of Module Record represents B in due to (A, "b.js") and (C, "b.js") are not the same pairs.
Does it all mean, in effect, the ES module system doesn't guarantee a possibility to create a module singleton? There is a good few of advice to export a singleton entity from a module. It is said a module is evaluated once and then cached, so it's an acceptable way to deal with singletons. However, although the statement about caching is true, but still this fact per se doesn't entail an module system impementation would resolve to the same imported module for different importing ones.

Yes, there is no guarantee in ECMAScript - the rules for how to resolve module specifiers are determined by the host.
Multiple different referencingScriptOrModule, specifier pairs may
map to the same Module Record instance. The actual mapping
semantic is implementation-defined but typically a normalization
process is applied to specifier as part of the mapping process. A
typical normalization process would include actions such as alphabetic
case folding and expansion of relative and abbreviated path
specifiers.
So only your host (like node.js or the browser) will give you the guarantee that it will always resolve certain specifiers to the same module.
There are many examples where (A, "b.js") and (C, "b.js") should not resolve to the same module - e.g. when A is one/a.js, it should return the module one/b.js, while C might be two/c.js and result in the module two/b.js.

Altering Global Scope for dynamically loaded module in Node.js

Loading a module from a source dynamically:
var src="HERE GOES MY SOURCE"
var Module = module.constructor;
var m = new Module();
m._compile(src, 'a-path-that-does-not-exist');
Need to achieve following:
Pass some variables/functions so that they can be used inside the src script globally. Can set them in "m.foo", but want the script to use "foo" without using "module.foo". "global.foo" works, but see the point 2.
How to restrict the src script from accessing global scope?
How to restrict the src from loading other modules using require or other means.
How to restrict the src from running async operations?

All, I can think of is to wrap the script in its own function, kind of like nodejs already does for commonJS modules. This is the regular wrapper.
(function(exports, require, module, __filename, __dirname) {
// Module code actually lives in here
});
If you wrap that user code with your own wrapper and then when you call it to execute it, you can define your own values for require, module and any other semi-global symbols.
If you also put 'use strict'; as the very first line of your wrapper function (before any of the user code), then that will eliminate default assignment to the global object with just something like x = 4 because that will be an error without explicitly defining x first. If you then also make your own global object and pass it as an argument, that can keep anyone from assigning to the real global object. I don't think you can prevent implicit read access to pre-existing globals.
So, your wrapper could look like this:
(function(exports, require, module, __filename, __dirname, global) {
'use strict';
// insert user code here before evaluating it with eval()
// and getting the function which you can then call and pass the desired arguments
});
Then, when you call this function, you pass it the values you want to for all the arguments (something other than the real ones).
Note, it's hard to tell how leak-proof this type of scheme really is. Any real security should likely be run in a resource restricted VM.
Another idea, you could run in a Worker Thread which has it's own virgin set of globals. So, you do all of the above and run it in a Worker Thread.
Addressing your questions in comments:
Does the 'use strict'; need to go inside the wrapper function or outside?
It needs to be the first line of code inside the wrapper function, right before where you insert the user code. The idea is to force that function scope (where the user code lives) inside that wrapper to be in strict mode to limit some of the things it can do.
Could you explain the "I don't think you can prevent implicit read access to pre-existing globals."? If i provide my own object as global, how can the inner script access preexisting globals?
Any code, even strict mode code can access pre-existing globals without the global prefix. While you can prevent the code from creating new globals by shadowing it with your own global in the wrapper function arguments and by forcing it into strict mode, you can't prevent strict mode code from reading existing globals because they can do so without the global prefix. So, if there's a pre-existing global called "foo", then existing code can reference that like:
console.log(foo);
or
foo = 12;
If there is no foo in a closer scope, the interpreter will find the foo on the global object and use that.
Note that strict mode prevents the automatic creation of a new global with something like:
greeting = "happy birthday"
Could you elaborate more no "resource restricted VM"?
I was talking about real hardware/OS level VMs that allow you to fully control what resources a process may use (disk access, sockets, memory, hardware, etc...). It's essentially a virtual computer environment separate from any other VMs on the same system. This is a more rigorous level of control.
WorkerThread is a very interesting concept. will take a look! My understanding was that WorkerThreads provide memory isolation and the only way to share data is by sending messages (effectively creating copies)?
Yes, Worker Threads provide pretty good isolation as they start up a whole new JS engine and have their own globals. They can shared ArrayBuffers in certain ways (if you choose to do that), but normal JS variables are not accessible across thread boundaries. They would normally communicate via messaging (which is automatically synchronized through the event queue), but you could also communicate via sockets if you wanted.

difference between function context (this) in node.js and browser

I am aware of how "this" works in a browser context , and how its value changes in different scenarios like when using arrow functions how the function is invoked.
I printed out "this" in different scenarios for node js (express js, to be more specific), and it is containing a lot more data - including path names etc
My question is :
1. are the rules concerning 'this' exactly the same for node.js ?
2. could any one explain the node.js 'this' object properties or point me to a simple article.
Thank you!

There are no different rules for this in a browser vs. node.js. The rules are set by the ECMAScript standards and both the browser's Javascript implementation and the one in node.js follow the same ECMAScript standards.
What you are probably looking at is a "default" value for this in some particular context. In a browser, you are probably looking at a default value for this that may be the window object. In node.js, if you see filenames, you may be looking at a module handle as the default value for this or the global object.
To help you more specifically, we would need to see the code around where you were examining the value of this in each environment and also know whether you were running in strict mode or not.
In most cases, this is not used with just a default value, but rather a specific object that the this value is set to. For example, if you are calling something like:
obj.method();
Then, inside the implementation of method, the Javascript interpreter will set the value of this to obj. This is a part of the object oriented nature of Javascript.

This this object is whatever the global object is in that context. In node that is the process object.

I observed a difference between this in a module when running on node (tests) and a browser (production).
in tests:
the following type of code worked fine when run by tests:
export function A()
{
}
export function B()
{
// NOTE: DON'T DO THIS. prepending "this." is not needed and might break.
this.A();
}
But on production it would throw:
TypeError: Cannot read property 'A' of undefined
However this is NOT a difference between node + browser/webview but rather a difference between production code (production build of bundle.js via webpack v4) and code running in tests.
With a debug build of bundle.js this would point to the module (so an object containing exported module symbols)
eg:
{
A : [Function: A]
B : [Function: B]
}
Whereas in a release build of bundle.js this returns undefined
This difference of behaviour is caused by webpacks concatenateModules optimization.

In Javascript, can constants be shared between files, as in Ruby?

This is probably an odd question because it's more typical for people to ask how to avoid using globals.
Coming from the Ruby world, I've become very comfortable using globals in two specific examples:
Constants. When a file is imported in Ruby, all of its constants are automatically made available to the other files in the program.
(and this ties in with the first) Packages. When I load a Ruby Gem in a required file, it also becomes available in my other files.
I've been starting to use module.exports, but I'm finding that I'm importing same modules in lots of different files.
I'd really like to have these features in Javascript. The way I'm writing my code at the moment, I'm using a functional approach and passing all my constants as parameters. The problem is my code is getting too verbose for my liking.
I'm really not looking for a "short answer: no" type of response, here. Even if it is too difficult, I'd appreciate being pointed in a direction for how to avoid passing constants as parameters to functions.

One method of using globals could be to use HTML5 Local Storage.
My thinking is, have an object with your globals, and on page load save each global variable into its own local storage location.
So you have an object with your globals stored:
var globals = {
GLOBAL1: "SomeString",
GLOBAL2: 400
}
Then onload / or if you want to do it sooner have it called before the page loads, you can have a function run through your globals and save the values into local storage
for(var key in globals) {
localStorage.setItem(key, globals[key]);
}
Then, later on, when a function needs, for example GLOBAL2 you can call:
localStorage.getItem("GLOBAL2");

Possible to enumerate or access module-level function declarations in NodeJs?

In Node.js, if I load a module which contains code in module-scope like:
this["foo"] = function() { console.log("foo"); }
...then I appear to get a globally available function that I can call just by saying foo() from any code using the module. It can be seen as one of the printed items with Object.getOwnPropertyNames(this).
However, if I put the following in module scope instead:
function foo() { console.log("foo"); }
...then it produces a function which can similarly be called within that module as foo(), but is invisible outside of it (e.g. does not show up as one of the items with Object.getOwnPropertyNames(this)).
I gather this is a change in runtime behavior from what's done in browsers. A browser seems to poke everything into global scope by default (and for years people have had to consciously avoid this by wrapping things up in anonymous functions/etc.)
My question is whether NodeJs has some secret way of interacting with these declarations outside of the module in which they are declared BESIDES using exports.(...) = (...). Can they be enumerated somehow, or are they garbage collected as soon as they are declared if they're not called by a module export? If I knew what the name of such a function was going to be in advance of loading a module...could I tell Node.js to "capture it" when it was defined?
I'm not expecting any such capabilities to be well-documented...but perhaps there's a debugger feature or other system call. One of the best pointers would be to the specific code in the Node.js project where this kind of declaration is handled, to see if there are any loopholes.
Note: In researching a little into V8 I saw that a "function definition" doesn't get added to the context. It's put into an "activation object" of the "execution context", and cannot be programmatically accessed. If you want some "light reading" I found:
http://coachwei.sys-con.com/node/676031/mobile
http://perfectionkills.com/understanding-delete/

if you fill in exports.foo = foo; at the end of your file it will be available in other files in node, assuming that you do var myFile = require('myFile.js') with the file name and you call the function via myFile.foo(); You can even rename the function for outside use in exports and set whatever you want to call the package when you use require.
BTW you can enumerate these functions just like you do on any JSON object (ie for k in ...)

This is impossible in node without more advanced reflection tools like the debugger.
The only way to do this would be to use __parent__ which was removed due to security issues and other things (hard to optimize, not standard to begin with) . When you run the script those variables become closed under the module. You can not access them elsewhere without explicitly exporting them.
This is not a bug, it's by design. It's just how node works. see this closely related question.
If this sort of reflection was available without tools like the debugger it would have been extremely hard to optimize the source code (at least the way v8 works) , meaning the answer to your question is no. Sorry.

We Keep Coding

JavaScript is the programming language of the Web.