Altering Global Scope for dynamically loaded module in Node.js - javascript

Loading a module from a source dynamically:
var src="HERE GOES MY SOURCE"
var Module = module.constructor;
var m = new Module();
m._compile(src, 'a-path-that-does-not-exist');
Need to achieve following:
Pass some variables/functions so that they can be used inside the src script globally. Can set them in "m.foo", but want the script to use "foo" without using "module.foo". "global.foo" works, but see the point 2.
How to restrict the src script from accessing global scope?
How to restrict the src from loading other modules using require or other means.
How to restrict the src from running async operations?

All, I can think of is to wrap the script in its own function, kind of like nodejs already does for commonJS modules. This is the regular wrapper.
(function(exports, require, module, __filename, __dirname) {
// Module code actually lives in here
});
If you wrap that user code with your own wrapper and then when you call it to execute it, you can define your own values for require, module and any other semi-global symbols.
If you also put 'use strict'; as the very first line of your wrapper function (before any of the user code), then that will eliminate default assignment to the global object with just something like x = 4 because that will be an error without explicitly defining x first. If you then also make your own global object and pass it as an argument, that can keep anyone from assigning to the real global object. I don't think you can prevent implicit read access to pre-existing globals.
So, your wrapper could look like this:
(function(exports, require, module, __filename, __dirname, global) {
'use strict';
// insert user code here before evaluating it with eval()
// and getting the function which you can then call and pass the desired arguments
});
Then, when you call this function, you pass it the values you want to for all the arguments (something other than the real ones).
Note, it's hard to tell how leak-proof this type of scheme really is. Any real security should likely be run in a resource restricted VM.
Another idea, you could run in a Worker Thread which has it's own virgin set of globals. So, you do all of the above and run it in a Worker Thread.
Addressing your questions in comments:
Does the 'use strict'; need to go inside the wrapper function or outside?
It needs to be the first line of code inside the wrapper function, right before where you insert the user code. The idea is to force that function scope (where the user code lives) inside that wrapper to be in strict mode to limit some of the things it can do.
Could you explain the "I don't think you can prevent implicit read access to pre-existing globals."? If i provide my own object as global, how can the inner script access preexisting globals?
Any code, even strict mode code can access pre-existing globals without the global prefix. While you can prevent the code from creating new globals by shadowing it with your own global in the wrapper function arguments and by forcing it into strict mode, you can't prevent strict mode code from reading existing globals because they can do so without the global prefix. So, if there's a pre-existing global called "foo", then existing code can reference that like:
console.log(foo);
or
foo = 12;
If there is no foo in a closer scope, the interpreter will find the foo on the global object and use that.
Note that strict mode prevents the automatic creation of a new global with something like:
greeting = "happy birthday"
Could you elaborate more no "resource restricted VM"?
I was talking about real hardware/OS level VMs that allow you to fully control what resources a process may use (disk access, sockets, memory, hardware, etc...). It's essentially a virtual computer environment separate from any other VMs on the same system. This is a more rigorous level of control.
WorkerThread is a very interesting concept. will take a look! My understanding was that WorkerThreads provide memory isolation and the only way to share data is by sending messages (effectively creating copies)?
Yes, Worker Threads provide pretty good isolation as they start up a whole new JS engine and have their own globals. They can shared ArrayBuffers in certain ways (if you choose to do that), but normal JS variables are not accessible across thread boundaries. They would normally communicate via messaging (which is automatically synchronized through the event queue), but you could also communicate via sockets if you wanted.

Related

What is global.isProd?

I've just started learning gulp, and I've been seeing a lot of people using global.isProd
Example of a gulp file:
'use strict';
global.isProd = false;
require('./gulp');
I just want to understand what is global and what is global.isProd or if you can tell me where I can find this information.
global is how you access true global variables in node.js.
So, global.isProd = false; is assigning a globally reachable property named isProd an initial value.
In node.js, the top level scope in a module (e.g. a var xxx declared at the top level of a module) is not actually the global scope. This is different than Javascript in the browser. The top level in a module is local to the module as a module is actually declared within a function scope that node.js sets up for each module.
So, to actually reach the global scope in node.js, it defines the symbol global that works somewhat like the window symbol in a browser. If you want a globally accessible variable in node.js, you make it a property of the global symbol.
Here's what the node.js doc has to say about global.
The usual practice in nodejs is to avoid globals when practical because it prevents global naming conflicts and letting modules store their own state generally makes code more modular and more reusable. Since module references are cached, you can usually get access to a common resource by simply referencing a property of a module or by calling a method in that module and just let the module itself take care of storing the common resource in its own module variables without using globals. So, the cached module handles tend to make it simpler to get access to singleton resources via a given module.
global is just a variable in the module's scope. It's similar to window in the browser, except global's scope covers the module only.
isProd is simply a "global variable" (again, only in the module) that tells your gulp file "Hey! This is NOT production." It's nothing special by the way, probably just used in your script.

Execute arbitrary javascript in a custom global context

My only requirement for executing this arbitrary js code is that certain global variables/functions (e.g. setInterval) are not exposed.
My current strategy involves parsing through the js code and making a var declaration (at the beginning of the enclosing closure) for every global reference.
I'm wondering if there's any other obvious way to solving this.
Also just to clarify, this arbitrary code is not being run with things like eval. Rather, it's being wrapped inside a closure and appended to the base code.
One of the options is to override the globals by supplying your own function or variables. Example:
window.alert = function() {
// your code goes here
// Optionally call window.alert if needed
}
This should be manageable if you have a small finite list of things you wish to hide. This will make them globally unavailable.

What's the global object within a module in nodejs?

I know in node, every module gets an local scope and variable defined within one module won't escape to the global unless explicitly exported.
What I want to know is when I declare a variable in one module file as the following, what's the global object this variable defined on?
var obj = {};
// In browser, this is defined on window.obj
// How about in the node?
There is one statement that global is the object used as the local global scope, however the following test code fails:
a = 1
console.log global.a
// undefined
So what is the global variable for a module?
There is in fact a global object in node; it's simply called global. You can assign properties to it and access those properties just like any other object, and you can do it across modules. However, directly accessing it (global.foo=bar in one module, baz=global.foo in another) is the only way to access it. Unqualified variable names never automatically resolve to its properties the way that, say, an unqualified variable name in a browser environment would resolve to window.somethingOrOther. In node, every module is conceptually wrapped inside an immediate function call, so unqualified names represent local variables and have module-level scope.
edit again I'm just about ready to conclude that, in a Node module, you can't get a reference to the global object. You really shouldn't need to; that's kind-of the whole point of the module mechanism, I think. You import what you need and export what you choose.
Whether there even is a global object in Node is an interesting question, I guess. I know that in Rhino, there definitely is; there's no implicit wrapper function around code fed to Rhino from the Java container. Via the Java ScriptEngine mechanism (and presumably from Mozilla APIs in "naked" Rhino) there are ways of pushing symbols into the global context to make them visible to JavaScript as global object properties.
wow this got complicated well things seem to be on the move in the Node.js world. What I wrote above was true for Node 0.6.2, but in the 0.9.0-pre build I just did there is indeed an object called "global" that behaves more or less like the global object in a browser.
The stuff below applies to browsers and Rhino and other contexts like that
You can use this in the global context. If you need a name, you can give it one.
var global = this;
var obj = "Hi!";
global.obj = "Bye"; // sets "obj"
A (somewhat) common idiom is to wrap your code in a function, to protect the global namespace.
(function( global ) {
// everything
})( this );
Caveat: I'm not a Node person so there may be some emerging idiom in that culture.
edit — it occurs to me that if Node really does wrap code from files it evaluates in a function, and it doesn't pass this into it from the global context, then there's no way to "find" it, I don't think. If you use "use strict" (and you should), it doesn't matter, because you can't really mess with the global context anyway.

Possible to enumerate or access module-level function declarations in NodeJs?

In Node.js, if I load a module which contains code in module-scope like:
this["foo"] = function() { console.log("foo"); }
...then I appear to get a globally available function that I can call just by saying foo() from any code using the module. It can be seen as one of the printed items with Object.getOwnPropertyNames(this).
However, if I put the following in module scope instead:
function foo() { console.log("foo"); }
...then it produces a function which can similarly be called within that module as foo(), but is invisible outside of it (e.g. does not show up as one of the items with Object.getOwnPropertyNames(this)).
I gather this is a change in runtime behavior from what's done in browsers. A browser seems to poke everything into global scope by default (and for years people have had to consciously avoid this by wrapping things up in anonymous functions/etc.)
My question is whether NodeJs has some secret way of interacting with these declarations outside of the module in which they are declared BESIDES using exports.(...) = (...). Can they be enumerated somehow, or are they garbage collected as soon as they are declared if they're not called by a module export? If I knew what the name of such a function was going to be in advance of loading a module...could I tell Node.js to "capture it" when it was defined?
I'm not expecting any such capabilities to be well-documented...but perhaps there's a debugger feature or other system call. One of the best pointers would be to the specific code in the Node.js project where this kind of declaration is handled, to see if there are any loopholes.
Note: In researching a little into V8 I saw that a "function definition" doesn't get added to the context. It's put into an "activation object" of the "execution context", and cannot be programmatically accessed. If you want some "light reading" I found:
http://coachwei.sys-con.com/node/676031/mobile
http://perfectionkills.com/understanding-delete/
if you fill in exports.foo = foo; at the end of your file it will be available in other files in node, assuming that you do var myFile = require('myFile.js') with the file name and you call the function via myFile.foo(); You can even rename the function for outside use in exports and set whatever you want to call the package when you use require.
BTW you can enumerate these functions just like you do on any JSON object (ie for k in ...)
This is impossible in node without more advanced reflection tools like the debugger.
The only way to do this would be to use __parent__ which was removed due to security issues and other things (hard to optimize, not standard to begin with) . When you run the script those variables become closed under the module. You can not access them elsewhere without explicitly exporting them.
This is not a bug, it's by design. It's just how node works. see this closely related question.
If this sort of reflection was available without tools like the debugger it would have been extremely hard to optimize the source code (at least the way v8 works) , meaning the answer to your question is no. Sorry.

What are some of the problems of "Implied Global variables"?

JavaScript: The Good Parts defines these kinds of declarations as bad:
foo = value;
The book says "JavaScript’s policy of making forgotten variables global creates
bugs that can be very difficult to find."
What are some of the problems of these implied global variables other than the usual dangers of typical global variables?
As discussed in the comments on this answer, setting certain values can have unexpected consequences.
In Javascript, this is more likely because setting a global variable actually means setting a property of the window object. For instance:
function foo (input) {
top = 45;
return top * input;
}
foo(5);
This returns NaN because you can't set window.top and multiplying a window object doesn't work. Changing it to var top = 45 works.
Other values that you can't change include document. Furthermore, there are other global variables that, when set, do exciting things. For instance, setting window.status updates the browser's status bar value and window.location goes to a new location.
Finally, if you update some values, you may lose some functionality. If, for instance, you set window.frames to a string, for instance, you can't use window.frames[0] to access a frame.
Global variable make it very difficult to isolate your code, and to reuse it in new contexts.
Point #1:
If you have a Javascript object that relies on a global var. you will not be able to create several instances of this object within your app because each instance will change the value of the global thereby overwriting data the was previously written by the another instance.
(Unless of course this variable holds a value that is common to all instances - but more often than not you'll discover that such an assumption is wrong).
Point #2:
Globals make it hard to take existing pieces of code and reuse them in new apps. Suppose you have a set of functions defined in one file and you want to use them in some other file (in another app). So you extract them to a new file and have that file included in the new app. If these function rely on a global your 2nd app will fail at runtime because the global variable is not there. The dependency on globals is not visible in the code so forgetting these variables (when moving functions to new files) is a likely danger.
They're global variables, so yes, all of the "usual dangers" apply. The main thing that distinguishes them from global variables in other languages is that:
You don't explicitly declare them in a global scope. If you mistakenly omit var in a variable declaration, you've accidentally declared a global variable. JavaScript makes it a little too easy to unintentionally declare global variables; contrast this with Scheme, which raises an error if a variable is not defined in the global scope.
Global variables, at least in the browser, are aliased by window[variable_name]. This is potentially worrisome. For example, some of your code might access window['foo'] (with the intention of accessing a global variable). Then, if you accidentally type foo instead of var foo elsewhere in the program, you have declared a reference to window['foo'], which you meant to keep separate.
One issue is that you may be trampling on already defined variables and not know it, causing weird side effects in other parts of the code that can be a bear to track down.
Another is that it is just sloppy code. You should not be creating variable with more scope than they need since at the very least it keeps more variables in memory and at worst it can create data scenarios you didn't intend.
The bottom line is that when you do that you don't know for sure that you are not messing up other functions that use a global variable of the same name. Sometimes it isn't even your fault, a lazy programmer of another plugin left something global that was meant to have scope inside of a function. So it is a very practical safeguard for writing better and less buggy code.
The problems of typical global variables is that they are, well, global - there is no scope to enclose them, and any code that you are executing / interacting with (such as a library that you call down the road) could modify the variable without a warning.
However, these problems are compounded in Javascript by two things:
You can define a global variable anywhere - the only requirement for that is to forget the var keyword.
It is extremely easy to define a global variable when you had no intent to do so. That is the problem that "implied" globals have over "typical" globals - you will create them without even knowing you did.
In a reasonably-designed language that includes truly global variables (ok, so not that reasonably-designed), you would have a limited handful of places to define globals, and it would require a special keyword to do so.

Categories