function myClass()
{
//lots and lots of vars and code here.
this.bar = function()
{
//do something with the numerous enclosed myClass vars
}
this.foo = function()
{
alert('Hello'); //don't do anything with the enclosed variables.
}
}
Each instance of myClass gets its own copy of bar and foo, which is why prototyped methods use less memory. However, I would like to know more about the memory use of inner methods.
It seems obvious to me that (1) below must be true. Do you agree?
Not only do distinct myClass instances have their own distinct copies of bar, but they also must have their own distinct copies of the myClass enclosure. (or else how does the bar method keep the myClass variables straight on a per instance basis?)
Now (2) is the question I'm really after.
Since the inner foo method doesn't use anything in the myClass enclosure a natural qustion is: Is javascript smart enough to not keep a myClass enclosure in memory for the use of foo?
This will depend entirely on the implementation not the language. Naive implementations will keep far more around than others. Any answer that is true for e.g. V8 (chrome's JS engine) may not be true for Spidermonkey (Firefox's) or JScript (IE) etc...
In my experience (this is definitely how GHC handles closures), the actual code of the functions (inner or outer) only exists once, but an "environment" mapping variables to values must be kept around for the closure (bar), so no, I wouldn't expect the JS implementation to keep around more than one copy of MyClass, but it will keep around a copy of its environment at least with this.bar, sensible implementations might realize that no environment is actually needed foo is not actually a closure, and thus does not need to keep a copy of the environment from the call to MyClass however do not rely on this behaviour.
Especially since the behavior of JS's eval function which evaluates a string in the same lexical environemnt as the eval call, it is quite likely that JS implementations always keep the lexical environment around.
I'm reasonably certain that any function has the a pointer to the entire state. It's irrelevant whether it references any variables it still has access to them.
Of course you do release that both bar & foo point to the same "entire state" of the object. So it doesn't take any more memory. I guess you have to allocate one more pointer internally so you can point to the "entire state" of the object but optimising for something like that is silly.
Whether it optimises it away is javascript engine specific. By all means read the chromium source and find out.
I'll see if I can dig some quotes out of the spec for you.
Entering Function code from the spec:
10.4.3 Entering Function Code The following steps are performed when
control enters the execution context
for function code contained in
function object F, a caller provided
thisArg, and a caller provided
argumentsList:
If the function code is strict code, set the ThisBinding to thisArg.
Else if thisArg is null or undefined, set the ThisBinding to the
global object.
Else if Type(thisArg) is not Object, set the ThisBinding to
ToObject(thisArg).
Else set the ThisBinding to thisArg.
Let localEnv be the result of calling NewDeclarativeEnvironment
passing the value of the [[Scope]]
internal property of F as the
argument.
Set the LexicalEnvironment to localEnv.
Set the VariableEnvironment to localEnv.
That tracks your environment to NewDeclarativeEnvironment.
NewDeclarativeEnvironment
10.2.2.2 NewDeclarativeEnvironment (E) When the abstract operation
NewDeclarativeEnvironment is called
with either a Lexical Environment or
null as argument E the following steps
are performed:
Let env be a new Lexical Environment.
Let envRec be a new declarative environment record containing no
bindings.
Set env’s environment record to be envRec.
Set the outer lexical environment reference of env to E.
Return env.
This tracks your environment to E which is the [[Scope]] of the function Object
13.2 Creating Function Objects Given an optional parameter list specified
by FormalParameterList, a body
specified by FunctionBody, a Lexical
Environment specified by Scope, and a
Boolean flag Strict, a Function object
is constructed as follows:
Create a new native ECMAScript object and let F be that object.
Set all the internal methods, except for [[Get]], of F as described
in 8.12.
Set the [[Class]] internal property of F to "Function".
Set the [[Prototype]] internal property of F to the standard built-in
Function prototype object as specified
in 15.3.3.1.
Set the [[Get]] internal property of F as described in 15.3.5.4.
Set the [[Call]] internal property of F as described in 13.2.1.
Set the [[Construct]] internal property of F as described in 13.2.2.
Set the [[HasInstance]] internal property of F as described in
15.3.5.3.
Set the [[Scope]] internal property of F to the value of Scope.
Let names be a List containing, in left to right textual or
Turn's out the spec is just telling you things you already know. Every functions contain the Scope. As the other poster mentioned it is upto individual implementation to "optimise" some of the scope away in terms of memory management.
If you ask me you really don't care about this unless you have >million instances of these objects.
The following block from the spec is more applicable:
The outer environment reference is
used to model the logical nesting of
Lexical Environment values. The outer
reference of a (inner) Lexical
Environment is a reference to the
Lexical Environment that logically
surrounds the inner Lexical
Environment. An outer Lexical
Environment may, of course, have its
own outer Lexical Environment. A
Lexical Environment may serve as the
outer environment for multiple inner
Lexical Environments. For example, if
a FunctionDeclaration contains two
nested FunctionDeclarations then the
Lexical Environments of each of the
nested functions will have as their
outer Lexical Environment the Lexical
Environment of the current execution
of the surrounding function.
The two nested functions both have the same reference to the outer Lexical Environment
Well, let's ask some implementations what they do, shall we :-)
Spidermonkey: Internals-Functions -- agrees with other answers. Talks about how closures can be classified.
V8: Are Closures Optimized -- very terse but does mention "static optimizations". Various articles on the web talk about "hidden classes" which, I believe, are how the V8 GC tries to optimize closures.
Sadly those meager links are about all I can find. A direct analysis of the engine source-code is likely required to add more meaningful input.
So, yes, different engines do implement different optimization strategies when they can.
Related
I've read many questions on closures and JavaScript on SO, but I can't find information about what happens to a function when it gets closured in. Usually the examples are string literals or simple objects. See the example below. When you closure in a function, the original function is preserved even if you change it later on.
What technically happens to the function preserved in a closure? How is it stored in memory? How is it preserved?
See the following code as an example:
var makeFunc = function () {
var fn = document.getElementById; //Preserving this, I will change it later
function getOriginal() {
return fn; //Closure it in
}
return getOriginal;
};
var myFunc = makeFunc();
var oldGetElementById = myFunc(); //Get my preserved function
document.getElementById = function () { //Change the original function
return "foo"
};
console.log(oldGetElementById.call(document, "myDiv")); //Calls the original!
console.log(document.getElementById("myDiv")); //Returns "foo"
Thanks for the comments and discussion. After your recommendations, I found the following.
What technically happens to the function preserved in a closure?
Functions as objects are treated no differently than any other simple object as closured variables (such as a string or object).
How is it stored in memory? How is it preserved?
To answer this, I had to dig through some texts on programming languages. John C. Mitchell's Concepts in Programming Languages explains that closured variables usually end up in the program heap.
Therefore, the variables defined in nesting subprograms may need lifetimes that are of the entire program, rather than just the time during which the subprogram in which they were defined is active. A variable whose lifetime is that of the whole program is said to have unlimited extent. This usually means they must be heap-dynamic, rather than stack-dynamic.
And more specific to JavaScript runtimes, Dmitry Soshnikov describes
As to implementations, for storing local variables after the context is destroyed, the stack-based implementation is not fit any more (because it contradicts the definition of stack-based structure). Therefore in this case closured data of the parent context are saved in the dynamic memory allocation (in the “heap”, i.e. heap-based implementations), with using a garbage collector (GC) and references counting. Such systems are less effective by speed than stack-based systems. However, implementations may always optimize it: at parsing stage to find out, whether free variables are used in function, and depending on this decide — to place the data in the stack or in the “heap”.
Further, Dmitry shows a varied implementation of the parent scope in function closures:
As we mentioned, for optimization purpose, when a function does not use free variables, implementations may not to save a parent scope chain. However, in ECMA-262-3 specification nothing is said about it; therefore, formally (and by the technical algorithm) — all functions save scope chain in the [[Scope]] property at creation moment.
Some implementations allow access to the closured scope directly. For example in Rhino, the [[Scope]] property of a function corresponds to a non-standard property __parent__.
As said in sec 10.4.3
The following steps are performed when control enters the execution
context for function code contained in function object F, a caller
provided thisArg, and a caller provided argumentsList:
If the function code is strict code, set the ThisBinding to thisArg.
Else if thisArg is null or undefined, set the ThisBinding to the global object.
Else if Type(thisArg) is not Object, set the ThisBinding to ToObject(thisArg).
Else set the ThisBinding to thisArg.
Let localEnv be the result of calling NewDeclarativeEnvironment passing the value of the [[Scope]] internal property of F as the
argument.
Set the LexicalEnvironment to localEnv.
Set the VariableEnvironment to localEnv.
Let code be the value of F‘s [[Code]] internal property.
Perform Declaration Binding Instantiation using the function code code and argumentsList as described in 10.5.
Consider the following code snippet:
function foo(){
var a={p:'p'};
o={c:'c'};
}
Thus we have the following:
Code of our function isnt a strict code
thisArg is null hence, ThisBinding set to the global object
---
---
I dont understand what bindings will be contains environment record represented by [[Scope]] internal property.
Set the LexicalEnvironment to environment which geted at step 5.
Set the VariableEnvironment to environment which geted at step 5.
Perform declaration binding instatiation.
At step 8 bindings are created in the VariableEnvironment, but not in LexicalEnvironment. But in sec 10.3 said that
When an execution context is created its LexicalEnvironment and
VariableEnvironment components initially have the same value.
Question:
Why just after creation of execution context LexicalEnvironment and VariableEnvironment is still equal in my case above?
I am not sure I understand your question, but this is how I understand sec 10.4.3 :
steps 1 to 4 are dealing with the value of this. Basically, in strict mode, this will be left to a null or undefined value instead of defaulting to the global object (window in case of a browser). This covers the cases when a function is not invoked through usual object or event handler mechanisms.
step 5 to 7 mean that a new naming environment is created each time you enter a function.
It describes the creation of this environment, which is linked to the previous one to form the current name scope.
For each new function, two environments are co-existing.
When a name is resolved, Lexical environment is searched first, then Variable environment. If both searches fail, the process is repeated at the upper level of the environment chain, until the 'catch-all' global scope is encountered. In this scope, all identifiers are handled as properties of the global (window) object. You can picture it as the whole code being enclosed in a with (window) block.
Lexical environment can be seen as a temporary augmentation of the variable scope.
Lexical and Variable environments are functionnally identical until you alter the lexical environment with two specific statements: with or catch. It does not mean they are implemented as identical data structures.
In terms of implementation, you could imagine the lexical environment as an empty list and the variable environment as a list containing all local variable and parameter names.
When a catch or with statement is encountered, the lexical list is populated with new names that will take precedence over the ones stored in the variable list.
catch will simply make its argument available for name resolution (i.e. allow you to reference the exception parameter). No big deal since the new name is just as explicit as a function parameter.
with is a rather more dangerous beast. It will create a new environment with the names of all properties of the object passed as its argument. The scope will consist of the chain of variable environments plus this new lexical environment.
Here the new names available for resolution are 'hidden' inside the object.
For instance:
var a = 'a', b = 'surprise!', o = {a:'a'};
with (o) { a = b; }
console.log (a+" "+b+" "+o.a);
will yield
a surprise! surprise!
a is resolved as o.a since o contains a property named a.
b is not found inside the lexical environment and thus the current variable environment is tried and variable 'b' is found.
This is a pretty dangerous mechanism, because if you believe an object contains a given property while it actually doesn't, you will instead reference a variable outside the current scope.
For instance, a simple typo like this:
with (element.style) {leftt = '10px';}
will set the window.leftt property to '10px', unless you happen to have declared a variable named leftt somewhere in the current scope.
Now if you give silly names like 'i' or 'j' to your object properties, chances are you will clobber a random loop index somewhere up the scope chain while believing you are setting the property of an object.
step 8 describes the parameters binding once the function scope is established. Basically, the parameters are bound with the values and their names are added to the variable environment.
The whole point of keeping two separate environments is that the hoisting mechanism always uses the variable environment chain as a scope.
The idea is that a variable or function should behave as if it had been declared at the top of the current scope block, so for instance a function declared inside a with block should not resolve its names with the object properties exposed by the with statement.
Frankly this is a rather moot point, since ECMA spec does not allow function declarations inside blocks, although most implementations do, with varrying results.
Now for your example:
function foo(){
var a={p:'p'};
o={c:'c'};
}
Your function does not contain any with or catch statements, so the scope chain inside 'foo()' is just a list of two variable environments :
global (a bunch of DOM objects all seen as properties of 'window')
function foo (var a)
once you call foo(),
a will resolve as foo's local variable, will be created as an object with the property p of value 'p' (and garbage collected as soon as you leave foo(), unless you manage to reference it from a persistent variable).
o will not be found in foo's variable environment, so it will be caught by the 'catch-all' global scope and thus resolved as a (new) property of the window object. It will create a window.o.c property with the value 'c'.
Does that answer somewhat to your question?
I'm reading the Execution Context / Lexical Environment section of the ECMA 262 5 specification. It states the following: (emphasis added)
A Lexical Environment is a specification type used to define the association of Identifiers to specific variables and functions based upon the lexical nesting structure of ECMAScript code. A Lexical Environment consists of an Environment Record and a possibly null reference to an outer Lexical Environment. Usually a Lexical Environment is associated with some specific syntactic structure of ECMAScript code such as a FunctionDeclaration, a WithStatement, or a Catch clause of a TryStatement and a new Lexical Environment is created each time such code is evaluated.
I noticed that it says nothing about creating a lexical environment for Function Expressions. Is a lexical environment created for function expressions, or is one created only for function declarations? Am I missing something?
Edit: I notice that function code will have its own execution context, which is why I'm also confused why function expression is not mentioned in the lexical environment section.
Yes, every function gets (§10.4.3) its own ExecutionContext when it is called (§13.2.1). That new context is initialised with a new LexicalEnvironment (created by NewDeclarativeEnvironment, §10.2.2.2), deriving from the [[Scope]] of the function - i.e. the LexicalEnvironment it was declared/"expressed" in (§13).
As #Pointy pointed out, the sentence you stumpled upon does not exhaustively list them: "…some [structure] such as…".
If a name is included in a FunctionExpression, that name becomes a read-only binding that shadows any outer declarations of the same name. But that binding may itself be shadowed by a formal parameter or local declaration within the function. Such a binding for the function name is only created for named FunctionExpressions and not for anonymous FunctionExpressions or FunctionDeclarations. The name binding for a FunctionDeclaration is created in the surrounding VariableEnvironment.
Here is a more detailed explantion referencing the ES5.1 spec:
There is more than one environment record associated with a function object. Whenever a function is called a new DeclarativeEnvironmentRecord is created to contain the local bindings of that particular function invocation. That record becomes both the VariableEnvironment and the initial LexicalEnvironment of the ExecutionContext that is created for that invocation. This is specified in section 10.4.3.
When this environment record is created its "outer environment" is set to the value of the [[Scope]] internal property of the function object that is being called. (line 5, 10.4.3) The outer environment provide the bindings for all non-local declarations. The [[Scope]] is set when a function object is created (see the semantics in section 13 and also 13.2). So, each distinct invocation of a specific function object has a different local environment but all invocations of that function share the same outer [[Scope]].
For most functions, the captured [[Scope]] is simply the LexicalEnvironment of the ExecutionContext that was active when the function is created. However FunctionExpressions that include an Identifier as the function name have an extra DeclarativeEnvironmentRecord insert at the head of its [[Scope]] chain. (see steps 1-3 of the third algorithm in section 13).
This extra environment record is used to capture the binding for the function name given in the FunctionExpression.
An instantiated function has a scope. It doesn't matter whether it was instantiated as part of a function declaration statement or a function instantiation expression.
(It's probably more correct to say that an instantiated function has a scope when it is called, and that every call produces a distinct scope.)
Each function object should have two "hidden" properties ( per JavaScript The Good Parts, The Functions Chapter )
context
and
code
Is there a way to access these properties?
Well, you can access the function code pretty easily - by using toString() (or Mozilla's non-standard toSource()):
var x = function() { alert('Here is my happy function'); };
console.log(x.toString());
As for context, I suppose DC meant more than simple this, and actually wrote about Execution Context.
UPDATE: Have found an interesting snippet in ES5 specification, where these two properties are actually described in some details - and not as abstract concepts:
13.2 Creating Function Objects
Given an optional parameter list specified by FormalParameterList, a
body specified by FunctionBody, a Lexical Environment specified by
Scope, and a Boolean flag Strict, a Function object is constructed as
follows:
...
Set the [[Scope]] internal property of F to the value of Scope.
...
Set the [[Code]] internal property of F to FunctionBody.
At the same time:
Lexical Environments and Environment Record values are purely
specification mechanisms and need not correspond to any specific
artefact of an ECMAScript implementation. It is impossible for an
ECMAScript program to directly access or manipulate such values.
So I guess that closes the question about accessing Scope property of function.
As for Code property, its read-only accessing with toString(), as was rightly noticed by Matt, is implementation-dependent - yet is more often implemented than not. )
Compare this code1:
somevar = 5;
delete window.somevar;
alert(typeof somevar) //=> undefined, so deleted
to this code:
var somevar = 5;
delete window.somevar;
alert(typeof somevar) //=> number, so NOT deleted
See it in action here
Now in the first block, somevar is deleted, in the second block it's not. The only difference is using the var keyword in the second block. Both blocks run in the global scope.
Can this be explained?
1 the code can't be tested in a chrome-console or firebug, and not in jsfiddle either. In those environments all code is evalled, and in evalled code delete works on anything that is the result of eval (see more about that). In IE < 9 delete window[anything] is not allowed anyway.
What you're seeing is an aspect of the fact that the global object (window, on browsers) is a conflation of two different things which are distinct everywhere except the global execution context.
In the first block, someVar is a normal property of the window object. Properties can be removed via delete.
In the second block, someVar is a property of the binding object of the variable context of the global execution context — which is also window. You cannot delete properties the binding object receives in its role as the binding object (even though you can delete properties it receives in other ways). That is, you cannot delete variables declared with var (and a few other things that are added the same way).
(Sorry, not my terminology; it comes from the spec, which features some very fun language indeed.)
It's only the global execution context where we have this conflation of concepts. The variable binding object for other execution contexts (function calls, for instance) is still a very real thing (and crucial to proper functioning of closures), but there's no programmatic way to directly access it. In the global execution context, though, it's the global object, which of course we can access.
It helps to understand this if we look at functions first, and then look at the global execution context. when you call a function, these things happen:
Set this to point to the object designated by the call (the value of this is usually implicitly set, but there are ways to set it explicitly).
Create an execution context for this call.
Create a variable context for that execution context.
Create a binding object for that variable context.
Add the function's name, if it has one, to the binding object as a property referring to the function.
Add the arguments property to the binding object, referring to the pseudo-array of arguments to the function.
Add any named arguments declared in the function definition as properties of the binding object, referring to their entries in the arguments.
Add the names of of any variables declared via var statements (anywhere in the function body) as properties of the binding object, initially with the value undefined.
If there are named functions declared within the function, add their names as properties of the binding object, referring to those functions.
Put the binding object at the top of the scope chain (more below).
...and then step-by-step execution of the code in the body of the function begins. Any var statements with initializers (e.g., var a = 5; rather than just var a; are treated as assignment statements (a = 5;) when the execution point reaches them.
Throughout the above, whenever a property is added "to the binding object", it's added with a flag indicating that it cannot be deleted. This is why vars (and the names of declared functions, etc.) can't be deleted.
Any unqualified reference is looked up via the scope chain. So when you refer to a in your code, the first place the interpreter looks is the binding object at the top of the scope chain. If it has a property called a, that's what gets used; if not, we look at the next link down the scope chain and use that property if we find it; and so on until we run out of links on the scope chain. The global object is the bottommost link of that chain (which is why global variables work).
So what's different about the global context? Well, very little, actually. Here's the sequence (roughly):
Create an execution context for this call.
Create a variable context for that execution context.
Create a binding object for that variable context.
Set this to point to the binding object; that makes it the global object.
Set some default properties on that object as defined by the environment (in browsers, for instance, the property window is added to the object, referring to itself).
...and then we basically pick up with step 8 in the function stuff:
Add the names of of any variables declared via var statements (anywhere in the global scope) as properties of the binding/global object, initially with the value undefined.
If there are named functions declared within the global scope, add their names as properties of the binding/global object, referring to those functions.
Put the binding/global object at the top of the scope chain (more below).
...and start step-by-step execution of the code (again with var initializers becoming assignments).