I've run across a JavaScript library that implement a cross-browser WeakMap in ES5. (WeakMap is slated for ES6.)
How can this possibly work without support in the JavaScript language itself?
Edit: Just to be clear, I'm referring to a Weak Map, not a regular Map. I tested this project out using Chrome's profiler and the keys are not held by strong references. They get GC'ed without having to remove them from the WeakMap.
It took me a while to grok the code, but then it hit me: the key itself is used to store a reference to the value.
For example, several layers into set it does
defProp(obj, globalID, { value: store });
where defProp has been defined to be Object.defineProperty, obj is the key, globalID is a guid and store is a storage object that contains the value.
Then down in get it looks up the value with
obj[globalID];
This is very clever. The WeakMap doesn't actually contain a reference to anything (weak or otherwise)-- it just sets up a policy of where to secretly store the value. The use of Object.defineProperty means that you won't accidentally discover the value storage-- you have to know the magic guid to look it up.
Since the key directly refers to the value (and the WeakMap doesn't refer to it), when all references to the key are gone, it gets GCed like normal.
Related
Objects in JavaScript can be used as Hashtable
(the key must be String)
Is it perform well as Hashtable the data structure?
I mean , does it implemented as Hashtable behind the scene?
Update: (1) I changed HashMap to hashtable (2) I guess most of the browser implement it the same, if not why not? is there any requirement how to implement it in the ECMAScript specs?
Update 2 : I understand, I just wonder how V8 and the Firefox JS VM implements the Object.properties getters/setters?
V8 doesn't implement Object properties access as hashtable, it actually implement it in a better way (performance wise)
So how does it work? "V8 does not use dynamic lookup to access properties. Instead, V8 dynamically creates hidden classes behind the scenes" - that make the access to properties almost as fast as accessing properties of C++ objects.
Why? because in fixed class each property can be found on a specific fixed offset location..
So in general accessing property of an object in V8 is faster than Hashtable..
I'm not sure how it works on other VMs
More info can be found here: https://v8.dev/blog/fast-properties
You can also read more regarding Hashtable in JS here:(my blog) http://simplenotions.wordpress.com/2011/07/05/javascript-hashtable/
"I guess most of the browser implement it the same, if not why not? is there any requirement how to implement it in the ECMAScript specs?"
I am no expert, but I can't think of any reason why a language spec would detail exactly how its features must be implemented internally. Such a constraint would have absolutely no purpose, since it does not impact the functioning of the language in any way other than performance.
In fact, this is absolutely correct, and is in fact the implementation-independence of the ECMA-262 spec is specifically described in section 8.6.2 of the spec:
"The descriptions in these tables indicate their behaviour for native
ECMAScript objects, unless stated otherwise in this document for particular kinds of native ECMAScript objects. Host objects may support these internal properties with any implementation-dependent behaviour as long as it is consistent with the specific host object restrictions stated in this document"
"Host objects may implement these internal methods in any manner unless specified otherwise;"
The word "hash" appears nowhere in the entire ECMA-262 specification.
(original, continued)
The implementations of JavaScript in, say, Internet Explorer 6.0 and Google Chrome's V8 have almost nothing in common, but (more or less) both conform to the same spec.
If you want to know how a specific JavaScript interpreter does something, you should research that engine specifically.
Hashtables are an efficient way to create cross references. They are not the only way. Some engines may optimize the storage for small sets (for which the overhead of a hashtable may be less efficient) for example.
At the end of the day, all you need to know is, they work. There may be faster ways to create lookup tables of large sets, using ajax, or even in memory. For example see the interesting discussion on this post from John Reseig's blog about using a trie data structure.
But that's neither here nor there. Your choice of whether to use this, or native JS objects, should not be driven by information about how JS implements objects. It should be driven only by performance comparison: how does each method scale. This is information you will get by doing performance tests, not by just knowing something about the JS engine implementation.
Most modern JS engines use pretty similar technique to speed up the object property access. The technique is based on so called hidden classes, or shapes. It's important to understand how this optimization works to write efficient JS code.
JS object looks like a dictionary, so why not use one to store the properties? Hash table has O(1) access complexity, it looks like a good solution. Actually, first JS engines have implemented objects this way. But in static typed languages, like C++ or Java a class instance property access is lightning fast. In such languages a class instance is just a segment of memory, end every property has its own constant offset, so to get the property value we just need to take the instance pointer and add the offset to it. In other words, in compile time an expression like this point.x is just replaced by its address in memory.
May be we can implement some similar technique in JS? But how? Let's look at a simple JS function:
function getX(point) {
return point.x;
}
How to get the point.x value? The first problem here is that we don't have a class (or shape) which describes the point. But we can calculate one, that is what modern JS engines do. Most of JS objects at runtime have a shape which is bound to the object. The shape describes properties of the object and where these properties values are stored. It's very similar to how a class definition describes the class in C++ or Java. It's a pretty big question, how the Shape of an object is calculated, I won't describe it here. I recommend this article which contains a great explanation of the shapes in general, and this post which explains how the things are implemented in V8. The most important thing you should know about the shapes is that all objects with the same properties which are added in the same order will have the same shape. There are few exceptions, for example if an object has a lot of properties which are frequently changed, or if you delete some of the object properties using delete operator, the object will be switched into dictionary mode and won't have a shape.
Now, let's imagine that the point object has an array of property values, and we have a shape attached to it, which describes where the x value in this property array is stored. But there is another problem - we can pass any object to the function, it's not even necessary that the object has the x property. This problem is solved by the technique called Inline caching. It's pretty simple, when getX() is executed the first time, it remembers the shape of the point and the result of the x lookup. When the function is called second time, it compares the shape of the point with the previous one. If the shape matches no lookup is required, we can take the previous lookup result.
The primary takeaway is that all objects which describe the same thing should have the same shape, i.e. they should have the same set of properties which are added in the same order. It also explains why it's better to always initialize object properties, even if they are undefined by default, here is a great explanation of the problem.
Relative resources:
JavaScript engine fundamentals: Shapes and Inline Caches and a YouTube video
A tour of V8: object representation
Fast properties in V8
JavaScript Engines Hidden Classes (and Why You Should Keep Them in Mind)
Should I put default values of attributes on the prototype to save space?
this article explains how they are implemented in V8, the engine used by Node.js and most versions of Google Chrome
https://v8.dev/blog/fast-properties
apparently the "tactic" can change over time, depending on the number of properties, going from an array of named values to a dictionary.
v8 also takes the type into account, a number or string will not be treated in the same way as an object (or function, a type of object)
if i understand this correctly a property access frequently, for example in a loop, will be cached.
v8 optimises code on the fly by observing what its actually doing, and how often
v8 will identify the objects with the same set of named properties, added in the same order (like a class constructor would do, or a repetitive bit of JSON, and handle them in the same way.
see the article for more details, then apply at google for a job :)
I am trying to mock a very simple browser environment that mimics how browsers react to user's change in location variable. As far as I know, users can alter
self.location
location
document.location
window.location
one of those to navigate the current window by either
assigning a string of url directly onto the variable (e.g. self.location = 'http://www.google.com')
or assigning a string onto href inside the location object (e.g. self.location.href = 'http://www.google.com')
or maybe explicitly instantiating a location object.
So my real question is how can I instantiate the browser environment to allow users to assign location variable is anyway they wish and let me retrieve the location variable later on? Sorry if this sounds stupid, (I've never coded in javascript before) but in real world circumstances, do browsers convert string data into location objects through macro or does Javascript have some sort of "implicit constructor" mechanism that can automatically invoke the constructor of a class just by assigning a value?
(I am aware that there are dom libraries available, but I find them quite an overkill as I am only interested in the navigating mechanism.)
Well, there are actually several different answers, here.
First, don't anticipate being on the same page, when retrieving the location.
If a user changed it's value, then your page will change (and your in-memory application with it), so you'd need to do state management using some form of storage (online/offline) there.
In terms of how the object actually works, that isn't exactly JS (in all cases).
Long story short, there's JavaScript the language, and then there's the DOM/BOM.
Document-Object Model and Browser-Object Model.
The DOM is a set of functions/objects that let you interface with the HTML/CSS(as it applies to an element) of the page.
The BOM contains things that relate, not directly to the HTML, but to other parts of web functionality.
console.log( ); is a good example.
JS has no native console object or Console constructor. That's a part of the BOM that's added to a browser's environment, by the browser vendor (or third-party plugin), and is implementation a specific, with no real standard.
Same thing is true for URLs.
JS has a global object (in the BOM, it's called window), but it doesn't have a location natively.
So the implementation -- the "how", is hard to answer.
Some browsers might do it in legitimate JS, while others do it in C++ or C, and old IE even had ActiveX components.
That said, new JS engines do have magic get / set methods which can perform actions, while you set data.
Cutting-edge JS even has Proxies, which are sort of like that, but on steroids (these won't be widely supported everywhere for a few years)...
But older JS engines didn't have those features in place in the native language, so they went off into some other language and wired things up to behave in ways that wouldn't have been supported in JS itself, but were needed to fill out the BOM/DOM.
these days, using a .set might be all you need to grab an instance of a constructor.
So in terms of setting up your own object with similar functionality, you could RegEx parse (or iterate pieces of) the value handed to you.
On your object, you could have magic get/set methods assigned to a property name, where you could then, on a set, either modify all aspects of your current instance (based on value), or save a new instance to the var currently occupied by the old one, passing the new value to the constructor.
...but "where's the Location" constructor is a question that's not going to be answered in any simple way, without going browser by browser, version by version.
In a browser, window is effectively an alias for the global object so self === window.self and:
self.location
location
window.location
are all equivalent.
The location object is documented at MDN: window.location and at MSDN: The Window’s Location Object.
Users can set a location by entering it in the browser's address bar. You might also provide say an input element that they can type into, then a button that assigns the value to window.location (or uses any of the methods in the references provided).
do browsers convert string data into location objects through macro or does Javascript have some sort of "implicit constructor" mechanism that can automatically invoke the constructor of a class just by assigning a value?
The window and location objects are host objects that are not required to have any constructor, they just "are". Even if they are implemented as instances with prototype inheritance, there's no specification requiring users to have access to the constructor.
A new window can be created using:
var newWindow = window.open(...);
which I suppose is like calling a constructor.
When a location object is assigned a new URL, it behaves per the documentation referenced. How that happens is an implementation detail that is not specified, so you can implement it any way you like as long as it behaves like it should. That is the general philosophy behind javascript.
What you're looking for here is a getter/setter pair.
var window = {
get location() {
return {
get href() { return "foo"; },
set href() { /* change stored location data */; },
get port() { return 80; },
set port() { /* change stored location data */; },
/* etc ... */
};
}
set location() {
// change stored location data
}
};
I have a big object defined in the global scope called global. I would like to dynamically find all the referenced properties under my variable global. That is, all the properties that were accessed during the execution of the code.
I want to do static code analysis to extract all the referenced properties under my variable. I can search for these patterns: global.PROPERTY_NAME AND global[PROPERTY_NAM]. However, what about the complicated cases like these ones
var tmp="PROPERTY_NAME";
global[tmp]
OR
var tmp=global;
tmp.PROPERTY_NAME
and the other ones?
I don't want to get all the variable's properties. I only want a list of the referenced ONES!! the properties that were referenced in my source code only
After your edit:
What you're looking for is JavaScript Proxy objects. Here is a tutorial on how to do this using them.
Proxy objects let you wrap an object and execute a method whenever its properties are accessed. Unfortunately as it currently stands they are not widely supported.
This is currently only way in JavaScript to accomplish this without changing your original global object.
You can turn them on in Chrome by enabling experimental JavaScript in the about:flags tab.
Before your edit:
The feature you're looking for is called reflection, JavaScript supports it well and natively
Here is some code that iterates through an object and gets its properties
for(var prop in global){
if(global.hasOwnProperty(prop)){ //this is to only get its properties and not its prototype's
alert(prop+" => "+global[prop]);
}
}
This is fairly cross-browser. More modern browsers allow you to do this in simpler ways like Object.keys(global) which returns an array containing all its enumerable properties, or Object.getOwnPropertyNames(global) which returns both enumerable and not-enumerable properties.
Due to the dynamic nature of JavaScript you won't achieve that with static code analysis. Think about cases like this:
var prop = document.getElementById('prop').value;
global[prop];
Impossible. The alternative, dynamic analysis, would mean that you modify your global object to log access to its properties, then run the code. This is easily possible in JavaScript but it won't help you either because how would you assure that you have covered every possible access? Especially in a 5 MB JavaScript, there are most likely edge cases that you will oversee.
So, if you can't narrow down your requirement, it won't be possible.
Say I instantiate a large object with var and new. After I'm done with it, can I "free" it by setting it to null? Is this something Javascript GCs look for?
The garbage collection is interested in objects, that are not referenced by ANY other object. So make sure there are no references left anywhere in your application (Arrays etc.)
You can break the reference by setting the variable to null, but it doesn't break any other references.
All references need to be broken individually before the object can be GC'd.
So yes, if the only reference to the object that is held by that variable, then setting it to null will free it for eventual GC.
as am not i am stated, you'll need to break all references in order for a variable to be eligible for garbage collection. This can be a difficult task if you can't track down that last reference to a specific Object, so use the tools available to you for this task. Personally I use the Chrome's Heap Profiler which you can read about in the chrome docs.
Also, note that only non-primitive types are passed by reference (and therefore only non-primitive types can be GC'd).
Objects in JavaScript can be used as Hashtable
(the key must be String)
Is it perform well as Hashtable the data structure?
I mean , does it implemented as Hashtable behind the scene?
Update: (1) I changed HashMap to hashtable (2) I guess most of the browser implement it the same, if not why not? is there any requirement how to implement it in the ECMAScript specs?
Update 2 : I understand, I just wonder how V8 and the Firefox JS VM implements the Object.properties getters/setters?
V8 doesn't implement Object properties access as hashtable, it actually implement it in a better way (performance wise)
So how does it work? "V8 does not use dynamic lookup to access properties. Instead, V8 dynamically creates hidden classes behind the scenes" - that make the access to properties almost as fast as accessing properties of C++ objects.
Why? because in fixed class each property can be found on a specific fixed offset location..
So in general accessing property of an object in V8 is faster than Hashtable..
I'm not sure how it works on other VMs
More info can be found here: https://v8.dev/blog/fast-properties
You can also read more regarding Hashtable in JS here:(my blog) http://simplenotions.wordpress.com/2011/07/05/javascript-hashtable/
"I guess most of the browser implement it the same, if not why not? is there any requirement how to implement it in the ECMAScript specs?"
I am no expert, but I can't think of any reason why a language spec would detail exactly how its features must be implemented internally. Such a constraint would have absolutely no purpose, since it does not impact the functioning of the language in any way other than performance.
In fact, this is absolutely correct, and is in fact the implementation-independence of the ECMA-262 spec is specifically described in section 8.6.2 of the spec:
"The descriptions in these tables indicate their behaviour for native
ECMAScript objects, unless stated otherwise in this document for particular kinds of native ECMAScript objects. Host objects may support these internal properties with any implementation-dependent behaviour as long as it is consistent with the specific host object restrictions stated in this document"
"Host objects may implement these internal methods in any manner unless specified otherwise;"
The word "hash" appears nowhere in the entire ECMA-262 specification.
(original, continued)
The implementations of JavaScript in, say, Internet Explorer 6.0 and Google Chrome's V8 have almost nothing in common, but (more or less) both conform to the same spec.
If you want to know how a specific JavaScript interpreter does something, you should research that engine specifically.
Hashtables are an efficient way to create cross references. They are not the only way. Some engines may optimize the storage for small sets (for which the overhead of a hashtable may be less efficient) for example.
At the end of the day, all you need to know is, they work. There may be faster ways to create lookup tables of large sets, using ajax, or even in memory. For example see the interesting discussion on this post from John Reseig's blog about using a trie data structure.
But that's neither here nor there. Your choice of whether to use this, or native JS objects, should not be driven by information about how JS implements objects. It should be driven only by performance comparison: how does each method scale. This is information you will get by doing performance tests, not by just knowing something about the JS engine implementation.
Most modern JS engines use pretty similar technique to speed up the object property access. The technique is based on so called hidden classes, or shapes. It's important to understand how this optimization works to write efficient JS code.
JS object looks like a dictionary, so why not use one to store the properties? Hash table has O(1) access complexity, it looks like a good solution. Actually, first JS engines have implemented objects this way. But in static typed languages, like C++ or Java a class instance property access is lightning fast. In such languages a class instance is just a segment of memory, end every property has its own constant offset, so to get the property value we just need to take the instance pointer and add the offset to it. In other words, in compile time an expression like this point.x is just replaced by its address in memory.
May be we can implement some similar technique in JS? But how? Let's look at a simple JS function:
function getX(point) {
return point.x;
}
How to get the point.x value? The first problem here is that we don't have a class (or shape) which describes the point. But we can calculate one, that is what modern JS engines do. Most of JS objects at runtime have a shape which is bound to the object. The shape describes properties of the object and where these properties values are stored. It's very similar to how a class definition describes the class in C++ or Java. It's a pretty big question, how the Shape of an object is calculated, I won't describe it here. I recommend this article which contains a great explanation of the shapes in general, and this post which explains how the things are implemented in V8. The most important thing you should know about the shapes is that all objects with the same properties which are added in the same order will have the same shape. There are few exceptions, for example if an object has a lot of properties which are frequently changed, or if you delete some of the object properties using delete operator, the object will be switched into dictionary mode and won't have a shape.
Now, let's imagine that the point object has an array of property values, and we have a shape attached to it, which describes where the x value in this property array is stored. But there is another problem - we can pass any object to the function, it's not even necessary that the object has the x property. This problem is solved by the technique called Inline caching. It's pretty simple, when getX() is executed the first time, it remembers the shape of the point and the result of the x lookup. When the function is called second time, it compares the shape of the point with the previous one. If the shape matches no lookup is required, we can take the previous lookup result.
The primary takeaway is that all objects which describe the same thing should have the same shape, i.e. they should have the same set of properties which are added in the same order. It also explains why it's better to always initialize object properties, even if they are undefined by default, here is a great explanation of the problem.
Relative resources:
JavaScript engine fundamentals: Shapes and Inline Caches and a YouTube video
A tour of V8: object representation
Fast properties in V8
JavaScript Engines Hidden Classes (and Why You Should Keep Them in Mind)
Should I put default values of attributes on the prototype to save space?
this article explains how they are implemented in V8, the engine used by Node.js and most versions of Google Chrome
https://v8.dev/blog/fast-properties
apparently the "tactic" can change over time, depending on the number of properties, going from an array of named values to a dictionary.
v8 also takes the type into account, a number or string will not be treated in the same way as an object (or function, a type of object)
if i understand this correctly a property access frequently, for example in a loop, will be cached.
v8 optimises code on the fly by observing what its actually doing, and how often
v8 will identify the objects with the same set of named properties, added in the same order (like a class constructor would do, or a repetitive bit of JSON, and handle them in the same way.
see the article for more details, then apply at google for a job :)