Object variables: Pointers vs References - what's common in languages? [duplicate] - javascript

This question already has answers here:
What are the differences between a pointer variable and a reference variable?
(44 answers)
When to use references vs. pointers
(17 answers)
What is the difference between a pointer and a reference variable in Java?
(9 answers)
Closed 6 months ago.
I've learned programming in Java at university. The book I read, explained the concept of pointers being used for object variables. Later I checked the actual concepts of pointers & references in C++.
I know these aren't 100% applicable to other high level / scripting languages, but there are recurring patterns that are similar.
When I started learning new languages, I noticed that most of them rely on the concept of storing some sort of object identifier inside a variable. When using an accessing-operator on the object-variable, it will read it, identify the object based on the identifier and access its content. I'd say this comes pretty close to the basic idea of a pointer.
I've confirmed that behavior for PHP and JS.
Example for PHP:
https://www.php.net/manual/en/language.oop5.references.php
When an object is sent by argument, returned or assigned to another variable, the different variables are not aliases: they hold a copy of the identifier, which points to the same object.
However, I often see people mixing up the terms reference and pointer. This started confusing me a little.
Personally, I think pointers are a more elegant way to store objects.
I'd like to know if pointers are actually a common case as object-variable or if references are just as common. Do pointers have notable advantages over references?
It's hard to tell which general concept is being used, because from a programmers perspective, the internals are often abstracted away. But I'd like to have a rough understanding of what's happening under the hood. Not in all detail, but conceptually.

At the CPU level, there are pointers (and nothing else except the registers). Therefore, all so called low-level languages must have some pointer mechanism because that's how the internals work. Every variable, function, etc is a pointer at the assembly level.
At the language level however, it's a different story if and how pointers and references are used. If they are implemented in similar way, that's up to the implementation. At the language level, they are different. So you need to learn both. It's not only the concept of a pointer or a reference also, it's the whole process of memory management, variable visibility and lifetime etc.
Suggestion: Start cleanly with a good C++ book and don't take assumptions on anything. Java's memory management is quite different from C++ and C++ 11 and later memory management is also different than earlier versions.

Related

How does an object store keys inside it in javascript [duplicate]

Objects in JavaScript can be used as Hashtable
(the key must be String)
Is it perform well as Hashtable the data structure?
I mean , does it implemented as Hashtable behind the scene?
Update: (1) I changed HashMap to hashtable (2) I guess most of the browser implement it the same, if not why not? is there any requirement how to implement it in the ECMAScript specs?
Update 2 : I understand, I just wonder how V8 and the Firefox JS VM implements the Object.properties getters/setters?
V8 doesn't implement Object properties access as hashtable, it actually implement it in a better way (performance wise)
So how does it work? "V8 does not use dynamic lookup to access properties. Instead, V8 dynamically creates hidden classes behind the scenes" - that make the access to properties almost as fast as accessing properties of C++ objects.
Why? because in fixed class each property can be found on a specific fixed offset location..
So in general accessing property of an object in V8 is faster than Hashtable..
I'm not sure how it works on other VMs
More info can be found here: https://v8.dev/blog/fast-properties
You can also read more regarding Hashtable in JS here:(my blog) http://simplenotions.wordpress.com/2011/07/05/javascript-hashtable/
"I guess most of the browser implement it the same, if not why not? is there any requirement how to implement it in the ECMAScript specs?"
I am no expert, but I can't think of any reason why a language spec would detail exactly how its features must be implemented internally. Such a constraint would have absolutely no purpose, since it does not impact the functioning of the language in any way other than performance.
In fact, this is absolutely correct, and is in fact the implementation-independence of the ECMA-262 spec is specifically described in section 8.6.2 of the spec:
"The descriptions in these tables indicate their behaviour for native
ECMAScript objects, unless stated otherwise in this document for particular kinds of native ECMAScript objects. Host objects may support these internal properties with any implementation-dependent behaviour as long as it is consistent with the specific host object restrictions stated in this document"
"Host objects may implement these internal methods in any manner unless specified otherwise;"
The word "hash" appears nowhere in the entire ECMA-262 specification.
(original, continued)
The implementations of JavaScript in, say, Internet Explorer 6.0 and Google Chrome's V8 have almost nothing in common, but (more or less) both conform to the same spec.
If you want to know how a specific JavaScript interpreter does something, you should research that engine specifically.
Hashtables are an efficient way to create cross references. They are not the only way. Some engines may optimize the storage for small sets (for which the overhead of a hashtable may be less efficient) for example.
At the end of the day, all you need to know is, they work. There may be faster ways to create lookup tables of large sets, using ajax, or even in memory. For example see the interesting discussion on this post from John Reseig's blog about using a trie data structure.
But that's neither here nor there. Your choice of whether to use this, or native JS objects, should not be driven by information about how JS implements objects. It should be driven only by performance comparison: how does each method scale. This is information you will get by doing performance tests, not by just knowing something about the JS engine implementation.
Most modern JS engines use pretty similar technique to speed up the object property access. The technique is based on so called hidden classes, or shapes. It's important to understand how this optimization works to write efficient JS code.
JS object looks like a dictionary, so why not use one to store the properties? Hash table has O(1) access complexity, it looks like a good solution. Actually, first JS engines have implemented objects this way. But in static typed languages, like C++ or Java a class instance property access is lightning fast. In such languages a class instance is just a segment of memory, end every property has its own constant offset, so to get the property value we just need to take the instance pointer and add the offset to it. In other words, in compile time an expression like this point.x is just replaced by its address in memory.
May be we can implement some similar technique in JS? But how? Let's look at a simple JS function:
function getX(point) {
return point.x;
}
How to get the point.x value? The first problem here is that we don't have a class (or shape) which describes the point. But we can calculate one, that is what modern JS engines do. Most of JS objects at runtime have a shape which is bound to the object. The shape describes properties of the object and where these properties values are stored. It's very similar to how a class definition describes the class in C++ or Java. It's a pretty big question, how the Shape of an object is calculated, I won't describe it here. I recommend this article which contains a great explanation of the shapes in general, and this post which explains how the things are implemented in V8. The most important thing you should know about the shapes is that all objects with the same properties which are added in the same order will have the same shape. There are few exceptions, for example if an object has a lot of properties which are frequently changed, or if you delete some of the object properties using delete operator, the object will be switched into dictionary mode and won't have a shape.
Now, let's imagine that the point object has an array of property values, and we have a shape attached to it, which describes where the x value in this property array is stored. But there is another problem - we can pass any object to the function, it's not even necessary that the object has the x property. This problem is solved by the technique called Inline caching. It's pretty simple, when getX() is executed the first time, it remembers the shape of the point and the result of the x lookup. When the function is called second time, it compares the shape of the point with the previous one. If the shape matches no lookup is required, we can take the previous lookup result.
The primary takeaway is that all objects which describe the same thing should have the same shape, i.e. they should have the same set of properties which are added in the same order. It also explains why it's better to always initialize object properties, even if they are undefined by default, here is a great explanation of the problem.
Relative resources:
JavaScript engine fundamentals: Shapes and Inline Caches and a YouTube video
A tour of V8: object representation
Fast properties in V8
JavaScript Engines Hidden Classes (and Why You Should Keep Them in Mind)
Should I put default values of attributes on the prototype to save space?
this article explains how they are implemented in V8, the engine used by Node.js and most versions of Google Chrome
https://v8.dev/blog/fast-properties
apparently the "tactic" can change over time, depending on the number of properties, going from an array of named values to a dictionary.
v8 also takes the type into account, a number or string will not be treated in the same way as an object (or function, a type of object)
if i understand this correctly a property access frequently, for example in a loop, will be cached.
v8 optimises code on the fly by observing what its actually doing, and how often
v8 will identify the objects with the same set of named properties, added in the same order (like a class constructor would do, or a repetitive bit of JSON, and handle them in the same way.
see the article for more details, then apply at google for a job :)

Objects and the "in" keyword in JavaScript [duplicate]

Objects in JavaScript can be used as Hashtable
(the key must be String)
Is it perform well as Hashtable the data structure?
I mean , does it implemented as Hashtable behind the scene?
Update: (1) I changed HashMap to hashtable (2) I guess most of the browser implement it the same, if not why not? is there any requirement how to implement it in the ECMAScript specs?
Update 2 : I understand, I just wonder how V8 and the Firefox JS VM implements the Object.properties getters/setters?
V8 doesn't implement Object properties access as hashtable, it actually implement it in a better way (performance wise)
So how does it work? "V8 does not use dynamic lookup to access properties. Instead, V8 dynamically creates hidden classes behind the scenes" - that make the access to properties almost as fast as accessing properties of C++ objects.
Why? because in fixed class each property can be found on a specific fixed offset location..
So in general accessing property of an object in V8 is faster than Hashtable..
I'm not sure how it works on other VMs
More info can be found here: https://v8.dev/blog/fast-properties
You can also read more regarding Hashtable in JS here:(my blog) http://simplenotions.wordpress.com/2011/07/05/javascript-hashtable/
"I guess most of the browser implement it the same, if not why not? is there any requirement how to implement it in the ECMAScript specs?"
I am no expert, but I can't think of any reason why a language spec would detail exactly how its features must be implemented internally. Such a constraint would have absolutely no purpose, since it does not impact the functioning of the language in any way other than performance.
In fact, this is absolutely correct, and is in fact the implementation-independence of the ECMA-262 spec is specifically described in section 8.6.2 of the spec:
"The descriptions in these tables indicate their behaviour for native
ECMAScript objects, unless stated otherwise in this document for particular kinds of native ECMAScript objects. Host objects may support these internal properties with any implementation-dependent behaviour as long as it is consistent with the specific host object restrictions stated in this document"
"Host objects may implement these internal methods in any manner unless specified otherwise;"
The word "hash" appears nowhere in the entire ECMA-262 specification.
(original, continued)
The implementations of JavaScript in, say, Internet Explorer 6.0 and Google Chrome's V8 have almost nothing in common, but (more or less) both conform to the same spec.
If you want to know how a specific JavaScript interpreter does something, you should research that engine specifically.
Hashtables are an efficient way to create cross references. They are not the only way. Some engines may optimize the storage for small sets (for which the overhead of a hashtable may be less efficient) for example.
At the end of the day, all you need to know is, they work. There may be faster ways to create lookup tables of large sets, using ajax, or even in memory. For example see the interesting discussion on this post from John Reseig's blog about using a trie data structure.
But that's neither here nor there. Your choice of whether to use this, or native JS objects, should not be driven by information about how JS implements objects. It should be driven only by performance comparison: how does each method scale. This is information you will get by doing performance tests, not by just knowing something about the JS engine implementation.
Most modern JS engines use pretty similar technique to speed up the object property access. The technique is based on so called hidden classes, or shapes. It's important to understand how this optimization works to write efficient JS code.
JS object looks like a dictionary, so why not use one to store the properties? Hash table has O(1) access complexity, it looks like a good solution. Actually, first JS engines have implemented objects this way. But in static typed languages, like C++ or Java a class instance property access is lightning fast. In such languages a class instance is just a segment of memory, end every property has its own constant offset, so to get the property value we just need to take the instance pointer and add the offset to it. In other words, in compile time an expression like this point.x is just replaced by its address in memory.
May be we can implement some similar technique in JS? But how? Let's look at a simple JS function:
function getX(point) {
return point.x;
}
How to get the point.x value? The first problem here is that we don't have a class (or shape) which describes the point. But we can calculate one, that is what modern JS engines do. Most of JS objects at runtime have a shape which is bound to the object. The shape describes properties of the object and where these properties values are stored. It's very similar to how a class definition describes the class in C++ or Java. It's a pretty big question, how the Shape of an object is calculated, I won't describe it here. I recommend this article which contains a great explanation of the shapes in general, and this post which explains how the things are implemented in V8. The most important thing you should know about the shapes is that all objects with the same properties which are added in the same order will have the same shape. There are few exceptions, for example if an object has a lot of properties which are frequently changed, or if you delete some of the object properties using delete operator, the object will be switched into dictionary mode and won't have a shape.
Now, let's imagine that the point object has an array of property values, and we have a shape attached to it, which describes where the x value in this property array is stored. But there is another problem - we can pass any object to the function, it's not even necessary that the object has the x property. This problem is solved by the technique called Inline caching. It's pretty simple, when getX() is executed the first time, it remembers the shape of the point and the result of the x lookup. When the function is called second time, it compares the shape of the point with the previous one. If the shape matches no lookup is required, we can take the previous lookup result.
The primary takeaway is that all objects which describe the same thing should have the same shape, i.e. they should have the same set of properties which are added in the same order. It also explains why it's better to always initialize object properties, even if they are undefined by default, here is a great explanation of the problem.
Relative resources:
JavaScript engine fundamentals: Shapes and Inline Caches and a YouTube video
A tour of V8: object representation
Fast properties in V8
JavaScript Engines Hidden Classes (and Why You Should Keep Them in Mind)
Should I put default values of attributes on the prototype to save space?
this article explains how they are implemented in V8, the engine used by Node.js and most versions of Google Chrome
https://v8.dev/blog/fast-properties
apparently the "tactic" can change over time, depending on the number of properties, going from an array of named values to a dictionary.
v8 also takes the type into account, a number or string will not be treated in the same way as an object (or function, a type of object)
if i understand this correctly a property access frequently, for example in a loop, will be cached.
v8 optimises code on the fly by observing what its actually doing, and how often
v8 will identify the objects with the same set of named properties, added in the same order (like a class constructor would do, or a repetitive bit of JSON, and handle them in the same way.
see the article for more details, then apply at google for a job :)

Literal notation VS. constructor to create objects in JavaScript [duplicate]

This question already has answers here:
Should I be using object literals or constructor functions?
(12 answers)
Closed 7 years ago.
I am learning JavaScript from the basics (although I program in other languages such as C#). It popped up to me the question of which is of this two ways is more efficient and should be use as general rule.
I am sure and expecting no definitive answer but I would like to know the general pros and cons.
Thank you!!
Object literals are usually the way to go. They only need to be parsed when loading the script, which can introduce various optimizations by the scripting engine.
Constructors need to be executed. That means they will be slower, but you can easily add some validation code etc. to them, and they allow the construction of complex objects with public, privileged methods and private "attributes" hidden in the constructors scope. Also, they of course construct objects that share a prototype, which you might find useful.
Not aware of any performance efficiency of one over the other. However, the literal notation seems to get its preference due to the simplicity argument, and because it avoids using constructors and new keyword.
Constructors and the new keyword are seen by some as negative features of the JavaScript language (see Crockford - JavaScript:The Good Parts). JSLint even calls out when finding new Array() or new Object() use.

anonymous functions considered harmful? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
The more I delve into javascript, the more I think about the consequences of certain design decisions and encouraged practices. In this case, I am observing anonymous functions, a feature which is not only JavaScript-provided, but I see strongly used.
I think we can all agree on the following facts:
the human mind does not deal with more than 7 plus minus two entities (Miller's law)
deep indentation is considered bad programming practice, and generally points out at design issues if you indent more than three or four levels. This extends to nested entities, and it's well presented in the python Zen entry "Flat is better than nested."
the idea of having a function name is both for reference, and for easy documentation of the task it performs. We know, or can expect, what a function called removeListEntry() does. Self-documenting, clear code is important for debugging and readability.
While anonymous functions appears to be a very nice feature, its use leads to deeply nested code design. The code is quick to write, but difficult to read. Instead of being forced to invent a named context for a functionality, and flatten your hierarchy of callable objects, it encourages a "go deep one level", pushing your brain stack and quickly overflowing the 7 +/- 2 rule. A similar concept is expressed in Alan Cooper's "About Face", quoting loosely "people don't understand hierarchies". As programmers we do understand hierarchies, but our biology still limits our grasping of deep nesting.
I'd like to hear you on this point. Should anonymous functions be considered harmful, an apparent shiny syntactic sugar which we find later on to be salt, or even rat poison ?
CW as there's no correct answer.
As I see it, the problem you're facing is not anonymous functions, rather an unwillingness to factor out functionality into useful and reusable units. Which is interesting, because it's easier to reuse functionality in languages with first-class functions (and, necessarily, anonymous functions) than in languages without.
If you see a lot of deeply nested anonymous functions in your code, I would suggest that there may be a lot of common functionality that can be factored out into named higher-order functions (i.e. functions that take or return ("build") other functions). Even "simple" transformations of existing functions should be given names if they are used often. This is just the DRY principle.
Anonymous functions are more useful functionally than they are harmful legibly. I think that if you format your code well enough, you shouldn't have a problem. I don't have a problem with it, and I'm sure I can't handle 7 elements, let alone 7 + 2 :)
Actually, hierarchies help to overcome 7+/-2 rule the same way as OOP does. When you're writing or reading a class, you read its content and nothing of outside code so you are dealing with relatively small portion of entities. When you're looking at class hierarchies, you don't look inside them, meaning then again you are dealing with small number of entities.
The same if true for nested functions. By dividing your code into multiple levels of hierarchy, you keep each level small enough for human brain to comprehend.
Closures (or anonymous functions) just help to break your code into slightly different way than OOP does but they doesn't really create any hierarchies. They are here to help you to execute your code in context of other block of code. In C++ or Java you have to create a class for that, in JavaScript function is enough. Granted, standalone class is easier to understand as it is just easier for human to look at it as at standalone block. Function seems to be much smaller in size and brain sometimes think it can comprehend it AND code around it at the same time which is usually a bad idea. But you can train your brain not to do that :)
So no, I don't think anonymous functions are at all harmful, you just have to learn to deal with them, as you learnt to deal with classes.
Amusingly, JavaScript will let you name "anonymous" functions:
function f(x) {
return function add(y) {
return x+y;
};
}
I think closures have enormous benefits which should not be overlooked. For example, Apple leverages "blocks"(closures for C) with GCD to provide really easy multithreading - you don't need to setup context structs, and can just reference variables by name since they're in scope.
I think a bigger problem with Javascript is that it doesn't have block scope(blocks in this case referring to code in braces, like an if statement). This can lead to enormous complications, forcing programmers to use unnecessary closures to get around this Javascript design limitation.
I also think anonymous functions (in latest languages often referred as closures) have great benefits and make code often more readable and shorter. I sometimes am getting really nuts when I have to work with Java (where closures aren't first class language features).
If indentation and too many encapsulated function-variables are the problem then you should refactor the code to have it more modular and readable.
Regarding java-script I think that function-variables look quite ugly and make code cluttered (the encapsulated function(...){} string makes java-script code often less readable). As an example I much prefer the closure syntax of groovy ('{}' and '->' chars).
If a function is not understandable without a name, the name is probably too long.
Use comments to explain cryptic code, don't rely on names.
Who ever came up with the idea of requiring functions to be bound to identifiers did every programmer a disservice. If you've never done functional programming and you're not familiar with and comfortable with functions being first-class values, you're not a real programmer.
In fact, to counter your own argument, I would go so far as to consider functions bound to (global) names to be harmful! Check Crockford's article about private and public members and learn more.

What lessons can be learned from the prototype model in javascript?

The question is from a language design perspective.
I should explain a little about the situation. I am working on a javascript variant which does not support prototypes, however it is overdue a decent type system (most importantly support for instanceof). The ecmascript spec is not important, so i have the freedom to implement something different and better suited.
In the variant:-
You do not declare constructors with function foo(), rather constructors are declared in template files, which means constructors exist in a namespace (detirmined by the path of the file)
Currently all inheritance of behaviour is done by applying templates which means all shared functions are copied to each individual object (there are no prototypes afterall).
Never having been a web developer, this puts me in the slightly bizarre position of never having used prototypes in anger. Though this hasn't stopped me having opinions on them.
My principal issues with the prototype model as i understand it are
unnecessary littering of object namespace, obj.prototype, obj.constructor (is this an immature objection, trying to retain ability to treat objects as maps, which perhaps they are not?)
ability to change shared behaviour at runtime seems unnecessary, when directly using an extra level of indirection would be more straight forward obj.shared.foo(). Particularly it is quite a big implementation headache
people do not seem to understand prototypes very well generally e.g. the distinction between a prototype and a constructor.
So to get around these my idea is to have a special operator constructorsof. Basically the principal is that each object has a list of constructors, which occasionally you will want access to.
var x = new com.acme.X();
com.acme.Y(x,[]); // apply y
(constructorsof x) // [com.acme.Y,com.acme.X,Object];
x instanceof com.acme.X; // true
x instanceof com.acme.Y; // true
All feedback appreciated, I appreciate it maybe difficult to appreciate my POV as there is a lot i am trying to convey, but its an important decision and an expert opinion could be invaluable.
anything that can improve my understanding of the prototype model, the good and the bad.
thoughts on my proposal
thanks,
mike
edit: proposal hopefully makes sense now.
Steve Yegge has written a good technical article about the prototype model.
I don't think your issues with the prototype model are valid:
unnecessary littering of object namespace, obj.prototype, obj.constructor
prototype is a property of the contructor function, not the object instance. Also, the problem isn't as bad as it sounds because of the [[DontEnum]] attribute, which unfortunately can't be set programatically. Some of the perceived problems would go away if you could.
is this an immature objection, trying to retain ability to treat objects as maps, which perhaps they are not?
There's no problem with using objects as maps as long as the keys are strings and you check hasOwnProperty().
ability to change shared behaviour at runtime seems unnecessary, when directly using an extra level of indirection would be more straight forward obj.shared.foo(). Particularly it is quite a big implementation headache
I don't see where the big implementation headache in implementning the prototype chain lies. In fact, I consider prototypical inheritance conceptually simpler than class-based inheritance, which doesn't offer any benefits in languages with late-binding.
people do not seem to understand prototypes very well generally e.g. the distinction between a prototype and a constructor.
People who only know class-based oo languages like Java and C++ don't understand JavaScript's inheritance system, news at 11.
In addition to MarkusQ's suggestions, you might also want to check out Io.
It might just be easier to try a few things with practical code. Create the language with one simple syntax, whatever that is, and implement something in that language. Then, after a few iterations of refactoring, identify the features that are obstacles to reading and writing the code. Add, alter or remove what you need to improve the language. Do this a few times.
Be sure your test-code really exercises all parts of your language, even with some bits that really try to break it. Try to do everything wrong in your tests (as well as everything right)
Reading up on "self", the language that pioneered the prototype model, will probably help you more than just thinking of it in terms of javascript (especially since you seem to associate that, as many do, with "web programming"). A few links to get you started:
http://selflanguage.org/
http://www.self-support.com/
Remember, those who fail to learn history are doomed to reimplement it.

Categories