Javascript - how do primitives really work - javascript

I have the following code:
String.prototype.isLengthGreaterThan = function(limit){
return this.length > limit;
}
console.log("John".isLengthGreaterThan(3));
Number.prototype.isPositive = function(){
return this > 0;
}
console.log(5.isPositive(3)); //error
The 1st example work, the other doesn't. Why?
From my understanding, primitives are not objects – though they do have access to their function constructor's prototype (Number, String, etc.). So, you can add methods directly to the prototype. In my example above it didn't work in one instance.
I am looking for "under the hood" answer, to understand what happens when you do, for example:
var a = 1;
How does it really have an access to its prototype if it isn't an object?

The dot thing isn't to do with primitives, it's just the syntax of writing numeric literals. In a number, the . is a decimal point, whereas of course that's not the case for a string. To use a method on a literal number, you have two choices:
Parens:
console.log((5).isPositive(3));
Two dots (yes, really):
console.log(5..isPositive(3));
In the second case, it works because the first dot is a decimal point; this means that the dot following can't be a decimal point, and so it's the property accessor operator.
How does it really have an access to it's prototype if it isn't an object.
It's promoted to an object automatically by the JavaScript engine when you do the propety access. In the case of numbers, it's as though you called new Number(5) or similar. In the case of strings, it's as though you called new String("the string"). When the property access is complete, the temporary object is then discarded immediately when the expression is complete. Naturally, the JavaScript engine can optimize the object allocation out, but conceptually that's what happens.
So conceptually:
We do (say) var n = 1; The variable a contains the primitive number value 1.
We do n.isPositive(3); to create a string from it using Number.prototype.toFixed. That's a property access operation:
The engine evaluates the left-hand side, n, and gets the result 1 (a primitive number); this is the base it'll use for the property access operation.
The engine evaluates the right-hand side (isPositive) to determine the property key (name) to look up. In this case it's a literal, so the key is the string "isPositive". This is the property key for the property access operation.
The engine goes to look up the property on the base. Since the base we have is a primitive, the JavaScript engine promotes it (coerces it) to an equivalent Number object.
Since that object doesn't have a "isPositive" property, the engine looks at its prototype, Number.prototype, and finds it; it's a reference to a function.
Various things happen that aren't really germane, and ultimately isPositive is called with this being an object coerced from the primitive value 1. It does its work and generates its return value.
Since the temporary object isn't referenced by anything anymore, it's eligible for garbage collection.
The mechanism by which this happens is a bit scattered in the specification:
The runtime semantics of the property accessor operator return something called a Reference specification type, which basically is a purely abstract specification holding object that will get evaluated later. It says that the base will be the result of evaluating the left-hand side of the property accessor, and the property anme will be the result of evaluting the right-hand side.
The GetValue operation on the Reference type, which says (step 5) that if it's a property reference and the base is primitive, it's coerced via the specification's ToObject operation.
ToObject, which defines the rules for doing that.

Related

Why is the string literal considered a primitive type in JavaScript?

The official documentation as well as tons of articles on the internet say that 'some string' is a primitive value, meaning that it creates a copy each time we assign it to a variable.
However, this question (and answer to it) How to force JavaScript to deep copy a string? demonstrates that actually V8 does not copy a string even on the substr method.
It would also be insane to copy strings every time we pass them into functions and would not make sense. In languages like C#, Java, or Python, the String data type is definitely a reference type.
Furthermore, this link shows the hierarchy and we can see HeapObject after all.
https://thlorenz.com/v8-dox/build/v8-3.25.30/html/d7/da4/classv8_1_1internal_1_1_sliced_string.html
Finally, after inspecting
let copy = someStringInitializedAbove
in Devtools it is clear that a new copy of that string has not been created!
So I am pretty sure that strings are not copied on assignment. But I still do not understand why so many articles like JS Primitives vs Reference say that they are.
Fundamentally, because the specification says so:
string value
primitive value that is a finite ordered sequence of zero or more 16-bit unsigned integer values
The specification also defines that there are String objects, as distinct from primitive strings. (Similarly there are primitive number, boolean, and symbol types, and Number and Boolean and Symbol objects.)
Primitive strings follow all the rules of other primitives. At a language level, they're treated exactly the way primitive numbers and booleans are. For all intents and purposes, they are primitive values. But as you say, it would be insane for a = b to literally make a copy of the string in b and put that copy in a. Implementations don't have to do that because primitive string values are immutable (just like primitive number values). You can't change any characters in a string, you can only create a new string. If strings were mutable, the implementation would have to make a copy when you did a = b (but if they were mutable the spec would be written differently).
Note that primitive strings and String objects really are different things:
const s = "hey";
const o = new String("hey");
// Here, the string `s` refers to is temporarily
// converted to a string object so we can perform an
// object operation on it (setting a property).
s.foo = "bar";
// But that temporary object is never stored anywhere,
// `s` still just contains the primitive, so getting
// the property won't find it:
console.log(s.foo); // undefined
// `o` is a String object, which means it can have properties
o.foo = "bar";
console.log(o.foo); // "bar"
So why have primitive strings? You'd have to ask Brendan Eich (and he's reasonably responsive on Twitter), but I suspect it was so that the definition of the equivalence operators (==, ===, !=, and !==) didn't have to either be something that could be overloaded by an object type for its own purposes, or special-cased for strings.
So why have string objects? Having String objects (and Number objects, and Boolean objects, and Symbol objects) along with rules saying when a temporary object version of a primitive is created make it possible to define methods on primitives. When you do:
console.log("example".toUpperCase());
in specification terms, a String object is created (by the GetValue operation) and then the property toUpperCase is looked up on that object and (in the above) called. Primitive strings therefore get their toUpperCase (and other standard methods) from String.prototype and Object.prototype. But the temporary object that gets created is not accessible to code except in some edge cases,¹ and JavaScript engines can avoid literally creating the object outside of those edge cases. The advantage to that is that new methods can be added to String.prototype and used on primitive strings.
¹ "What edge cases?" I hear you ask. The most common one I can think of is when you've added your own method to String.prototype (or similar) in loose mode code:
Object.defineProperty(String.prototype, "example", {
value() {
console.log(`typeof this: ${typeof this}`);
console.log(`this instance of String: ${this instanceof String}`);
},
writable: true,
configurable: true
});
"foo".example();
// typeof this: object
// this instance of String: true
There, the JavaScript engine was forced to create the String object because this can't be a primitive in loose mode.
Strict mode makes it possible to avoid creating the object, because in strict mode this isn't required to be an object type, it can be a primitive (in this case, a primitive string):
"use strict";
Object.defineProperty(String.prototype, "example", {
value() {
console.log(`typeof this: ${typeof this}`);
console.log(`this instance of String: ${this instanceof String}`);
},
writable: true,
configurable: true
});
"foo".example();
// typeof this: string
// this instanceof String: false

Valid property names, property assignment and access in JavaScript

Updated Question
What, exactly, qualifies as a valid property name in Javascript? How do various methods of property assignment differ? And how does the property name affect property access?
Note
The answers to my original question (seen below) helped to clear some things up, but also opened a new can of worms. Now that I've had a chance to become a bit more familiar with JavaScript, I believe I've been able to figure a lot of it out.
Since I had a hard time finding this information consolidated into one explanation, I thought it might be helpful to expand my original question, and attempt to answer it.
Original Question
Originally, there was some confusion with the MDN JavaScript guide (object literals). Specifically, I wondered why they claimed that if a property name was not a valid JavaScript identifier, then it would have to be enclosed in quotes. Yet, they offered example code that showed that the number 7 could be used — without quotes — as a property name.
As it turns out, the guide simply left off one important part, and Pointy updated it (changes in bold):
If the property name would not be a valid JavaScript identifier or number, it must be enclosed in quotes.
I also wondered why property names were allowed to deviate away from the "may not start with a digit" rule, that applies to identifiers. That question actually reveals the complete misunderstanding that I had of property names, and is what lead me to do some more research.
Answer for 1st question:
Yes, the statement given in the MDN guide is not 100% accurate, but in your daily work it'd be better to follow it as rule. You really don't need to create properties names which are numbers.
Answer for 2nd question:
A property name may not start with a digit but a property name that is a number without any other characters in its name is fine.
This exception exists because the properties with number for name as the same as indexes.
Let's try this:
var obj = {7: "abc"};
obj[7]; // works fine
obj.7; // gives an error (SyntaxError)
Now try to call Array.push on the object and observe what happens:
Array.prototype.push.call(obj, "xyz");
console.log(obj);
console.log(obj[0]);
// Prints
Object {0: "xyz", 7: "abc", length: 1}
"xyz"
You can see that few new properties (one with name 0 and another with name length) have been added to the object. Moreover, you can use the object as an array:
var obj = { "0": "abc", "1": "xyz", length: 2 };
Array.prototype.pop.call(obj); // Returns: "xyz"
Array.prototype.pop.call(obj); // Returns: "abc"
You can use array's methods on objects and this is called Duck Typing.
Arrays are nothing more than objects with some predefined methods.
From MDN:
Array elements are object properties in the same way that length is a property, but trying to access an element of an array with dot notation throws a syntax error, because the property name is not valid. There is nothing special about JavaScript arrays and the properties that cause this. JavaScript properties that begin with a digit cannot be referenced with dot notation and must be accessed using bracket notation.
Now you can understand why a number for property name is valid. These are called just indexes and they are used in JavaScript arrays. And since JavaScript needs to be consistent with other languages, numbers are valid for indexes/properties names.
Hope this makes it clear.
Here are some interesting articles:
JavaScript identifiers (in ECMAScript 5)
JavaScript identifiers (in ECMAScript 6)
Short Answer
Object property names can be any valid identifier, numeric literal, or string literal (including the empty string).
With that said, there are some potentially confusing intricacies to keep in mind about JavaScript property names, as described below.
And unless you're working with valid (non-negative integer) array indexes, it's a good idea to explicitly assign all numerical property names as strings.
Negative Numbers
What might look like a negative number is actually an expression — something property names do not support.
// SyntaxError
const obj = { -12: 'nope' };
Fortunately, bracket notation handles expressions for us.
// Successful property assignment.
const obj = {};
obj[-12] = 'yup';
Typecasting
All property names are typecasted into strings before being stored.
const obj = {
12: '12'
};
console.log(typeof Object.keys(obj)[0]); // -> string
Parsing
But even before typecasting occurs, keys are parsed according to the syntax used, and transformed into a decimal literal.
const obj = {
// Valid string literal
'022': '022',
// Interpreted as decimal
6: '6',
// Interpreted as floating-point
.345: '0.345',
// Interpreted as floating-point
1.000: '1',
// Interpreted as floating-point
8.9890: '8.989',
// Interpreted as decimal
000888: '888',
// Interpreted as octal
0777: '511',
// Interpreted as hexadecimal
0x00111: '273',
// Interpreted as binary
0b0011: '3',
};
/* Quoted property name */
console.log(obj['022']); // "022"; as expected
console.log(obj[022]); // undefined; 022 is an octal literal that evaluates to 18 before our lookup ever occurs
/* Valid (non-negative integer) array index */
console.log(obj[6]); // "6"; as expected
console.log(obj['6']); // "6"; as expected
/* Non-valid array index */
console.log(obj[0x00111]); // "273"; we're accessing the property name as it was assigned (before it was parsed and typecasted)
console.log(obj['0x00111']); // undefined; after parsing and typecasting, our property name seems to have disappeared
console.log(obj['273']); // "273"; there it is, we found it using the evaluation of our original assignment

Is null a regular primitive in JavaScript?

There are plenty of discussions about what null in JavaScript actually is. For example, Why is null an object and what's the difference between null and undefined?.
MDN lists null among primitive values and states that it is:
a special keyword denoting a null value; null is also a primitive
value
(The above emphasis is mine)
My last reference will be to Programming JavaScript Applications book by Eric Elliott, which in its Chapter 3. Objects says the following:
In JavaScript, ... even primitive types
get the object treatment when you refer to them with the property
access notations. They get automatically wrapped with an object so
that you can call their prototype methods.
Primitive types behave like objects when you use the property access notations, but you can't assign new properties to them. Primitives
get wrapped with an object temporarily, and then that object is
immediately thrown away. Any attempt to assign values to properties
will seem to succeed, but subsequent attempts to access that new
property will fail.
And indeed the following statements will execute without a problem:
"1".value = 1;
(1).value = "1";
false.value = "FALSE";
while his one
null.value = "Cannot set property of null";
throws Uncaught TypeError. See JS Fiddle.
So at least in this regard, null behaves differently than other primitives.
Is null considered a regular primitive in JavaScript?
Yes it's an actual primitive.
The exceptions for property access are null and undefined, because they have no wrapper type like strings, booleans and numbers do.
ECMAScript 5, Section 4.3.2 primitive
value
member of one of the types Undefined, Null, Boolean, Number, or String
as defined in Clause 8.
NOTE A primitive value is a datum that is represented directly at the
lowest level of the language implementation.

Confusion about prototype chain , primitives and objects

in firebug console :
>>> a=12
12
>>> a.__proto__
Number {}
>>> (12).__proto__
Number {}
>>> a.constructor.prototype === (12).__proto__
true
>>> a.constructor.prototype.isPrototypeOf(a)
false
the final line causes me a great confusion as compared to the other lines. also see Constructor.prototype not in the prototype chain?
When you use the . operator with a primitive, the language auto-boxes it with the appropriate Object type (in this case, Number). That's because simple primitive types in JavaScript really are not Object instances.
Thus, the actual left-hand side of
a.__proto__
is not the number 12 but essentially new Number(12). However, the variable "a" continues to be the simple number value 12.
edit — Section 8.7 of the spec "explains" this with typical ECMA 262 moon language. I can't find a clear paragraph that describes the way that a primitive baseValue is treated as a Number, Boolean, or String instance, but that section directly implies it. I think that because those non-primitive synthetic values are ephemeral (they're only "real" while the . or [] expression is being evaluated) that the spec just talks about the behavior without explicitly requiring that an actual Number is constructed. I'm guessing on that however.
#Pointy has explained it very well. Basically, if you want your last statement to be true, you would have to write it like:
a.constructor.prototype.isPrototypeOf(new Number(a));
In JavaScript primitives do not have a prototype chain. Only objects do. A primitive value includes:
Booleans
Numbers
Strings
Null
Undefined
Hence if you call isPrototypeOf with a primitive value then it'll always return false.
If you try to use a boolean, number or string as an object then JavaScript automatically coerces it into an object for you. Hence a.constructor evaluates to new Number(a).constructor behind the scenes. This is the reason you can use a primitive value as an object.
If you wish to use a variable storing a primitive value as an object often then it's better to explicitly make it an object. For example in your case it would have been better to define a as new Number(12). The advantages are:
JavaScript doesn't need to coerce the primitive to an object every time you try to use it as an object. You only create the object once. Hence it's performance efficient.
The isPrototypeOf method in your case will return true as a will be an instance of Number. Hence it will have Number.prototype in its prototype chain.

JavaScript and String as primitive value

In JavaScript a String is a primitive value.
But is also a String object...
A primitive value is a value put directly into a variable.
So my question is:
var d = "foo";
does d contain directly foo or a reference to a string object like other languages?
Thanks.
If I understand it correctly, d will contain the string literal "foo", and not a reference to an object. However, the JavaScript engine will effectively cast the literal to an instance of String when necessary, which is why you can call methods of String.prototype on string literals:
"some string".toUpperCase(); //Method of String.prototype
The following snippet from MDN may help to explain it further (emphasis added):
String literals (denoted by double or single quotes) and strings
returned from String calls in a non-constructor context (i.e., without
using the new keyword) are primitive strings. JavaScript automatically
converts primitives and String objects, so that it's possible to use
String object methods for primitive strings. In contexts where a
method is to be invoked on a primitive string or a property lookup
occurs, JavaScript will automatically wrap the string primitive and
call the method or perform the property lookup.
This is all explained in detail in the specification, but it's not exactly easy reading. I asked a related question recently (about why it is possible to do the above), so it might be worth reading the (very) detailed answer.
if you define
var d = "foo";
than d contains directly foo
but, if you define
var S = new String("foo");
then S is an Object
Example:
var s1 = "1";
var s2 = "1";
s1 == s2 -> true
var S1 = new String("2");
var S2 = new String("2");
S1 == S2 -> false
I think that every variable in Javascript actually represents an Object. Even a function is an Object.
I found two useful articles detailing this, located here and here. Seems like primitive types in JavaScript are passed by VALUE (i.e. when you pass if to a function it gets "sandboxed" within the function and the original variable's value won't change), while reference types are passed, you guessed it, by REFERENCE and passing it through to a function will change the original variable.
Primitive types in JavaScript are text (string), numeric (float / int), boolean and NULL (and the dreaded "undefined" type). Any custom objects, functions or standard arrays are considered reference types. I haven't researched the Date type though, but I'm sure it will fall into the primitive types.
Found this page about javascript variables, seems that:
Primitive type for javascript are booleans, numbers and text.
I believe there are no primitives in Javascript, in the Java sense at least - everything is an object of some kind.
So yes it is a reference to an object - if you extend the String object, d would have that extension.
If you mean primitives as in those types provided by the language, you've got a few, boolean, numbers, strings and dates are all defined by the language.

Categories