Isn't string.length actually a method in JavaScript? - javascript

I would like to get a better understanding of what is actually going on when I find the length of a string. I tried looking on W3, ECMA, and at the V8 Ignition website but not much luck.
I keep reading that 'JavaScript treats primitive values as objects when executing methods and properties.' But, I can't seem to find out how exactly this happens. If I call a method/property on a primitive which, I assume gets interpreted as an object by Ignition, doesn't the String class need to call a function at some point to iterate the string? I feel like myString.length should be called a method and String.length could MAYBE be called a property, depending on at which point the "property" is found and how it's found.
Basically, I don't understand why it's touted as a property if it doesn't seem to be inherent and has to be fetched/determined. That seems like a method to me (let alone the fact that string.length) isn't even a real thing and is interpreted.

(V8 developer here.)
I can see several issues here that can be looked at separately:
1. From a language specification perspective, is something a method or a property?
Intuitively, the distinction is: if you write a function call like obj.method(), then it's a method; if you write obj.property (no ()), then it's a property.
Of course in JavaScript, you could also say that everything is a property, and in case the current value of the property is a function, then that makes it a method. So obj.method gets you a reference to that function, and obj.method() gets and immediately calls it:
var obj = {};
obj.foo = function() { console.log("function called"); return 42; }
var x = obj.foo(); // A method!
var func = obj.foo; // A property!
x = func(); // A call!
obj.foo = 42;
obj.foo(); // A TypeError!
2. When it looks like a property access, is it always a direct read/write from/to memory, or might some function get executed under the hood?
The latter. JavaScript itself even provides this capability to objects you can create:
var obj = {};
Object.defineProperty(obj, "property", {
get: function() { console.log("getter was called"); return 42; },
set: function(x) { console.log("setter was called"); }
});
// *Looks* like a pair of property accesses, but will call getter and setter:
obj.property = obj.property + 1;
The key is that users of this obj don't have to care that getters/setters are involved, to them .property looks like a property. This is of course very much intentional: implementation details of obj are abstracted away; you could modify the part of the code that sets up obj and its .property from a plain property to a getter/setter pair or vice versa without having to worry about updating other parts of the code that read/write it.
Some built-in objects rely on this trick, the most common example is arrays' .length: while it's specified to be a property with certain "magic" behavior, the most straightforward way for engines to implement this is to use a getter/setter pair under the hood, where in particular the setter does the work of truncating any extra array elements if you set the length to a smaller value than before.
3. So what does "abc".length do in V8?
It reads a property directly from memory. All strings in V8 always have a length field internally. As commenters have pointed out, JavaScript strings are immutable, so the internal length field is written only once (when the string is created), and then becomes a read-only property.
Of course this is an internal implementation detail. Hypothetically, an engine could use a "C-style" string format internally, and then it would have to use a strlen()-like function to determine a string's length when needed. However, on a managed heap, being able to quickly determine each object's size is generally important for performance, so I'd be surprised if an engine actually made this choice. "Pascal-style" strings, where the length is stored explicitly, are more suitable for JavaScript and similar garbage-collected languages.
So, in particular, I'd say it's fair to assume that reading myString.length in JavaScript is always a very fast operation regardless of the string's length, because it does not iterate the string.
4. What about String.length?
Well, this doesn't have anything to do with strings or their lengths :-)
String is a function (e.g. you can call String(123) to get "123"), and all functions have a length property describing their number of formal parameters:
function two_params(a, b) { }
console.log(two_params.length); // 2
As for whether that's a "simple property" or a getter under the hood: there's no reason to assume that it's not a simple property, but there's also no reason to assume that engines can't internally do whatever they want (so long as there's no observable functional difference) if they think it increases performance or saves memory or simplifies things or improves some other metric they care about :-)
(And engines can and do make use of this freedom, for various forms of "lazy"/on-demand computation, caching, optimization -- there are plenty of internal function calls that you probably wouldn't expect, and on the flip side what you "clearly see" as a function call in the JS source might (or might not!) get inlined or otherwise optimized away. The details change over time, and across different engines.)

Length is not a method, it is a property. It doesn't actually do anything but return the length of an array, a string, or the number of parameters expected by a function. When you use .length, you are just asking the JavaScript interpreter to return a variable stored within an object; you are not calling a method.
Also, note that the String.length property gives the actual number of code units in a string, rather than a literal character count. One code unit is 16 bits as defined by UTF-16 (used by JavaScript). However, some special characters use 32 bits which means that in a string containing one of these characters the String.length property might give you a higher character count than the literal number of characters.
Link:- https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/length
And also one fact length work very different with string.length from Array.length
let myString = "bluebells";
myString.length = 4;
console.log(myString); //bluebells
console.log(myString.length); //9
//--
let myArr = [5,6,8,2,4,7];
myArr.length = 2;
console.log(myArr); //[5, 6]
console.log(myArr.length); //2

Related

Javascript - primitive string implicit conversion to object

I will reference to Mozilla's docs about String object.
JavaScript automatically converts primitives to String objects, so that it's possible to use String object methods for primitive strings. In contexts where a method is to be invoked on a primitive string or a property lookup occurs, JavaScript will automatically wrap the string primitive and call the method or perform the property lookup.
Good example of such situation would be accessing length property:
let word = "Hello";
word.length;
I have understood that what happens in this situation is:
let word = "Hello";
String(word).length;
But after trying some benchmarks it's clear for me that word.length is much faster than String(word).length. It seems that implicit conversion is something completely different than String(word), much faster. I cannot find any info about how this implicit conversion works, but it might be helpful to know in some optimization problems.
The reason for that is probably time to parse and the fact that inner JS string object (that is actually C++) works faster then creating extra interface to interact with string on high JS level...
So it's all about optimisation
There is good article about it (https://dev.to/promhize/what-you-need-to-know-about-javascripts-implicit-coercion-e23).

What's the difference between str.fun() / str.fun / fun(str) in JavaScript?

I tried googling but couldn't find a precise answer, so allow me to try and ask here. If the question does not seem proper, please let me know and I'll delete it.
In JS you've got three different way of writing certain build in functionalities:
str.length
str.toString()
parseInt(str)
I wonder if there is a reason behind these different ways of writing. As a new user I don't grasp why it couldn't be streamlined as: length(str) / toString(str) / parseInt(str) or with dot formulation.
I however think if I do know the reason behind these differences, it would give me a better understanding of JavaScript.
Length is one of the attributes of string in JavaScript. Hence you use string.length to get the length of the string.
toString is a function for string objects, hence we use stringobj.toString().
parsInt(str) is a global function which takes string as a parameter.
JavaScript is object-oriented, so there are functions or procedures which require first an object to use as this in their bodies. str.length is a property, both syntactically and semantically. It doesn't require any parameters and represents some quality of the object. obj.toString() is a method (a function attached to an object), which doesn't represent any characteristics of the object, but rather operates on its state, computes some new values, or changes the state of the object a lot. parseInt(str) is a "global" function, which represents an operation not attached to any type or object.
Under the hood, these three ways may be well implemented with just calling a function, passing this as the first parameter (like C# does, for example). The semantic difference is the important one.
So why not use just the third syntax, like for example PHP does? First, it doesn't bloat the global environment with lots of functions which only work for one specific case and type, allowing you to specify any new function you want without breaking the old functionality. Second, it ecourages you to use object-oriented concepts, because you can already see working objects and methods in the language, and can try to make something similar.
And why isn't parseInt a method? It can as well be str.toInt() without any issues, it's just the way JavaScript designers wanted it to be, although it seems also a bit logical to me to make it a static method Number.parseInt(str), because the behaviour of the function is relevant more to the Number type than the String type.
JavaScript is based around objects. Objects have properties (e.g. a User object may have name and age properties). These are what define the user and are related to the user. Properties are accessed via dot-notation or brackets notation (to access Eliott’s age, we’ll use either eliott.age or eliott['age'] — these are equivalent).
These properties can be of any type — String, Number, Object, you name it — even functions. Now the proper syntax to call a function in JS is to put round brackets: eliott.sayHello(). This snippet actually fetches Eliott’s sayHello property, and calls it right away.
You can see Eliott as a box of properties, some of which can be functions. They only exist within the box and have no meaning out of the box: what would age be? Whose age? Who’s saying hello?
Now some functions are defined at the global level: parseInt or isNaN for instance. These functions actually belong to the global box, named window (because legacy). You can also call them like that: window.parseInt(a, 10) or window.isNaN(a). Omitting window is allowed for brevity.
var eliott = {
name: 'Eliott',
age: 32,
sayHello: function () { console.log('Hello, I’m Eliott'); }
};
eliott.name; // access the `name` property
eliott.age; // access the `age` property
eliott.sayHello; // access the `sayHello` property
eliott.sayHello(); // access the `sayHello` property and calls the function
sayHello(eliott); // Reference error: `window.sayHello` is undefined!
Note: Some types (String, Number, Boolean, etc.) are not real objects but do have properties. That’s how you can fetch the length of a string ("hello".length) and reword stuff ("hello, Eliott".replace("Eliott", "Henry")).
Behaviour of these expressions is defined in ECMAScript grammar. You could read the specification to understand it thoroughly: ECMAScript2015 specification. However, as pointed out by Bergi, it's probably not the best resource for beginners because it doesn't explain anything, it just states how things are. Moreover I think it might be too difficult for you to be able to grasp concepts described in this specification because of the very formal language used.
Therefore I recommend to start with something way simpler, such as a very basic introduction to JavaScript: JavaScript Basics on MDN. MDN is a great resource.
But to answer your question just briefly:
str.length is accessing a property of the str object.
parseInt(str) is a function call
str.toString() is a call of a function which is a property of the str object. Such functions are usually named methods.
Functions and methods are in fact very similar but one of the differences (except for the obvious syntax difference) is that methods by default have context (this) set to refer to the object which they're part of. In this case inside of toString function this equals to str.
Note: Accessing a property (as in str.length) could in effect call a getter function but it depends on how the object is defined, and is in fact transparent for the user.

Why are Strings and Numbers in JS considered immutable? [duplicate]

If a string is immutable, does that mean that....
(let's assume JavaScript)
var str = 'foo';
alert(str.substr(1)); // oo
alert(str); // foo
Does it mean, when calling methods on a string, it will return the modified string, but it won't change the initial string?
If the string was mutable, does that mean the 2nd alert() would return oo as well?
It means that once you instantiate the object, you can't change its properties. In your first alert you aren't changing foo. You're creating a new string. This is why in your second alert it will show "foo" instead of oo.
Does it mean, when calling methods on
a string, it will return the modified
string, but it won't change the
initial string?
Yes. Nothing can change the string once it is created. Now this doesn't mean that you can't assign a new string object to the str variable. You just can't change the current object that str references.
If the string was mutable, does that
mean the 2nd alert() would return oo
as well?
Technically, no, because the substring method returns a new string. Making an object mutable, wouldn't change the method. Making it mutable means that technically, you could make it so that substring would change the original string instead of creating a new one.
On a lower level, immutability means that the memory the string is stored in will not be modified. Once you create a string "foo", some memory is allocated to store the value "foo". This memory will not be altered. If you modify the string with, say, substr(1), a new string is created and a different part of memory is allocated which will store "oo". Now you have two strings in memory, "foo" and "oo". Even if you're not going to use "foo" anymore, it'll stick around until it's garbage collected.
One reason why string operations are comparatively expensive.
Immutable means that which cannot be changed or modified.
So when you assign a value to a string, this value is created from scratch as opposed to being replaced. So everytime a new value is assigned to the same string, a copy is created. So in reality, you are never changing the original value.
I'm not certain about JavaScript, but in Java, strings take an additional step to immutability, with the "String Constant Pool". Strings can be constructed with string literals ("foo") or with a String class constructor. Strings constructed with string literals are a part of the String Constant Pool, and the same string literal will always be the same memory address from the pool.
Example:
String lit1 = "foo";
String lit2 = "foo";
String cons = new String("foo");
System.out.println(lit1 == lit2); // true
System.out.println(lit1 == cons); // false
System.out.println(lit1.equals(cons)); // true
In the above, both lit1 and lit2 are constructed using the same string literal, so they're pointing at the same memory address; lit1 == lit2 results in true, because they are exactly the same object.
However, cons is constructed using the class constructor. Although the parameter is the same string constant, the constructor allocates new memory for cons, meaning cons is not the same object as lit1 and lit2, despite containing the same data.
Of course, since the three strings all contain the same character data, using the equals method will return true.
(Both types of string construction are immutable, of course)
The text-book definition of mutability is liable or subject to change or alteration.
In programming, we use the word to mean objects whose state is allowed to change over time. An immutable value is the exact opposite – after it has been created, it can never change.
If this seems strange, allow me to remind you that many of the values we use all the time are in fact immutable.
var statement = "I am an immutable value";
var otherStr = statement.slice(8, 17);
I think no one will be surprised to learn that the second line in no way changes the string in statement.
In fact, no string methods change the string they operate on, they all return new strings. The reason is that strings are immutable – they cannot change, we can only ever make new strings.
Strings are not the only immutable values built into JavaScript. Numbers are immutable too. Can you even imagine an environment where evaluating the expression 2 + 3 changes the meaning of the number 2? It sounds absurd, yet we do this with our objects and arrays all the time.
Immutable means the value can not be changed. Once created a string object can not be modified as its immutable. If you request a substring of a string a new String with the requested part is created.
Using StringBuffer while manipulating Strings instead makes the operation more efficient as StringBuffer stores the string in a character array with variables to hold the capacity of the character array and the length of the array(String in a char array form)
From strings to stacks... a simple to understand example taken from Eric Lippert's blog:
Path Finding Using A* in C# 3.0, Part Two...
A mutable stack like System.Collections.Generic.Stack
is clearly not suitable. We want to be
able to take an existing path and
create new paths from it for all of
the neighbours of its last element,
but pushing a new node onto the
standard stack modifies the stack.
We’d have to make copies of the stack
before pushing it, which is silly
because then we’d be duplicating all
of its contents unnecessarily.
Immutable stacks do not have this problem. Pushing onto an immutable
stack merely creates a brand-new stack
which links to the old one as its
tail. Since the stack is immutable,
there is no danger of some other code
coming along and messing with the tail
contents. You can keep on using the
old stack to your heart’s content.
To go deep on understaning immutability, read Eric's posts starting with this one:
Immutability in C# Part One: Kinds of Immutability
One way to get a grasp of this concept is to look at how javascript treats all objects, which is by reference. Meaning that all objects are mutable after being instantiated, this means that you can add an object with new methods and properties. This matters because if you want an object to be immutable the object can not change after being instantiated.
Try This :
let string = "name";
string[0] = "N";
console.log(string); // name not Name
string = "Name";
console.log(string); // Name
So what that means is that string is immutable but not constant, in simple words re-assignment can take place but can not mutate some part.
The text-book definition of mutability is liable or subject to change or alteration. In programming, we use the word to mean objects whose state is allowed to change over time. An immutable value is the exact opposite – after it has been created, it can never change.
If this seems strange, allow me to remind you that many of the values we use all the time are in fact immutable.
var statement = "I am an immutable value"; var otherStr = statement.slice(8, 17);
I think no one will be surprised to learn that the second line in no way changes the string in statement. In fact, no string methods change the string they operate on, they all return new strings. The reason is that strings are immutable – they cannot change, we can only ever make new strings.
Strings are not the only immutable values built into JavaScript. Numbers are immutable too. Can you even imagine an environment where evaluating the expression 2 + 3 changes the meaning of the number 2? It sounds absurd, yet we do this with our objects and arrays all the time.

Javascript strings are primitive types?

In Javascript numbers strings and booleans are said to be primitive types.
Primitive types are passed around by copy.
OK consider the following code:
var s1 = "this is a string of 1000 characters ...";
var s2 = s1; // (2)
What happens in line (2)? 1000 characters are copied to the variable s2?
OR is there one memory location and both s1 and s2 refer to this memory location?
I believe the second is true.
If so, why all books say that strings are primitive types, they are not, they are
reference types, aren't they?
What happens in line (2)? 1000 characters are copied to the variable s2? OR is there one memory location and both s1 and s2 refer to this memory location?
It is an implementation detail of the JavaScript engine, there is no way to tell the difference from inside the JavaScript program.
why all books say that strings are primitive types
The language defines them as such.
they are reference types, aren't they?
They might be implemented that way at a level lower then is exposed to JS, but that doesn't matter to the JS author.
What happens in line (2)?
That's more or less implementation defined. To you, it will look like a copy. However, the engine is free to optimise it, and probably will. No doubt, something like copy-on-write.
In JavaScript, there are primitive strings and string objects. It's worth knowing the differences. A string object is passed around by reference, but seeing as all string methods return a new string, you're unlikely to modify it.

Is there a way to customize/override assignment operations in JavasScript?

Every time I assign a string, I'd actually like to assign a string object, without the extra code.
This var foo = "bar";
becomes var foo = new String("bar");
Basically hi-jacking the assignment.
Follow-up:
If the above is not possible is there a way to prototype the string variable type, rather than the String object?
As pointed out by armando, the foo would be a string type, but is essentially a customized array. It would be nice to be able to prototype functions to that class.
No this is not possible
If it was possible, you really would not want to do this, at least not globally.
The string variable type does not have all the extra overhead that an object does.
Note: the string array that is created (in your case, foo) would have other properties (eg foo.length)
Objects come at a performance hit
It's not quite what you're looking for, but you may want to look at Overriding assignment operator in JS

Categories