I will reference to Mozilla's docs about String object.
JavaScript automatically converts primitives to String objects, so that it's possible to use String object methods for primitive strings. In contexts where a method is to be invoked on a primitive string or a property lookup occurs, JavaScript will automatically wrap the string primitive and call the method or perform the property lookup.
Good example of such situation would be accessing length property:
let word = "Hello";
word.length;
I have understood that what happens in this situation is:
let word = "Hello";
String(word).length;
But after trying some benchmarks it's clear for me that word.length is much faster than String(word).length. It seems that implicit conversion is something completely different than String(word), much faster. I cannot find any info about how this implicit conversion works, but it might be helpful to know in some optimization problems.
The reason for that is probably time to parse and the fact that inner JS string object (that is actually C++) works faster then creating extra interface to interact with string on high JS level...
So it's all about optimisation
There is good article about it (https://dev.to/promhize/what-you-need-to-know-about-javascripts-implicit-coercion-e23).
Related
I would like to get a better understanding of what is actually going on when I find the length of a string. I tried looking on W3, ECMA, and at the V8 Ignition website but not much luck.
I keep reading that 'JavaScript treats primitive values as objects when executing methods and properties.' But, I can't seem to find out how exactly this happens. If I call a method/property on a primitive which, I assume gets interpreted as an object by Ignition, doesn't the String class need to call a function at some point to iterate the string? I feel like myString.length should be called a method and String.length could MAYBE be called a property, depending on at which point the "property" is found and how it's found.
Basically, I don't understand why it's touted as a property if it doesn't seem to be inherent and has to be fetched/determined. That seems like a method to me (let alone the fact that string.length) isn't even a real thing and is interpreted.
(V8 developer here.)
I can see several issues here that can be looked at separately:
1. From a language specification perspective, is something a method or a property?
Intuitively, the distinction is: if you write a function call like obj.method(), then it's a method; if you write obj.property (no ()), then it's a property.
Of course in JavaScript, you could also say that everything is a property, and in case the current value of the property is a function, then that makes it a method. So obj.method gets you a reference to that function, and obj.method() gets and immediately calls it:
var obj = {};
obj.foo = function() { console.log("function called"); return 42; }
var x = obj.foo(); // A method!
var func = obj.foo; // A property!
x = func(); // A call!
obj.foo = 42;
obj.foo(); // A TypeError!
2. When it looks like a property access, is it always a direct read/write from/to memory, or might some function get executed under the hood?
The latter. JavaScript itself even provides this capability to objects you can create:
var obj = {};
Object.defineProperty(obj, "property", {
get: function() { console.log("getter was called"); return 42; },
set: function(x) { console.log("setter was called"); }
});
// *Looks* like a pair of property accesses, but will call getter and setter:
obj.property = obj.property + 1;
The key is that users of this obj don't have to care that getters/setters are involved, to them .property looks like a property. This is of course very much intentional: implementation details of obj are abstracted away; you could modify the part of the code that sets up obj and its .property from a plain property to a getter/setter pair or vice versa without having to worry about updating other parts of the code that read/write it.
Some built-in objects rely on this trick, the most common example is arrays' .length: while it's specified to be a property with certain "magic" behavior, the most straightforward way for engines to implement this is to use a getter/setter pair under the hood, where in particular the setter does the work of truncating any extra array elements if you set the length to a smaller value than before.
3. So what does "abc".length do in V8?
It reads a property directly from memory. All strings in V8 always have a length field internally. As commenters have pointed out, JavaScript strings are immutable, so the internal length field is written only once (when the string is created), and then becomes a read-only property.
Of course this is an internal implementation detail. Hypothetically, an engine could use a "C-style" string format internally, and then it would have to use a strlen()-like function to determine a string's length when needed. However, on a managed heap, being able to quickly determine each object's size is generally important for performance, so I'd be surprised if an engine actually made this choice. "Pascal-style" strings, where the length is stored explicitly, are more suitable for JavaScript and similar garbage-collected languages.
So, in particular, I'd say it's fair to assume that reading myString.length in JavaScript is always a very fast operation regardless of the string's length, because it does not iterate the string.
4. What about String.length?
Well, this doesn't have anything to do with strings or their lengths :-)
String is a function (e.g. you can call String(123) to get "123"), and all functions have a length property describing their number of formal parameters:
function two_params(a, b) { }
console.log(two_params.length); // 2
As for whether that's a "simple property" or a getter under the hood: there's no reason to assume that it's not a simple property, but there's also no reason to assume that engines can't internally do whatever they want (so long as there's no observable functional difference) if they think it increases performance or saves memory or simplifies things or improves some other metric they care about :-)
(And engines can and do make use of this freedom, for various forms of "lazy"/on-demand computation, caching, optimization -- there are plenty of internal function calls that you probably wouldn't expect, and on the flip side what you "clearly see" as a function call in the JS source might (or might not!) get inlined or otherwise optimized away. The details change over time, and across different engines.)
Length is not a method, it is a property. It doesn't actually do anything but return the length of an array, a string, or the number of parameters expected by a function. When you use .length, you are just asking the JavaScript interpreter to return a variable stored within an object; you are not calling a method.
Also, note that the String.length property gives the actual number of code units in a string, rather than a literal character count. One code unit is 16 bits as defined by UTF-16 (used by JavaScript). However, some special characters use 32 bits which means that in a string containing one of these characters the String.length property might give you a higher character count than the literal number of characters.
Link:- https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/length
And also one fact length work very different with string.length from Array.length
let myString = "bluebells";
myString.length = 4;
console.log(myString); //bluebells
console.log(myString.length); //9
//--
let myArr = [5,6,8,2,4,7];
myArr.length = 2;
console.log(myArr); //[5, 6]
console.log(myArr.length); //2
I know that you can perform String properties on string primitives as JavaScript temporarily treats string primitives as instances of String:
n = "string";
n2 = n.toUpperCase();
console.log(n2);
Behind the scenes, is JavaScript treating the primitive by doing something like this:
new String(stringPrimitive)? E.g.
n = "string";
n2 = new String(n).toUpperCase();
console.log(n2);
You can instance the built-in String object and then apply any string method. For example:
let str = new String('stringPrimitive')
console.log(str)
console.log(str.split(''))
The term for this behavior is boxing.
In theory, it does exactly that - create a new string object, and use that. However it should be noted, that in reality, engines internally can take a lot of shortcuts, and make many optimizations.
When reading the spec for 6.2.4.8 GetValue, you can see this boxing in the steps (use ToObject(base), if it's a primitive), as well as an explicit note towards the permitted optimization:
The object that may be created in step 5.a.ii is not accessible outside of the above abstract operation and the ordinary object [[Get]] internal method. An implementation might choose to avoid the actual creation of the object.
Therefore, to know what really happens, you'd probably have to read the code of the related engine. For normal javascript programmers, knowing the engine internals is imho not necessary, but if you want to anyways, e.g. V8 is open source (written in c++).
In Javascript numbers strings and booleans are said to be primitive types.
Primitive types are passed around by copy.
OK consider the following code:
var s1 = "this is a string of 1000 characters ...";
var s2 = s1; // (2)
What happens in line (2)? 1000 characters are copied to the variable s2?
OR is there one memory location and both s1 and s2 refer to this memory location?
I believe the second is true.
If so, why all books say that strings are primitive types, they are not, they are
reference types, aren't they?
What happens in line (2)? 1000 characters are copied to the variable s2? OR is there one memory location and both s1 and s2 refer to this memory location?
It is an implementation detail of the JavaScript engine, there is no way to tell the difference from inside the JavaScript program.
why all books say that strings are primitive types
The language defines them as such.
they are reference types, aren't they?
They might be implemented that way at a level lower then is exposed to JS, but that doesn't matter to the JS author.
What happens in line (2)?
That's more or less implementation defined. To you, it will look like a copy. However, the engine is free to optimise it, and probably will. No doubt, something like copy-on-write.
In JavaScript, there are primitive strings and string objects. It's worth knowing the differences. A string object is passed around by reference, but seeing as all string methods return a new string, you're unlikely to modify it.
I am making a small search engine where string are searched often. Since javaScript convert a string primitive (declared like var thatString = "a string") to an object when we use methods like indexOf on them, and then back to a primitive, I think that converting all the primitives stings to object with var aString = new String("aString") in the array of strings to be analysed could bring a speed advantage. But is it really worth it ?
The search engine prototype can be seen at http://bottinbio.com and it's code (open source) at http://ogfor.com/bottinbio/code.js
Even when we do not add a property to the string Object made with new String, the primitive is much faster (43-45%) in firefox and chrome in ubuntu.
Thanks DhruvPathak for the link to jsperf.com
Does javascript use immutable or mutable strings? Do I need a "string builder"?
They are immutable. You cannot change a character within a string with something like var myString = "abbdef"; myString[2] = 'c'. The string manipulation methods such as trim, slice return new strings.
In the same way, if you have two references to the same string, modifying one doesn't affect the other
let a = b = "hello";
a = a + " world";
// b is not affected
However, I've always heard what Ash mentioned in his answer (that using Array.join is faster for concatenation) so I wanted to test out the different methods of concatenating strings and abstracting the fastest way into a StringBuilder. I wrote some tests to see if this is true (it isn't!).
This was what I believed would be the fastest way, though I kept thinking that adding a method call may make it slower...
function StringBuilder() {
this._array = [];
this._index = 0;
}
StringBuilder.prototype.append = function (str) {
this._array[this._index] = str;
this._index++;
}
StringBuilder.prototype.toString = function () {
return this._array.join('');
}
Here are performance speed tests. All three of them create a gigantic string made up of concatenating "Hello diggity dog" one hundred thousand times into an empty string.
I've created three types of tests
Using Array.push and Array.join
Using Array indexing to avoid Array.push, then using Array.join
Straight string concatenation
Then I created the same three tests by abstracting them into StringBuilderConcat, StringBuilderArrayPush and StringBuilderArrayIndex http://jsperf.com/string-concat-without-sringbuilder/5 Please go there and run tests so we can get a nice sample. Note that I fixed a small bug, so the data for the tests got wiped, I will update the table once there's enough performance data. Go to http://jsperf.com/string-concat-without-sringbuilder/5 for the old data table.
Here are some numbers (Latest update in Ma5rch 2018), if you don't want to follow the link. The number on each test is in 1000 operations/second (higher is better)
Browser
Index
Push
Concat
SBIndex
SBPush
SBConcat
Chrome 71.0.3578
988
1006
2902
963
1008
2902
Firefox 65
1979
1902
2197
1917
1873
1953
Edge
593
373
952
361
415
444
Exploder 11
655
532
761
537
567
387
Opera 58.0.3135
1135
1200
4357
1137
1188
4294
Findings
Nowadays, all evergreen browsers handle string concatenation well. Array.join only helps IE 11
Overall, Opera is fastest, 4 times as fast as Array.join
Firefox is second and Array.join is only slightly slower in FF but considerably slower (3x) in Chrome.
Chrome is third but string concat is 3 times faster than Array.join
Creating a StringBuilder seems to not affect perfomance too much.
Hope somebody else finds this useful
Different Test Case
Since #RoyTinker thought that my test was flawed, I created a new case that doesn't create a big string by concatenating the same string, it uses a different character for each iteration. String concatenation still seemed faster or just as fast. Let's get those tests running.
I suggest everybody should keep thinking of other ways to test this, and feel free to add new links to different test cases below.
http://jsperf.com/string-concat-without-sringbuilder/7
from the rhino book:
In JavaScript, strings are immutable objects, which means that the
characters within them may not be changed and that any operations on
strings actually create new strings. Strings are assigned by
reference, not by value. In general, when an object is assigned by
reference, a change made to the object through one reference will be
visible through all other references to the object. Because strings
cannot be changed, however, you can have multiple references to a
string object and not worry that the string value will change without
your knowing it
Just to clarify for simple minds like mine (from MDN):
Immutables are the objects whose state cannot be changed once the object is created.
String and Numbers are Immutable.
Immutable means that:
You can make a variable name point to a new value, but the previous value is still held in memory. Hence the need for garbage collection.
var immutableString = "Hello";
// In the above code, a new object with string value is created.
immutableString = immutableString + "World";
// We are now appending "World" to the existing value.
This looks like we're mutating the string 'immutableString', but we're not. Instead:
On appending the "immutableString" with a string value, following events occur:
Existing value of "immutableString" is retrieved
"World" is appended to the existing value of "immutableString"
The resultant value is then allocated to a new block of memory
"immutableString" object now points to the newly created memory space
Previously created memory space is now available for garbage collection.
Performance tip:
If you have to concatenate large strings, put the string parts into an array and use the Array.Join() method to get the overall string. This can be many times faster for concatenating a large number of strings.
There is no StringBuilder in JavaScript.
The string type value is immutable, but the String object, which is created by using the String() constructor, is mutable, because it is an object and you can add new properties to it.
> var str = new String("test")
undefined
> str
[String: 'test']
> str.newProp = "some value"
'some value'
> str
{ [String: 'test'] newProp: 'some value' }
Meanwhile, although you can add new properties, you can't change the already existing properties
A screenshot of a test in Chrome console
In conclusion,
1. all string type value (primitive type) is immutable.
2. The String object is mutable, but the string type value (primitive type) it contains is immutable.
Strings are immutable – they cannot change, we can only ever make new strings.
Example:
var str= "Immutable value"; // it is immutable
var other= statement.slice(2, 10); // new string
Regarding your question (in your comment to Ash's response) about the StringBuilder in ASP.NET Ajax the experts seem to disagree on this one.
Christian Wenz says in his book Programming ASP.NET AJAX (O'Reilly) that "this approach does not have any measurable effect on memory (in fact, the implementation seems to be a tick slower than the standard approach)."
On the other hand Gallo et al say in their book ASP.NET AJAX in Action (Manning) that "When the number of strings to concatenate is larger, the string builder becomes an essential object to avoid huge performance drops."
I guess you'd need to do your own benchmarking and results might differ between browsers, too. However, even if it doesn't improve performance it might still be considered "useful" for programmers who are used to coding with StringBuilders in languages like C# or Java.
It's a late post, but I didn't find a good book quote among the answers.
Here's a definite except from a reliable book:
Strings are immutable in ECMAScript, meaning that once they are created, their values cannot change. To change the string held by a variable, the original string must be destroyed and the variable filled with another string containing a new value...
—Professional JavaScript for Web Developers, 3rd Ed., p.43
Now, the answer which quotes Rhino book's excerpt is right about string immutability but wrong saying "Strings are assigned by reference, not by value." (probably they originally meant to put the words an opposite way).
The "reference/value" misconception is clarified in the "Professional JavaScript", chapter named "Primitive and Reference values":
The five primitive types...[are]: Undefined, Null, Boolean, Number, and String. These variables are said to be accessed by value, because you are manipulating the actual value stored in the variable.
—Professional JavaScript for Web Developers, 3rd Ed., p.85
that's opposed to objects:
When you manipulate an object, you’re really working on a reference to that object rather than the actual object itself. For this reason, such values are said to be accessed by reference.—Professional JavaScript for Web Developers, 3rd Ed., p.85
JavaScript strings are indeed immutable.
Strings in Javascript are immutable