Accessing Array beyond its size in javascript

Accessing Array beyond its size in javascript - javascript

I read in a particular book that an array in JavaScript can hold 4,294,967,295 items and would throw exception if the number reaches beyond that.
I tried out the functionality using the following code:
var a = ["a","b","c"];
a[4294967300] = "d";
console.log(a[4294967300]);
It shows the output "d" and no exception or error. Am I missing something here? Can someone put some light on the topic and share some knowledge regarding max array items in JavaScript and various scenarios related to it?

An array doesn't have to hold all the items from 0 to N to contain one with index N.
That's because arrays in JavaScript engines can switch to a dictionnary mode when the holes are too big, those arrays are called sparse arrays (vs dense arrays).
It's important to know this distinction because the implementation is leaking on one point : performance. You should read this on this topic : http://www.html5rocks.com/en/tutorials/speed/v8/
But regarding indexes starting at 2³², sebcap26 is right, there's a distinction due to the fact the index is handled as a string. This distinction is important and can be verified by logging a.length : you'll see the length isn't impacted by such an element. There's no exception or error per se but it makes it impossible to use normal array operations like iterating up to the length or using array functions like map or filter (the elements with index greater than the numeric index limit are ignored by those functions).

If I understand well the ECMAScript specifications, an index which is not in [0 .. 2^32-1] is converted into a String and used as an Object key, not as an Array index.
A property name P (in the form of a String value) is an array index if and only if ToString(ToUint32(P)) is equal to P and ToUint32(P) is not equal to 2^32−1.

Try running this code: fiddle : http://jsfiddle.net/vXtfE/
var a = ["a","b","c"];
a[4294967300] = "d";
console.log(a.length);
console.log(a);
console.log(a[4294967300]);
You will see this output:
3
["a", "b", "c", 4294967300: "d"]
d
The initial items get stored as array elements, but for large index, the storage changes to hash based sparse array. Hence, it is a mix of both in your case.
Good explanation of this :
Why is array.push sometimes faster than array[n] = value?

Related

How does native sort method deal with sparse arrays

When a sparse array is sorted [5, 2, 4, , 1].sort() -> [1, 2, 4, 5, empty], the empty value is always last even with callback no matter the return statement.
I'm building my own sort method as a challenge and I solved this problem by using filter method since filter skips empty values. Then iterate over filtered array and set original array's index to filtered array's values. Then I shorten the original array's length since the remaining items will be duplicates, and I can finally feed it in my sorting algorithm. Once that's done, then I set it's length back to original which adds appropriate amount of empty items at the end. Here's a snippet of code, but here's a link of the entire code
const collectUndefined = [];
// remove empty items and collect undefined
const removeSparse = this.filter(el => {
if (el === undefined) {
collectUndefined.push(el);
}
return el !== undefined;
});
const tempLength = this.length;
// reset values but will contain duplicates at the end
for (let i = 0; i < removeSparse.length; i++) {
this[i] = removeSparse[i];
}
// shorten length which will remove extra duplicates
this.length = removeSparse.length;
// sort algorithm ...
// place undefineds back into the array at the end
this.push(...collectUndefined);
// restores original length and add empty elemnts at the end
this.length = tempLength;
return this
Is the native sort implemented in this similar fashion when dealing with sparse arrays, or no.

When it comes to implementation of Array.sort you have to also ask which engine? They are not all equal in terms of how they end up getting to the final sorted version of the array. For example V8 has a pre-processing and post-processing step before it does any sorting:
V8 has one pre-processing step before it actually sorts anything and
also one post-processing step. The basic idea is to collect all
non-undefined values into a temporary list, sort this temporary list
and then write the sorted values back into the actual array or object.
This frees V8 from caring about interacting with accessors or the
prototype chain during the sorting itself.
You can find pretty detailed explanation of the entire process V8 goes through here
The actual source code for the V8 sort (using Timsort) can be found here and is now in Torque language.
The js tests for V8 Array.sort can be seen here
Bottom line however is that nothing is actually removed from the original array since it should not be. Sort is not supposed to mutate the original array. That would be super weird if you call myArray.sort() and all of a sudden it has 5 elements less from its 8 total (for example). That is not something you would find in any Array.sort specs.
Also Array.sort pays close attention to the types it sorts and orders them specifically. Example:
let arr = [4,2,5,,,,3,false,{},undefined,null,0,function(){},[]]
console.log(arr.sort())
Notice in the output above how array is first, followed by numeric values, object literal, Boolean, function, null and then undefined / empty. So if you want to really match the spec you would have to consider how different types are also sorted.

Why can't we provide size of array in Javascript?

Why can't we provide size of array in JavaScript?
I mean even if it is possible why don't we why we just simply define the array.

Because standard arrays in JavaScript aren't really arrays at all (spec | post on my blog), they're just objects backed by Array.prototype with special handling for a class of property names ("array indexes"), a special length property, and a built-in literal notation. They aren't contiguous blocks of memory as in some other languages (barring optimization, of course).
I have a question in my mind about why can't we provide size of array in JavaScript ??
You can create an array with a given length via Array(n) where n is the length as a number. But again, it doesn't preallocate memory for that many slots or anything. You just end up with a sparse array with length set to n and no entries in it:
var a = Array(42);
console.log(a.length); // 42
console.log(0 in a); // false, it doesn't have an entry 0
a.forEach(function(entry) { // Never calls the callback
console.log(entry); // because the array is empty
});
I mean even if it is possible why don't we why we just simply define the array.
Because it serves no purpose.
Now, for typed arrays (Uint8Array and similar), we do indeed create them with a specific length (var a = new Uint8Array(42);), and that length is fixed (cannot change), because they're true arrays.

You can provide size of array. If its not given, you can add multiple values dynamically.
var arr = new Array(5);

You can provide size of array in java-script.
Java-script array is different from array in C language.
You can read more on following link
understanding-javascript-arrays

You can provide a size of an array and that size of an array will not change in the program.
Array is an object backed by Array.prototype, so there is a function called seal.
var myArray = Object.seal([5, 6, "saurabh", "text"]); // this is an array of size 4 fixed.
//myArray.push('new text'); //throw exception error
console.log(myArray[2]); //"saurabh"
myArray[0] = "change text";
console.log("print myArray: ", myArray);
You can read more over here:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object/seal

Is it an antipattern to set an array length in JavaScript?

Is it bad to use code like:
var a = [1,2,3,4];
a.length = 2; // 3 and 4 are removed
Does it have decent browser support? Do the removed values get garbage collected properly?

Does it have decent browser support?
Yes. This has been present since the very first edition of ECMAScript:
Specifically, whenever a property is added whose name is an array index, the
length property is changed, if necessary, to be one more than the numeric value of that array index; and whenever
the length property is changed, every property whose name is an array index whose value is not smaller than the
new length is automatically deleted.
Standard ECMA-262, 15.4
In ECMAScript 5, this was moved to 15.4.5.2.
Attempting to set the length property of an Array object to a value that is numerically less than or equal to the largest numeric property name of an existing array indexed non-deletable property of the array will result in the length being set to a numeric value that is one greater than that largest numeric property name.
Standard ECMA-262, 15.4.5.2
All browsers support this behavior.
Do the removed values get garbage collected properly?
Yes. (See quotes above, and 15.4.5.1.)
Is it an antipattern to set array length in Javascript?
No, not really. While arrays are "exotic objects" in ES6-speak, the real crazily unique thing about arrays are their indexing, not setting the length property. You could replicate the same behavior with a property setter.
It's true that property setters with non-subtle effects are somewhat unusual in JS, as their side-effects can be unobvious. But since .length has been there in Javascript from Day 0, it should be fairly widely understood.
If you want an alternative, use .splice():
// a is the array and n is the eventual length
a.length = n;
a.splice(n, a.length - n); // equivalent
a.splice(n, a.length); // also equivalent
If I were avoid setting .length for some reason, it would be because it mutates the array, and I prefer immutable programming where reasonable. The immutable alternative is .slice().
a.slice(0, n); // returns a new array with the first n elements

Arrays are exotic objects.
An exotic object is any form of object whose property semantics differ in any way from the default semantics.1
The property semantics for arrays are special, in this case that changing length affects the actual contents of the array.
An Array object is an exotic object that gives special treatment to array index property keys [...] 2
The behavior of the specific property key for this array exotic object is outlined as well.
[...] Every Array object has a length property whose value is always a nonnegative integer less than 2^32. The value of the length property is numerically greater than the name of every own property whose name is an array index; whenever an own property of an Array object is created or changed, other properties are adjusted as necessary to maintain this invariant [...]
This section is detailing that the length property is a 32 bit integer. It is also saying that if an array index is added, the length is changed. What this is implying is that when an own property name that is not an index is used, the length is not changed and also that names which are not indexes are considered numerically less than the indexes. This means that if you have an array and also add a string (not implicitly numeric either, as in not "3") property to it, that changing the length to delete elements will not remove the value associated with the string property. For example,
var a = [1,2,3,4];
a.hello = "world";
a.length = 2; // 3 and 4 are removed
console.log(a.hello);//"world"
[...] Specifically, whenever an own property is added whose name is an array index, the value of the length property is changed, if necessary, to be one more than the numeric value of that array index; [...]
In addition to the truncation, expansion is also available by use of an index. If an index value is used (an integer basically) then the length will be updated to reflect that change. As arrays in JavaScript are sparse (as in, no gaps allowed) this means that adding a value for a larger index can make an array rather larger.
var a = [1,2,3,4];
a[50] = "hello";
console.log(a.length);//51
[...] and whenever the value of the length property is changed, every own property whose name is an array index whose value is not smaller than the new length is deleted.
Finally, this is the specific aspect in question which the OP raises. When the length property of an array is modified, it will delete every index and value which is numerically more than the new length value minus 1. In the OP's example, we can clearly see that changing the length to 2 removed the values at index 2 and 3, leaving only the first two values who had index 0 and 1.
The algorithm for the deletion mentioned above can be found in the ArraySetLength(A, Desc) definition.
b. Let deleteSucceeded be A.[[Delete]](ToString(oldLen)). 3
Which is converting the index to a string and using the delete behavior for objects on the index. At which point the entire property is removed. As this internally reaches the delete call, there is no reason to believe that it will leak memory or even that it is an anti pattern as it is explicitly described in the language specification.
In conclusion, Does it have decent browser support? Yes. Do the removed values get garbage collected properly? Yes.
However, is it an anti-pattern? This could be argued either way. What it really breaks down to is the level of familiarity with the language of others who would be using the same code which is a nice way of saying that it may not have the best readability. On the other hand, since JavaScript does take up bandwidth or memory with each character used (hence the need for minification and bundling) it can be useful to take this approach to truncate an array. More often than not, worrying about this type of minutia is going to be a micro-optimization and the use of this special property should be considered on a case to case basis in context with the overall design of the related code.
1. The Object Type ECMA 6
2. Array Exotic Objects ECMA 6
3. ArraySetLength(A, Desc) ECMA 6

As it is writeable https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/length (seen here) It should be okay to use it that way
You can set the length property to truncate an array at any time. When you extend an array by changing its length property, the number of actual elements does not increase; for example, if you set length to 3 when it is currently 2, the array still contains only 2 elements.
There's even an example how to shorten an array, using this on Mozilla Developer Network
if (statesUS.length > 50) {
statesUS.length = 50;
}

It's better to use slice because your intentions are clear.
var truncated = a.slice(0,2)
Slice does create a new array though.
If that is an issue for you then there is nothing wrong with modifying the length property.
It's actually the fastest way of truncation and supported in all browsers. jsperf

Is it an antipattern to set array length in JavaScript?
No. It would be OK to set it to truncate the array.
Does it have decent browser support?
Yes, see the compatiblity matrix here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/length
Do the removed values get garbage collected properly?
Yes, they should. Will they actually get gced - you will need to profile the JavaScript code if this is a real concern.

This way of truncating an array is mentioned in JavaScript: Good Parts, so it is not an anti-pattern.

It is not an anti pattern. It is supported in all major browsers.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/length

Is an array in JS actually filled with empty spaces or when printed is it filled with zeros?

I have an array:
mydata =[];
i am storing values in the array by using a numeric key that can be somewhat big
mydata[13525] = 1;
However if i only store one item as the above, when i print the array in the console it prints 13524 commas before the object, nevertheless the debugger tells me it has length 13526 and there is only one element in the array.
I am confused as i understood that JS arrays do not required every position to be filled and certainly this might consume a lot of memory on bigger numbers. Can you explain please?

The .length property of an array represents the highest index plus one in the array that has been assigned a value.
When you do not assign intermediate values (and thus have a sparse array), the intervening values that you have not assigned are undefined which is essentially the "nothing here" value in Javascript. The console chooses to display arrays by trying to show you every value between 0 and the end of the array which is simply not a very good way to display a sparse array that is mostly empty. That's more an artifact of a design choice in the console than anything else. One could design a different way to display contents of an array that would handle sparse arrays differently.
Arrays are most efficient if you use consecutive indexes started from 0. That's what they are mostly designed for and what many implementations are optimized for since a runtime can do some optimized things if it knows there is a sequential set of values.
If you know you're going to mostly not be using sequences of numeric indexes starting from 0 and as such the .length property is of little use to you, then perhaps a plain object with arbitrary properties is a better choice.
var mydata = {};
mydata[13525] = 1;
console.log(mydata[13525]); // 1
console.log(mydata.length); // undefined - no .length property on an object
console.log(myData); // {1: 13525}

If you only want to print any non-null value in the array, instead of printing the whole array I'd use a for loop that only prints non-null values. It'd be something like this...
for (i = 0; i < mydata.length; i++) {
if(mydata[i]!= null){
console.log(mydata[i]);
}
}

javascript array is bit different from others,
var foo = [1, 2, 3, 4, 5, 6];
foo.length = 3;
foo; // [1, 2, 3]
foo.length = 6;
foo.push(4);
foo; // [1, 2, 3, undefined, undefined, undefined, 4]
While the getter of the length property simply returns the number of elements that are contained in the array, in setter, a smaller value truncates the array, larger value creates a sparse array. Guess what the setter mydata[13525] = 1; would do.
src: Javascript Garden
Edit:
to print/use only the values present, you can do
mydata.forEach(function(v){console.log(v);});

Is a JavaScript array index a string or an integer?

I had a generic question about JavaScript arrays. Are array indices in JavaScript internally handled as strings?
I read somewhere that because arrays are objects in JavaScript, the index is actually a string. I am a bit confused about this, and would be glad for any explanation.

Formally, all property names are strings. That means that array-like numeric property names really aren't any different from any other property names.
If you check step 6 in the relevant part of the spec, you'll see that property accessor expressions are always coerced to strings before looking up the property. That process is followed (formally) regardless of whether the object is an array instance or another sort of object. (Again, it just has to seem like that's what's happening.)
Now, internally, the JavaScript runtime is free to implement array functionality any way it wants.
edit — I had the idea of playing with Number.toString to demonstrate that a number-to-string conversion happens, but it turns out that the spec explicitly describes that specific type conversion as taking place via an internal process, and not by an implicit cast followed by a call to .toString() (which probably is a good thing for performance reasons).

That is correct so:
> var a = ['a','b','c']
undefined
> a
[ 'a', 'b', 'c' ]
> a[0]
'a'
> a['0']
'a'
> a['4'] = 'e'
'e'
> a[3] = 'd'
'd'
> a
[ 'a', 'b', 'c', 'd', 'e' ]

Yes, technically array-indexes are strings, but as Flanagan elegantly put it in his 'Definitive guide':
"It is helpful to clearly distinguish an array index from an object property name. All indexes are property names, but only property names that are integers between 0 and 232-1 are indexes."
Usually you should not care what the browser (or more in general 'script-host') does internally as long as the outcome conforms to a predictable and (usually/hopefully) specified result. In fact, in case of JavaScript (or ECMAScript 262) is only described in terms of what conceptual steps are needed. That (intentionally) leaves room for script-host (and browsers) to come up with clever smaller and faster way's to implement that specified behavior.
In fact, modern browsers use a number of different algorithms for different types of arrays internally: it matters what they contain, how big they are, if they are in order, if they are fixed and optimizable upon (JIT) compile-time or if they are sparse or dense (yes it often pays to do new Array(length_val) instead of ninja []).
In your thinking-concept (when learning JavaScript) it might help to know that arrays are just special kind of objects. But they are not always the same thing one might expect, for example:
var a=[];
a['4294967295']="I'm not the only one..";
a['4294967296']="Yes you are..";
alert(a); // === I'm not the only one..
although it is easy and pretty transparent to the uninformed programmer to have an array (with indexes) and attach properties to the array-object.
The best answer (I think) is from the specification (15.4) itself:
Array Objects
Array objects give special treatment to a certain class of property
names. A property name P (in the form of a String value) is an array
index if and only if ToString(ToUint32(P)) is equal to P and
ToUint32(P) is not equal to 232−1. A property whose property name is
an array index is also called an element. Every Array object has a
length property whose value is always a nonnegative integer less than
232. The value of the length property is numerically greater than the name of every property whose name is an array index; whenever a
property of an Array object is created or changed, other properties
are adjusted as necessary to maintain this invariant. Specifically,
whenever a property is added whose name is an array index, the length
property is changed, if necessary, to be one more than the numeric
value of that array index; and whenever the length property is
changed, every property whose name is an array index whose value is
not smaller than the new length is automatically deleted. This
constraint applies only to own properties of an Array object and is
unaffected by length or array index properties that may be inherited
from its prototypes.
An object, O, is said to be sparse if the following algorithm returns
true:
Let len be the result of calling the [[Get]] internal method of O with argument "length".
For each integer i in the range 0≤i<ToUint32(len)
a. Let elem be the result of calling the [[GetOwnProperty]] internal method of O with argument ToString(i).
b. If elem is undefined, return true.
Return false.
Effectively the ECMAScript 262 spec just ensures to the JavaScript-programmer unambiguous array-references regardless of getting/setting arr['42'] or arr[42] up to 32-bit unsigned.
The main difference is for example (auto-updating of) array.length, array.push and other array-sugar like array.concat, etc.
While, yes, JavaScript also lets one loop over the properties one has set to an object, we can not read how much we have set (without a loop). And yes, to the best of my knowledge, modern browsers (especially chrome in what they call (but don't exactly specify)) 'small integers' are wicked fast with true (pre-initialized) small-int arrays.
Also see for example this related question.
Edit: as per #Felix Kling's test (from his comment above):
After arr[4294967294] = 42;, arr.length correctly shows 4294967295. However, calling arr.push(21); throws a RangeError: Invalid array length. arr[arr.length] = 21 works, but doesn't change length.
The explanation for this (predictable and intended) behavior should be clear after this answer.
Edit2:
Now, someone gave the comment:
for (var i in a) console.log(typeof i) shows 'string' for all indexes.
Since for in is the (unordered I must add) property iterator in JavaScript, it is kind of obvious it returns a string (I'd be pretty darned if it didn't).
From MDN:
for..in should not be used to iterate over an Array where index order
is important.
Array indexes are just enumerable properties with integer names and
are otherwise identical to general Object properties. There is no
guarantee that for...in will return the indexes in any particular
order and it will return all enumerable properties, including those
with non–integer names and those that are inherited.
Because the order of iteration is implementation dependent, iterating
over an array may not visit elements in a consistent order. Therefore
it is better to use a for loop with a numeric index (or Array.forEach
or the for...of loop) when iterating over arrays where the order of
access is important.
So.. what have we learned? If order is important to us (often is with arrays), then we need this quirky array in JavaScript, and having a 'length' is rather useful for looping in numerical order.
Now think of the alternative: Give your objects an id/order, but then you'd need to loop over your objects for every next id/order (property) once again...
Edit 3:
Someone answered along the lines of:
var a = ['a','b','c'];
a['4'] = 'e';
a[3] = 'd';
alert(a); // returns a,b,c,d,e
Now using the explanation in my answer: what happened is that '4' is coercible to integer 4 and that is in the range [0, 4294967295] making it into a valid array index also called element. Since var a is an array ([]), the array element 4 gets added as array element, not as property (what would have happened if var a was an object ({}).
An example to further outline the difference between array and object:
var a = ['a','b','c'];
a['prop']='d';
alert(a);
see how it returns a,b,c with no 'd' to be seen.
Edit 4:
You commented: "In that case, an integer index should be handled as a string, as it is a property of the array, which is a special type of JavaScript object."
That is wrong in terms of terminology because: (strings representing) integer indexes (between [0, 4294967295]) create array indexes or elements; not properties.
It's better to say: Both an actual integer and a string representing an integer (both between [0, 4294967295]) is a valid array index (and should conceptually be regarded as integer) and creates/changes array elements (the 'things'/values (only) that get returned when you do arr.join() or arr.concat() for example).
Everything else creates/changes a property (and should conceptually be regarded as string).
What the browser really does, usually shouldn't interest you, noting that the simpler and clearer specified you code, the better chance the browser has to recognize: 'oh, let’s optimize this to an actual array under the hood'.

Let's see:
[1]["0"] === 1 // true
Oh, but that's not conclusive, since the runtime could be coercing "0" to +"0" and +"0" === 0.
[1][false] === undefined // true
Now, +false === 0, so no, the runtime isn't coercing the value to a number.
var arr = [];
arr.false = "foobar";
arr[false] === "foobar" // true
So actually, the runtime is coercing the value to a string. So yep, it's a hash table lookup (externally).

In JavaScript there are two type of arrays: standard arrays and associative arrays (or an object with properies)
[ ] - standard array - 0 based integer indexes only
{ } - associative array - JavaScript objects where keys can be any strings
So ...
var arr = [ 0, 1, 2, 3 ];
... is defined as a standard array where indexes can only be integers. When you do arr["something"] since something (which is what you use as index) is not an integer you are basically defining a property to the arr object (everything is object in JavaScript). But you are not adding an element to the standard array.

We Keep Coding

JavaScript is the programming language of the Web.