Why does JS keep insertion order in Set? [closed]

Why does JS keep insertion order in Set? [closed] - javascript

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
Randomly experimenting with JavaScript (ES6) and reading its documentation I found out that Set maintains the insertion order of its elements.
I wonder what is the rationale behind this decision? I always thought of a set as an unordered collection. Requiring anything more would lead to a more costly implementation, whereas, in my opinion, this feature is mostly unused.

Ordered Sets are very useful, consider for example:
unique_elements_in_order = [...new Set(some_array)]
In an "unsorted set" language, like python, you'd need a separate OrderedSet implementation for this to work.
Yes, theoretically, sets are not ordered, but mathematical abstractions, like sets, functions, numbers etc, are only tangentially related to similarly named objects we use in programming. Set is just a special kind of data structure, and it's up to the language designers to define its specific properties, like "Sets are in the insertion order" or "Sets can only contain hashable objects" etc.
As to the committee's motivation, some googling brought this
Mark S. Miller:
Mon Feb 13 22:31:28 PST 2012
There are many benefits to determinism. E started with non-deterministic
iteration order, which opens a covert channel hazard. I initially changed
to deterministic order merely to plug this leak. Having done so, I found it
had many software engineering benefits. For example, it becomes much easier
to write regression tests and to reproduce bugs by re-execution. In my
implementation, it also had a minor additional space and time cost. Tyler's
Waterken tables show that even the minor runtime costs I was paying were
unnecessary.
Let's not introduce yet another source of non-determinism for the sake of
an unmeasured efficiency. Let's measure, and if the cost turns out to be
high after all, then let's reconsider determinism.

Related

How does Javascript's engine design affect user data structure implementations on worse case? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
Take, for instance, a JS web engine that implements associative arrays (objects) by using hash tables. but as we know hash tables have worse case O(n) because collisions are inevitable.
Suppose I begin to develop a new data structure using Javascript, a data structure such as LinkedList that has worse case O(1) for insert/delete. But since I implement it with object/array. Then it must be true that my implementation is also at minimum worse case O(n) as well.
I'm aware that this engine optimizes very well, and a good hash function will generate O(1) on average. However, I just want to confirm my realization that this isn't all as straightforward as the textbook says so. or is it?
I suppose at the root of all data structures implements with an array, since the access is always O(1), then shouldn't all data structures be built with an array without intermediary structures? also, the dynamic array still has delete O(n) cant that be the same problem that can trickle down just like my earlier example?
is this where the benefit of using a low-level programming language is better than using a high-level language? Such that low level there isn't so much abstraction and the textbook complexity numbers can actually match?
apologize if my ideas are all over the place.

But since I implement [my custom data structure] with object/array. Then it must be true that my implementation is also at minimum worse case O(n) as well.
No. Your linked list does not use an object with n keys anywhere. You'll have a
const linkedListNode = {
value: …,
next: null,
};
but even if this was implemented using a HashTable with O(n) worst-case member access, in your case n=2. There are not arbitrarily many properties in your object, just two. That's how you get back to O(1).
Is this where the benefit of using a low level programming language is better than using high level language? Such that low level there isn't so much abstraction and the textbook complexity numbers can actually match?
No. Even in a lower-level programming language, you can begin to question the underlying abstraction. You think in C, an indexed array access in memory is constant time? No, since page faults and other caching shenanigans come into play.
This is why textbook complexity is always defined in terms of a machine model. As long as you define object property access in your JavaScript execution model as constant-time (and it's a very reasonable assumption to do that! It closely resembles the real world), your numbers do apply to JavaScript code as well. Sure, you can try unravelling abstractions and analyse your high-level algorithms in terms of the primitives of a lower level, but there's no point in doing that. It's precisely why we have these abstractions in the first place.

"associative arrays (objects) by using hash tables" -> Javascript objects are complicated and are much more than just "it's a hash map". I don't know the exact technical details but I think they change to hash map after a certain amount of values are stored in them, along with that they also store metadata which is used on other algorithms like Object.keys to automatically sort the keys after they've been pulled out of the hash map. Again I don't know the technical details but I do know that, it's not straight forward.
"as we know hash tables has worse case O(n) because collisions are inevitable" -> It depends on what hashing you're using, but more than that even if collisions are inevitable it's not correct to just claim "it's worse case O(n)" and leave it at that because the probability of it being O(n) logarithmically declines to 0, the chances of it finding a collision time after time again and again is extremely unlikely, so while it can perhaps find a collision that doesn't effectively describe the time complexity.
"it must be true that my implementation is also at minimum worse case O(n) as well" -> Not correct, you're speaking about two different things. If you build a linked list each node will be connected to the next node using a heap reference, which has nothing to do with javascript objects. Iterating through an entire linked list will be O(n) but that's because of having to iterate over every next node, not because of anything with objects or hashes.
"worse case O(1) for insert/delete" -> This is only true if you have the reference to the node where you want to insert/delete it, otherwise you'll have to search through it before insertion/deletion. But that's exactly the same in javascript.
"then shouldn't all data structure be built with array without intermediary structures" -> Most data structures I know (like a list, stack, queue) are implemented on top of a normal array. The ones that aren't (like a binary tree, dictionary/map or a linked list) are not implemented on an array because it wouldn't really make sense. For example the whole point of using object references with a linked list is so that you can directly insert/delete something, using an array under the hood would just defeat the entire point of using a linked list when you're specifically trying to take advance of the object references.
"also dynamic array still has delete O(n) cant that be the same problem which can trickle down just like my earlier example" -> Not necessarily because when you wrap things inside an object and use an internal array inside, you can add metadata, indexes, hashes, things stored outside of the array (private to the object) and all sorts of other things to speed up and keep track of things on that array. So the complexity of what's used internally doesn't just automatically spill over to using it in another object. But you do need to be careful, like if you use a list then the inner workings of it in languages like C# is that it will double the internal array when you try to add more elements to it after it's full, this can result in a lot of memory waste.
That being said the use case for javascript is in 99% of cases not "optimize this by another 10ms", javascript is used because of its non IO blocking nature, streaming, async/await reactive programming and it's rapid speed of development, it's also used 90% for web communication, not some highly optimized graphics engine. So there's very very few edge cases where you need over 9000 complexity optimizations, feature development, code readability, maintainability, things like that are a much bigger deal in JS in general. Along with that in most use cases you aren't going to request 1m data records from your DB using JS, usually like 50 that you want to display on a page and for that you'll use the DB to optimize your query, there's hardly ever any need for using such large data structures in any JS development (or web development in general). It's a lot better to pull what you need and to request more or continuously stream what you need to the client. So a lot of the data structures (like a binary tree) aren't really relevant to things in JS unless it's a very specific use case.

Loading lookup table from server - Efficient Format [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
If I had a python script that created a lookup table which could be read by a webpage (javascript and maybe ajax), what is the most efficient (in speed and if possible size) format to use?
The lookup-table could have 2000 rows.
Here is a data example:
Apple: 3fd4
Orange: 1230
Banana: 942a
...

Even though this is primarily opinion based, I'd like to roughly explain to you what your options are.
If size is truly critical, consider a binary format. You could even write your own!
With the data size you are presenting, we are probably talking megabytes of data (depending on the field values and no. of columns), so the format is of importance. Now, a simple csv or plain text file - provided it can be read by the webpage - is very efficient in terms of the additional overhead: simply seperating the values by a comma and putting the table headers on line 1 is very, very concise.
JSON would work too, but does maintain a somewhat larger overhead than just a raw (text) data dump (like a csv would be). JavaScript object notation is often used for data transfers, but really, in the case of raw data it does not make much sense to coerce it into such a format.
Final thoughts: put it into a relational database and do not worry about it any more. That is the tried and tested approach to any relational data set, and I do not really see a reason you should deviate from that format.

Errors in javascript programs [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Is there a good classification of standard javascript errors? For example, in java like programs, there are errors like ArrayIndexOutOfRange, resource leaks, race conditions etc.
Also, in Javascript few errors are not reported as exceptions (e.g., divide by zero). Are there any other similar behaviors that are not reported as runtime exceptions in javascript?

MDN has a great article about this, they can put it better than I ever could:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Error#Error_types
Also, good classification here in the ECMA standard:
ECMAScript 5.1 (Current): http://www.ecma-international.org/ecma-262/5.1/#sec-15.11.6
ECMAScript 6 (Coming soon, some features already here in certain browsers): http://www.ecma-international.org/ecma-262/5.1/#sec-15.11.6
In terms of "not being reported as runtime errors", there are some evaluations of expressions that do not halt execution of code but return indicators like NaN, e.g.:
var a = "Hello";
var b = 3;
var c = a / b; // c is "NaN"
You can use the isNaN() function to check for this. Unfortunately I don't know of an offical definite list of these scenarios (if there are more) or even how you would classify them. I guess it comes down to experience and learning the features (or quirks depending on your perspective!) of the language.

Why do some programming languages allow semicolons to be automatically included? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Languages such as C++ will not work if semicolons are forgotten but other languages such as JavaScript will automatically include them for you.
I know from this article Do you recommend using semicolons after every statement in JavaScript?, that it is recommended to use semicolons and there are scenarios that can create unwanted ambiguities (such as dangling else in C++ when braces aren't used).
At some point in time there must have been a decision to make them optional (e.g. when the creators of JavaScript made the conscious choice to make it optional).
I would like to know why this decision was made and how it is beneficial to users of these languages.
Background: I am a novice coder and have only recently began learning JavaScript.
EDIT: To the comments that are saying it is bad in JavaScript, I know. I'm asking why it is allowed to happen in the first place, if most people consider it bad practice.

Regarding JavaScript, Douglas Crockford explains the origins of the idea in this video. (It's a great talk and it's really worth your time to watch it if you intend to continue pursuing JavaScript.)
This is a direct quote from the talk:
Semicolon insertion was something intended to make the C syntax easier for beginners.
As far as how it's beneficial to users of the language, Crockford explains in detail a few reasons why it's not beneficial, but rather how it introduces very serious ambiguities and gotchas into the syntax. One of the most notable cases is when attempting to return an object literal using a braces-on-the-left coding style (source from the video):
return
{
ok: false
};
Which actually returns undefined, because semicolon insertion adds one after return, and the remaining intended object literal gets parsed as a code block, equivalent to this:
return;
{
ok: false;
}
Trying to make a language easier for beginners can be a great source of well-intentioned blunders.

The author of the JavaScript language, Brendan Eich, has a blog post on this subject called The infernal semicolon on the topic of Automatic Semicolon Insertion (ASI).
Relevant quotes:
ASI is (formally speaking) a syntactic error correction procedure.
I wish I had made newlines more significant in JS back in those ten days in May, 1995. Then instead of ASI, we would be cursing the need to use infix operators at the ends of continued lines, or perhaps or brute-force parentheses, to force continuation onto a successive line. But that ship sailed almost 17 years ago.
My two cents: be careful not to use ASI as if it gave JS significant newlines.

Long ago, in the distant, dusty past, things like this were done primarily to make up for the fact that compile/link/run cycles were measured in hours at a minimum, and often ran more than a day. It could be (okay: was) extremely frustrating to wait hours for a result, only to find that the compiler had stopped at line 3 (or whatever) because of some silly typo.
To try to combat that, some compilers of the time tried to second-guess your intended meaning, so if a typo was minor enough (for some definition of "minor enough") it would assume it knew what you really intended, and continue compiling (and potentially even executing) despite an error.
Those who fail to study history are doomed to repeat it. A few who are just too arrogant to learn from history repeat it as well. There's probably room for considerably debate about the exact sort of character defect that would lead a language designer to make this mistake at the present time. There is much less room (none at all, really) for argument about whether it is a mistake though--it clearly is, and an inexcusable one at that.

in javascript, the semi colon is a statement seperator, but so is newlines, so you don't need them if you have a statement per line.
other languages, like C++, only have ; as a seperator, and whitespace like newlines, do nothing. There are pros and cons
in C++ it means the syntax is consistent
if you write
int x=0;
x++;
if you then compress to one line, its the same general syntax :-
int x = 0; x++;
in javascript if you write
var x=0
x++
then if you compressed to one line
var x=0 x++
would be a problem
you'd need to do var x=0; x++
So, the big thing is whether whitespace is significant or not. Ideally a language would consistently use one mechanisim. But for javascript it is mixed so it leaves a bit of ambiguity when to use ;

What are the pros of loose typing? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
JavaScript is said to be a "loosely-typed" language. This is due to the fact that the runtime allows operations to be performed on operands of different types (via coercion):
var number = 6;
var bool = true;
var result = number + bool; //result is 7
Coming from a mostly statically-typed, strongly-typed background, I am having a hard time reasoning about the benefits of this type of approach. Sure, it can make for some pretty concise syntax, but it also seems like it could cause a nightmare when trying to track down bugs. So, besides conciseness, what are some of the benefits of loose typing and implicit type conversions?

Loosely typed languages have a number of differences which can be taken as advantages:
There is no need of interfaces. As long as an object has the method name that you need, call that method. Not using interfaces can simplify coding and reduce code size.
There is no need of generics, for very similar reasons.
"by type" function overloads are handled more simply If a function needs a string parameter, then just cast the incoming value to a string. If type checking is needed, it can be added there.
We don't have, or need, classes. The [almost] everything is an object makes passing values around much easier. No need to auto-box, no need to cast values coming out.
Objects are easily extended without breaking code. You can create an array then drop replace the indexOf method to use one uses the binary search. The end result is smaller, and IMHO, cleaner code.

We Keep Coding

JavaScript is the programming language of the Web.