Understanding ES6 Symbols - javascript

I've been around the block when it comes to languages, having worked with everything from C# to Lisp to Scala to Haskell, and in every language that supported them, symbols have acted pretty much the same; that is, any two symbols with the same name, are guaranteed to be identical because they're singleton objects.
Racket: (equal? 'foo 'foo) true
Common Lisp: (eq 'foo 'foo) true
Ruby: :foo == :foo true
Scala: 'foo == 'foo true
ES6: Symbol('foo') === Symbol('foo') false
The benefit of symbols being singletons is obvious: You can use them in maps/dictionaries without risking your key not being equal to your input because the language suddenly decided to hash it differently (looking at you, Ruby)
So why is it that ECMAScript 6 takes a different approach on this, and how can I get around it?

You can (sort-of) get the effect of symbols being "knowable" by name by using registered (global) symbols:
var s1 = Symbol.for("foo");
var s2 = Symbol.for("foo");
s1 === s2; // true
Of course you can create your own Symbol registry too, with a Map instance or a plain object.
edit — I'll add that the intention of the optional string parameter when making a new Symbol instance is to provide a way of identifying the meaning and purpose of a symbol to the programmer. Without that string, a Symbol works just fine as a Symbol, but if you dump out an object in a debugger the properties keyed by such anonymous Symbol instances are just values. If you're keeping numeric properties on an object with Symbol keys, then you'll just see some numbers, and that would be confusing. The description string associated with a Symbol instances gives the programmer reference information without compromising the uniqueness of the Symbol as a key value.
Finally, you can always compare the result of calling .toString() on two similarly-constructed Symbol instances. I suspect that that practice would be considered questionable, but you can certainly do it.
edit more — it occurs to me that the fact that the default behavior of Symbol creation in JavaScript makes the type more useful than, say, atoms in Erlang or keys in Clojure. Because the language provides by default a value guaranteed to be unique, the fundamental problem of namespace collision is solved pretty nicely. It's still possible to work with "well-known" symbols, but having unique values available without having to worry about conflicts with other code that might also want to avoid collisions is nice. JavaScript has the somewhat unique, and certainly uniquely pervasive and uncontrollable, problem of a global namespace that may be polluted by code that a programmer does not even know exists, because code can collide in the browser enviroment due to the actions of a party different from, and unknown to, an arbitrary number of software architects.

Related

Why is the Object Referential Equality not significantly faster than String Equality?

Whenever there are tab switches or similar structures, I see this pattern:
const tabs = {
FIRST: 'FIRST',
SECOND: 'SECOND',
}
const getActiveClassName = current => activeTab === current ? 'active' : ''
...
const activeTab = tabs.FIRST
<button className={getActiveClassName(tabs.FIRST)}/>
<button className={getActiveClassName(tabs.SECOND)}/>
I thought that going letter by letter in String Comparison must be inefficient, so I wrote a test and compared it to Object Equality in hope that comparing references would be much faster:
const tabs = {
FIRST: {},
SECOND: {},
}
The result is that there is almost no difference. Why?
The JSPerf test is here.
String comparison does not always need to go letter by letter.
Strings are not implemented as raw data values (like the other primitive types), but are actually references to their (immutable) contents. This, and the fact that they are immutable, allows some optimisations that might occur in your example:
Two strings referencing the same content memory are know to be equal. If you assign activeTab = tabs.FIRST and then compare activeTab === tabs.FIRST I'd bet that only the reference will be compared.
Two strings that are unequal are only compared until the first letter that distinguishes them. Comparing "First" === "Second" will need to access only one letter.
Since string literals are typically interned, when the engines knows that it does compare two interned strings (not dynamically built ones) it will only need to compare their references. Two different references to interned string contents mean two different contents, as interned strings with the same contents would be share their memory. So even your activeLongString can be distinguished from the other constants in longStrings by a constant comparison.
Deep down in the belly of the computer, string comparisons rely on the C library and its strcmp(3) function.
As this is one of the most used functions, C library developers have optimized the heck out of it. In particular, when two strings are found to differ by at least one byte, the strings may be considered different and the comparison short-circuits.
For s*hit and giggles, you may actually find how strcmp has been implemented in (some very old version of) macOS in x86_64 assembly:
https://opensource.apple.com/source/Libc/Libc-825.26/x86_64/string/strcmp.s.auto.html
Note however two things:
These type of things are close to the metal and JS is not
Things like string comparison implementation depends on the OS
You are therefore bound to get weird results since the effects are so tiny and depend on the OS version to the next. Mind you that a JS runtime has to go through many hoops to get to the string comparison and the associated noise completely overshadows the cost of the comparison operator itself.
My advice for all JS developers would be to focus solely on code correctness and UX.

Can a pure function return a Symbol?

This may border on philosophical, but I thought it would be the right place to ask.
Suppose I have a function that creates a list of IDs. These identifiers are only used internally to the application, so it is acceptable to use ES2015 Symbol() here.
My problem is that, technically, when you ask for a Symbol, I'd imagine the JS runtime creates a unique identifier (random number? memory address? unsure) which, to prevent collisions, would require accessing global state. The reason I'm unsure is because of that word, "technically". I'm not sure (again, from a philosophical standpoint) if this ought to be enough to break the mathematical abstraction that the API presents.
tl;dr: here's an example--
function sentinelToSymbol(x) {
if (x === -1) return Symbol();
return x;
}
Is this function pure?
Not really, no, but it might not actually matter.
On the surface, (foo) => Symbol(foo) appears pure. While the runtime may do some operations with side effects, you will never see them, even if you call Symbol() at the same time with the same parameters. However, calling Symbol with the same arguments will never return the same value, which is one of the main criteria (#2, below).
From the MDN page:
Note that Symbol("foo") does not coerce the string "foo" into a symbol. It creates a new symbol each time:
Symbol("foo") === Symbol("foo"); // false
Looking solely at side effects, (foo) => Symbol(foo) is pure (above the runtime).
However, a pure function must meet more criteria. From Wikipedia:
Purely functional functions (or expressions) have no side effects (memory or I/O). This means that pure functions have several useful properties, many of which can be used to optimize the code:
If the result of a pure expression is not used, it can be removed without affecting other expressions.
If a pure function is called with arguments that cause no side-effects, the result is constant with respect to that argument list (sometimes called referential transparency), i.e. if the pure function is again called with the same arguments, the same result will be returned (this can enable caching optimizations such as memoization).
If there is no data dependency between two pure expressions, then their order can be reversed, or they can be performed in parallel and they cannot interfere with one another (in other terms, the evaluation of any pure expression is thread-safe).
If the entire language does not allow side-effects, then any evaluation strategy can be used; this gives the compiler freedom to reorder or combine the evaluation of expressions in a program (for example, using deforestation).
You could argue the preface to that list rules out everything in JavaScript, since any operation could result in memory being allocated, internal structures updated, etc. In the strictest possible interpretation, JS is never pure. That's not very interesting or useful, so...
This function meets criteria #1. Disregarding the result, (foo) => Symbol(foo) and (foo) => () are identical to any outside observer.
Criteria #2 gives us more trouble. Given bar = (foo) => Symbol(foo), bar('xyz') !== bar('xyz'), so Symbol does not meet that requirement at all. You are guaranteed to get a unique instance back every time you call Symbol.
Moving on, criteria #3 causes no problems. You can call Symbol from different threads without them conflicting (parallel) and it doesn't matter what order they are called in.
Finally, criteria #4 is more of a note than direct requirement, and is easily met (the JS runtimes shuffle everything around as they go).
Therefore:
strictly speaking, nothing in JS can be pure.
Symbol() is definitely not pure, thus the example is not either.
If all you care about is side effects rather than memoization, the example does meet those criteria.
Yes, this function is impure: sentinelToSymbol(-1) !== sentinelToSymbol(-1). We would expect equality here for a pure function.
However, if we use the concept of referential transparency in a language with object identities, we might want to loosen our definition a bit. If you consider function x() { return []; }, is it pure? Obviously x() !== x(), but still the function always returns an empty array regardless of the input, like a constant function. So what we do have to define here is the equality of values in our language. The === operator might not be the best fit here (just consider NaN). Are arrays equal to each other if the contain the same elements? Probably yes, unless they are mutated somewhere.
So you will have to answer the same question for your symbols now. Symbols are immutable, which makes that part easy. Now we could consider them equal by their [[Description]] value (or .toString()), so sentinelToSymbol would be pure by that definition.
But most languages do have functions that allow to break referential transparency - for example see How to print memory address of a list in Haskell. In JavaScript, this would be using === on otherwise equal objects. And it would be using symbols as properties, as that inspects their identity. So if you do not use such operations (or at least without being observable to the outside) in your programs, you can claim purity for your functions and use it for reasoing about your program.

ES6 iterators and ##iterator [duplicate]

I've noticed ## used in a few pages about new ES6 features, but I don't know what exactly it means (whether it's actually syntax or just some kind of documentation convention). And it's hard to google. Can someone explain it?
## describes what's called a well-known symbol. (Note that it isn't actually valid syntax in JS.) According to the ES6/ES20151 specification:
Well-known symbols are built-in Symbol values that are explicitly referenced by algorithms of this specification. They are typically used as the keys of properties whose values serve as extension points of a specification algorithm. Unless otherwise specified, well-known symbols values are shared by all Code Realms (8.2).
Code Realms refer to different instances of a JavaScript environment. For example, the Code Realm of the root document would be different to that of JavaScript running in an <iframe>.
An example of where it matter what code realm an object comes from is when trying to use instanceof to determine whether an object is an array (hint: it won't work if it's from another frame). To avoid these kinds of issues from popping up with symbols, they are shared so that references to (say) ##toString will work no matter where the object came from.
Some of these are exposed directly through the Symbol constructor, for example, ##toPrimitive is exposed as Symbol.toPrimitive. That can be used to override the value produced when attempting to convert an object to a primitive value, for example:
let a = { [Symbol.toPrimitive]: () => 1 };
console.log(+a); // 1
console.log(a.valueOf()); // (the same object)
console.log(a.toString()); // "[object Object]"
In general, symbols are used to provide unique properties on an object which cannot collide with a random property name, for example:
let a = Symbol();
let foo = { [a]: 1 };
foo[a]; // 1
There is no way to access the value except by getting the symbol from somewhere (though you can get all symbols for an object by calling Object.getOwnPropertySymbols, so they cannot be used to implement private properties or methods).
1: See this es-discuss topic for some discussion about the different names.

Is there a convention for naming symbols in ES6?

I'm toying around with ES6, looking at symbols. Unlike ruby for example where you'd write :symbol, ES6 symbols seem to be allowed any "standard" variable name. To be honest I am finding this rather confusing:
var privateProperty = Symbol();
var obj = {};
obj[privateProperty] = 'some value goes here';
as it tends to make me think that privateProperty is probably a plain string like the years before. Using :privateProperty is not valid.
So just like using $bar for a jQuery object or bar$ for a RxJS/Bacon stream, I was wondering if there is already an established convention for naming symbols in ES6?
obj[privateProperty] = …; tends to make me think that privateProperty is probably a plain string like the years before.
Well, that's the old thinking that we will need to forget. privateProperty is a key, which can be either a string (property name) or symbol. Similarly, we already have learned to distinguish (integer) indices from "normal" properties.
ES6 seems to allow any "standard" variable name
Just as does Ruby. The difference is that ES6 doesn't introduce a literal notation for symbols (which resolve to the same thing), but allows them to be created only by using Symbol (locally) or Symbol.for (globally).
There is no standard convention for naming variables that hold symbol values.
Of course, you can always use hungarian notation if you tend to want such type annotations. If I had to coin a standard, I'd propose to use a more subtle leading underscore (var _myPrivate). Underscores in property names had always implied something special, and it would look similar for computed property keys (obj[_myPrivate]) then.
Is there a convention for naming symbols in ES6?
While there is no convention for naming symbol-holding variables (yet), there certainly is a convention for naming symbols themselves:
local symbols should have a very descriptive descriptor string (given as an argument to Symbol). Of course, they're still unique anyway, so you basically can use whatever name you want.
global symbols need to avoid collisions. The convention is to namespace them appropriately, and connect those namespace names with dots - like a Java package name. If the symbol is available as a global variable, its descriptor should have the same name (similar to the builtin, native symbols, which are like ##iterator = Symbol.for("Symbol.iterator");). If the symbol is exported by a module or library, its descriptor should be prefixed with the module/library name (to avoid collisions).
And it would probably be a best practise to use the same name for the variable (like the native symbols already do, Symbol.iterator.toString() == "Symbol(Symbol.iterator)").

Literal vs Constructor notation for primitives, which is more proper for starters? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
So I am a TA for a class at my University, and I have a bit of a disagreement about how to present datatypes for absolute beginner programmers (In which most never programmed before). My teacher tells students they must strictly use constructors to create primitive datatypes such as Numbers and Strings, her reasoning is to treat JavaScript as if its strongly typed so students will be used to the languages that are in the future. I understand why, but I think it has bad trade-offs.
var num = new Number(10); // This is encouraged.
var num = 10; // This is discouraged (students will lose points for doing this).
My instructor does not make a distinction between these and students are told to treat them as if they are primitive Numbers, Strings, etc. Although I believe at least for starters who don't know better to use datatype.valueOf() when necessary, and don't know much at all what objects are yet. Literal notation would be (and I consider it to be) more proper and standard, the other way would cause confusion. Since there a consistency issues with constructor notation because they are objects (and I don't want students to worry about that). For example these don't make sense to beginners:
var num1 = new Number(1);
var num2 = new Number(1);
if(num1 === num2) ... ; // Does not run.
if(num1 == num2) ... ; // Does not run.
if(num1 == 1) ... ; // But this does.
var num2 = new Number(2);
if(num1 < num2) ... ; // So does this.
switch(num1){
case 1:
... // Does not run.
break;
default:
... // This runs
break;
}
As you can see this would be confusing for someone just learning what an if statement is.I feel as if she is encouraging bad practice and discouraging good practice. So what do you think between literal and constructor notation for primitive values, which is considered more standard/proper and which is better for beginners to use?
As someone who spent extra time creating a main function in every Python program so that we'd be more prepared for Java's public static void main when the time came, there is a place for slightly-less-than-best practices when it comes to learning how to programming.
Now a teacher myself, using constructor functions in JavaScript is not that place. First, it results in misbehavior with control flow, an essential part of the beginning steps of programming. Secondly, it does the students a disservice by mis-teaching an essential language in the web developer's toolkit. Finally, it does not prepare one for constructors! As a side note, JavaScript does not easily fall into any typical programming paradigm, and as such is unsuitable as a first language in a four-year college curriculum (in this author's opinion).
Constructor functions block understanding of control flow
Firstly, let's look at control flow. Without control flow, a beginner may never be able to construct anything more complex than Hello world. It is absolutely essential for a beginner to have a solid understanding of each item in the control flow basket. A good instructor will provide several examples for each type, as well as a full explanation of what the difference is between true and truthy.
Constructor functions completely ruin a beginner's understanding of control flow, as in the following example:
var myBool = new Boolean(false);
if (myBool) { console.log('yes'); } // 'yes' is logged to the console.
Yes, this is one of JavaScript's 'wat' moments. Should the language act like this? Probably not. Does it? Absolutely. Beginners need to see simple examples that make sense to the uninitiated mind. There is no brain space for edge cases when first starting out. Examples like this only do a disservice to those the professor is supposed to be teaching.
Constructor functions have no place in JavaScript
This one's a bit more opinion-based. A caveat: JavaScripters may find some use for constructor functions in using them to convert between types (i.e. Number('10')) but there are usually better ways.
Virtually every time a constructor function is used in JavaScript, it's a misuse. I am of the mind that if one wants a literal number, just write that out. No need for all the extra typing, not to mention the various type conflicts and hidden gotchas that come from using constructors](https://stackoverflow.com/a/369450/1216976).
It's worth noting that both JSLint and JSHint (written by people much smarter than I) discourage the use of constructors in this fashion.
new Number(1) and 1 are not the same. The first one is an Object, and the latter one is a Number.
var num = new Number(1);
typeof num; //object
var num = 1;
typeof num; //number
In order to get the value of the object created by the constructor, one must use valueOf every single time in order to get a correct result.
if( new Boolean(false) ){ if( new Boolean(false).valueOf() ){
console.log("True"); console.log("True");
}else{ }else{
console.log("False"); console.log("False");
} }
//> True //> False
//Student Be Like: //Student Be Like:
// What?! That doesn't even make //Okay that makes sense.
// sense!
Because of the fact that using constructors will only confuse more learners and will cause more problems in further development of your program, IMO primitive constructors should never be used. (except in certain cases.)
Furthermore, using Constructors every time you declare a variable will greatly impact your execution speed because of the fact that allocating spaces for a new object takes longer than just a simple 64-bit floating point number. Here is the comparison: http://jsperf.com/constructors-vs-literalss
tl;dr
var num = new Number(10); // bad, never do this, prone to errors
var num = 10; // do this instead
Every time when there's a literal and your don't use it: Very likely to be bad code.
new Boolean(false); // bad
new String("foo"); // bad
new Object(); // not as bad as the ones above, but still bad
new Array(10); // bad, it initializes the array with
// not-your-normal `undefined`
First of all if you're a teaching assistant at a university then you should tell your professor not to teach JavaScript as an introductory language to students. The reason is that JavaScript is:
Ambivalent about the kind of a language it wants to be: functional vs object-oriented.
Too unintuitive: many a times it doesn't behave the way you would expect it to.
Too verbose: there are too many syntactical elements diverting your attention.
JavaScript is not a suitable introductory language. It's not even a suitable production language which is why we have LiveScript and other compile-to-javascript languages.
Since you're a teaching assistant I would highly recommend that you read Bret Victor's amazing blog post on Learnable Programming. It describes the constituents of a good introductory programming language.
Anyway, I digress. Coming back to the original question, "Literal vs Constructor notation for primitives, which is more proper for starters?" This question is a no-brainer. You know the answer: literals.
I surmise that your professor is trying to teach the following point to the students:
Everything in JavaScript is an object.
Now while it's ideal to think about everything in JavaScript as an object this ideal is too far-fetched. Primitives are not objects. Primitives consist of:
Undefined.
Null.
Booleans.
Numbers.
Strings.
I think your professor is confusing JavaScript with Java which advocates that everything is an object (which is actually not true because you still do have primitives in addition to classes, interfaces, etc.)
JavaScript is more object-oriented than Java simply because everything (except for primitives) is an object. You don't have classes in JavaScript. Objects simply inherit from other objects.
On a sticky side note here's a good analogy to explain the difference between primitive vs reference values to students: Primitive value vs Reference value.
Now although you do have primitives in JavaScript you also have wrapper objects for these primitives (i.e. Boolean, Number and String constructors) so as to treat them as objects.
These wrappers expose several useful utility methods. For example you could do: 42..toString(2) to get the value of 42 in binary as a string. As you can see JavaScript coerces the primitive value 42 into a Number object automatically when you call a method on the primitive. Hence it makes no sense to create wrapper objects for primitives explicitly. Read The Secret Life of JavaScript Primitives.
In fact if you really wanted a wrapper object instead of a primitive (for performance reasons) the I would do it as follows:
Object.prototype.getObject = function () {
return this;
};
var yes = true.getObject();
var answer = 42..getObject();
var greet = "hello".getObject();
This actually makes more sense to me. You're taking a primitive, coercing in into an object and returning the coerced reference value. By the way, 42 is an allusion to The Hitchhiker's Guide to the Galaxy.
Beside being more succinct another good reason to use literals over wrapper objects is mathematical equivalence. As you described in your question it's desirable to be able to compare two values for mathematical equivalence. Objects break mathematical equivalence. Since objects have a distinct identity two mathematically equivalent objects may not be logically equivalent. For example:
console.log(new Number(42) === new Number(42)); // false
To solve this problem you need to convert the numbers to primitives using valueOf:
console.log(new Number(42).valueOf() === new Number(42).valueOf()); // true
It's really asinine to write code like this. Why circumscribe yourself to such unnecessary hacks when you could write clean understandable code?
console.log(42 === 42); // true
Everyone know thats 42 is indeed 42 but not many beginners would understand why two objects which are mathematically equivalent are logically distinct.
To summarize, when you're teaching somebody how to program you need to make sure that your code is intuitive (i.e. it shouldn't work in unexpected ways). If it doesn't work the way you would expect it to work then you're doing something wrong. You'll end up struggling with the language instead of writing useful programs.
Starters or not, prefer primitives.
I think this question is no different than the primitive or boxed primitive question in Java.
Using primitives is simpler and more expressive: Number, Boolean or String although available objects in JavaScript are Objects and not just the values. To quote Joshua Bloch from effective Java: 'Objects have identities apart from their values.' When you refer to a Number object its not the same as number value. When we are dealing with a numbers its the value we usually are bothered about. Primitives save the additional effort of understanding it as object for the developer as well as on the readers' part.
Equals comparison: equals comparison of objects with = operator is actually the reference comparison, which is always misunderstood and creates scope for errors.
Primitives are faster: (proof needed?) Why bother with object creations when it's just the value that matters to us?
Type validation in exposed functions: JavaScript being loosely typed we at times require a failsafe type check in exposed functions. Say a general check on type: typeof param === "string" or "number". You give me an "object" m not gonna spend time in guessing what object you might be passing..
The only places I have used a Number constructor is when parsing strings to numbers. My work has mostly been development on a banking product and I would rather take a NaN than guess a number using parseInt (different question altogether).

Categories