When does mutation become a side effect? - javascript

I came across the following How to break on reduce that was not tagged functional, yet contained a lot of discussion regarding the mutation of the array being a functional no no.
The main answer mutated the array to break out of the iterator early, but the array could easily be restored to its original state by pushing back the spliced items, a somewhat dubious solution and arguably not at all functional.
However many algorithms gain a significant advantage if items can be modified in place (mutated)
In regard to Javascript (single threaded (no workers), and no proxies) is it consider as mutation if the modification only exists temporarily? Or Is mutation only a side effect after the function has returned.
Is the following function a mutator?
function mutateAndRepair(arr) { // arr is an array of numbers
arr[0]++;
arr[0]--;
}
The array contains 1 or more items.
The arrays first item (index 0) is a number within max safe integer range.
The array is not a shared buffer
The array is not being watched by a proxy
I consider this as not mutating as the mutation only exists while the function is executing and as JS is blocking no other code will ever be able to see the mutation hence there are no side effects.
Considering the restraints does this comply with the common functional paradigm used by JavaScript coders?

The ++ and -- operators are mutating and they do not exactly reverse each other. Quoting the 2017 standard:
12.4.4.1Runtime Semantics: Evaluation
UpdateExpression : LeftHandSideExpression ++
Let lhs be the result of evaluating LeftHandSideExpression.
Let oldValue be ? ToNumber(? GetValue(lhs)).
Let newValue be the result of adding the value 1 to oldValue, using the same rules as for the + operator (see 12.8.5).
Perform ? PutValue(lhs, newValue).
Return oldValue.
It's that second step that's important, as it converts the value to a number primitive but there's also a subtle difference between that and a Number object as returned by the Number constructor.
var arr = [new Number(1234)];
function mutateAndRepair(arr) {
console.log(`the value before is ${arr[0]}`);
arr[0]++;
arr[0]--;
console.log(`the value after is ${arr[0]}`);
}
arr[0].foo = 'bar';
console.log(`foo before is ${arr[0].foo}`);
mutateAndRepair(arr)
console.log(`foo after is ${arr[0].foo}`);
Now, I'm being a little cheeky here by loosely interpreting your requirement that the first item of arr is a "number". And for sure, you can add another stipulation that the values of arr must be "number primitives" to exclude this exact form of mutation.
How about another, more subtle point. -0 and 0 are treated as the same value in virtually all ways except Object.is:
var arr = [-0];
function mutateAndRepair(arr) {
console.log(`the value before is ${arr[0]}`);
arr[0]++;
arr[0]--;
console.log(`the value after is ${arr[0]}`);
}
console.log(`is zero before ${Object.is(0, arr[0])}`);
mutateAndRepair(arr)
console.log(`is zero after ${Object.is(0, arr[0])}`);
Okay, you can add a requirement that the first item of arr is not -0. But all of that kind of misses the point. You could argue that virtually any method is non-mutating if you simply declare that you're going to ignore any case in which mutation would be observed.
Considering the restraints does this comply with the common functional paradigm used by JavaScript coders?
I would not consider this code to follow functional coding principles, and would perhaps even reject it in a code review if that were a goal of the project. It's not even so much about the nitty-gritty of how or whether immutability is assured by all code paths, but the fact that it depends upon mutation internally that makes this code non-functional in my view. I've seen a number of bugs arise in pseudo-functional code where an exception occurs between the mutate and repair steps, which of course leads to clear and unexpected side-effects, and even if you have a catch/finally block to try to restore the state, an exception could also occur there. This is perhaps just my opinion, but I think of immutability as a part of a larger functional style, rather than just a technical feature of a given function.

Related

Why is the Object Referential Equality not significantly faster than String Equality?

Whenever there are tab switches or similar structures, I see this pattern:
const tabs = {
FIRST: 'FIRST',
SECOND: 'SECOND',
}
const getActiveClassName = current => activeTab === current ? 'active' : ''
...
const activeTab = tabs.FIRST
<button className={getActiveClassName(tabs.FIRST)}/>
<button className={getActiveClassName(tabs.SECOND)}/>
I thought that going letter by letter in String Comparison must be inefficient, so I wrote a test and compared it to Object Equality in hope that comparing references would be much faster:
const tabs = {
FIRST: {},
SECOND: {},
}
The result is that there is almost no difference. Why?
The JSPerf test is here.
String comparison does not always need to go letter by letter.
Strings are not implemented as raw data values (like the other primitive types), but are actually references to their (immutable) contents. This, and the fact that they are immutable, allows some optimisations that might occur in your example:
Two strings referencing the same content memory are know to be equal. If you assign activeTab = tabs.FIRST and then compare activeTab === tabs.FIRST I'd bet that only the reference will be compared.
Two strings that are unequal are only compared until the first letter that distinguishes them. Comparing "First" === "Second" will need to access only one letter.
Since string literals are typically interned, when the engines knows that it does compare two interned strings (not dynamically built ones) it will only need to compare their references. Two different references to interned string contents mean two different contents, as interned strings with the same contents would be share their memory. So even your activeLongString can be distinguished from the other constants in longStrings by a constant comparison.
Deep down in the belly of the computer, string comparisons rely on the C library and its strcmp(3) function.
As this is one of the most used functions, C library developers have optimized the heck out of it. In particular, when two strings are found to differ by at least one byte, the strings may be considered different and the comparison short-circuits.
For s*hit and giggles, you may actually find how strcmp has been implemented in (some very old version of) macOS in x86_64 assembly:
https://opensource.apple.com/source/Libc/Libc-825.26/x86_64/string/strcmp.s.auto.html
Note however two things:
These type of things are close to the metal and JS is not
Things like string comparison implementation depends on the OS
You are therefore bound to get weird results since the effects are so tiny and depend on the OS version to the next. Mind you that a JS runtime has to go through many hoops to get to the string comparison and the associated noise completely overshadows the cost of the comparison operator itself.
My advice for all JS developers would be to focus solely on code correctness and UX.

Is it possible to cause endless-loop in a sorting function in javascript?

Array.prototype.sort sorts the elements of the array in-place and return the sorted array.
The requirements from the compareFunction(a, b) are:
get two elements (to compare)
return <0 in order to place a before b
return 0 in order to keep the original position of a and b (relative to each other)
return >0 in order to place b before a.
the returned value must always be the same for each pair of elements.
Since every browser-provider might implement the sorting algorithm differently, my question is: is possible to provide a compareFunction that will cause the sort function to get into an infinite loop while trying to sort the elements?
If so - in case it is possible - would it be considered a bug in the implementation, or in case the compareFunction did not follow the above instructions - it's okay to get unexpected results?
To be clear - I'm not asking if it's possible to add while (true); inside the compareFunction.
No.
Proper sort algorithms (quicksort, merge-sort, Tim-sort, bubble-sort, etc.) always make progress on each iteration and are thus free from the possibility of endless loops like this. While it is possible to craft a function to attack the performance of specific sort implementations, this will not prevent termination of the algorithm.
Hypothetically, it is possible that a "custom" sort implementation could be written in a way where an unstable comparison function (which makes the calling function invalid and the result unpredictable) could "hang", this is simply a serious defect in the sorting implementation; I give much more credit to the authors/contributors of widely used and thoroughly vetted sort implementations used in browsers.
The sorting algorithms used assume that the items to be sorted form a total order. This means that given say 'a','b','c' there is exactly one order. perhaps 'b' 'a' 'c'. Assuming that your compare function was only comparing two items (not calling a function that would not terminate) then if your comparison function is not well defined then the total order would not be well defined.
i.e if b < c and c < b then which one comes first? b or c.
so if you had a compare function like this (not real code):
compare(a,b) = select at random from [-1, 0, 1]
then when called with the same values the result might be different. It would also mean that there was not a well defined ordering - while this would not guarantee an infinite loop, but might go on a while - then again it might also stop quickly! But in general I would say no, as this would not be a well defined comparison.

Can a pure function return a Symbol?

This may border on philosophical, but I thought it would be the right place to ask.
Suppose I have a function that creates a list of IDs. These identifiers are only used internally to the application, so it is acceptable to use ES2015 Symbol() here.
My problem is that, technically, when you ask for a Symbol, I'd imagine the JS runtime creates a unique identifier (random number? memory address? unsure) which, to prevent collisions, would require accessing global state. The reason I'm unsure is because of that word, "technically". I'm not sure (again, from a philosophical standpoint) if this ought to be enough to break the mathematical abstraction that the API presents.
tl;dr: here's an example--
function sentinelToSymbol(x) {
if (x === -1) return Symbol();
return x;
}
Is this function pure?
Not really, no, but it might not actually matter.
On the surface, (foo) => Symbol(foo) appears pure. While the runtime may do some operations with side effects, you will never see them, even if you call Symbol() at the same time with the same parameters. However, calling Symbol with the same arguments will never return the same value, which is one of the main criteria (#2, below).
From the MDN page:
Note that Symbol("foo") does not coerce the string "foo" into a symbol. It creates a new symbol each time:
Symbol("foo") === Symbol("foo"); // false
Looking solely at side effects, (foo) => Symbol(foo) is pure (above the runtime).
However, a pure function must meet more criteria. From Wikipedia:
Purely functional functions (or expressions) have no side effects (memory or I/O). This means that pure functions have several useful properties, many of which can be used to optimize the code:
If the result of a pure expression is not used, it can be removed without affecting other expressions.
If a pure function is called with arguments that cause no side-effects, the result is constant with respect to that argument list (sometimes called referential transparency), i.e. if the pure function is again called with the same arguments, the same result will be returned (this can enable caching optimizations such as memoization).
If there is no data dependency between two pure expressions, then their order can be reversed, or they can be performed in parallel and they cannot interfere with one another (in other terms, the evaluation of any pure expression is thread-safe).
If the entire language does not allow side-effects, then any evaluation strategy can be used; this gives the compiler freedom to reorder or combine the evaluation of expressions in a program (for example, using deforestation).
You could argue the preface to that list rules out everything in JavaScript, since any operation could result in memory being allocated, internal structures updated, etc. In the strictest possible interpretation, JS is never pure. That's not very interesting or useful, so...
This function meets criteria #1. Disregarding the result, (foo) => Symbol(foo) and (foo) => () are identical to any outside observer.
Criteria #2 gives us more trouble. Given bar = (foo) => Symbol(foo), bar('xyz') !== bar('xyz'), so Symbol does not meet that requirement at all. You are guaranteed to get a unique instance back every time you call Symbol.
Moving on, criteria #3 causes no problems. You can call Symbol from different threads without them conflicting (parallel) and it doesn't matter what order they are called in.
Finally, criteria #4 is more of a note than direct requirement, and is easily met (the JS runtimes shuffle everything around as they go).
Therefore:
strictly speaking, nothing in JS can be pure.
Symbol() is definitely not pure, thus the example is not either.
If all you care about is side effects rather than memoization, the example does meet those criteria.
Yes, this function is impure: sentinelToSymbol(-1) !== sentinelToSymbol(-1). We would expect equality here for a pure function.
However, if we use the concept of referential transparency in a language with object identities, we might want to loosen our definition a bit. If you consider function x() { return []; }, is it pure? Obviously x() !== x(), but still the function always returns an empty array regardless of the input, like a constant function. So what we do have to define here is the equality of values in our language. The === operator might not be the best fit here (just consider NaN). Are arrays equal to each other if the contain the same elements? Probably yes, unless they are mutated somewhere.
So you will have to answer the same question for your symbols now. Symbols are immutable, which makes that part easy. Now we could consider them equal by their [[Description]] value (or .toString()), so sentinelToSymbol would be pure by that definition.
But most languages do have functions that allow to break referential transparency - for example see How to print memory address of a list in Haskell. In JavaScript, this would be using === on otherwise equal objects. And it would be using symbols as properties, as that inspects their identity. So if you do not use such operations (or at least without being observable to the outside) in your programs, you can claim purity for your functions and use it for reasoing about your program.

Is Underscore.js functional programming a fake?

According to my understanding of functional programming, you should be able to chain multiple functions and then execute the whole chain by going through the input data once.
In other words, when I do the following (pseudo-code):
list = [1, 2, 3];
sum_squares = list
.map(function(item) { return item * item; })
.reduce(function(total, item) { return total + item; }, 0);
I expect that the list will be traversed once, when each value will be squared and then everything will be added up (hence, the map operation would be called as needed by the reduce operation).
However, when I look at the source code of Underscore.js, I see that all the "functional programming" functions actually produce intermediate collections like, for example, so:
// Return the results of applying the iteratee to each element.
_.map = _.collect = function(obj, iteratee, context) {
iteratee = cb(iteratee, context);
var keys = !isArrayLike(obj) && _.keys(obj),
length = (keys || obj).length,
results = Array(length);
for (var index = 0; index < length; index++) {
var currentKey = keys ? keys[index] : index;
results[index] = iteratee(obj[currentKey], currentKey, obj);
}
return results;
};
So the question is, as stated in the title, are we fooling ourselves that we do functional programming when we use Underscore.js?
What we actually do is make program look like functional programming without actually it being functional programming in fact. Imagine, I build a long chain of K filter() functions on list of length N, and then in Underscore.js my computational complexity will be O(K*N) instead of O(N) as would be expected in functional programming.
P.S. I've heard a lot about functional programming in JavaScript, and I was expecting to see some functions, generators, binding... Am I missing something?
Is Underscore.js functional programming a fake?
No, Underscore does have lots of useful functional helper functions. But yes, they're doing it wrong. You may want to have a look at Ramda instead.
I expect that the list will be traversed once
Yes, list will only be traversed once. It won't be mutated, it won't be held in memory (if you had not a variable reference to it). What reduce traverses is a different list, the one produced by map.
All the functions actually produce intermediate collections
Yes, that's the simplest way to implement this in a language like JavaScript. Many people rely on map executing all its callbacks before reduce is called, as they use side effects. JS does not enforce pure functions, and library authors don't want to confuse people.
Notice that even in pure languages like Haskell an intermediate structure is built1, though it would be consumed lazily so that it never is allocated as a whole.
There are libraries that implement this kind of optimisation in strict languages with the concept of transducers as known from Clojure. Examples in JS are transduce, transducers-js, transducers.js or underarm. Underscore and Ramda have been looking into them2 too.
I was expecting to see some […] generators
Yes, generators/iterators that can be consumed lazily are another choice. You'll want to have a look at Lazy.js, highland, or immutable-js.
[1]: Well, not really - it's a too easy optimisation
[2]: https://github.com/jashkenas/underscore/issues/1896, https://github.com/ramda/ramda/pull/865
Functional programming has nothing to do with traversing a sequence once; even Haskell, which is as pure as you're going to get, will traverse the length of a strict list twice if you ask it to filter pred (map f x).
Functional programming is a simpler model of computation where the only things that are allowed to happen do not include side effects. For example, in Haskell basically only the following things are allowed to happen:
You can apply a value f to another value x, producing a new value f x with no side-effects. The first value f is called a "function". It must be the case that any time you apply the same f to the same x you get the same answer for f x.
You can give a name to a value, which might be a function or a simple value or whatever.
You can define a new structure for data with a new type signature, and/or structure some data with those "constructors."
You can define a new type-class or show how an existing data structure instantiates a type-class.
You can "pattern match" a data structure, which is a combination of a case dispatch with naming the parts of the data structure for the rest of your project.
Notice how "print something to the console" is not doable in Haskell, nor is "alter an existing data structure." To print something to the console, you construct a value which represents the action of printing something to the console, and then give it a special name, main. (When you're compiling Haskell, you compute the action named main and then write it to disk as an executable program; when you run the program, that action is actually completed.) If there is already a main program, you figure out where you want to include the new action in the existing actions of that program, then use a function to sequence the console logging with the existing actions. The Haskell program never does anything; it just represents doing something.
That is the essence of functional programming. It is weaker than normal programming languages where the language itself does stuff, like JavaScript's console.log() function which immediately performs its side effect whenever the JS interpreter runs through it. In particular, there are some things which are (or seem to be) O(1) or O(log(log(n))) in normal programs where our best functional equivalent is O(log(n)).

javascript cost of comparison with undefined

While messing around with JavaScript I found that comparing an array element with undefined was very interesting. Considering:
L = [1,2,3];
if (L[1] == undefined)
console.log('no element for key 1');
else
console.log('Value for key 1'+L[1]);
I think thats an awesome way to check for values in sequences in JavaScript, instead of iterating over sequences or other containers, but my question is: is that error prone or not efficient? Whats the cost of such comparison?
The code does not test if a particular value exists; it tests if an [Array] index was assigned a non-undefined value. (It will also incorrectly detect some false-positive values like null due to using ==, but that's for another question ..)
Consider this:
L = ["hello","world","bye"]
a = L["bye"]
b = L[1]
What is the value of a and what does it say about "bye"? What is the value of b and how does 1 relate to any of the values which may (or may not) exist as elements of L?
That is, iterating an Array - to find a value of unknown index? to perform an operation on multiple values? - and accessing an element by index are two different operations and cannot be generally interchanged.
On the flip side, object properties can be used to achieve a similar (but useful) effect:
M = {hello: 1, world: 1, bye: 1}
c = M["hello"]
What is the value of c now? How does the value used as the key relate to the data?
In this case the property name (used as a lookup key) relates to the data being checked and can say something useful about it - yes, there is a "hello"! (This can detect some false-positives without using hasOwnProperty, but that's for another question ..)
And, of course .. for a small sequence, or an infrequent operation, iterating (or using a handy method like Array.indexOf or Array.some) to find existence of a value is "just fine" and will not result in a "performance impact".
In V8 accessing array out of bounds to elicit undefined is unspeakably slow because for instance if it's done in optimized code, the optimized code is thrown away and deoptimized. In other languages an exception would be thrown which is very very slow too or in unmanaged language your program would have undefined behavior and for example crash if you're lucky.
So always check the .length of the collection to ensure you don't do out of bounds accesses.
Also, for performance prefer void 0 over undefined as it is a compile time constant rather than runtime variable lookup.

Categories