An article I was reading gives this as an example of an impure function (in JavaScript):
const tipPercentage = 0.15;
const calculateTip = cost => cost * tipPercentage;
That struck me as a bit of an odd example, since tipPercentage is a constant with an immutable value. Common examples of pure functions allow dependence on immutable constants when those constants are functions.
const mul = (x, y) => x * y
const calculateTip = (cost, tipPercentage) => mul(cost, tipPercentage);
In the above example, correct me if I'm wrong, calculateTip would usually be categorised as a pure function.
So, my question is: In functional programming, is a function still considered pure if it relies on an externally defined constant with an immutable value, when that value is not a function?
Yes, it is a pure function. Pure functions are referentially transparent, i.e. one can replace the function call with its result without changing the behaviour of the program.
In your example, it is always valid to replace e.g. calculateTip (100) anywhere in your program with its result of 15 without any change in behaviour, hence the function is pure.
Yes, theoretically, a function that abides by these two rules:
not causing side effects
not depending on "any mutable state"
can be considered pure, and as #TheInnerLight has put it well, provides referential transparency among other benefits. That makes the function in your example pure.
However, as the question is about & tagged with javascript, it is important to note that a function that depends on a const defined in the outer scope cannot always be considered pure:
const state = {}
function readState (key) {
return state[key]
}
// A hypothetical setState() can mutate `state` and
// therefore change the outcome of `readState` calls.
That is obviously because state value is not immutable (only the assignment is).
The rule of thumb is to prefer local over global when it comes to scopes. That provides better isolation, debuggability, and less cognitive load for the next reader.
Related
What I need to do
I need to detect whether two objects are the same. By same I mean deep-equal: different objects that look and behave the same are the same to me. For instance {} is the same as {}, even though {} != {}.
I need to do this on Node.js.
Problem
This has been easy with most types I handle (undefined, null, numbers, NaN, strings, objects, arrays), but it's proving really hard with functions.
For function I would consider two functions to be the same if their name and code is identical. In case the functions are closures, then the variables they capture need to be the same instance.
I don't know how to implement that.
Attempted solutions
These are all the approaches I could think of to compare functions, but they all have issues:
Comparing the function with == or === doesn't work, of course. Example ()=>0 != ()=>0, but they should be the same.
Comparing the function name doesn't work, of course. Example: ()=>0 === ()=>1 when they shouldn't be the same.
Comparing the function's code (as reported by Function.prototype.toString) doesn't work:
const fnFactory=(n)=>{ return ()=>n; };
const fn1=fnFactory(0);
const fn2=fnFactory(1);
fn1.toString() === fn2.toString()
But they shouldn't be the same.
Compare the functions' code. If it's the same, parse it and detect whether the function has any captured variables, if it doesn't, the functions are the same.
This solution however would be unbearably slow. Besides it still doesn't work for different functions that capture the same instances of variables.
What I need this for (example)
I need this in order to implement a factory function like this (this is just a minimal example, of course):
// Precondition:
// "factoryFn" is called only once for every value of "fn"
// the result is then reused for every future invocation.
const storage=new MagicStorage();
function createOrReuse(fn, ...opts)
{
if(storage.has(fn, ...opts)){
return storage.get(fn, ...opts);
}
else{
const data=factoryFn(...opts);
storage.set(data, fn, ...opts);
return data;
}
}
Then I want to use it in several places around my code. For instance:
function f1(n) {
// ...
const buffer=createOrReuse((n)=>new Buffer.alloc(n*1024*1024), n);
//...
}
function f2(){
//...
emitter.on('ev', async()=>{
const buffer=await createOrReuse(()=>fs.readFile('file.txt'));
// ...
});
//...
}
Of course there are other ways to achieve the same result: For instance I could store the allocated values in variables with a lifetime long enough.
However similar solutions are much less ergonomic. I wish to have that small createOrReuse function. In other languages (like C++) such a createOrReuse could be implemented. Can I not implement it in Javascript?
Question
Is it possible to implement the function-comparison logic I need in pure JavaScript? I can use any ES version.
Otherwise is it possible to implement it as a native module for Node.js?
If yes, Is there any Node.js native module that can be used to achieve what I need?
Otherwise, where can I start to develop one?
I am reading a book where it says one way to handle impure functions is to inject them into the function instead of calling it like the example below.
normal function call:
const getRandomFileName = (fileExtension = "") => {
...
for (let i = 0; i < NAME_LENGTH; i++) {
namePart[i] = getRandomLetter();
}
...
};
inject and then function call:
const getRandomFileName2 = (fileExtension = "", randomLetterFunc = getRandomLetter) => {
const NAME_LENGTH = 12;
let namePart = new Array(NAME_LENGTH);
for (let i = 0; i < NAME_LENGTH; i++) {
namePart[i] = randomLetterFunc();
}
return namePart.join("") + fileExtension;
};
The author says such injections could be helpful when we are trying to test the function, as we can pass a function we know the result of, to the original function to get a more predictable solution.
Is there any difference between the above two functions in terms of being pure as I understand the second function is still impure even after getting injected?
An impure function is just a function that contains one or more side effects that are not disenable from the given inputs.
That is if it mutates data outside of its scope and does not predictably produce the same output for the same input.
In the first example NAME_LENGTH is defined outside the scope of the function - so if that value changes the behaviour of getRandomFileName also changes - even if we supply the same fileExtension each time. Likewise, getRandomLetter is defined outside the scope - and almost certainly produces random output - so would be inherently impure.
In second example everything is referenced in the scope of the function or is passed to it or defined in it. This means that it could be pure - but isn't necessarily. Again this is because some functions are inherently impure - so it would depend on how randomLetterFunc is defined.
If we called it with
getRandomFileName2('test', () => 'a');
...then it would be pure - because every time we called it we would get the same result.
On the other hand if we called it with
getRandomFileName2(
'test',
() => 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'.charAt(Math.floor(25 * Math.random()))
);
It would be impure, because calling it each time would give a different result.
There's more than one thing at stake here. At one level, as Fraser's answer explains, assuming that getRandomLetter is impure (by being nondeterministic), then getRandomFileName also is.
At least, by making getRandomFileName2 a higher-order function, you at least give it the opportunity to be a pure function. Assuming that getRandomFileName2 performs no other impure action, if you pass it a pure function, it will, itself, transitively, be pure.
If you pass it an impure function, it will also, transitively, be impure.
Giving a function an opportunity to be pure can be useful for testing, but doesn't imply that the design is functional. You can also use dependency injection and Test Doubles to make objects deterministic, but that doesn't make them functional.
In most languages, including JavaScript, you can't get any guarantees from functions as first-class values. A function of a particular 'shape' (type) can be pure or impure, and you can check this neither at compile time nor at run time.
In Haskell, on the other hand, you can explicitly declare whether a function is pure or impure. In order to even have the option of calling an impure action, a function must itself be declared impure.
Thus, the opportunity to be impure must be declared at compile time. Even if you pass a pure 'implementation' of an impure type, the receiving, higher-order function still looks impure.
While something like described in the OP would be technically possible in Haskell, it would make everything impure, so it wouldn't be the way you go about it.
What you do instead depends on circumstances and requirements. In the OP, it looks as though you need exactly 12 random values. Instead of passing an impure action as an argument, you might instead generate 12 random values in the 'impure shell' of the program, and pass those values to a function that can then remain pure.
There's more at stake than just testing. While testability is nice, the design suggested in the OP will most certainly be impure 'in production' (i.e. when composed with a proper random value generator).
Impure actions are harder to understand, and their interactions can be surprising. Pure functions, on the other hand, are referentially transparent, and referential transparency fits in your head.
It'd be a good idea to have as a goal pure functions whenever possible. The proposed getRandomFileName2 is unlikely to be pure when composed with a 'real' random value generator, so a more functional design is warranted.
Anything that contains random (or Date or stuff like that). Will be considered impure and hard to test because what it returns doesn't strictly depends on its inputs (always different). However, if the random part of the function is injected, the function can be made "pure" in the test suite by replacing whatever injected randomness with something predictable.
function getRandomFileName(fileExtension = "", randomLetterFunc = getRandomLetter) {}
can be tested by calling it with a predictable "getLetter" function instead of a random one:
getRandomFileName("", predictableLetterFunc)
Let's say I have a few primitives defined, here using javascript:
const TRUE = x => y => x;
const FALSE = x => y => y;
const ZERO = f => a => a;
const ONE = f => a => f(a);
const TWO = f => a => f(f(a));
If a language is purely function, how would it translate these primitives to something physical? For example, usually I see something like a function that is not a pure function, such as:
const TWO = f => a => f(f(a));
const inc = x => x+1;
console.log(TWO(inc)(0));
// 2
But again this is sort of a 'trick' to print something, in this case a number. But how is the pure-functional stuff translated into something that can actually do something?
A function is pure if its result (the return value) only depends on the inputs you give to it.
A language is purely functional if all its functions are pure¹.
Therefore it's clear that "utilities" like getchar, which are fairly common functions in many ordinary, non-functional languages, pose a problem in functional languages, because they take no input², and still they give different outputs everytime.
It looks like a functional language needs to give up on purity at least for doing I/O, doesn't it?
Not quite. If a language wants to be purely functional, it can't ever break function purity, not even for doing I/O. Still it needs to be useful. You do need to get things done with it, or, as you say, you need
something that can actually do something
If that's the case, how can a purely functional language, like Haskell, stay pure and yet provide you with utilities to interact with keyboard, terminal, and so on? Or, in other words, how can purely functional languages provide you with entities that have the same "read the user input" role of those impure functions of ordinary languages?
The trick is that those functions are secretly (and platonically) taking one more argument, in addition to the 0 or more arguments they'd have in other languages, and spitting out an additional return value: these two guys are the "real world" before and after the "action" that function performs. It's a bit like saying that the signatures of getchar and putchar are not
char getchar()
void putchar(char)
but
[char, newWorld] = getchar(oldWorld)
[newWorld] = putchar(char, oldWorld)
This way you can give to your program the "initial" world, and all those functions which are impure in ordinary languages will, in functional languages, pass the evolving world to each other, like Olympic torch.
Now you could ask: what's the advantage of doing so?
The point of a pure functional language like Haskell, is that it abstracts this mechanism away from you, hiding that *word stuff from you, so that you can't do silly things like the following
[firstChar, newWorld] = getchar(oldWorld)
[secondChar, newerWorld] = getchar(oldWorld) // oops, I'm mistakenly passing
// oldWorld instead of newWorld
The language just doesn't give you tools to put you hands on the "real world". If it did, you'd have the same degrees of freedom you have in languages like C, and you'd end up with the type of bugs which are common in thos languages.
A purely functional language, instead, hides that stuff from you. It basically constrains and limits your freedom inside a smaller set than non-functional languages allow, letting the runtime machinary that actually runs the program (and on which you have no control whatsoever), take care of the plumbing on your behalf.
(A good reference)
¹ Haskell is such a language (and there isn't any other mainstream purely functional language around, as far as I know); JavaScript is not, even if it provides several tools for doing some functional programming (think of arrow functions, which are essentially lambdas, and the Lodash library).
² No, what you enter from the keyboard is not an input to the getchar function; you call getchar without arguments, and assign its return value to some variable, e.g. char c = getchar() or let c = getchar(), or whatever the language syntax is.
Having read this article https://www.toptal.com/javascript/es6-class-chaos-keeps-js-developer-up and subsequently "JavaScript: The Good Parts", I shall henceforth commit to becoming a better JavaScript developer. However, one question remained for me. I usually implemented methods like this:
function MyClass(){
this.myData = 43;
this.getDataFromObject = function(){
return this.myData;
}
}
MyClass.prototype.getDataFromPrototype = function(){
return this.myData;
}
var myObject = new MyClass();
console.log(myObject.getDataFromObject());
console.log(myObject.getDataFromPrototype());
My assumption that underlies this whole post is that getDataFromObject is faster (during call, not during object creation) because it saves an indirection to the prototype but it is also less memory-efficient because every object gets an own instance of the function object. If that is already wrong, please correct me and you can probably stop reading here.
Else: Both article and book recommend a style like this:
function secretFactory() {
const secret = "Favor composition over inheritance [...]!"
const spillTheBeans = () => console.log(secret)
return {
spillTheBeans
}
}
const leaker = secretFactory()
leaker.spillTheBeans()
(quote from the article, the book didn't have ES6 yet but the ideas are similar)
My issue is this:
const leaker1 = secretFactory()
const leaker2 = secretFactory()
console.log(leaker1.spillTheBeans === leaker2.spillTheBeans) // false
Do I not mostly want to avoid that every object gets an own instance of every method? It might be insignificant here but if spillTheBeans is more complicated and I create a bazillion objects, each with twelvetythousand other methods?
If so, what is the "goot parts"-solution? My assumption would be:
const spillStaticBeans = () => console.log("Tabs rule!")
const spillInstanceBeans = (beans) => console.log(beans)
function secretFactory() {
const secret = "Favor composition over inheritance [...]!"
return{
spillStaticBeans,
spillInstanceBeans: () => spillInstanceBeans(secret)
}
}
const leaker1 = secretFactory()
const leaker2 = secretFactory()
leaker1.spillStaticBeans()
leaker2.spillInstanceBeans()
console.log(leaker1.spillStaticBeans === leaker2.spillStaticBeans) // true
console.log(leaker1.spillInstanceBeans === leaker2.spillInstanceBeans) // false
The spillInstanceBeans method is still different because each instance needs its own closure but at least they just wrap a reference to the same function object which contains all the expensiveness.
But now I have to write every method name two to three times. Worse, I clutter the namespace with public spillStaticBeans and spillInstanceBeans functions. In order to mitigate the latter, I could write a meta factory module:
const secretFactory = (function(){
const spillStaticBeans = () => console.log("Tabs rule!")
const spillInstanceBeans = (beans) => console.log(beans)
return function() {
const secret = "Favor composition over inheritance [...]!"
return{
spillStaticBeans,
spillInstanceBeans: () => spillInstanceBeans(secret)
}
}
}())
This can be used the same way as before but now the methods are hidden in a closure. However, it gets a bit confusing. Using ES6 modules, I could also leave them in module scope and just not export them. But is this the way to go?
Or am I mistaken in general and JavaScript's internal function representation takes care of all this and there is not actually a problem?
My assumption that underlies this whole post is that getDataFromObject is faster to call than getDataFromPrototype because it saves an indirection to the prototype
No. Engines are very good at optimising the prototype indirection. The instance.getDataFromPrototype always resolves to the same method for instances of the same class, and engines can take advantage of that. See this article for details.
Do I not mostly want to avoid that every object gets an own instance of every method? It might be insignificant here
Yes. In most of the cases, it actually is insignificant. So write your objects with methods using whatever style you prefer. Only if you actually measure a performance bottleneck, reconsider the cases where you are creating many instances.
Using ES6 modules, I could also leave them in module scope and just not export them. But is this the way to go?
Yes, that's a sensible solution. However, there's no good reason to extract spillInstanceBeans to the static scope, just leave it where it was - you have to create a closure over the secret anyway.
The spillInstanceBeans method is still different because each instance needs its own closure but at least they just wrap a reference to the same function object which contains all the expensiveness.
It should be noted that you're just replicating the way the JavaScript VM works internally: a function like spillTheBeans is compiled only once where it occurs in the source code even if it has free variables like secret. In SpiderMonkey for example, the result is called a »proto-function« (not to be confused with prototype). These are internal to the VM and cannot be accessed from JavaScript.
At runtime, function objects are created by binding the free variables of proto-functions to (a part of) the current scope, pretty much like your spillInstanceBeans example.
Saying that, it's true that using closures instead of prototype methods and this creates more function objects overall – the robustness gained from true privacy and read-only properties might make it worthwhile. The proposed style focuses more on objects rather than classes, so a different design could emerge that cannot be compared directly to a class-based design.
As Bergi says, measure and reconsider if performance is more important in (some part of) your code.
This may border on philosophical, but I thought it would be the right place to ask.
Suppose I have a function that creates a list of IDs. These identifiers are only used internally to the application, so it is acceptable to use ES2015 Symbol() here.
My problem is that, technically, when you ask for a Symbol, I'd imagine the JS runtime creates a unique identifier (random number? memory address? unsure) which, to prevent collisions, would require accessing global state. The reason I'm unsure is because of that word, "technically". I'm not sure (again, from a philosophical standpoint) if this ought to be enough to break the mathematical abstraction that the API presents.
tl;dr: here's an example--
function sentinelToSymbol(x) {
if (x === -1) return Symbol();
return x;
}
Is this function pure?
Not really, no, but it might not actually matter.
On the surface, (foo) => Symbol(foo) appears pure. While the runtime may do some operations with side effects, you will never see them, even if you call Symbol() at the same time with the same parameters. However, calling Symbol with the same arguments will never return the same value, which is one of the main criteria (#2, below).
From the MDN page:
Note that Symbol("foo") does not coerce the string "foo" into a symbol. It creates a new symbol each time:
Symbol("foo") === Symbol("foo"); // false
Looking solely at side effects, (foo) => Symbol(foo) is pure (above the runtime).
However, a pure function must meet more criteria. From Wikipedia:
Purely functional functions (or expressions) have no side effects (memory or I/O). This means that pure functions have several useful properties, many of which can be used to optimize the code:
If the result of a pure expression is not used, it can be removed without affecting other expressions.
If a pure function is called with arguments that cause no side-effects, the result is constant with respect to that argument list (sometimes called referential transparency), i.e. if the pure function is again called with the same arguments, the same result will be returned (this can enable caching optimizations such as memoization).
If there is no data dependency between two pure expressions, then their order can be reversed, or they can be performed in parallel and they cannot interfere with one another (in other terms, the evaluation of any pure expression is thread-safe).
If the entire language does not allow side-effects, then any evaluation strategy can be used; this gives the compiler freedom to reorder or combine the evaluation of expressions in a program (for example, using deforestation).
You could argue the preface to that list rules out everything in JavaScript, since any operation could result in memory being allocated, internal structures updated, etc. In the strictest possible interpretation, JS is never pure. That's not very interesting or useful, so...
This function meets criteria #1. Disregarding the result, (foo) => Symbol(foo) and (foo) => () are identical to any outside observer.
Criteria #2 gives us more trouble. Given bar = (foo) => Symbol(foo), bar('xyz') !== bar('xyz'), so Symbol does not meet that requirement at all. You are guaranteed to get a unique instance back every time you call Symbol.
Moving on, criteria #3 causes no problems. You can call Symbol from different threads without them conflicting (parallel) and it doesn't matter what order they are called in.
Finally, criteria #4 is more of a note than direct requirement, and is easily met (the JS runtimes shuffle everything around as they go).
Therefore:
strictly speaking, nothing in JS can be pure.
Symbol() is definitely not pure, thus the example is not either.
If all you care about is side effects rather than memoization, the example does meet those criteria.
Yes, this function is impure: sentinelToSymbol(-1) !== sentinelToSymbol(-1). We would expect equality here for a pure function.
However, if we use the concept of referential transparency in a language with object identities, we might want to loosen our definition a bit. If you consider function x() { return []; }, is it pure? Obviously x() !== x(), but still the function always returns an empty array regardless of the input, like a constant function. So what we do have to define here is the equality of values in our language. The === operator might not be the best fit here (just consider NaN). Are arrays equal to each other if the contain the same elements? Probably yes, unless they are mutated somewhere.
So you will have to answer the same question for your symbols now. Symbols are immutable, which makes that part easy. Now we could consider them equal by their [[Description]] value (or .toString()), so sentinelToSymbol would be pure by that definition.
But most languages do have functions that allow to break referential transparency - for example see How to print memory address of a list in Haskell. In JavaScript, this would be using === on otherwise equal objects. And it would be using symbols as properties, as that inspects their identity. So if you do not use such operations (or at least without being observable to the outside) in your programs, you can claim purity for your functions and use it for reasoing about your program.