According to my understanding of functional programming, you should be able to chain multiple functions and then execute the whole chain by going through the input data once.
In other words, when I do the following (pseudo-code):
list = [1, 2, 3];
sum_squares = list
.map(function(item) { return item * item; })
.reduce(function(total, item) { return total + item; }, 0);
I expect that the list will be traversed once, when each value will be squared and then everything will be added up (hence, the map operation would be called as needed by the reduce operation).
However, when I look at the source code of Underscore.js, I see that all the "functional programming" functions actually produce intermediate collections like, for example, so:
// Return the results of applying the iteratee to each element.
_.map = _.collect = function(obj, iteratee, context) {
iteratee = cb(iteratee, context);
var keys = !isArrayLike(obj) && _.keys(obj),
length = (keys || obj).length,
results = Array(length);
for (var index = 0; index < length; index++) {
var currentKey = keys ? keys[index] : index;
results[index] = iteratee(obj[currentKey], currentKey, obj);
}
return results;
};
So the question is, as stated in the title, are we fooling ourselves that we do functional programming when we use Underscore.js?
What we actually do is make program look like functional programming without actually it being functional programming in fact. Imagine, I build a long chain of K filter() functions on list of length N, and then in Underscore.js my computational complexity will be O(K*N) instead of O(N) as would be expected in functional programming.
P.S. I've heard a lot about functional programming in JavaScript, and I was expecting to see some functions, generators, binding... Am I missing something?
Is Underscore.js functional programming a fake?
No, Underscore does have lots of useful functional helper functions. But yes, they're doing it wrong. You may want to have a look at Ramda instead.
I expect that the list will be traversed once
Yes, list will only be traversed once. It won't be mutated, it won't be held in memory (if you had not a variable reference to it). What reduce traverses is a different list, the one produced by map.
All the functions actually produce intermediate collections
Yes, that's the simplest way to implement this in a language like JavaScript. Many people rely on map executing all its callbacks before reduce is called, as they use side effects. JS does not enforce pure functions, and library authors don't want to confuse people.
Notice that even in pure languages like Haskell an intermediate structure is built1, though it would be consumed lazily so that it never is allocated as a whole.
There are libraries that implement this kind of optimisation in strict languages with the concept of transducers as known from Clojure. Examples in JS are transduce, transducers-js, transducers.js or underarm. Underscore and Ramda have been looking into them2 too.
I was expecting to see some […] generators
Yes, generators/iterators that can be consumed lazily are another choice. You'll want to have a look at Lazy.js, highland, or immutable-js.
[1]: Well, not really - it's a too easy optimisation
[2]: https://github.com/jashkenas/underscore/issues/1896, https://github.com/ramda/ramda/pull/865
Functional programming has nothing to do with traversing a sequence once; even Haskell, which is as pure as you're going to get, will traverse the length of a strict list twice if you ask it to filter pred (map f x).
Functional programming is a simpler model of computation where the only things that are allowed to happen do not include side effects. For example, in Haskell basically only the following things are allowed to happen:
You can apply a value f to another value x, producing a new value f x with no side-effects. The first value f is called a "function". It must be the case that any time you apply the same f to the same x you get the same answer for f x.
You can give a name to a value, which might be a function or a simple value or whatever.
You can define a new structure for data with a new type signature, and/or structure some data with those "constructors."
You can define a new type-class or show how an existing data structure instantiates a type-class.
You can "pattern match" a data structure, which is a combination of a case dispatch with naming the parts of the data structure for the rest of your project.
Notice how "print something to the console" is not doable in Haskell, nor is "alter an existing data structure." To print something to the console, you construct a value which represents the action of printing something to the console, and then give it a special name, main. (When you're compiling Haskell, you compute the action named main and then write it to disk as an executable program; when you run the program, that action is actually completed.) If there is already a main program, you figure out where you want to include the new action in the existing actions of that program, then use a function to sequence the console logging with the existing actions. The Haskell program never does anything; it just represents doing something.
That is the essence of functional programming. It is weaker than normal programming languages where the language itself does stuff, like JavaScript's console.log() function which immediately performs its side effect whenever the JS interpreter runs through it. In particular, there are some things which are (or seem to be) O(1) or O(log(log(n))) in normal programs where our best functional equivalent is O(log(n)).
Related
I'm relatively new to studying functional programming and things were going well until I had to treat errors and promises. Trying to do it in the “right” way I get a lot of references for Monads as to be the better solution for it, but while studying it I ended up with what I honestly would call a "reference hell", where there was a lot of references and sub-references either mathematical or in programming for the same thing where this thing would have a different name for the same concepts, which was really confusing. So after persisting in the subject, I'm now trying to summarize and clarify it, and this is what I get so far:
For the sake of understanding I'll oversimplify it.
Monoids: are anything that concatenate/sum two things returning a thing of the same group so in JS any math addition or just concatenation from a string are Monoids for definition as well the composition of functions.
Maps: Maps are just methods that apply a function to each element of a group, without changing the category of the group itself or its length.
Functors: Functors are just objects that have a Map method that return the Functor itself.
Monads: Monads are Functors that use FlatMaps.
FlatMaps: FlatMaps are maps that have the capability of treat promises/fetch or summarize the received value.
Either, Maybe, Bind, Then: are all FlatMaps but with different names depending on the context you use it.
(I think they all are FlatMaps for definition but there is a difference in the way they are used since there is a library like Monets.js that have both a Maybe and a Either function, but I don’t get the use case difference).
So my question is: are these concepts right?
if anyone can reassure me what so far I got right and correct what I got wrong, or even expand on what I have missed, I would be very grateful.
Thanks to anyone who takes the time.
//=============================================================//
EDIT:
I should've emphasized more in this post, but these affirmations and simplified definitions are only from the "practical perspective in JavaScript" (I'm aware of the impossibility of making such a small simplification of a huge and complex theme like these especially if you add another field like mathematics).
//=============================================================//
Monoids: are anything that concatenate/sum two things returning a thing of the same group...
First thing that I don't like here is the word “group”. I know, you're just trying to use simple language and all, but the problem is that group has a very specific mathematical meaning, and we can't just ignore this because groups and monoids are very closely related. (A group is basically a monoid with inverse elements.) So definitely don't use this word in any definition of monoids, however informal. You could say “underlying set” there, but I'd just say type. That may not match the semantics in all programming languages, but certainly in Haskell.
So,
Monoids: are anything that concatenate/sum two things returning a thing of the same type so in JS any math addition or just concatenation from a string are Monoids for definition as well the composition of functions.
Ok. Specifically, concatenation of endofunctions is a monoid. In JavaScript, all functions are in a sense endofunctions, so you can get away with this.
But that's actually describing only a semigroup, not a monoid. (See, there's out groups... confusingly, monoids are in between semigroups and groups.) A monoid is a semigroup that has also a unit element, which can be concatenated to any other element without making a difference.
For the addition-monoid, this is the number zero.
For the string-monoid, it is the empty string.
For the function monoid, it is the identity function.
Even with unit elements, your characterization is missing the crucial feature of a semigroup/monoid: the concatenation operation must be associative. Associativity is a strangely un-intuitive property, perhaps because in most examples it seems stupidly obvious. But it's actually crucial for much of the maths that is built on those definitions.
To make the importance of associativity clear, it helps to look at some things that are not semigroups because they aren't associative. The type of integers with subtraction is such an example. The result is that you need to watch out where to put your parentheses in maths expressions, to avoid incurring sign errors. Whereas strings can just be concatenated in either order – ("Hello"+", ")+"World" is the same as "Hello"+(", "+"World").
Maps: Maps are just methods that apply a function to each element of a group, without changing the category of the group itself or its length.
Here we have the next badly chosen word: categories are again a specific maths thing that's very closely related to all we're talking about here, so please don't use the term with any other meaning.
IMO your definitions of “maps” and functors are unnecessary. Just define functors, using the already known concept of functions.
But before we can do that –
Functors: Functors are just objects...
here we go again, with the conflict between mathematical terminology and natural language. Mathematically, objects are the things that live in a category. Functors are not objects per se (although you can construct a specific category in which they are, by construction, objects). And also, itself already conflicting: in programming, “object” usually means “value with associated methods”, most often realized via a class.
Your usage of the terms seems to match neither of these established meanings, so I suggest you avoid it.
Mathematically, a functor is a mapping between two categories. That's hardly intuitive, but if you consider the category as a collection of types then a functor simply maps types to types. For example, the list functor maps some type (say, the type of integers) to the type of lists containing values of that type (the type of lists of integers).
Here of course we're running a bit into trouble when considering it all with respect to JS. In dynamic languages, you can easily have lists containing elements of multiple different types. But it's actually ok if we just treat the language as having only one big type that all values are members of. The list functor in Python maps the universal type to itself.
Blablatheory, what's the point of this all? The actual feature of a functor is not the type-mapping, but instead that it lifts a function on the contained values (i.e. on the values of the type you started with, in my example integers) to a function on the container-values (on lists of integers). More generally, the functor F lifts a function a -> b to a function F(a) -> F(b), for any types a and b. What you called “category of the group itself” means that you really are mapping lists to lists. Even in a dynamically typed language, the list functor's map method won't take a list and produce a dictionary as the result.
I suggest a different understandable-definition:
Functors: Functors wrap types as container-types, which have a mapping method that applies functions on contained values to functions on the whole container.
What you said about length is true of the list functor in particular, but it doesn't really make sense for functors in general. In Haskell we often talk about the fact that functor mapping preserves the “shape” of the container, but that too isn't actually part of the mathematical definition.
What is part of the definition is that a functor should be compatible with composition of the functions. This boils down to being able to map as often as you like. You can always map the identity function without changing the structure, and if you map two functions separately it has the same effect as mapping their composition in one go. It's kind of intuitive that this amounts to the mapping being “shape-preserving”.
Monads: Monads are Functors that use FlatMaps.
Fair enough, but of course this is just shifting everything to: what's a FlatMap?
Mathematically it's actually easier to not consider the FlatMap / >>= operation at first, but just consider the flattening operation, as well as the singleton injector. Going by example: the list monad is the list functor, equipped with
The operation that creates a list of just a plain contained value. (This is analogous to the unit value of a monoid.)
The operation that takes a nested list and flattens it out to a plain list, by gathering all the values in each of the inner lists. (This is analogous to the sum operation in a monoid.)
Again, it is important that these operations obey laws. These are also analogous to the monoid laws, but unfortunately even less intuitive because their simultaneously hard to think about and yet again so almost-trivial that they can seem a bit useless. But specifically the associativity law for lists can be phrased quite nicely:
Flattening the inner lists in a doubly nested list and then flattening the outer ones has the same effect as first flattening the outer ones and then the inner ones.
[[[1,2,3],[4,5]],[[6],[7,8,9]]] ⟼ [[1,2,3,4,5],[6,7,8,9]] ⟼ [1,2,3,4,5,6,7,8,9]
[[[1,2,3],[4,5]],[[6],[7,8,9]]]⟼[[1,2,3],[4,5],[6],[7,8,9]]⟼[1,2,3,4,5,6,7,8,9]
A Monoid is a set and an operator, such that:
The operator is associative for that set
The operator has an identity within that set
So, addition is associative for the set of real numbers, and in the set of real numbers has the identity zero.
a+(b+c) = (a+b)+c -- associative
a+0 = a -- identity
A Map is a transformation between two sets. For every element with the first set, there is a matching element in the second set. As an example, the transformation could be 'take a number and double it'.
The transformation is called a Functor. If the set is mapped back to itself, it is called an Endofunctor.
If an operator and a set is a Monoid, and also can be considered an Endofunctor, then we call that a Monad.
Monoids, Functors, Endofunctor, and Monads are not a thing but rather the property of a thing, that the operator and set has these properties. Can we declare this in Haskell by creating instances in the appropriate Monoid, Functor and Monad type-classes.
A FlatMap is a Map combined with a flattening operator. I can declare a map to be from a list to a list of lists. For a Monad we want to go from a list to a list, and so we flatten the list at the end to make it so.
To be blunt, I think all of your definitions are pretty terrible, except maybe the "monoid" one.
Here's another way of thinking about these concepts. It's not "practical" in the sense that it will tell you exactly why a flatmap over the list monad should flatten nested lists, but I think it's "practical" in the sense that it should tell you why we care about programming with monads in the first place, and what monads in general are supposed to accomplish from a practical perspective within functional programs, whether they are written in JavaScript or Haskell or whatever.
In functional programming, we write functions that take certain types as input and produce certain types as output, and we build programs by composing functions whose input and output types match. This is an elegant approach that results in beautiful programs, and it's one of the main reasons functional programmers like functional programming.
Functors provide a way to systematically transform types in a way that adds useful functionality to the original types. For example, we can use a functor to add functionality to a "normal type" that allows it to be missing or absent or "null" (Maybe) or represent either a successfully computed result or an error condition (Either) or that allows it to represent multiple possible values instead of only one (list) or that allows it to be computed at some time in the future (promise) or that requires a context for evaluation (Reader), or that allows a combination of these things.
A map allows us to reuse the functions we've defined for normal types on these new types that have been transformed by a functor, in some natural way. If we already have a function that doubles an integer, we can re-use that function on various functor transformations of integers, like doubling an integer that might be missing (mapping over a Maybe) or doubling an integer that hasn't been computed yet (mapping over a promise) or doubling every element of a list (mapping over a list).
A monad involves applying the functor concept to the output types of functions to produce "operations" that have additional functionality. With monad-less functional programming, we write functions that take "normal types" of inputs and produce "normal types" of outputs, but monads allow us to take "normal types" of inputs and produce transformed types of outputs, like the ones above. Such a monadic operation can represent a function that takes an input and Maybe produces an output, or one that takes an input and promises to produce an output later, or that takes an input and produces a list of outputs.
A flatmap generalizes the composition of functions on normal types (i.e., the way we build monad-less functional programs) to composition of monadic operations, appropriately "chaining" or combining the extra functionality provided by the transformed output types of the monadic operations. So, flatmaps over the maybe monad will compose functions as long as they keep producing outputs and give up when one of those functions has a missing output; flatmaps over the promise monad will turn a chain of operations that each take an input and promise an output into a single composed operation that takes an input and promises a final output; flatmaps over the list monad will turn a chain of operations that each take a single input and produce multiple outputs into a single composed operation that takes an input and produces multiple outputs.
Note that these concepts are useful because of their convenience and the systematic approach they take, not because they add magic functionality to functional programs that we wouldn't otherwise have. Of course we don't need a functor to create a list data type, and we don't need a monad to write a function that takes a single input and produces a list of outputs. It just ends up being useful thinking in terms of "operations that take an input and promise to produce either an error message or a list of outputs", compose 50 of those operations together, and end up with a single composed operation that takes an input and promises either an error message or a list of outputs (without requiring deeply nested lists of nested promises to be manually resolved -- hence the value of "flattening").
(In practical programming terms, monoids don't have that much to do with the rest of these, except to make hilarious in-jokes about the category of endofunctors. Monoids are just a systematic way of combining or "reducing" a bunch of values of a particular type into a single value of that type, in a manner that doesn't depend on which values are combined first or last.)
In a nutshell, functors and their maps allow us to add functionality to our types, while monads and their flatmaps provide a mechanism to use functors while retaining some semblance of the elegance of simple functional composition that makes functional programming so enjoyable in the first place.
An example might help. Consider the problem of performing a depth-first traversal of a file tree. In some sense, this is a simple recursive composition of functions. To generate a filetree() rooted at pathname, we need to call a function on the pathname to fetch its children(), and then we need to recursively call filetree() on those children(). In pseudo-JavaScript:
// to generate a filetree rooted at a pathname...
function filetree(pathname) {
// we need to get the children and generate filetrees rooted at their pathnames
filetree(children(pathname))
}
Obviously, though, this won't work as real code. For one thing, the types don't match. The filetree function should be called on a single pathname, but children(pathname) will return multiple pathnames. There are also some additional problems -- it's unclear how the recursion is supposed to stop, and there's also the issue that the original pathname appears to get lost in the shuffle as we jump right to its children and their filetrees. Plus, if we're trying to integrate this into an existing Node application with a promise-based architecture, it's unclear how this version of filetree could support the promise-based filesystem API.
But, what if there was a way to add functionality to the types involved while maintaining the elegance of this simple composition? For example, what if we had a functor that allowed us to promise to return multiple values (e.g., multiple child pathnames) while logging strings (e.g., parent pathnames) as a side effect of the processing?
Such a functor would, as I've said above, be a transformation of types. That means that it would transform a "normal" type, like a "integer", into a "promise for a list of integers together with a log of strings". Suppose we implement this as an object containing a single promise:
function M(promise) {
this.promise = promise
}
which when resolved will yield an object of form:
{
"data": [1,2,3,4] // a list of integers
"log": ["strings","that","have","been","logged"]
}
As a functor, M would have the following map function:
M.prototype = {
map: function(f) {
return this.promise.then((obj) => ({
data: obj.data.map(f),
log: obj.log
}))
}
}
which would apply a plain function to the promised data (without affecting the log).
More importantly, as a monad, M would have the following flatMap function:
M.prototype = {
...
flatMap: function(f) {
// when the promised data is ready
return new M(this.promise.then(function(obj) {
// map the function f across the data, generating promises
var promises = obj.data.map((d) => f(d).promise)
// wait on all promises
return Promise.all(promises).then((results) => ({
// flatten all the outputs
data: results.flatMap((result) => result.data),
// add to the existing log
log: obj.log.concat(results.flatMap((result) => result.log))
}))
}))
}
}
I won't explain in detail, but the idea is that if I have two monadic operations in the M monad, that take a "plain" input and produce an M-transformed output, representing a promise to provide a list of values together with a log, I can use the flatMap method on the output of the first operation to compose it with the second operation, yielding a composite operation that takes a single "plain" input and produces an M-transformed output.
By defining children as a monadic operation in the M monad that promises to take a parent pathname, write it to the log, and produce a list of the children of this pathname as its output data:
function children(parent) {
return new M(fsPromises.lstat(parent)
.then((stat) => stat.isDirectory() ? fsPromises.readdir(parent) : [])
.then((names) => ({
data: names.map((x) => path.join(parent, x)),
log: [parent]
})))
}
I can write the recursive filetree function almost as elegantly as the original above, as a flatMap-assisted composition of the children and recursively invoked filetree functions:
function filetree(pathname) {
return children(pathname).flatMap(filetree)
}
In order to use filetree, I need to "run" it to extract the log and, say, print it to the console.
// recursively list files starting at current directory
filetree(".").promise.then((x) => console.log(x.log))
The full code is below. Admittedly, there's a fair bit of it, and some of it is pretty complicated, so the elegance of the filetree function appears to have come at a fairly big cost, as we've apparently just moved all the complexity (and them some) into the M monad. However, the M monad is a general tool, not specific to performing depth-first traversals of file trees. Also, in an ideal world, a sophisticated JavaScript monad library would allow you to build the M monad from monadic pieces (promise, list, and log) with a couple lines of code.
var path = require('path')
var fsPromises = require('fs').promises
function M(promise) {
this.promise = promise
}
M.prototype = {
map: function(f) {
return this.promise.then((obj) => ({
data: obj.data.map(f),
log: obj.log
}))
},
flatMap: function(f) {
// when the promised data is ready
return new M(this.promise.then(function(obj) {
// map the function f across the data, generating promises
var promises = obj.data.map((d) => f(d).promise)
// wait on all promises
return Promise.all(promises).then((results) => ({
// flatten all the outputs
data: results.flatMap((result) => result.data),
// add to the existing log
log: obj.log.concat(results.flatMap((result) => result.log))
}))
}))
}
}
// not used in this example, but this embeds a single value of a "normal" type into the M monad
M.of = (x) => new M(Promise.resolve({ data: [x], log: [] }))
function filetree(pathname) {
return children(pathname).flatMap(filetree)
}
function children(parent) {
return new M(fsPromises.lstat(parent)
.then((stat) => stat.isDirectory() ? fsPromises.readdir(parent) : [])
.then((names) => ({
data: names.map((x) => path.join(parent, x)),
log: [parent]
})))
}
// recursively list files starting at current directory
filetree(".").promise.then((x) => console.log(x.log))
Does it influence performance if I cache a call to Object.keys() in a variable when repeteadly checking it's content inside every(), filter(), any() etc. functions ? Does the JS engine perform inline caching in this case?
X = {/*...*/};
t = [/*...*/];
f = key => {/*...*/}; // somehow transforms keys of X
// we want to check if all elements of t are the transformed keys of X
keys = Object.keys(X).map(f);
result = t.every(a=>keys.includes(a)); // <──────── is this faster than this ?
result = t.every(a=>Object.keys(X).map(f).includes(a)); // <──────────────┘
// same with filtering etc.
t = t.filter(a => !Object.keys(X).includes(a)); // discarding elements from t that are not keys in X
The answer will influence my coding practice, I'd much rather use the second, more compact version, because it can be used in a one-line lambdas and is easier to look at without jumping to definitions.
Unless your code performs the operations many thousands of times, it won't matter. However making a function call (to Object.keys()) and in the process instantiating a new array will always use more resources than not doing that.
It's more important for your code to be understandable and maintainable than optimally composed in almost all cases.
It's also a really good idea to not rely on supposed optimization behaviors inside the JavaScript runtime(s). Those may change. It's OK to simply trust that somebody maintaining the runtime has performance as their full-time job.
This may border on philosophical, but I thought it would be the right place to ask.
Suppose I have a function that creates a list of IDs. These identifiers are only used internally to the application, so it is acceptable to use ES2015 Symbol() here.
My problem is that, technically, when you ask for a Symbol, I'd imagine the JS runtime creates a unique identifier (random number? memory address? unsure) which, to prevent collisions, would require accessing global state. The reason I'm unsure is because of that word, "technically". I'm not sure (again, from a philosophical standpoint) if this ought to be enough to break the mathematical abstraction that the API presents.
tl;dr: here's an example--
function sentinelToSymbol(x) {
if (x === -1) return Symbol();
return x;
}
Is this function pure?
Not really, no, but it might not actually matter.
On the surface, (foo) => Symbol(foo) appears pure. While the runtime may do some operations with side effects, you will never see them, even if you call Symbol() at the same time with the same parameters. However, calling Symbol with the same arguments will never return the same value, which is one of the main criteria (#2, below).
From the MDN page:
Note that Symbol("foo") does not coerce the string "foo" into a symbol. It creates a new symbol each time:
Symbol("foo") === Symbol("foo"); // false
Looking solely at side effects, (foo) => Symbol(foo) is pure (above the runtime).
However, a pure function must meet more criteria. From Wikipedia:
Purely functional functions (or expressions) have no side effects (memory or I/O). This means that pure functions have several useful properties, many of which can be used to optimize the code:
If the result of a pure expression is not used, it can be removed without affecting other expressions.
If a pure function is called with arguments that cause no side-effects, the result is constant with respect to that argument list (sometimes called referential transparency), i.e. if the pure function is again called with the same arguments, the same result will be returned (this can enable caching optimizations such as memoization).
If there is no data dependency between two pure expressions, then their order can be reversed, or they can be performed in parallel and they cannot interfere with one another (in other terms, the evaluation of any pure expression is thread-safe).
If the entire language does not allow side-effects, then any evaluation strategy can be used; this gives the compiler freedom to reorder or combine the evaluation of expressions in a program (for example, using deforestation).
You could argue the preface to that list rules out everything in JavaScript, since any operation could result in memory being allocated, internal structures updated, etc. In the strictest possible interpretation, JS is never pure. That's not very interesting or useful, so...
This function meets criteria #1. Disregarding the result, (foo) => Symbol(foo) and (foo) => () are identical to any outside observer.
Criteria #2 gives us more trouble. Given bar = (foo) => Symbol(foo), bar('xyz') !== bar('xyz'), so Symbol does not meet that requirement at all. You are guaranteed to get a unique instance back every time you call Symbol.
Moving on, criteria #3 causes no problems. You can call Symbol from different threads without them conflicting (parallel) and it doesn't matter what order they are called in.
Finally, criteria #4 is more of a note than direct requirement, and is easily met (the JS runtimes shuffle everything around as they go).
Therefore:
strictly speaking, nothing in JS can be pure.
Symbol() is definitely not pure, thus the example is not either.
If all you care about is side effects rather than memoization, the example does meet those criteria.
Yes, this function is impure: sentinelToSymbol(-1) !== sentinelToSymbol(-1). We would expect equality here for a pure function.
However, if we use the concept of referential transparency in a language with object identities, we might want to loosen our definition a bit. If you consider function x() { return []; }, is it pure? Obviously x() !== x(), but still the function always returns an empty array regardless of the input, like a constant function. So what we do have to define here is the equality of values in our language. The === operator might not be the best fit here (just consider NaN). Are arrays equal to each other if the contain the same elements? Probably yes, unless they are mutated somewhere.
So you will have to answer the same question for your symbols now. Symbols are immutable, which makes that part easy. Now we could consider them equal by their [[Description]] value (or .toString()), so sentinelToSymbol would be pure by that definition.
But most languages do have functions that allow to break referential transparency - for example see How to print memory address of a list in Haskell. In JavaScript, this would be using === on otherwise equal objects. And it would be using symbols as properties, as that inspects their identity. So if you do not use such operations (or at least without being observable to the outside) in your programs, you can claim purity for your functions and use it for reasoing about your program.
This example is from http://eloquentjavascript.net/code/#5.1.
My question is the first bullet-pointed detail; others may be helpful details, but are additional; also see the first short program to see my question in context.
- Why is arrays.reduce() used instead of reduce(arrays()). I know that their's works with the arrays.reduce, but why?
This is an answer to a comment that is useful, but additional to the original question.
My question is with this first program. Since it uses arrays.reduce,
reduce would be a method of arrays, I am not sure why reduce is a
method of arrays. The reason might be in the design decisions of
JavaScript? Thanks, #cookie monster, for that comment!
This is the program with the context of my question-
var arrays = [[1, 2, 3], [4, 5], [6]];
console.log(arrays.reduce(function(flat, current) {
return flat.concat(current);
}, []));
// → [1, 2, 3, 4, 5, 6]
These next details are additional, but may(or may not) be of use:
I know that the [] at the end is used because it is the start parameter in the function reduce so that the other arrays are added to that empty array. I know the .concat is putting together the two arrays like + with strings, but for arrays. Here is what the reduce function, even though it is standard, looks like:
function reduce(array, combine, start){
var current = start;
for(var i = 0; i < array.length; i++)
current = combine(current, array[i]);
return current;
}
One of their other examples showed my way with a single array, if that helps. It looked like:
console.log(reduce([1, 2, 3, 4], function(a, b){
return a + b;
}, 0));
// 10
Thanks! :)
In object oriented design, a guiding principle is that you create or declare objects and those objects have a series of methods on them that operate on that particular type of object. As Javascript is an object oriented language, the built-in functions follow many of the object oriented principles.
.reduce() is a function that operates only on Arrays. It is of no use without an Array to operate on. Thus, in an object oriented design paradigm, it makes complete sense to place the .reduce() function as a method on an Array object.
This object-oriented approach offers the following advantages vs. a global reduce() function:
It is consistent with the object oriented principles used elsewhere in the language.
It is convenient to invoke the .reduce() function on a particular array by using array.reduce() and it is obvious from the syntax which array it is operating on.
All array operations are grouped as methods on the Array object making the API definition more self documenting.
If you attempt to do obj.reduce() (invoke it on a non-array), you will immediately get an runtime error about an undefined .reduce() method.
No additional space is taken in the global scope (e.g. no additional global symbol is defined - lessening the chance for accidental overwrite or conflict with other code).
If you want to know why anyone ever used a global reduce() instead, you would need to understand a little bit about the history of Javascript evolution. Before ES5 (or for users running browsers who hadn't yet implemented ES5 like IE8), Javascript did not have a built-in .reduce() method on the Array object. Yet, some developers who were familiar with this type of useful iteration capability from other languages wanted to to use it in their own Javascript or in their own Javascript framework.
When you want to add some functionality to an existing built-in object like Array in Javascript, you generally have two choices. You can add a method to the prototype (in object-oriented fashion) or you can create a global function that takes the Array as its first argument.
For the Array object in particular, there are some potential issues with adding iterable methods to the Array prototype. This is because if any code does this type of iteration of an array (a bad way to iterate arrays, but nevertheless done by many):
for (var prop in array)
they will end up iterating not only the elements of the array, but also any iterable properties of the array. If a new method is assigned to the prototype with this type of syntax:
Array.prototype.myMethod = function() {}
Then, this new method will end up getting iterated with the for (var prop in array) syntax and it will often cause problems.
So, rather than adding a new method to the prototype, a safer alternative was to just use a global function and avoid this issue.
Starting in ES 5.1, there is a way using Object.defineProperty() to add non-iterable methods to an object or prototype so for newer browsers, it is now possible to add methods to a prototype that are not subject to the problem mentioned above. But, if you wanted to support older browsers (like IE8) and use reduce() type of functionality, you're still stuck with these ancient limitations.
So ... even though a global reduce() is less object oriented and is generally not as desirable in Javascript, some older code went that route for legitimate safety/interoperability reasons. Fortunately, we are putting that road behind us as old version of IE drop off in usage (thank god for Microsoft dropping XP support to finally accelerate the demise of old versions of IE). And newer browsers already contain array.reduce() built in.
JavaScript was/is influenced by the language Scheme, a dialect of Lisp. In Scheme higher order functions are a key component/feature of the language. In fact the reduce function is pretty much equivalent to the fold function in Scheme. In the case of the reduce function in JavaScript, the creators of the language noticed that programmers often need to traverse arrays in a certain fashion, and gave programmers a higher order function where they can pass in a function to specify how they want to manipulate the data. Having higher order functions allows programmers to abstract redundant code therefore creating shorter, cleaner, more readable code.
You can use the reduce function in JavaScript to do many things other than flatten lists. Look here an example.
In Douglas Crockford's book "Javascript: The Good Parts" he provides code for a curry method which takes a function and arguments and returns that function with the arguments already added (apparently, this is not really what "curry" means, but is an example of "partial application"). Here's the code, which I have modified so that it works without some other custom code he made:
Function.prototype.curry = function(){
var slice = Array.prototype.slice,
args = slice.apply(arguments),
that = this;
return function() {
// context set to null, which will cause `this` to refer to the window
return that.apply(null, args.concat(slice.apply(arguments)));
};
};
So if you have an add function:
var add = function(num1, num2) {
return num1 + num2;
};
add(2, 4); // returns 6
You can make a new function that already has one argument:
var add1 = add.curry(1);
add1(2); // returns 3
That works fine. But what I want to know is why does he set this to null? Wouldn't the expected behavior be that the curried method is the same as the original, including the same this?
My version of curry would look like this:
Function.prototype.myCurry = function(){
var slice = [].slice,
args = slice.apply(arguments),
that = this;
return function() {
// context set to whatever `this` is when myCurry is called
return that.apply(this, args.concat(slice.apply(arguments)));
};
};
Example
(Here is a jsfiddle of the example)
var calculator = {
history: [],
multiply: function(num1, num2){
this.history = this.history.concat([num1 + " * " + num2]);
return num1 * num2;
},
back: function(){
return this.history.pop();
}
};
var myCalc = Object.create(calculator);
myCalc.multiply(2, 3); // returns 6
myCalc.back(); // returns "2 * 3"
If I try to do it Douglas Crockford's way:
myCalc.multiplyPi = myCalc.multiply.curry(Math.PI);
myCalc.multiplyPi(1); // TypeError: Cannot call method 'concat' of undefined
If I do it my way:
myCalc.multiplyPi = myCalc.multiply.myCurry(Math.PI);
myCalc.multiplyPi(1); // returns 3.141592653589793
myCalc.back(); // returns "3.141592653589793 * 1"
However, I feel like if Douglas Crockford did it his way, he probably has a good reason. What am I missing?
Reader beware, you're in for a scare.
There's a lot to talk about when it comes to currying, functions, partial application and object-orientation in JavaScript. I'll try to keep this answer as short as possible but there's a lot to discuss. Hence I have structured my article into several sections and at the end of each I have summarized each section for those of you who are too impatient to read it all.
1. To curry or not to curry
Let's talk about Haskell. In Haskell every function is curried by default. For example we could create an add function in Haskell as follows:
add :: Int -> Int -> Int
add a b = a + b
Notice the type signature Int -> Int -> Int? It means that add takes an Int and returns a function of type Int -> Int which in turn takes an Int and returns an Int. This allows you to partially apply functions in Haskell easily:
add2 :: Int -> Int
add2 = add 2
The same function in JavaScript would look ugly:
function add(a) {
return function (b) {
return a + b;
};
}
var add2 = add(2);
The problem here is that functions in JavaScript are not curried by default. You need to manually curry them and that's a pain. Hence we use partial application (aka bind) instead.
Lesson 1: Currying is used to make it easier to partially apply functions. However it's only effective in languages in which functions are curried by default (e.g. Haskell). If you have to manually curry functions then it's better to use partial application instead.
2. The structure of a function
Uncurried functions also exist in Haskell. They look like functions in "normal" programming languages:
main = print $ add(2, 3)
add :: (Int, Int) -> Int
add(a, b) = a + b
You can convert a function in its curried form to its uncurried form and vice versa using the uncurry and curry functions in Haskell respectively. An uncurried function in Haskell still takes only one argument. However that argument is a product of multiple values (i.e. a product type).
In the same vein functions in JavaScript also take only a single argument (it just doesn't know it yet). That argument is a product type. The arguments value inside a function is a manifestation of that product type. This is exemplified by the apply method in JavaScript which takes a product type and applies a function to it. For example:
print(add.apply(null, [2, 3]));
Can you see the similarity between the above line in JavaScript and the following line in Haskell?
main = print $ add(2, 3)
Ignore the assignment to main if you don't know what it's for. It's irrelevant apropos to the topic at hand. The important thing is that the tuple (2, 3) in Haskell is isomorphic to the array [2, 3] in JavaScript. What do we learn from this?
The apply function in JavaScript is the same as function application (or $) in Haskell:
($) :: (a -> b) -> a -> b
f $ a = f a
We take a function of type a -> b and apply it to a value of type a to get a value of type b. However since all functions in JavaScript are uncurried by default the apply function always takes a product type (i.e. an array) as its second argument. That is to say that the value of type a is actually a product type in JavaScript.
Lesson 2: All functions in JavaScript only take a single argument which is a product type (i.e. the arguments value). Whether this was intended or happenstance is a matter of speculation. However the important point is that you understand that mathematically every function only takes a single argument.
Mathematically a function is defined as a morphism: a -> b. It takes a value of type a and returns a value of type b. A morphism can only have one argument. If you want multiple arguments then you could either:
Return another morphism (i.e. b is another morphism). This is currying. Haskell does this.
Define a to be a product of multiple types (i.e. a is a product type). JavaScript does this.
Out of the two I prefer curried functions as they make partial application trivial. Partial application of "uncurried" functions is more complicated. Not difficult, mind you, but just more complicated. This is one of the reasons why I like Haskell more than JavaScript: functions are curried by default.
3. Why OOP matters not
Let's take a look at some object-oriented code in JavaScript. For example:
var oddities = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9].filter(odd).length;
function odd(n) {
return n % 2 !== 0;
}
Now you might wonder how is this object-oriented. It looks more like functional code. After all you could do the same thing in Haskell:
oddities = length . filter odd $ [0..9]
Nevertheless the above code is object-oriented. The array literal is an object which has a method filter which returns a new array object. Then we simply access the length of the new array object.
What do we learn from this? Chaining operations in object-oriented languages is the same as composing functions in functional languages. The only difference is that the functional code reads backwards. Let's see why.
In JavaScript the this parameter is special. It's separate from the formal parameters of the function which is why you need to specify a value for it separately in the apply method. Because this comes before the formal parameters, methods are chained from left-to-right.
add.apply(null, [2, 3]); // this comes before the formal parameters
If this were to come after the formal parameters the above code would probably read as:
var oddities = length.filter(odd).[0, 1, 2, 3, 4, 5, 6, 7, 8, 9];
apply([2, 3], null).add; // this comes after the formal parameters
Not very nice is it? Then why do functions in Haskell read backwards? The answer is currying. You see functions in Haskell also have a "this" parameter. However unlike in JavaScript the this parameter in Haskell is not special. In addition it comes at the end of the argument list. For example:
filter :: (a -> Bool) -> [a] -> [a]
The filter function takes a predicate function and a this list and returns a new list with only the filtered elements. So why is the this parameter last? It makes partial application easier. For example:
filterOdd = filter odd
oddities = length . filterOdd $ [0..9]
In JavaScript you would write:
Array.prototype.filterOdd = [].filter.myCurry(odd);
var oddities = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9].filterOdd().length;
Now which one would you choose? If you're still complaining about reading backwards then I have news for you. You can make Haskell code read forwards using "backward application" and "backward composition" as follows:
($>) :: a -> (a -> b) -> b
a $> f = f a
(>>>) :: (a -> b) -> (b -> c) -> (a -> c)
f >>> g = g . f
oddities = [0..9] $> filter odd >>> length
Now you have the best of both worlds. Your code reads forwards and you get all the benefits of currying.
There are a lot of problems with this that don't occur in functional languages:
The this parameter is specialized. Unlike other parameters you can't simply set it to an arbitrary object. Hence you need to use call to specify a different value for this.
If you want to partially apply functions in JavaScript then you need to specify null as the first parameter of bind. Similarly for call and apply.
Object-oriented programming has nothing to do with this. In fact you can write object-oriented code in Haskell as well. I would go as far as to say that Haskell is in fact an object-oriented programming language, and a far better one at that than Java or C++.
Lesson 3: Functional programming languages are more object-oriented than most mainstream object-oriented programming languages. In fact object-oriented code in JavaScript would be better (although admittedly less readable) if written in a functional style.
The problem with object-oriented code in JavaScript is the this parameter. In my humble opinion the this parameter shouldn't be treated any differently than formal parameters (Lua got this right). The problem with this is that:
There's no way to set this like other formal parameters. You have to use call instead.
You have to set this to null in bind if you wish to only partially apply a function.
On a side note I just realized that every section of this article is becoming longer than the preceding section. Hence I promise to keep the next (and final) section as short as possible.
4. In defense of Douglas Crockford
By now you must have picked up that I think that most of JavaScript is broken and that you should shift to Haskell instead. I like to believe that Douglas Crockford is a functional programmer too and that he is trying to fix JavaScript.
How do I know that he's a functional programmer? He's the guy that:
Popularized the functional equivalent of the new keyword (a.k.a Object.create). If you don't already do then you should stop using the new keyword.
Attempted to explain the concept of monads and gonads to the JavaScript community.
Anyway, I think Crockford nullified this in the curry function because he knows how bad this is. It would be sacrilege to set it to anything other than null in a book entitled "JavaScript: The Good Parts". I think he's making the world a better place one feature at a time.
By nullifying this Crockford is forcing you to stop relying on it.
Edit: As Bergi requested I'll describe a more functional way to write your object-oriented Calculator code. We will use Crockford's curry method. Let's start with the multiply and back functions:
function multiply(a, b, history) {
return [a * b, [a + " * " + b].concat(history)];
}
function back(history) {
return [history[0], history.slice(1)];
}
As you can see the multiply and back functions don't belong to any object. Hence you can use them on any array. In particular your Calculator class is just a wrapper for list of strings. Hence you don't even need to create a different data type for it. Hence:
var myCalc = [];
Now you can use Crockford's curry method for partial application:
var multiplyPi = multiply.curry(Math.PI);
Next we'll create a test function to multiplyPi by one and to go back to the previous state:
var test = bindState(multiplyPi.curry(1), function (prod) {
alert(prod);
return back;
});
If you don't like the syntax then you could switch to LiveScript:
test = do
prod <- bindState multiplyPi.curry 1
alert prod
back
The bindState function is the bind function of the state monad. It's defined as follows:
function bindState(g, f) {
return function (s) {
var a = g(s);
return f(a[0])(a[1]);
};
}
So let's put it to the test:
alert(test(myCalc)[0]);
See the demo here: http://jsfiddle.net/5h5R9/
BTW this entire program would have been more succinct if written in LiveScript as follows:
multiply = (a, b, history) --> [a * b, [a + " * " + b] ++ history]
back = ([top, ...history]) -> [top, history]
myCalc = []
multiplyPi = multiply Math.PI
bindState = (g, f, s) -->
[a, t] = g s
(f a) t
test = do
prod <- bindState multiplyPi 1
alert prod
back
alert (test myCalc .0)
See the demo of the compiled LiveScript code: http://jsfiddle.net/5h5R9/1/
So how is this code object oriented? Wikipedia defines object-oriented programming as:
Object-oriented programming (OOP) is a programming paradigm that represents concepts as "objects" that have data fields (attributes that describe the object) and associated procedures known as methods. Objects, which are usually instances of classes, are used to interact with one another to design applications and computer programs.
According to this definition functional programming languages like Haskell are object-oriented because:
In Haskell we represent concepts as algebraic data types which are essentially "objects on steroids". An ADT has one or more constructors which may have zero or more data fields.
ADTs in Haskell have associated functions. However unlike in mainstream object-oriented programming languages ADTs don't own the functions. Instead the functions specialize upon the ADTs. This is actually a good thing as ADTs are open to adding more methods. In traditional OOP languages like Java and C++ they are closed.
ADTs can be made instances of typeclasses which are similar to interfaces in Java. Hence you still do have inheritance, variance and subtype polymorphism but in a much less intrusive form. For example Functor is a superclass of Applicative.
The above code is also object-oriented. The object in this case is myCalc which is simply an array. It has two functions associated with it: multiply and back. However it doesn't own these functions. As you can see the "functional" object-oriented code has the following advantages:
Objects don't own methods. Hence it's easy to associate new functions to objects.
Partial application is made simple via currying.
It promotes generic programming.
So I hope that helped.
Reason 1 - not easy to provide a general solution
The problem is that your solution is not general. If the caller doesn't assign the new function to any object, or assigns it to a completely different object, your multiplyPi function will stop working:
var multiplyPi = myCalc.multiply.myCurry(Math.PI);
multiplyPi(1); // TypeError: this.history.concat is not a function
So, neither Crockford's nor your solution can assure that the function will be used correctly. Then it may be easier to say that the curry function works only on "functions", not "methods", and set this to null to force that. We might only speculate though, since Crockford doesn't mention that in the book.
Reason 2 - functions are being explained
If you asking "why Crockford didn't use this or that" - the very likely answer is: "It wasn't important in regard to the demonstrated matter." Crockford uses this example in the chapter Functions. The purpose of the sub-chapter curry was:
to show that functions are objects you can create and manipulate
to demonstrate another usage of closures
to show how arguments can be manipulated.
Finetuning this for a general usage with objects was not purpose of this chapter. As it is problematic if not even impossible (see Reason 1), it was more educational to put there just null instead if putting there something which could raise questions if it actually works or not (didn't help in your case though :-)).
Conclusion
That said, I think you can be perfectly confident in your solution! There's no particular reason in your case to follow Crockfords' decision to reset this to null. You must be aware though that your solution only works under certain circumstances, and is not 100% clean. Then clean "object oriented" solution would be to ask the object to create a clone of its method inside itself, to ensure that the resultant method will stay within the same object.
But what I want to know is why does he set this to null?
There is not really a reason. Probably he wanted to simplify, and most functions that make sense to be curried or partially applied are not OOP-methods that use this. In a more functional style the history array that is appended to would be another parameter of the function (and maybe even a return value).
Wouldn't the expected behavior be that the curried method is the same as the original, including the same this?
Yes, your implementation makes much more sense, however one might not expect that a partially applied function still needs to be called in the correct context (as you do by re-assigning it to your object) if it uses one.
For those, you might have a look at the bind method of Function objects for partial application including a specific this-value.
From MDN:
thisArg The value of this provided for the call to fun. Note that this
may not be the actual value seen by the method: if the method is a
function in non-strict mode code, null and undefined will be replaced
with the global object, and primitive values will be boxed.
Hence, if the method is in non-strict mode and the first argument is null or undefined, this inside of that method will reference Window. In strict mode, this is null or undefined. I've added a live example on this Fiddle.
Furthermore passing in nullor undefined does not do any harm in case the function does not reference this at all. That's probably why Crockford used null in his example, to not overcomplicate things.