Is it accurate to say JavaScript is a "single-thread" language?

Is it accurate to say JavaScript is a "single-thread" language? - javascript

I've heard this kind of statements many times, but personally I think this doesn't quite make sense. I think people are confusing JavaScript as a language specification and JavaScript in practice (browser, Node, etc.). Of course in most cases JavaScript are executed in a single-thread environment; but AFAIK nothing in the language specification requires it to be so. I think this is just like saying Python is "interpreted", while it's in fact entirely a matter of implementation.
So, is it accurate to say JavaScript is a "single-thread" language?

By JavaScript you seem to mean ECMAScript.
There's already multithreading in the browser, built with webworkers, and based on a strong isolation of data : workers only communicate by message passing, nothing is shared.
If you want more intricate multithreading, with data sharing, then that doesn't look possible now. Nothing in ECMAScript explicitely forbids multithreading but you can't do multithreading without
facilities to create "threads" (in a general sense, that could be coroutines)
mutexes and facilities to synchronize accesses
a low level support to ensure for example a property change won't break the data in case of simultaneous accesses. None of the current engines has been designed with this kind of strength (yes, some of them support multiple threads but in isolation).
The fact ECMAScript wasn't designed to include multi-threading is enough to prevent, currently, to support it (other than message-passing isolated multi-threading as is already done but it's a very limited kind of multi-threading).
You have to realize that
data sharing multi-threading is very expensive (not even speaking about simultaneous actions on the DOM)
you would rarely use it in JavaScript
Why do I say you would rarely use it ? Because most of the IO blocking tasks (file reading, requests, db queries, etc.), most of the low level tasks (for example image decoding or page rendering), most of the UI management (with event queue), most of the scheduling (timeouts and intervals) is done outside for you.

Multi-threading behavior is available in both HTML5 and node.js, BUT there is no native threading API in the Javascript language, so I guess the answer to your contrived question (I mean that in the nicest possible way, of course) is "yes, Javascript is a single-threaded language."

AFAIK nothing in the language specification requires it to be so.
In TC39 it says:
At any point in time, there is at most one execution context per agent that is actually executing code.
That appears to me to be the crucial guarantee that you never have to synchronize access to variables in ECMAScript. Which is what I expect is meant when someone says a language is single-threaded.
Of course, most ECMAScript host environments use more than one thread in their host environment implementation for things like garbage collection, etc. And, ECMAScript itself allows for multiple separate contexts of execution that could each be driven by their own thread--though the standard makes it clear you could also just drive all of them with the same thread.
The point, again, is you never have to protect any ECMAScript variable with a mutex, semaphore, or the like (which is why ECMAScript provides no such facilities) since the language promises there will never be two threads of control with simultaneous access to the same context.
I don't know of any JavaScript implementation that violates this promise either, though there certainly could be.

Related

Thread-safety when passing array as argument in WebAssembly? [Emscripten]

AFAIK, when passing arrays from JS to Emscripten-compiled C/C++ functions, we are essentially putting the array into a JS simulated HEAP(like Module.HEAPU8), which is shared by JS code and C/C++ code.
This works fine in a single-threaded environment, but how about a multi-threaded environment, like worker threads? Is there some built-in mechanism to guarantee the thread safety for this simulated HEAP?
If not, does it mean we need to call Module._malloc() & Module._free() to dynamically manage heap space for each thread? If so, this sounds like a potential performance bottleneck, given the effort for array copy and space allocation/free might compromise the benefit we gain from using worker threads.
Reference: ref1 ref2

Your understanding is correct, but it is currently impossible to share a WebAssembly.Memory across workers. JavaScript has SharedArrayBuffer but WebAssembly doesn't yet support the equivalent (and compatible) WebAssembly.Memory with shared=true attribute.
Once it is supported you'll be able to postMessage a WebAssembly.Memory and use it to instantiate multiple modules with it across workers. You'll also be able to postMessage the underlying SharedArrayBuffer, and read / write to and from it using JavaScript, concurrently with WebAssembly.
In all these cases they memory won't be copied. The WebAssembly malloc / free implementation isn't specified, but what you'll get from e.g. Emscripten will be thread safe. It won't use grow_memory initially (the design currently disallows growing a shared memory), but will rather pre-allocate and make sure that's thread safe for you (like any multi-threaded C implementation does).

Is web worker heavier or lighter than a native thread

Is web worker just a normal native thread created by browser to run and communicate with other browser threads with a messaging queue? Or doesn't it contain other things when created?
I'm looking at the experimental support of pthread in emscripten, multiple threads in C++ will be translated to web workers once compiled. But will it have the same level of performance as native code? After all fine grained multithreading is a key feature in C++.

At the moment, WebWorkers are pretty heavyweight because the VM duplicates a bunch of its internal state (sometimes even re-JITs code for a worker). That state is much bigger than a native thread's few MiBs of initial stack space and associated state.
Some of this can be fixed by implementations, and I expect that if SharedArrayBuffer or WebAssembly + threads become popular then browser engines will want to optimize things.
That being said, the end of your question hints at a misunderstanding of what thread overheads are, and how the proposal for SharedArrayBuffer (which Emscripten relies on to support pthreads) works. WebWorkers are heavy at the moment, but they can communicate through SABs in exactly the same way native code such as C++ can: by accessing exactly the same memory, at the same virtual address. SAB adds a new kind of ArrayBuffer to JavaScript which doesn't get neutered when you postMessage it to another worker. Multiple workers can then see other worker's updates to the buffer in exactly the same way C++ code when you use std::atomic.
At the same time, workers can't block the main thread by definition and therefore have a more "native" feel. Some web APIs aren't available to all workers, but that's been changing. This becomes relevant if you e.g. write a game and have network / audio / render / AI / input in different threads. The web is slowly finding its own way of doing these things.
The details are a bit trickier
SAB currently only supports non-atomic accesses, and sequentially-consistent accesses (i.e. the only available Atomic access at the moment is the same as C++'s std::memory_order_seq_cst). Doing non-atomic accesses should be about as performant as C++'s non-atomics (big compiler caveats here which I won't get into), and using Atomic should be about as performant as C++'s std::atomic default (which is std::memory_order_seq_cst). C++ has 5 other memory orders (relaxed, consume, acquire, release, acq_rel) which SAB doesn't support at the moment. These other memory orders allow native code to be faster in some circumstances, but are harder to use correctly and portably. They might be added to future updates to SAB, e.g. through this issue.
SAB also supports Futex which native programs rely on under the hood to implement efficient mutex.
There are even trickier details when comparing to C++, I've detailed some of them but there are even more.

is javascript a concurrent language or is it the javascript engines that makes the language concurrent?

Is it correct to say that javascript is a concurrent programming language or is it rather the different javascript engines that makes javascript concurrent?
Javascript as a concurrent language is not listed on wikipedia, but node.js is:
http://en.wikipedia.org/wiki/Concurrent_computing#Concurrent_programming_languages.
I would appreciate some more information about where the concurrent behaviour of javascript comes from.

To best answer this, it's important to understand what javascript is.
From the ECMAScript language specification
http://www.ecma-international.org/ecma-262/5.1/
ECMAScript is an object-oriented programming language for performing
computations and manipulating computational objects within a host
environment. ECMAScript as defined here is not intended to be
computationally self-sufficient; indeed, there are no provisions in
this specification for input of external data or output of computed
results. Instead, it is expected that the computational environment of
an ECMAScript program will provide not only the objects and other
facilities described in this specification but also certain
environment-specific host objects, whose description and behaviour are
beyond the scope of this specification except to indicate that they
may provide certain properties that can be accessed and certain
functions that can be called from an ECMAScript program.
It's up to the host to determine the implementation. Node.js is one such host, browsers are another such host. Any host can choose to implement the language as per specification, and as a host can provide its own environment by which information is processed.
So, to answer the question
Is it correct to say that javascript is a concurrent programming
language or is it rather the different javascript engines that makes
javascript concurrent?
I would say no, it is not correct to say javascript is a concurrent programming language, because the answer to that depends on the host environment (or engine); however, concurrency can be made possible through a host environment (engine) that enables it.

What optimizations do modern JavaScript engines perform?

By now, most mainstream browsers have started integrating optimizing JIT compilers to their JavaScript interpreters/virtual machines. That's good for everyone. Now, I'd be hard-pressed to know exactly which optimizations they do perform and how to best take advantage of them. What are references on optimizations in each of the major JavaScript engines?
Background:
I'm working on a compiler that generates JavaScript from a higher-level & safer language (shameless plug: it's called OPA and it's very cool) and, given the size of applications I'm generating, I'd like my JavaScript code to be as fast and as memory-efficient as possible. I can handle high-level optimizations, but I need to know more about which runtime transformations are performed, so as to know which low-level code will produce best results.
One example, from the top of my mind: the language I'm compiling will soon integrate support for laziness. Do JIT engines behave well with lazy function definitions?

This article series discusses the optimisations of V8. In summary:
It generates native machine code - not bytecode (V8 Design Elements)
Precise garbage collection (Wikipedia)
Inline caching of called methods (Wikipedia)
Storing class transition information so that objects with the same properties are grouped together (V8 Design Elements)
The first two points might not help you very much in this situation. The third might show insight into getting things cached together. The last might help you create objects with same properties so they use the same hidden classes.
This blog post discusses some of the optimisations of SquirrelFish Extreme:
Bytecode optimizations
Polymorphic inline cache (like V8)
Context threaded JIT (introduction of native machine code generation, like V8)
Regular expression JIT
TraceMonkey is optimised via tracing. I don't know much about it but it looks like it detects the type of a variable in some "hot code" (code run in loops often) and creates optimised code based on what the type of that variable is. If the type of the variable changes, it must recompile the code - based off of this, I'd say you should stay away from changing the type of a variable within a loop.

I found an additional resource:
What does V8 do with that loop?

reinventing the wheels: Node.JS/Event-driven programming v.s. Functional Programming?

Now there's all the hype lately about Node.JS, an event driven framework using Javascript callbacks. To my limited understanding, its primary advantage seems to be that you don't have to wait step by step sequentially (for example, you can fetch the SQL results, while calling other functions too).
So my question is: how is this different, or better than just functional languages, like CL, Haskell, Clojure etc? If not better, then why don't people just do functional languages then (instead of reinventing the wheel with Javascript)?
Please note that I have none experience in either Node.JS nor functional programming. So some basic explanation can be helpful.

Having read through the Node.JS docs (and the nice slide deck), I think the other answers here are missing the point about it: Node.JS is based on the idea that the style of writing programs where we expect them to block on I/O is wrong, and that instead we should initiate I/O (such as reading a database, or reading a socket) and pass along a function to handle the result of the I/O along with the request.
So rather than do this:
var result = db.query("select.."); // blocking
// use result
Node.JS is based on the idea of doing this:
db.query("select..", function (result) {
// use result
});
The authors (of Node.JS) point out that this way of programming is very awkward in many systems as the languages don't have closures or anonymous functions and the libraries are primarily blocking I/O. However, Javascript supplies the former (and programmers are used to it given how JS is used in an event like manner in the browser), and Node.JS fills in the later: being an entirely event driven I/O library with no blocking calls at all.
How does this relate to functional programming? All functional programming languages provide closure constructs powerful enough to do what Node.JS is trying to do with Javascript. Most make it even easier to code, as passing closures is generally fundamental to the language.
As for Haskell, using Monads, this sort of thing could be very easy to construct. For example:
doQuery :: DBConnection -> IO ()
doQuery db = do
rows <- query db "select..."
doSomething rows
doSomethingElse rows
These very sequential, imperative lines of code are actually a sequence of closures under control of the IO monad. It is as if in JavaScript you had written:
db.query("select...", function (rows) {
doSomething(rows, function () {
doSomethingElse(rows, function () { /* done */ })
})
})
In essence, when writing monadic code in a functional language, you already are writing it in the form the Node.JS authors want us to write: Where each step of the sequential computation is passed as a closure to the prior one. However, look how much nicer that code looks in Haskell!
Furthermore, you can easily use concurrent Haskell features to achieve this non-blocking operation easily:
forkQuery :: DBConnection -> IO ThreadId
forkQuery db = forkIO $ do
rows <- query db "select..."
doSomething rows
doSomethingElse rows
Don't confuse that forkIO with the expensive process fork you're used to, or even OS process threads. It is basically the same light weight thread of execution that Node.JS is using (only with somewhat richer semantics). You can have 1,000s of 'em just like Node.JS aims for.
So, in short - I think Node.JS is building on a facility that exists in JavaScript, but that is much more natural in functional languages. Furthermore, I think all the elements that are in Node.JS already exist in Haskell and its packages, and then some. For me, I'd just use Haskell!

I don't really know about Node.JS either, but I don't really see any striking similarity between it (from your description) and functional programming. From your description, Node.JS seems to be aimed at aiding asynchronous programming -- as you state "you don't have to wait step by step sequentially", you can do other tasks as one long-running task does its thing.
Functional programming is completely orthogonal to this -- i.e. it doesn't really have any link to asynchronicity. You can have one without the other, or both together, or neither of them. Functional programming is about eliminating side-effects in your programs, and about allowing functions as first-class members of the language, to be manipulated and composed similarly to other values.

This isn't really "reinventing the wheel." Javascript isn't really a functional language per se, but it was based on Lisp and this is the sort of thing it was designed to do. Javascript is really stronger as a Lisp-ish functional language than it is as an OO language in my opinion. That's why frameworks with strongly functional* designs like jQuery fit the language so well.
(* Note: Not pure, obviously, but functional in much the same way as Scheme.)

I haven't used node.js yet, but I'm definitely interested in it and will be trying it out soon. There are already lots of great answers about functional programming and such here, so I won't go into that.
You ask why not use some other language on the server, such as haskell, Closure, etc. For me, the attraction of node.js over others is that it IS javascript. My applications are already heavy in javascript on the client, so it means I could work in one language on both the server and the client.
My hope is that this will streamline and simplify development because I won't need to be switching contexts so drastically. There might even be some reduction in work if some logic that is used both on client and server could be shared (perhaps form validation code and the like).

Javascript's key role in node as a webserver is that it is largely event driven.
I believe functional programming has advantages for concurrency, due to immutability among other things.
not sure how event driven other functional languages are, just wanted to highlight it as part of the advantages of node.

One of the main benifits with the gap between the client and the server shrinking is the ability to reuse code on the client and the server. For instance if you want to have a rich and dynamic AJAX website for modern browsers and a stripped down version for older browsers you can use the same display code to format incomming data on both the client and the server.
Other benifits with this include the HTML5/Google Gears/Adobe Air ability to have a local storage DB and server to run web apps offline, you can have code that would traditionally be on the server stored locally for when the server is not available.

We Keep Coding

JavaScript is the programming language of the Web.