There is quite a bit of excitement on asm.js and how it will be able to run some very heavy applications. However, it is compiled from C++ code. Is it still possible to get the benefit of current improvements without knowing C++ or other low level languages?
Here is the thought that I had: Is it at all possible that we can write the code in Js, have it recompiled for asm.js for optimisation?
If you have small function that is very computation-heavy (crunching numbers rather than manipulating DOM) you could rewrite it in asm.js style yourself, manually. It's possible (I've done it), but tedious.
There are other asm.js compilers, e.g. LLJS that you may use instead of C++.
However, asm.js is not magic. You will only get performance benefit when you use language that is much better suited for ahead-of-time optimization than JS. You can't take fully-featured JS and make it faster by running JS VM on top of JS VM, just like you can't make ZIP files smaller by zipping them.
However, it is compiled from C++ code.
It is not. It's a language. Any program can emit text files which contain asm.js code. Emscripten compiles LLVM IR into asm.js, and there are compilers from C and C++ to LLVM IR, but this is only one possible way of getting asm.js code. Admittedly, it's currently the most mature, practical and popular way, but I would not be surprised at all if other asm.js compilers for other languages pop up some time in the future.
Is it still possible to get the benefit of current improvements without knowing C++ or other low level languages?
Well, in theory any language that can efficiently be compiled to machine code ahead-of-time can be implemented efficiently using asm.js, and that includes some rather high-level ones (e.g. Haskell).
But currently, nobody has a working implementation, and I don't expect this to ever become very popular. Right now, if you want asm.js performance, you'd probably write C or C++ code and compile it to asm.js, yes.
Note that the above excludes (among many others) Javascript.The fact that asm.js is a subset of Javascript is convenient in that asm.js code will run on unmodified browsers, but it's not of much use for anyone writing Javascript. asm.js is basically just a thin layer above machine code, with some amends for security and JS interoperability. Compiling JS to asm.js is as hard as compiling it to machine code: Easy if you don't give a damn about performance (just always used boxed dynamically-typed values like an interpreter, and emit calls to runtime library functions), very hard when you do.
In fact, after decades of research into the subject, there's still no example of a highly dynamic language like Javascript, Ruby or Python being compiled into machine code ahead of time and running much faster than a clever interpreter. Just-in-time compilation, on the other hand, is very much practical -- but the major JS engines already do that, in a less roundabout way than compiling to asm.js, then parsing it again and compiling it to machine code.
Asm.js isn't a separate language, but a subset of Javascript. It's just Javascript with a lot stripped out for performance. This means that you don't need to learn another language, though in this case knowing C/C++ might be useful to understand it.
Asm.js is a very strict subset of JavaScript, that can be generated relatively easily when compiling from C/C++ to JavaScript. Asm.js code is much closer to machine code than ordinary JavaScript code, which allowed browsers to heavily optimise for any code written in asm.js. In browsers that implemented those optimizations, your code will typically run about 50% of the speed of a C/C++ program that is compiled to machine code... which may seem slow, but is a hell of a lot faster than any ordinary JavaScript!
However, precisely because it's optimized for machines rather than humans, asm.js is practically impossible to handcode by any human developer... even though it's just JavaScript. While it is technically possible to convert - at least a subset of - ordinary JavaScript to an asm.js equivalent, such a conversion is not an easy task and I haven't encountered any project yet that made any attempt to achieve this.
Until someone achieves such a Herculean task, the best approach to producing asm.js code remains writing your code in C/C++ and converting it to JavaScript.
For more info on asm.js, see eg. John Resig's article from 2013 or the official specs.
Related
I have recently been working on a small to medium sized game in JavaScript to familiarize myself with it and come from a background in C and Java. In my game I am using a few constant numbers to render all of my objects and I was wondering about the performance impact those would have. Both C and Java are languages with compilers that will automatically simplify simple statements such as 2 + 3 into 5 before the code is actually run. This is great because it means I can make my code more readable by having something like WINDOW_MAX_Y - SHIP_Y in my code and understand what it does later. Looking over my code recently however, I found a few lines of code that become very long when written in this style, and because JavaScript does not have a compiler I was wondering whether it would improve performance if I simplified everything into "magic numbers" or if JavaScript had some function that would automatically simplify the lines before execution like the C compiler. I would like to keep my code readable, however, if JavaScript does not simplify those statements I would love to know as they are run thousands of times per second.
Not necessarily before execution but when the interpreter detects that the code is being used heavily.
You mention Java and this is good because I think you may be familiar with the concept since Java was one of the first languages to implement it. It's called just-in-time compilation (or JIT for short).
There is no standard that covers how JIT is to be implemented in Javascript. And there is no standard that mandates that a javascript interpreter must implement JIT (this is exactly the same for Java by the way, it is perfectly legal to implement a JVM that does not implement a JIT compiler). However, market forces have ensured that all current major javascript interpreters have implemented JIT (which may seem bit odd since all major javascript interpreters are free but they are competing for market share, not profit).
Similarly you mentioned C, which is nice because it also means you should be familiar with another concept. In C there is no standard defining how a = 1 + 2 should be compiled to. There is a concept in the C standard that specifies the behaviour of such optimisations though: the as-if rule. It basically states that if a complier was to implement optimisations, the end result must be exactly the same as if there was no optimisation.
All major javascript interpreters currently have most of the common optimisations that compilers typically have. They are all slightly different (just as all C compilers are slightly different) but they all compile code just-in-time. In addition, most interpreters also perform optimisations during compilation to byte code.
Yes, there is byte code, just like Java. While there is no compilation step performed by the programmer, all current javascript interpreters compile to bytecode before execution just like Java. This is true for all modern "scripting" languages such as Python and Ruby.
My advice is basically if you don't worry about such things in C or Java then you should not need to worry about it in javascript. For things that you need to worry about in C (such as micro-optimisations) there is a subset of javascript called asm.js that is guaranteed to be compiled down to machine code instead of bytecode (if set up correctly).
Javascript engine is usually used to transform bytecode from source code.then, the bytecode transforms to native code.
1) Why transformed bytecode ?? source code directly transforming native code is poor performance ?
2) If source code is very simple (ex. a+b function), source code directly transforming native code is good ?
Complexity and portability.
Transforming from source code to and kind of object code, whether it's bytecode for a virtual machine or machine code for a real machine, is a complex process. Bytecode more closely mimics what most real machines do, and so it's easier to work with: better for optimizing the code to run faster, transforming to machine code for an even bigger boost, or even turning into other formats if the situation calls for it.
Because of this, it usually turns out to be easier to write a front end whose only job is to transform the source code to bytecode (or some other intermediate language), and then a back end that works on the intermediate language: optimizes it, outputs machine code, and all that jazz. More traditional compilers for languages like C have done this for a long time. Java could be considered an unusual application of this principle: its build process usually stops with the intermediate representation (i.e. Java bytecode), and then developers ship that out, so that the JVM can "finish the job" when the user runs it.
There are two big benefits to working this way, aside from making the code easier to work with. The first big advantage is that you can reuse the backend to work with other languages. This doesn't matter so much for JavaScript (which doesn't have a standardized backend), but it's how projects like LLVM and GCC eventually grow to cover so many different languages. Writing the frontend is hard work, but let's say I made, for example, a Lua frontend for Mozilla's JavaScript backend. Then I could tap into all of the optimization work that Mozilla had put into that backend. This saves me a lot of work.
The other big advantage is that you can reuse the frontend to work with more machines. This one does have practical implications for JavaScript. If I were to write a JavaScript interpreter, I'd probably write my first backend for x86 -the architecture most PCs use- because that's where I'd probably be doing the development work. But most cell phones don't use an x86-based architecture -ARM is more common these days- so if I wanted to run fast on cell phones, I'd need to add an ARM backend. But I could do that, without having to rewrite the whole frontend, so once again, I've saved myself a lot of work. If I wanted to run on the Wii U (or the previous generation of game consoles, or older Macs) then I'd need a POWER backend, but again, I could do that without rewriting the frontend.
The bottom line is that while it seems more complex to do two transformations, in the long run it actually turns out to be easier. This is one of those strange and unintuitive things that pops up sometimes in software design, but the benefits are real.
I'm trying to better understand dart's effect on performance. On the dart website, their benchmarks show that Dart code compiled to Javascript is faster than just Javascript. How is this possible?
I understand how the Dart VM is faster than v8, but what I don't get is how dart2js-generated javascript is faster than plain old javascript, when both are running in the same environment, v8.
dart2js is able to perform optimizations that would not normally be manually added in JavaScript code.
There is nothing particularly special about Dart being the source language in this case: any automated tool that generates JavaScript should be able to do this, for instance the GWT compiler (Java to JavaScript) does this as well. Of course, you can run automated tools on JavaScript source to generate better JavaScript as well, this is what the Closure compiler does.
Technically, you can manually achieve the same speed with handwritten JavaScript if you know all the tricks.
One example is function inlining. If you need a code fragment called repeatedly you would refactor it out in a function/method. Dart2js often does the opposite. Method calls often get replaced with the code fragment contained by the called function/method which is called inlining. If you would do this manually this would lead to unmaintainable code.
I think many of the optimizations go in that direction.
Writing the code that way would just be unreadable and thus unmaintainable.
This doesn't mean it's sloppy.
There is a great presentation by Vyacheslav Egorov from the dart team where he explains in detail some of the optimizations including in lining..
http://www.infoq.com/presentations/dart-compiler
Summary Vyacheslav Egorov details how some of Dart's language features affected the design of a new JIT Dart compiler and how the V8
JavaScript engine influenced the overall design.
There is an interesting video by Seth Ladd and Kasper Lund.
Kasper is involved in creating the Dart2js compiler and gives some code examples on how the compiler optimizes the Javascript code.
Is it possible to create a simpler language by restricting the Javascript support in Google's V8? I'd like to embed the V8 engine in my own tool to run dynamic scripts, and like the idea of V8 precomiling the source for speed. However I need to drastically restrict what is possible within the language.
That means no dynamic allocation of data containers (e.g. arrays), no imported libraries, no recursion, no threads. It's more similar in philosophy to Renderman Shading Language than a general purpose language. The 'new' language is thus much simpler, and I'm only considering JS due to familiar syntax and the fact there's a good 'compiler' already (V8). I might also want it to run script code from within Chrome's native code (NaCl) environment, which Google seems to be working to support in V8.
How easy is it to redefine the JS 'grammar', or whatever other code define the language?
My other option is to create a new compiled language from scratch (maybe using LLVM stuff).
For all the features restriction you want, you would need to carry out a major surgery on V8 as V8 is never designed for such a radical modification.
An alternative solution is to invent a JavaScript-like language (with all the limitations you can impose) and compile it into normal JavaScript which then you can run with V8 (or any other JavaScript engine, for that matter). Well-known examples of such an approach are GWT (from Java), Dart, and TypeScript.
Take a closer look at squirrel language :
http://squirrel-lang.org
from description overview :
"both compiler and virtual machine fit together in about 7k lines of C++ code and add only around 100kb-150kb the executable size."
Enjoy!
I've read the article about Google's upcoming DASH/DART language, which I found quite interesting.
One thing I stumbled upon is that they say they will remove the inherent performance problems of JavaScript. But what are these performance problems exactly? There aren't any examples in the text. This is all it says:
Performance -- Dash is designed with performance characteristics in
mind, so that it is possible to create VMs that do not have the performance
problems that all EcmaScript VMs must have.
Do you have any ideas about what those inherent performance problems are?
This thread is a must read for anyone interested in dynamic language just in time compilers:
http://lambda-the-ultimate.org/node/3851
The participants of this thread are the creator of luajit, the pypy folks, Mozilla's javascript developers and many more.
Pay special attention to Mike Pall's comments (he is the creator of luajit) and his opinions about javascript and python in particular.
He says that language design affects performance. He gives importance to simplicity and orthogonality, while avoiding the crazy corner cases that plague javascript, for example.
Many different techiques and approaches are discussed there (tracing jits, method jits, interpreters, etc).
Check it out!
Luis
The article is referring to the optimization difficulties that come from extremely dynamic languages such as JavaScript, plus prototypal inheritance.
In languages such as Ruby or JavaScript, the program structure can change at runtime. Classes can get a new method, functions can be eval()'ed into existence, and more. This makes it harder for runtimes to optimize their code, because the structure is never guaranteed to be set.
Prototypal inheritance is harder to optimize than more traditional class-based languages. I suspect this is because there are many years of research and implementation experience for class-based VMs.
Interestingly, V8 (Chrome's JavaScript engine) uses hidden classes as part of its optimization strategy. Of course, JS doesn't have classes, so object layout is more complicated in V8.
Object layout in V8 requires a minimum of 3 words in the header. In contrast, the Dart VM requires just 1 word in the header. The size and structure of a Dart object is known at compile time. This is very useful for VM designers.
Another example: in Dart, there are real lists (aka arrays). You can have a fixed length list, which is easier to optimize than JavaScript's not-really-arrays and always variable lengths.
Read more about compiling Dart (and JavaScript) to efficient code with this presentation: http://www.dartlang.org/slides/2013/04/compiling-dart-to-efficient-machine-code.pdf
Another performance dimension is start-up time. As web apps get more complex, the number of lines of code goes up. The design of JavaScript makes it harder to optimize startup, because parsing and loading the code also executes the code. In Dart, the language has been carefully designed to make it quick to parse. Dart does not execute code as it loads and parses the files.
This also means Dart VMs can cache a binary representation of the parsed files (known as a snapshot) for even quicker startup.
One example is tail call elimination (I'm sure some consider it required for high-performance functional programming). A feature request was put in for Google's V8 Javascript VM, but this was the response:
Tail call elimination isn't compatible with JavaScript as it is used in the real
world.