I'm currently writing my own interpreter and benchmarked around with some languages for comparison.
The following pseudo-code recursive fibonacci function
fib(n) => n < 2 ? n : fib(n-1) + fib(n-2)
executed on my machine with fib(32); resulted in:
- Py: ~500ms (Python 3)
- PHP: ~300ms (PHP 7)
- Java: ~72ms (Java17)
- C#: ~30ms (.NET Core 5)
- JS: ~24ms (Chrome)
- C/++: ~8ms (current GCC)
How is this possible?
I've ran the function inside chrome's JS interpreter (dev console).
My own language, written in C as an interpreter compiling into bytecode, takes 120ms.
My first take was that JS looks at function calls and asynchronously starts them in parallel, waiting until both return - which would reduce the time needed (just for the first call) by almost half.
But I could be so wrong. How is JavaScript, in a Browser, in this scenario, faster or nearly as fast as C#, Java, etc.?
I am aware that a looped approach is a quintillion times faster in every case, but this is important to me and to my understanding of language design.
Thanks in advance to all you clever minds out there!
JavaScript is JIT-Compiled (just-in-time compiled, at least most implementations are), meaning that if it sees a function is called a lot with the same data type arguments, it will optimize the code to use more native data types instead of treating everything as an abstract type.
There's a talk by Franziska Hinkelmann called "JavaScript engines - how do they even?" that includes a lot more detail about how engines detect when to optimize and a bit more on how they do the optimization.
Related
I'm trying to learn whether or not and at which point and to what extent is the following TypeScript function optimized on its way from JavaScript to machine code in some V8 environment for example:
function foo(): number {
const a = 1;
const b = a;
const c = b + 1;
return c;
}
The function operates with constants with no parameters so it's always equivalent to the following function:
function foo(): number {
return 1 + 1;
}
And eventually in whatever bytecode or machine code just assign the number 2 to some register, without intermediary assignments of values or pointers from one register to another.
Assuming optimizing such simple logic is trivial, I could imagine a few potential steps where it could happen:
Compiling TypeScript to JavaScript
Generating abstract syntax tree from the JavaScript
Generating bytecode from the AST
Generating machine code from the bytecode
Repeating step 4 as per just-in-time compilation
Does this optimization happen, or is it a bad practice from the performance point of view to assign expressions to constants for better readability?
(V8 developer here.)
Does this optimization happen
Yes, it happens if and when the function runs hot enough to get optimized. Optimized code will indeed just write 2 into the return register.
Parsers, interpreters, and baseline compilers typically don't apply any optimizations. The reason is that identifying opportunities for optimizations tends to take more time than simply executing a few operations, so doing that work only pays off when the function is executed a lot.
Also, if you were to set a breakpoint in the middle of that function and inspect the local state in a debugger, you would want to see all three local variables and how they're being assigned step by step, so engines have to account for that possibility as well.
is it a bad practice from the performance point of view to assign expressions to constants for better readability?
No, it's totally fine to do that. Even if it did cost you a machine instruction or two, having readable code is usually more important.
This is true in particular when a more readable implementation lets you realize that you could simplify the whole algorithm. CPUs can execute billions of instructions per second, so saving a handful of instructions won't change much; but if you have an opportunity to, say, replace a linear scan with a constant-time lookup, that can save enough computational work (once your data becomes big enough) to make a huge difference.
I made two similar codes on c++ and node.js, that just work with strings. I have this in .js file:
//some code, including f
console.time('c++');
console.log(addon.sum("123321", s));
console.timeEnd('c++');
console.time('js');
console.log(f("123321", s));
console.timeEnd('js');
and my c++ addon looks like that:
//some code, including f
void Sum(const v8::FunctionCallbackInfo<v8::Value>& args)
{
v8::Isolate* isolate = args.GetIsolate();
v8::String::Utf8Value str0(isolate, args[0]);
v8::String::Utf8Value str1(isolate, args[1]);
string a(*str0), b(*str1);
string s2 = f(a, b);
args.GetReturnValue().Set(v8::String::NewFromUtf8(isolate, s2.c_str()).ToLocalChecked());
}
but the problem is that c++ works nearly 1.5 times slower, than js, even though the function on JS has some parts, that can be optimised (I did not write it very accurately).
In the console I get
#uTMTahdZ22d!a_ah(3(_Zd_]Zc(tJT[263mca!(jcT[20_]h0h_06q(0jJ(T]!&]qZM]d_30j&Tuj2hm[Z0d#!32ccT2(!dud#6]0MdJc]mta!3]j]_(hhJqha(([
c++: 7.970s
#uTMTahdZ22d!a_ah(3(_Zd_]Zc(tJT[263mca!(jcT[20_]h0h_06q(0jJ(T]!&]qZM]d_30j&Tuj2hm[Z0d#!32ccT2(!dud#6]0MdJc]mta!3]j]_(hhJqha(([
js: 5.062s
So, the results of functions are similar, but the JS program ran a lot faster. How can it be? Shouldn't c++ faster than JS (at least not so much slower)? Maybe I did not took in account an important detail, that slows c++ so much, or working with strings is really so slow in c++?
First off, the Javascript interpreter is pretty advanced in what types of optimizations it can do (actually compiling Javascript code to native code in some cases) which significantly reduces the differences between Javascript and C++ compared to what most people would think.
Second, crossing the C++/Javascript boundary has some overhead cost associated with it as you marshall the function arguments in between the Javascript world and the C++ world (creating copies, doing heap operations, etc...). So, if that overhead is significant relative to the execution of the operation, then it can negate your advantages to going to C++ in the first place.
For more detailed comments, we would need to see the actual implementation of f() in both Javascript and C++.
A simplified example:
function shorten(string) {
return string.slice(0, 3);
}
const today = "Friday";
if (shorten(today) === "Fri") {
console.log("Oh yeah it's " + shorten(today));
}
shorten(today) is called twice here, which makes me feel bad. I believe we all run into this situation every day, and what we do is store the the value of shorten(today) in a variable first, then use that variable twice.
My question is: are modern JS engines smart enough so that I actually don't need to worry about it?
If you run shorten multiple times, the V8 engine has a JIT compiler that will optimize that piece of code so it runs faster the next time.
When it runs into the same function call for the 2nd time, maybe it's able to realize it has just did the same calculation, and still have the result in memory
What you described is known as memoization, and V8 doesn't do that. However, there are libraries out there (e.g. fast-memoize) that does.
But you best bet is still to store the result of the computation in a variable and reference it.
When I execute a simple JS function twice in a row, does it cost twice
the computing power?
Yes. Consider Why is using a loop to iterate from start of array to end faster than iterating both start to end and end to start?
are the modern JS engines smart enough so that I actually don't need
to worry about it?
No. No engine can reliably predict the return value of a JavaScript function call. See Has it been mathematically proven that antivirus can't detect all viruses? Can a regular expression be crafted which determines the return type of a function?
Ques 1: Does it take significantly more time to initialize a JavaScript Object if it has a large number of variables and functions?
Ques 2: Does a large JavaScript(.js) file size is a performance issue?
For Instance:
I am creating a JavaScript Object using Prototype, my sample code is below:
function SimpleObject(){
// no variables and functions attached to it
}
function RootObject(){
var one = "one";
var two = "two";
.
.
var one_thousand = "One Thousand";
}
function Child_1_Object(){
// n number of variables
}
.
.
function Child_1000_Object(){
// n number of variables
}
RootObject.prototype.get_Child_1_Object = function(){
return new Child_1_Object();
}
.
.
RootObject.prototype.get_Child_1000_Object = function(){
return new Child_1000_Object();
}
All the above code is in one .js file which has 10k lines of code(10KLOC).
My question is when I will create an object of RootObject will it take significantly more time comparing to creation of a SimpleObject?
Question one:
Making an object more complicated, with members, functions etc. will increase the amount of time needed to instantiate it. I wrote a simple jsPerf test here to prove the point: http://jsperf.com/simple-vs-complex-object
One thing to note though, you're still creating hundreds of thousands of objects a second - it's hardly slow, even for very complex objects.
Question two:
Large files are problematic only because of their size in terms of a client having to download them. Minifying your code will help with this. e.g. http://javascript-minifier.com/
Definitely. In fast, modern browsers these are miliseconds (or less), but every variable must be initialised, so 10k is always worse than 1.
It is, as it needs to be downloaded by the browser (so the bigger - the slower) and than parsed by the js engine (again - the bigger - the slower)
It's simple math, although - like I said before - it you're just initialising variables - the delays are negligible, so you don't have to worry about that.
No. Most of the amount of time required to instantiate a new object will be determined by what is done in the constructor.
The main issue with having a lot of JavaScript (10k is not even remotely a large JavaScript file) is still a matter of what is it really doing? Sure some JavaScript VMs may run into issues with performance if you have 10mb of JavaScript, but I've never seen even Internet Explorer 7 choke on something like ExtJS 3.4 which has about 2.5mb of uncompressed JavaScript.
Now, download speed may be an issue, but parsing JavaScript is not. All of that is written in C or C++ and runs very quickly. When you are just declaring object types in JavaScript, much of that is just code parsing and assigning the prototype as a known type within the JavaScript VM. It doesn't have to actually execute the constructor at that point so the code you have above will run quickly until, perhaps, you start initializing objects.
Another thing about parsing JavaScript is that parsing is only one of the steps that other languages take. Java, C#, C, C++, etc. also have at least a stage of converting the parse tree into some form of object code. Java and C# stop there because their runtime has to do additional JIT compilation and optimization on the fly; C/C++ and some others have to do linking and optimization before generating usable machine code. Not that "parsing is easy" by any means on a programming language, but it is not the most performance-intensive part.
I know you should tread lightly when making recursive calls to functions in JavaScript because your second call could be up to 10 times slower.
Eloquent JavaScript states:
There is one important problem: In most JavaScript implementations, this second version is about 10 times slow than the first one. In JavaScript, running a simple loop is a lot cheaper than calling a function multiple times.
John Resig even says this is a problem in this post.
My question is: Why is it so inefficient to use recursion? Is it just the way a particular engine is built? Will we ever see a time in JavaScript where this isn't the case?
Function calls are just more expensive than a simple loop due to all the overhead of changing the stack and setting up a new context and so on. In order for recursion to be very efficient, a language has to support some form of tail-call elimination, which basically means transforming certain kinds of recursive functions into loops. Functional languages like OCaml, Haskell and Scheme do this, but no JavaScript implementation I'm aware of does so (it would only be marginally useful unless they all did, so maybe we have a dining philosophers problem).
This is just a way the particular JS engines the browsers use are built, yes. Without tail call elimination, you have to create a new stack frame every time you recurse, whereas with a loop it's just setting the program counter back to the start of it. Scheme, for example, has this as part of the language specification, so you can use recursion in this manner without worrying about performance.
https://bugzilla.mozilla.org/show_bug.cgi?id=445363 indicates progress being made in Firefox (and Brendan Eich speaks in here about it possibly being made a part of the ECMAScript spec), but I don't think any of the current browsers have this implemented quite yet.