JavaScript BigInt with a gmp-style API, specifically mpfr - javascript

I am writing a transpiler from my desktop programming language to JavaScript.
I use gmp on the desktop, so am writing a thin wrapper to mimic the same entry points but use BigInt under the hood.
(NB Emscripten etc NOT involved) So far mpz and mpq are working pretty well, ~30 entry points each, done by hand, so now I'm wondering about mpfr.
Could mpfr be done as mpq with implied/capped denominator of 10^k (where k can be negative), and
accordingly truncated/BigInt numerator? I expect a bit of a struggle with mpfr_const_pi(), mpfr_sin/log/exp(), etc. I say 10^k but am not even certain of that vs 2^k.
I have studied https://github.com/MikeMcl/big.js and friends but no offence meant all that seems to pre-date BigInts, and I simply cannot find anything that implements floats via BigInt.
In short, what code needs to be in mpfr.js so that the following will work (ideally unaltered), obviously any partial ideas, hints, or tips are just as welcome as a full-blown working example. You can assume (eg) mpz_get_str() is available, or of course you can go with using (say) BigInt.toString() etc directly, and not overly panic about precisely where the decimal point has to go, or any "%.75Rf" related nuances. I just need something to get the ball rolling.
<script src="mpfr.js"></script>
<script>
mpfr_set_default_prec(252); // (enough for 75 decimal places)
let one_third = mpfr_init(1); // (ok, non-std syntax, anyway init to 1)
mpfr_div_si(one_third,one_third,3);
console.log(mpfr_sprintf("%.75Rf",one_third);
</script>

I finally found this https://jrsinclair.com/articles/2020/sick-of-the-jokes-write-your-own-arbitrary-precision-javascript-math-library/ and I've now got pretty much everything I needed working.
While it is exactly what I was looking for, I should point out that it is deeply flawed, for instance there is a frankly outrageous memoize() function liberally applied, which no doubt vastly improved some pointless benchmark but would totally cripple real-world use, and other gross ineffiencies such as exp10(n) returns BigInt(1${[...new Array(n)].map(() => 0).join("")}), instead of the much saner
10n**BigInt(n). Nevertheless it is quite spirited and undeniably well meant, with plenty of good ideas.
Should anyone wish to see the results of my efforts I have uploaded the latest version: https://github.com/petelomax/Phix/blob/master/pwa/builtins/mpfr.js

Related

JavaScript Iterative Variable Filtering

Is there a tool that allows us to search for a javascript variable just like a memory editor does: through iterative filtering either by exact value or change?
(Sorry for the long intro, but it's the best way I found to describe my use case.)
When I was about 14 yrs old, I would use a memory editor to find, monitor, and edit variables in games.
This allowed me to understand a bit better how computers work but also allowed me to have fun, changing the variables of the games to whatever I liked (offline, of course ;) )
The program would show me all variables.
I would then reduce the list of variables by repeatedly filtering: either by searching for its exact value (if it was known) or by change (increase, decrease).
Now I find myself wanting to do the same for Javascript. I've searched for a while, and I've tried different things, including searching for the variables in the window variable in console (please keep in mind I'm not a javascript developer) or using the debug function (which works great if you know where the variable is) but I haven't found a similar solution.
Is it true that no tool exists like this?
There so many use-cases:
debugging: finding where that number with that weird value is;
fun: edit variables just for plain fun, in games, etc;
learning how to code: I learned how to program by "hacking" around, and I know I'm not the only one ;)
probably many others I can't think about.
Does anyone know anything like this?

Writing high-performance Javascript code without getting deoptimised

When writing performance-sensitive code in Javascript which operates on large numeric arrays (think a linear algebra package, operating on integers or floating-point numbers), one always wants the the JIT to help out as much as possible. Roughly this means:
We always want our arrays to be packed SMIs (small integers) or packed Doubles, depending on whether we're doing integer or floating-point calculations.
We always want to be passing the same type of thing to functions, so that they don't get labelled "megamorphic" and deoptimised. For instance, we always want to be calling vec.add(x, y) with both x and y being packed SMI arrays, or both packed Double arrays.
We want functions to be inlined as much as possible.
When one strays outside of these cases, a sudden and drastic performance dropoff occurs. This can happen for various innocuous reasons:
You might turn a packed SMI array into a packed Double array via a seemingly innocuous operation, like the equivalent of myArray.map(x => -x). This is actually the "best" bad case, since packed Double arrays are still very fast.
You might turn a packed array into a generic boxed array, for example by mapping the array over a function which (unexpectedly) returned null or undefined. This bad case is fairly easy to avoid.
You might deoptimise a whole function such as vec.add() by passing in too many types of things and turning it megamorphic. This could happen if you want to do "generic programming", where vec.add() is used both in cases where you're not being careful about types (so it sees a lot of types come in) and in cases where you want to eke out maximum performance (it should only ever receive boxed doubles, for instance).
My question is more of a soft question, about how one writes high-performance Javascript code in light of the considerations above, while still keeping the code nice and readable. Some specific sub-questions so that you know what kind of answer I'm aiming for:
Is there a set of guidelines somewhere on how to program while staying in the world of packed SMI arrays (for instance)?
Is possible to do generic high-performance programming in Javascript without using something like a macro system to inline things like vec.add() into callsites?
How does one modularise high-performance code into libaries in light of things like megamorphic call sites and deoptimisations? For instance, if I am happily using Linear Algebra package A at high speed, and then I import a package B that depends on A, but B calls it with other types and deoptimises it, suddenly (without my code changing) my code runs slower.
Are there any good easy to use measurement tools for checking what the Javascript engine is doing internally with types?
V8 developer here. Given the amount of interest in this question, and the lack of other answers, I can give this a shot; I'm afraid it won't be the answer you were hoping for though.
Is there a set of guidelines somewhere on how to program while staying in the world of packed SMI arrays (for instance)?
Short answer: it's right here: const guidelines = ["keep your integers small enough"].
Longer answer: giving a comprehensive set of guidelines is difficult for various reasons. In general, our opinion is that JavaScript developers should write code that makes sense to them and their use case, and JavaScript engine developers should figure out how to run that code fast on their engines. On the flip side, there are obviously some limitations to that ideal, in the sense that some coding patterns will always have higher performance costs than others, regardless of engine implementation choices and optimization efforts.
When we talk about performance advice, we try to keep that in mind, and carefully estimate what recommendations have a high likelihood of remaining valid across many engines and many years, and also are reasonably idiomatic/non-intrusive.
Getting back to the example at hand: using Smis internally is supposed to be an implementation detail that user code doesn't need to know about. It'll make some cases more efficient, and shouldn't hurt in other cases. Not all engines use Smis (for example, AFAIK Firefox/Spidermonkey historically hasn't; I've heard that for some cases they do use Smis these days; but I don't know any details and can't speak with any authority on the matter). In V8, the size of Smis is an internal detail, and has actually been changing over time and over versions. On 32-bit platforms, which used to be the majority use case, Smis have always been 31-bit signed integers; on 64-bit platforms they used to be 32-bit signed integers, which recently seemed like the most common case, until in Chrome 80 we shipped "pointer compression" for 64-bit architectures, which required lowering Smi size to the 31 bits known from 32-bit platforms. If you happened to have based an implementation on the assumption that Smis are typically 32 bits, you'd get unfortunate situations like this.
Thankfully, as you noted, double arrays are still very fast. For numerics-heavy code, it probably makes sense to assume/target double arrays. Given the prevalence of doubles in JavaScript, it is reasonable to assume that all engines have good support for doubles and double arrays.
Is possible to do generic high-performance programming in Javascript without using something like a macro system to inline things like vec.add() into callsites?
"generic" is generally at odds with "high-performance". This is unrelated to JavaScript, or to specific engine implementations.
"Generic" code means that decisions have to be made at runtime. Every time you execute a function, code has to run to determine, say, "is x an integer? If so, take that code path. Is x a string? Then jump over here. Is it an object? Does it have .valueOf? No? Then maybe .toString()? Maybe on its prototype chain? Call that, and restart from the beginning with its result". "High-performance" optimized code is essentially built on the idea to drop all these dynamic checks; that's only possible when the engine/compiler has some way to infer types ahead of time: if it can prove (or assume with high enough probability) that x is always going to be an integer, then it only needs to generate code for that case (guarded by a type check if unproven assumptions were involved).
Inlining is orthogonal to all this. A "generic" function can still get inlined. In some cases, the compiler might be able to propagate type information into the inlined function to reduce polymorphism there.
(For comparison: C++, being a statically compiled language, has templates to solve a related problem. In short, they let the programmer explicitly instruct the compiler to create specialized copies of functions (or entire classes), parameterized on given types. That's a nice solution for some cases, but not without its own set of drawbacks, for example long compile times and large binaries. JavaScript, of course, has no such thing as templates. You could use eval to build a system that's somewhat similar, but then you'd run into similar drawbacks: you'd have to do the equivalent of the C++ compiler's work at runtime, and you'd have to worry about the sheer amount of code you're generating.)
How does one modularise high-performance code into libaries in light of things like megamorphic call sites and deoptimisations? For instance, if I am happily using Linear Algebra package A at high speed, and then I import a package B that depends on A, but B calls it with other types and deoptimises it, suddenly (without my code changing) my code runs slower.
Yes, that's a general problem with JavaScript. V8 used to implement certain builtins (things like Array.sort) in JavaScript internally, and this problem (which we call "type feedback pollution") was one of the primary reasons why we have entirely moved away from that technique.
That said, for numerical code, there aren't all that many types (only Smis and doubles), and as you noted they should have similar performance in practice, so while type feedback pollution is indeed a theoretical concern, and in some cases can have significant impact, it's also fairly likely that in linear algebra scenarios you won't see a measurable difference.
Also, inside the engine there are many more situations than "one type == fast" and "more than one type == slow". If a given operation has seen both Smis and doubles, that's totally fine. Loading elements from two kinds of arrays is fine too. We use the term "megamorphic" for the situation when a load has seen so many different types that it's given up on tracking them individually and instead uses a more generic mechanism that scales better to large numbers of types -- a function containing such loads can still get optimized. A "deoptimization" is the very specific act of having to throw away optimized code for a function because a new type is seen that hasn't been seen previously, and that the optimized code therefore isn't equipped to handle. But even that is fine: just go back to unoptimized code to collect more type feedback, and optimize again later. If this happens a couple of times, then it's nothing to worry about; it only becomes a problem in pathologically bad cases.
So the summary of all that is: don't worry about it. Just write reasonable code, let the engine deal with it. And by "reasonable", I mean: what makes sense for your use case, is readable, maintainable, uses efficient algorithms, doesn't contain bugs like reading beyond the length of arrays. Ideally, that's all there is to it, and you don't need to do anything else. If it makes you feel better to do something, and/or if you're actually observing performance issues, I can offer two ideas:
Using TypeScript can help. Big fat warning: TypeScript's types are aimed at developer productivity, not execution performance (and as it turns out, those two perspectives have very different requirements from a type system). That said, there is some overlap: e.g. if you consistently annotate things as number, then the TS compiler will warn you if you accidentally put null into an array or function that's supposed to only contain/operate on numbers. Of course, discipline is still required: a single number_func(random_object as number) escape hatch can silently undermine everything, because the correctness of the type annotations is not enforced anywhere.
Using TypedArrays can also help. They have a little more overhead (memory consumption and allocation speed) per array compared to regular JavaScript arrays (so if you need many small arrays, then regular arrays are probably more efficient), and they're less flexible because they can't grow or shrink after allocation, but they do provide the guarantee that all elements have exactly one type.
Are there any good easy to use measurement tools for checking what the Javascript engine is doing internally with types?
No, and that's intentional. As explained above, we don't want you to specifically tailor your code to whatever patterns V8 can optimize particularly well today, and we don't believe that you really want to do that either. That set of things can change in either direction: if there's a pattern you'd love to use, we might optimize for that in a future version (we have previously toyed with the idea of storing unboxed 32-bit integers as array elements... but work on that hasn't started yet, so no promises); and sometimes if there's a pattern we used to optimize for in the past, we might decide to drop that if it gets in the way of other, more important/impactful optimizations. Also, things like inlining heuristics are notoriously difficult to get right, so making the right inlining decision at the right time is an area of ongoing research and corresponding changes to engine/compiler behavior; which makes this another case where it would be unfortunate for everyone (you and us) if you spent a lot of time tweaking your code until some set of current browser versions does approximately the inlining decisions you think (or know?) are best, only to come back half a year later to realize that then-current browsers have changed their heuristics.
You can, of course, always measure performance of your application as a whole -- that's what ultimately matters, not what choices specifically the engine made internally. Beware of microbenchmarks, for they are misleading: if you only extract two lines of code and benchmark those, then chances are that the scenario will be sufficiently different (e.g., different type feedback) that the engine will make very different decisions.

implement infinite lists in v8

I'm a computer science student and as part of a school projet I have been asked to either find an exploit in the v8 engine, make some really good optimisation or add a new feature.
I chose to add a new feature and here it is:
function* numbers() {
i = 1;
while (true) {
yield i++;
}
}
var gen = numbers();
var l = [...gen];
var n = l[42];
Putting it in words I want to have the possibility to use the destructuring syntax to create a list that can hold an infinite number of objects and access them.
It's possible to do it in Haskell and I want to try and do the same with JavaScript.
If of the developers at v8 could point me in the right direction it would be so great.
I already have a working environment, can compile the engine, read the source code, and run the debugger on the d8 binary file with symbols.
V8 developer here.
First off: just to be clear, stackoverflow is not a machine that does your homework. (You're only asking for "the right direction", that's okay.)
Secondly: V8 implements JavaScript as spec'ed, so any arbitrary "new feature" is not going to land in our repository, please be aware of that.
Thirdly: Keith has several good points. In particular, the syntax you propose is already valid JavaScript and eagerly evaluates the generator. Was your idea to switch to lazy evaluation iff the generator produces an infinite stream of values? Take a step back and think about the implications of that idea for a minute.
Finally, if you come up with workable syntax/semantics, then it'll still be a chunk of work to do this in V8, because there's no precedent of something similar. You'd probably want to use an elements interceptor, and store the generator in a private property. I think it would be much easier to polyfill the whole thing in pure JavaScript using a Proxy.
(It might be a good idea to reconsider your choice of project, but that's up to you. It's also quite a funky project description to begin with... what do they think how hard it is to "find an exploit or make some really good optimisation"? Do let us know if you find an exploit though!)

ES8 - Why have padStart/padEnd methods?

Trying to understand the reasoning behind the support for these 2 methods in ES8. padEnd for example - this can be achieved either using concat, replace, repeat.
So is it just to have a cleaner way of achieving this because this could be a common use-case or this is more efficient than current alternatives?
Edit: It would help to know why a question is down voted - was the question too opinionated/broad to ask?
It's just for convenience. There are a huge amount of functions that could be done using other low level means - but when written poorly they result in bugs, or inefficient code. Everyone wins when the language adds support for something people often do.
To exaggerate your example - languages don't need for loops either. You can generally write the same sort of code with a while loop. People don't need ternaries - they can be done with a standard if statement. In both of the examples people would generally need to write more code to achieve the same effect - but why make the coder do that?
I would reverse the question - why do you think they shouldn't include padEnd?
I think your question is asking for usecases of padstart, padend functions, i.e what prompted for these to be included in ecmascript.
As pointed above, they are helper functions that let you achieve
more with less code.
Displaying tabular data in a monospaced font.
Adding a count or an ID to a file name or a URL: 'file 001.txt'.
Aligning console output: 'Test 001: ✓'.
Printing hexadecimal or binary numbers that have a fixed number of digits: '0x00FF'
You can read more about its usecase/applications here :
http://exploringjs.com/es2016-es2017/ch_string-padding.html
https://www.theregister.co.uk/2017/07/12/javascript_spec_straps_on_padding/

How inefficient is String concatenation in Javascript?

Such as
var myName = 'Bob';
myName += ' is a good name';
For long operations of this, it there a better way to do it? Maybe with a StringBuffer type of structure?
Thanks! :)
The ‘better’ way would be:
var nameparts= ['Bob'];
nameparts.push(' is a good name');
...
nameparts.join('');
However, most modern JavaScript implementations do now detect naïve concatenation and can in many cases optimise it away, because so many people have (alas) written code this way. So in practice the ‘good’ method won't today be as much faster as it once was.
Huge performance boost can be obtained by simply using intermediate strings! It is possible to create StringBuffer-like class in JavaScript to gain even more performance boost.
See the complete article and graphs here.
Efficiency of string concatenation will depend on a browser you are using. You can google for statistics, there is also a googleTalk available on youtube. From what I can remember, most browsers deal with string concatenations efficiently when number of elements is below few thousand. After that IE slows down at exponential rate, when firefox, chrome and safari are doing much better. This may change since IE9 isn't that far away now.
I once read an article about this subject which offerer some code to build buffered strings with arrays:
http://www.softwaresecretweapons.com/jspwiki/javascriptstringconcatenation
I tested it myself and it was way faster in IE... and way slower in Firefox!
To sum up: there're many JavaScript engines out there and we can't really rely on this sort of implementation details. If it's ever an issue, you'll notice. Before that, don't care too much.

Categories