How v8 Javascript engine performs bit operations on Int32Array values - javascript

As far as I know, V8 Javascript engine makes double number conversion (to i32 and back) to perform bit operation.
Let's consider next example:
const int32 = new Int32Array(2);
int32[0] = 42;
int32[1] = 2;
const y = int32[0] | int32[1];
Does V8 makes double conversion in this case ?
If no, does it mean bit operations are faster on Int32Array values?
UPDATE:
By double conversion I mean:
from 64 bit (53 bits precision) to -> 32bit and again to -> 64 bit

V8 developer here. The short answer is: there are no conversions here, bitwise operations are always performed on 32-bit integers, and an Int32Array also stores its elements as 32-bit integers.
The longer answer is "it depends". In unoptimized code, all values use a unified representation. In V8, that means number values are either represented as a "Smi" ("small integer") if they are in range (31 bits including sign, i.e. about -1 billion to +1 billion), or as "HeapNumbers" (64-bit double with a small object header) otherwise. So for an element load like int32[0], the 32-bit value is loaded from the array, inspected for range, and then either tagged as a Smi or boxed as a HeapNumber. The following | operation looks at the input, converts it to a 32-bit integer (which could be as simple as untagging a Smi, or as complex as invoking .valueOf on an object), and performs the bitwise "or". The result is then again either tagged as a Smi or boxed as a HeapNumber, so that the next operation can decide what to do with it.
If the function runs often enough to eventually get optimized, then the optimizing compiler makes use of type feedback (guarded by type checks where necessary) to elide all these conversions that are unnecessary in this case, and the simple answer will be true.

Related

Keep trailing or leading zeroes on number

Is it possible to keep trailing or leading zeroes on a number in javascript, without using e.g. a string instead?
const leading = 003; // literal, leading
const trailing = 0.10; // literal, trailing
const parsed = parseFloat('0.100'); // parsed or somehow converted
console.log(leading, trailing, parsed); // desired: 003 0.10 0.100
This question has been regularly asked (and still is), yet I don't have a place I'd feel comfortable linking to (did i miss it?).
Fully analogously would be keeping any other aspect of the representation a number literal was entered as, although asked nowhere near as often:
console.log(0x10); // 16 instead of potentially desired 0x10
console.log(1e1); // 10 instead of potentially desired 1e1
For disambiguation, this is not about the following topics, for some of which I'll add links, as they might be of interest as well:
Padding to a set amount of digits, formatting to some specific string representation, e.g. How can i pad a value with leading zeroes?, How to output numbers with leading zeros in JavaScript?, How to add a trailing zero to a price
Why a certain string representation will be produced for some number by default, e.g. How does JavaScript determine the number of digits to produce when formatting floating-point values?
Floating point precision/accuracy problems, e.g. console.log(0.1 + 0.2) producing 0.30000000000000004, see Is floating point math broken?, and How to deal with floating point number precision in JavaScript?
No. A number stores no information about the representation it was entered as, or parsed from. It only relates to its mathematical value. Perhaps reconsider using a string after all.
If i had to guess, it would be that much of the confusion comes from the thought, that numbers, and their textual representations would either be the same thing, or at least tightly coupled, with some kind of bidirectional binding between them. This is not the case.
The representations like 0.1 and 0.10, which you enter in code, are only used to generate a number. They are convenient names, for what you intend to produce, not the resulting value. In this case, they are names for the same number. It has a lot of other aliases, like 0.100, 1e-1, or 10e-2. In the actual value, there is no contained information, about what or where it came from. The conversion is a one-way street.
When displaying a number as text, by default (Number.prototype.toString), javascript uses an algorithm to construct one of the possible representations from a number. This can only use what's available, the number value, also meaning it will produce the same results for two same numbers. This implies, that 0.1 and 0.10 will produce the same result.
Concerning the number1 value, javascript uses IEEE754-2019 float642. When source code is being evaluated3, and a number literal is encountered, the engine will convert the mathematical value the literal represents to a 64bit value, according to IEEE754-2019. This means any information about the original representation in code is lost4.
There is another problem, which is somewhat unrelated to the main topic. Javascript used to have an octal notation, with a prefix of "0". This means, that 003 is being parsed as an octal, and would throw in strict-mode. Similarly, 010 === 8 (or an error in strict-mode), see Why JavaScript treats a number as octal if it has a leading zero
In conclusion, when trying to keep information about some representation for a number (including leading or trailing zeroes, whether it was written as decimal, hexadecimal, and so on), a number is not a good choice. For how to achieve some specific representation other than the default, which doesn't need access to the originally entered text (e.g. pad to some amount of digits), there are many other questions/articles, some of which were already linked.
[1]: Javascript also has BigInt, but while it uses a different format, the reasoning is completely analogous.
[2]: This is a simplification. Engines are allowed to use other formats internally (and do, e.g. to save space/time), as long as they are guaranteed to behave like an IEEE754-2019 float64 in any regard, when observed from javascript.
[3]: E.g. V8 would convert to bytecode earlier than evaluation, already exchanging the literal. The only relevant thing is, that the information is lost, before we could do anything with it.
[4]: Javascript gives the ability to operate on code itself (e.g. Function.prototype.toString), which i will not discuss here much. Parsing the code yourself, and storing the representation, is an option, but has nothing to do with how number works (you would be operating on code, a string). Also, i don't immediately see any sane reason to do so, over alternatives.

c# Bitwise Xor calculation that seems easy in R or nodeJS

In R if I do bitXor(2496638211, -1798328965) I get 120 returned.
In nodeJS 2496638211 ^ -1798328965; returns 120.
How do I do this in C#? (Think I'm struggling to understand c# type declaration and the way R & nodeJS presumably convert to 32 bits)
In C# you would have to use unchecked and cast to int:
unchecked
{
int result = (int)2496638211 ^ -1798328965;
Console.WriteLine(result); // Prints 120
}
You could write a method to do this:
public static int XorAsInts(long a, long b)
{
unchecked
{
return (int)a ^ (int)b;
}
}
Then call it:
Console.WriteLine(XorAsInts(2496638211, -1798328965)); // Prints 120
Note that by casting to int, you are throwing away the top 32 bits of the long. This is what bitXor() is doing, but are you really sure this is the behaviour you want?
2496638211 is larger than an 32bit integer and in 64 bit systems with C# integers defaults long thus the hex value representation is0x94CFAD03. That 9 overlaps with the sign bits. This is also means that -1798328965 results in 0xFFFFFFFF94CFAD7B due to the negative number representation. So C# gets 0xFFFFFFFF00000078 that is the correct answer.
JS XOR uses 32 bit operands so it truncates 2496638211. Thus JS gets 0x00000078 that is not technically the correct answer. My assumption is that R does the same or similar.
In JS use Number.isSafeInteger() as well as Number.MAX_SAFE_INTEGER. In C# check int.MaxValue vs long.MaxValue to see the difference vs JS.
To recreate the JS behavior in C# wrap it in an unchecked block as per Matthew Watson post. The unchecked keyword is used to suppress overflow-checking for integral-type arithmetic operations and conversions.

How JavaScript decides what size of memory to allocate for a numeric value?

Programming languages like Java / C has int, long , byte etc that suggest interpreter exactly how much memory it should allocate for a number at run-time . This saves a lot of memory if you are dealing with large number of variables.
I'm wondering how programming languages , who doesn't have this primitive variable type declaration (JavaScript , Ruby) , decides how much memory to allocate for lets say var a = 1 . If it allocates lets say 1 byte ,then in the next line if I do a = 99999999999 , it will have to swipe out that variable and reallocate. Won't it be an expensive operation ?
Or does they allocate a very big memory space for all the variables so that one size fit all
Here is a good explanation.
JavaScript values
The type JS::Value represents a JavaScript value.
The representation is 64 bits and uses NaN-boxing on all platforms,
although the exact NaN-boxing format depends on the platform.
NaN-boxing is a technique based on the fact that in IEEE-754 there are
2**53-2 different bit patterns that all represent NaN. Hence, we can
encode any floating-point value as a C++ double (noting that
JavaScript NaN must be represented as one canonical NaN format). Other
values are encoded as a value and a type tag:
On x86, ARM, and similar 32-bit platforms, we use what we call
"nunboxing", in which non-double values are a 32-bit type tag and a
32-bit payload, which is normally either a pointer or a signed 32-bit
integer. There are a few special values: NullValue(),
UndefinedValue(), TrueValue() and FalseValue(). On x64 and similar
64-bit platforms, pointers are longer than 32 bits, so we can't use
the nunboxing format. Instead, we use "punboxing", which has 17 bits
of tag and 47 bits of payload. Only JIT code really depends on the
layout--everything else in the engine interacts with values through
functions like val.isDouble(). Most parts of the JIT also avoid
depending directly on the layout: the files PunboxAssembler.h and
NunboxAssembler.h are used to generate native code that depends on the
value layout.
Objects consist of a possibly shared structural description, called
the map or scope; and unshared property values in a vector, called the
slots. Each property has an id, either a nonnegative integer or an
atom (unique string), with the same tagged-pointer encoding as a
jsval.
The atom manager consists of a hash table associating strings uniquely
with scanner/parser information such as keyword type, index in script
or function literal pool, etc. Atoms play three roles: as literals
referred to by unaligned 16-bit immediate bytecode operands, as unique
string descriptors for efficient property name hashing, and as members
of the root GC set for exact GC.
According to W3Schools:
This format stores numbers in 64 bits, where the number (the fraction)
is stored in bits 0 to 51, the exponent in bits 52 to 62, and the sign
in bit 63:
Value (aka Fraction/Mantissa): 52 bits (0 - 51)
Exponent: 11 bits (52 - 62)
Sign: 1 bit (63)
Also read this article here.

Understanding bitwise operations in javascript

I am currently storing data inside an XML doc as binary, 20 digits long, each representing a boolean value.
<matrix>
<resource type="single">
<map>10001010100011110000</map>
<name>Resource Title</name>
<url>http://www.yoursite.com</url>
</resource>
</matrix>
I am parsing this with jQuery and am currently using a for loop and charAt() to determine whether to do stuff if the value is == "1".
for (var i = 0; i < _mapLength; i++) {
if (map.charAt(i) == "1") {
//perform something here
}
}
This takes place a few times as part of a HUGE loop that has run sort of slow. Someone told me that I should use bitwise operators to process this and it would run faster.
My question is either:
Can someone offer me an example of how I could do this? I've tried to read tutorials online and they seem to be flying right over my head. (FYI: I am planning on creating a Ruby script that will convert my binary 0 & 1's into bits in my XML.)
Or does anyone know of a good, simple (maybe even dumbed down version) tutorial or something that could help me grasp these bitwise operator concepts?
Assuming you have no more than 32 bits, you can use JavaScript's built-in parseInt() function to convert your string of 1s and 0s into an integer, and then test the flags using the & (and) operator:
var flags = parseInt("10001010100011110000", 2); // base 2
if ( flags & 0x1 )
{
// do something
}
...
See also: How to check my byte flag?
(question is on the use in C, but applies to the same operators in JS as well)
Single ampersand (&, as opposed to &&) does bit-wise comparison. But first you need to convert your strings to numbers using parseInt().
var map = parseInt("10010", 2); // the 2 tells it to treat the string as binary
var maskForOperation1 = parseInt("10000", 2);
var maskForOperation2 = parseInt("01000", 2);
// ...
if (map & maskForOperation1) { Operation1(); }
if (map & maskForOperation2) { Operation2(); }
// ...
Be extremely wary. Javascript does not have integers -- numbers are stored as 64 bit floating-point. You should get accurate conversion out to 52 bits. If you get more flags than that, bad things will happen as your "number" gets rounded to the nearest representable floating-point number. (ouch!)
Also, bitwise manipulation will not help performance, because the floating point number will be converted to an integer, tested, and then converted back.
If you have several places that you want to check the flags, I'd set the flags on an object, preferably with names, like so:
var flags = {};
flags.use_apples = map.charAt(4);
flags.use_bananas = map.charAt(10);
etc...
Then you can test those flags inside your loop:
if(flags.use_apples) {
do_apple_thing();
}
An object slot test will be faster than a bitwise check, since Javascript is not optimized for bitwise operators. However, if your loop is slow, I fear that decoding these flags is probably not the source of the slowness.
Bitwise operators will certainly be faster but only linearly and not by much. You'll probably save a few milliseconds (unless you're processing HUGE amounts of data in Javascript, which is most likely a bad idea anyway).
You should think about profiling other code in your loop to see what's slowing it down the most. What other algorithms, data structures and allocations do you have in there that could use refactoring?

What is the accepted way to send 64-bit values over JSON?

Some of my data are 64-bit integers. I would like to send these to a JavaScript program running on a page.
However, as far as I can tell, integers in most JavaScript implementations are 32-bit signed quantities.
My two options seem to be:
Send the values as strings
Send the values as 64-bit floating point numbers
Option (1) isn't perfect, but option (2) seems far less perfect (loss of data).
How have you handled this situation?
There is in fact a limitation at JavaScript/ECMAScript level of precision to 53-bit for integers (they are stored in the mantissa of a "double-like" 8 bytes memory buffer). So transmitting big numbers as JSON won't be unserialized as expected by the JavaScript client, which would truncate them to its 53-bit resolution.
> parseInt("10765432100123456789")
10765432100123458000
See the Number.MAX_SAFE_INTEGER constant and Number.isSafeInteger() function:
The MAX_SAFE_INTEGER constant has a value of 9007199254740991. The
reasoning behind that number is that JavaScript uses double-precision
floating-point format numbers as specified in IEEE 754 and can only
safely represent numbers between -(2^53 - 1) and 2^53 - 1.
Safe in this context refers to the ability to represent integers
exactly and to correctly compare them. For example,
Number.MAX_SAFE_INTEGER + 1 === Number.MAX_SAFE_INTEGER + 2 will
evaluate to true, which is mathematically incorrect. See
Number.isSafeInteger() for more information.
Due to the resolution of floats in JavaScript, using "64-bit floating point numbers" as you proposed would suffer from the very same restriction.
IMHO the best option is to transmit such values as text. It would be still perfectly readable JSON content, and would be easy do work with at JavaScript level.
A "pure string" representation is what OData specifies, for its Edm.Int64 or Edm.Decimal types.
What the Twitter API does in this case, is to add a specific ".._str": field in the JSON, as such:
{
"id": 10765432100123456789, // for JSON compliant clients
"id_str": "10765432100123456789", // for JavaScript
...
}
I like this option very much, since it would be still compatible with int64 capable clients. In practice, such duplicated content in the JSON won't hurt much, if it is deflated/gzipped at HTTP level.
Once transmitted as string, you may use libraries like strint – a JavaScript library for string-encoded integers to handle such values.
Update: Newer versions of JavaScript engines include a BigInt object class, which is able to handle more than 53-bit. In fact, it can be used for arbitrarily large integers, so a good fit for 64-bit integer values. But when serializing as JSON, the BigInt value will be serialized as a JSON string - weirdly enough, but for compatibility purposes I guess.
This seems to be less a problem with JSON and more a problem with Javascript itself. What are you planning to do with these numbers? If it's just a magic token that you need to pass back to the website later on, by all means simply use a string containing the value. If you actually have to do arithmetic on the value, you could possibly write your own Javascript routines for 64-bit arithmetic.
One way that you could represent values in Javascript (and hence JSON) would be by splitting the numbers into two 32-bit values, eg.
[ 12345678, 12345678 ]
To split a 64-bit value into two 32-bit values, do something like this:
output_values[0] = (input_value >> 32) & 0xffffffff;
output_values[1] = input_value & 0xffffffff;
Then to recombine two 32-bit values to a 64-bit value:
input_value = ((int64_t) output_values[0]) << 32) | output_values[1];
Javascript's Number type (64 bit IEEE 754) only has about 53 bits of precision.
But, if you don't need to do any addition or multiplication, then you could keep 64-bit value as 4-character strings as JavaScript uses UTF-16.
For example, 1 could be encoded as "\u0000\u0000\u0000\u0001". This has the advantage that value comparison (==, >, <) works on strings as expected. It also seems straightforward to write bit operations:
function and64(a,b) {
var r = "";
for (var i = 0; i < 4; i++)
r += String.fromCharCode(a.charCodeAt(i) & b.charCodeAt(i));
return r;
}
The JS number representation is a standard ieee double, so you can't represent a 64 bit integer. iirc you get maybe 48 bits of actual int precision in a double, but all JS bitops reduce to 32bit precision (that's what the spec requires. yay!) so if you really need a 64bit int in js you'll need to implement your own 64 bit int logic library.
JSON itself doesn't care about implementation limits.
your problem is that JS can't handle your data, not the protocol.
In other words, your JS client code has to use either of those non-perfect options.
This thing happened to me. All hell broke loose when sending large integers via json into JSON.parse. I spent days trying to debug. Problem immediately solved when i transmitted the values as strings.
Use
{ "the_sequence_number": "20200707105904535" }
instead of
{ "the_sequence_number": 20200707105904535 }
To make it worse, it would seem that where every JSON.parse is implemented, is some shared lib between Firefox, Chrome and Opera because they all behaved exactly the same. Opera error messages have Chrome URL references in it, almost like WebKit shared by browsers.
console.log('event_listen[' + global_weird_counter + ']: to be sure, server responded with [' + aresponsetxt + ']');
var response = JSON.parse(aresponsetxt);
console.log('event_listen[' + global_weird_counter + ']: after json parse: ' + JSON.stringify(response));
The behaviour i got was the sort of stuff where pointer math went horribly bad. Ghosts were flying out of my workstation wreaking havoc in my sleep. They are all exorcised now that i switched to string.

Categories