I'm in the process of writing a Node.js based application which talks via TCP to a C++ based server. The server speaks a binary protocol, quite similar to Protocol Buffers, but not exactly the same.
One data type the server returns is that of a unsigned 64-bit integer (uint64_t), serialized as a varint, where the most significant bit is used to indicate whether the next byte is also part of the int.
I am unable to parse this out in Javascript currently due to the 32-bit limitation on bitwise operations, and also the fact that JS doesn't do 64-bit ints natively. Does anyone have any suggestions on how I could do this?
My varint reading code is very similar to that shown here: https://github.com/chrisdickinson/varint/blob/master/decode.js
I thought I could use node-bignum to represent the number, but I'm unsure how to turn a Buffer consisting of varint bytes into this.
Cheers,
Nathan
Simply took the existing varint read module and modified it to yield a Bignum object instead of a regular number:
Bignum = require('bignum');
module.exports = read;
var MSB = 0x80
, REST = 0x7F;
function read(buf, offset) {
var res = Bignum(0)
, offset = offset || 0
, counter = offset
, b
, shift = 0
, l = buf.length;
do {
if(counter >= l) {
read.bytesRead = 0;
return undefined
}
b = buf[counter++];
res = res.add(Bignum(b & REST).shiftLeft(shift));
shift += 7
} while (b >= MSB);
read.bytes = counter - offset;
return res
}
Use it exactly the same way as you would have used the original decode module.
Related
I'm trying to understand how the conversion of C code to WebAssembly and the JavaScript interop works in the background. And I'm having problems getting a simple string from a function parameter.
My program is a simple Hello World, and I'm trying to "emulate" a printf/puts.
More or less the C equivalent I want to build:
int main() {
puts("Hello World\n");
}
You can see a working example here.
My best idea currently is to read 16bit chunks of memory at a time (since wasm seems to allocate them in 16bit intervals) and check for the null-terminaton.
function get_string(memory, addr) {
var length = 0;
while (true) {
let buffer = new Uint8Array(memory.buffer, addr, 16);
let term = buffer.indexOf(0);
length += term == -1 ? 16 : term;
if (term != -1) break;
}
const strBuf = new Uint8Array(memory.buffer, addr, length);
return new TextDecoder().decode(strBuf);
}
But this seems really clumsy. Is there a better way to read a string from the memory if you only know the start address?
And is it really necessary that I only read 16bit chunks at a time?
I couldn't find any information if creating an typed array of the memory counts as accessing the whole memory or this only happens when I try to get the data from the array.
WebAssembly allocates memory in 64k pages. Maybe this is where the 16 bit thing came from, because 16 bits can address 64 kbytes. However this is irrelevant to the task at hand, since the WebAssembly memory is just a continuous address space, there isn't much difference between the memory object and an ArrayBuffer of the given size, if there's any at all.
The 16-byte-window-at-a-time isn't necessary as well (somehow 16 bits became 16 bytes).
You can do it simply without any performance penalty and create a view of the rest of the buffer in the following way:
function get_string(memory, addr) {
let buffer = new Uint8Array(memory.buffer, addr, memory.buffer.byteLength - addr);
let term = buffer.indexOf(0);
return new TextDecoder().decode(buffer.subarray(0, term));
}
I have one API for generate a random map tileset with a Erlang server, with safe integer JavaScript can request to API good but the problem is when is more than Number.MAX_SAFE_INTEGER.
Then, in the server I haven't the integer becuse JavaScript send 1e+21 number in scientific notation, not 999999999999999999999 and I need the number like string not in scientific notation.
How can I get a string like "999999999999999999999" in JavaScript for send to API that and not scientific notation string? Exists a library for arbitrary long numbers in JavaScript small only for the number? The umber is a X and Y number of position and need the correct position, and not is good a big library because need good performance for render the map in ms, is for a game in browser, not for astronomy calc.
you could use the BigNumber library (https://github.com/MikeMcl/bignumber.js/).
These 2 lines should do the trick:
BigNumber.config({EXPONENTIAL_AT: [-10, 30]});
var stringifiedLongNumber = new BigNumber('999999999999999999999').toString();
EXPONENTIAL_AT config parameter documentation: http://mikemcl.github.io/bignumber.js/#exponential-at
One possible solution for the problem in this case is not send new position to server, and only send a event of movement, then the server in Erlang can calculate new position with arbitrary long number (while have enought memory for a very big number), and not have the problem in JavaScript.
Maybe is better a small solution using optimizations of asm.js and work with strings for add and rest to the number, this a example simple but is possible do better:
var a = "999999";
var b = "";
var acarreo = 1|0;
console.log(a);
for(var i=a.length-1|0; i >= 0; i--){
var j = parseInt(a[i])|0;
var j = j + acarreo|0;
if(j > 9){
j-=10|0;
acarreo=1|0;
}else{
acarreo=0|0;
}
b = j.toString() + b;
}
if(acarreo>0){
a = acarreo.toString() + b;
}else{
a = b;
}
console.log(a);
I am looking for a portable algorithm for creating a hashCode for binary data. None of the binary data is very long -- I am Avro-encoding keys for use in kafka.KeyedMessages -- we're probably talking anywhere from 2 to 100 bytes in length, but most of the keys are in the 4 to 8 byte range.
So far, my best solution is to convert the data to a hex string, and then do a hashCode of that. I'm able to make that work in both Scala and JavaScript. Assuming I have defined b: Array[Byte], the Scala looks like this:
b.map("%02X" format _).mkString.hashCode
It's a little more elaborate in JavaScript -- luckily someone already ported the basic hashCode algorithm to JavaScript -- but the point is being able to create a Hex string to represent the binary data, I can ensure the hashing algorithm works off the same inputs.
On the other hand, I have to create an object twice the size of the original just to create the hashCode. Luckily most of my data is tiny, but still -- there has to be a better way to do this.
Instead of padding the data as its hex value, I presume you could just coerce the binary data into a String so the String has the same number of bytes as the binary data. It would be all garbled, more control characters than printable characters, but it would be a string nonetheless. Do you run into portability issues though? Endian-ness, Unicode, etc.
Incidentally, if you got this far reading and don't already know this -- you can't just do:
val b: Array[Byte] = ...
b.hashCode
Luckily I already knew that before I started, because I ran into that one early on.
Update
Based on the first answer given, it appears at first blush that java.util.Arrays.hashCode(Array[Byte]) would do the trick. However, if you follow the javadoc trail, you'll see that this is the algorithm behind it, which is as based on the algorithm for List and the algorithm for byte combined.
int hashCode = 1;
for (byte e : list) hashCode = 31*hashCode + (e==null ? 0 : e.intValue());
As you can see, all it's doing is creating a Long representing the value. At a certain point, the number gets too big and it wraps around. This is not very portable. I can get it to work for JavaScript, but you have to import the npm module long. If you do, it looks like this:
function bufferHashCode(buffer) {
const Long = require('long');
var hashCode = new Long(1);
for (var value of buff.values()) { hashCode = hashCode.multiply(31).add(value) }
return hashCode
}
bufferHashCode(new Buffer([1,2,3]));
// hashCode = Long { low: 30817, high: 0, unsigned: false }
And you do get the same results when the data wraps around, sort of, though I'm not sure why. In Scala:
java.util.Arrays.hashCode(Array[Byte](1,2,3,4,5,6,7,8,9,10))
// res30: Int = -975991962
Note that the result is an Int. In JavaScript:
bufferHashCode(new Buffer([1,2,3,4,5,6,7,8,9,10]);
// hashCode = Long { low: -975991962, high: 197407, unsigned: false }
So I have to take the low bytes and ignore the high, but otherwise I get the same results.
This functionality is already available in Java standard library, look at the Arrays.hashCode() method.
Because your binary data are Array[Byte], here is how you can verify it works:
println(java.util.Arrays.hashCode(Array[Byte](1,2,3))) // prints 30817
println(java.util.Arrays.hashCode(Array[Byte](1,2,3))) // prints 30817
println(java.util.Arrays.hashCode(Array[Byte](2,2,3))) // prints 31778
Update: It is not true that the Java implementation boxes the bytes. Of course, there is conversion to int, but there's no way around that. This is the Java implementation:
public static int hashCode(byte a[]) {
if (a == null) return 0;
int result = 1;
for (byte element : a) result = 31 * result + element;
return result;
}
Update 2
If what you need is a JavaScript implementation that gives the same results as a Scala/Java implementation, than you can extend the algorithm by, e.g., taking only the rightmost 31 bits:
def hashCode(a: Array[Byte]): Int = {
if (a == null) {
0
} else {
var hash = 1
var i: Int = 0
while (i < a.length) {
hash = 31 * hash + a(i)
hash = hash & Int.MaxValue // taking only the rightmost 31 bits
i += 1
}
hash
}
}
and JavaScript:
var hashCode = function(arr) {
if (arr == null) return 0;
var hash = 1;
for (var i = 0; i < arr.length; i++) {
hash = hash * 31 + arr[i]
hash = hash % 0x80000000 // taking only the rightmost 31 bits in integer representation
}
return hash;
}
Why do the two implementations produce the same results? In Java, integer overflow behaves as if the addition was performed without loss of precision and then bits higher than 32 got thrown away and & Int.MaxValue throws away the 32nd bit. In JavaScript, there is no loss of precision for integers up to 253 which is a limit the expression 31 * hash + a(i) never exceeds. % 0x80000000 then behaves as taking the rightmost 31 bits. The case without overflows is obvious.
This is the meat of algorithm used in the Java library:
int result 1;
for (byte element : a) result = 31 * result + element;
You comment:
this algorithm isn't very portable
Incorrect. If we are talking about Java, then provided that we all agree on the type of the result, then the algorithm is 100% portable.
Yes the computation overflows, but it overflows exactly the same way on all valid implementations of the Java language. A Java int is specified to be 32 bits signed two's complement, and the behavior of the operators when overflow occurs is well-defined ... and the same for all implementations. (The same goes for long ... though the size is different, obviously.)
I'm not an expert, but my understanding is that Scala's numeric types have the same properties as Java. Javascript is different, being based on IEE 754 double precision floating point. However, with case you should be able to code the Java algorithm portably in Javascript. (I think #Mifeet's version is wrong ...)
I'm working on the Rosalind problem Mortal Fibonacci Rabbits and the website keeps telling me my answer is wrong when I use my algorithm written JavaScript. When I use the same algorithm in Python I get a different (and correct) answer.
The inconsistency only happens when the result gets large. For example fibd(90, 19) returns 2870048561233730600 in JavaScript but in Python I get 2870048561233731259.
Is there something about numbers in JavaScript that give me a different answer or am making a subtle mistake in my JavaScript code?
The JavaScript solution:
function fibd(n, m) {
// Create an array of length m and set all elements to 0
var rp = new Array(m);
rp = rp.map(function(e) { return 0; });
rp[0] = 1;
for (var i = 1; i < n; i++) {
// prepend the sum of all elements from 1 to the end of the array
rp.splice(0, 0, rp.reduce(function (e, s) { return s + e; }) - rp[0]);
// Remove the final element
rp.pop();
}
// Sum up all the elements
return rp.reduce(function (e, s) { return s + e; });
}
The Python solution:
def fibd(n, m):
# Create an array of length m and set all elements to 0
rp = [0] * m
rp[0] = 1
for i in range(n-1):
# The sum of all elements from 1 the end and dropping the final element
rp = [sum(rp[1:])] + rp[:-1]
return sum(rp)
I think Javascript only has a "Number" datatype, and this actually an IEEE double under the hood. 2,870,048,561,233,730,600 is too large to hold precisely in IEEE double, so it is approximated. (Notice the trailing "00" - 17 decimal places is about right for double.)
Python on the other hand has bignum support, and will quite cheerfully deal with 4096 bit integers (for those that play around with cryptographic algorithms, this is a huge boon).
You might will be able to find a Javascript bignum library if you search - for example http://silentmatt.com/biginteger/
Just doing a bit of research, this article seems interesting. Javascript only supports 53bits integers.
The result given by Python is indeed out of the maximum safe range for JS. If you try to do
parseInt('2870048561233731259')
It will indeed return
2870048561233731000
I need to encode and decode IEEE 754 floats and doubles from binary in node.js to parse a network protocol.
Are there any existing libraries that do this, or do I have to read the spec and implement it myself? Or should I write a C module to do it?
Note that as of node 0.6 this functionality is included in the core library, so that is the new best way to do it.
See http://nodejs.org/docs/latest/api/buffer.html for details.
If you are reading/writing binary data structures you might consider using a friendly wrapper around this functionality to make things easier to read and maintain. Plug follows: https://github.com/dobesv/node-binstruct
I ported a C++ (made with GNU GMP) converter with float128 support to Emscripten so that it would run in the browser: https://github.com/ysangkok/ieee-754
Emscripten produces JavaScript that will run on Node.js too. You will get the float representation as a string of bits, though, I don't know if that's what you want.
In modern JavaScript (ECMAScript 2015) you can use ArrayBuffer and Float32Array/Float64Array. I solved it like this:
// 0x40a00000 is "5" in float/IEEE-754 32bit.
// You can check this here: https://www.h-schmidt.net/FloatConverter/IEEE754.html
// MSB (Most significant byte) is at highest index
const bytes = [0x00, 0x00, 0xa0, 0x40];
// The buffer is like a raw view into memory.
const buffer = new ArrayBuffer(bytes.length);
// The Uint8Array uses the buffer as its memory.
// This way we can store data byte by byte
const byteArray = new Uint8Array(buffer);
for (let i = 0; i < bytes.length; i++) {
byteArray[i] = bytes[i];
}
// float array uses the same buffer as memory location
const floatArray = new Float32Array(buffer);
// floatValue is a "number", because a number in javascript is a
// double (IEEE-754 # 64bit) => it can hold f32 values
const floatValue = floatArray[0];
// prints out "5"
console.log(`${JSON.stringify(bytes)} as f32 is ${floatValue}`);
// double / f64
// const doubleArray = new Float64Array(buffer);
// const doubleValue = doubleArray[0];
PS: This works in NodeJS but also in Chrome, Firefox, and Edge.