I am developing a multiplayer game server.
On my case, every single byte that really matter for gaming experience and saving bandwith.
Client and server will send some integer values each other.
Integers mostly will have values lower than 100.
In some cases, that integers could have values between 0 and 100000.
All that integers will be send in same sequence. (Imagine that they are integer array)
Using 8 bit integer array or 16 bit integer array is not an option to me because of possible values greater than 65535.
And, I do not want to use 32 bit integer array just for the values what be in action rarely.
So, I developed an algorithm for that (here is the javascript port):
function write(buffer, number){
while(number > 0x7f){
buffer.push(0x80 | (number & 0x7f));
number >>= 7;
}
buffer.push(number);
}
function read(buffer){
var cur, result = 0, shift = 0x8DC54E1C0; // ((((((28 << 6) | 21) << 6) | 14) << 6) | 7) << 6;
while((cur = buffer.shift()) > 0x7f)
{
result |= (cur & 0x7f) << shift;
shift >>= 6;
}
return result | (cur << shift);
}
var d = [];
var number = 127;
write(d, number);
alert("value bytes: " + d);
var newResult = read(d);
alert("equals : " + (number === newResult));
My question is: Is there a better way to solve that problem ?
Thanks in advance
Related
Say you have two integers 10 and 20. That is 00001010 and 00010100. I would then like to just basically concat these as strings, but have the result be a new integer.
00001010 + 00010100 == 0000101000010100
That final number is 2580.
However, I am looking for a way to do this without actually converting them to string. Looking for something more efficient that just does some bit twiddling on the integers themselves. I'm not too familiar with that, but I imagine it would be along the lines of:
var a = 00001010 // == 10
var b = 00010100 // == 20
var c = a << b // == 2580
Note, I would like for this to work with any sequences of bits. So even:
var a = 010101
var b = 01110
var c = a + b == 01010101110
You basic equation is:
c = b + (a << 8).
The trick here is that you need to always shift by 8. But since a and b do not always use all 8 bits in the byte, JavaScript will automatically omit any leading zeros. We need to recover the number of leading zeros (of b), or trailing zeros of a, and prepend them back before adding. This way, all the bits stay in their proper position. This requires an equation like this:
c = b + (a << s + r)
Where s is the highest set bit (going from right to left) in b, and r is the remaining number of bits such that s + r = 8.
Essentially, all you are doing is shifting the first operand a over by 8 bits, to effectively add trailing zeros to a or equally speaking, padding leading zeros to the second operand b. Then you add normally. This can be accomplishing using logarithms, and shifting, and bitwise OR operation to provide an O(1) solution for some arbitrary positive integers a and b where the number of bits in a and b do not exceed some positive integer n. In the case of a byte, n = 8.
// Bitwise log base 2 in O(1) time
function log2(n) {
// Check if n > 0
let bits = 0;
if (n > 0xffff) {
n >>= 16;
bits = 0x10;
}
if (n > 0xff) {
n >>= 8;
bits |= 0x8;
}
if (n > 0xf) {
n >>= 4;
bits |= 0x4;
}
if (n > 0x3) {
n >>= 2;
bits |= 0x2;
}
if (n > 0x1) {
bits |= 0x1;
}
return bits;
}
// Computes the max set bit
// counting from the right to left starting
// at 0. For 20 (10100) we get bit # 4.
function msb(n) {
n |= n >> 1;
n |= n >> 2;
n |= n >> 4;
n |= n >> 8;
n |= n >> 16;
n = n + 1;
// We take the log here because
// n would otherwise be the largest
// magnitude of base 2. So, for 20,
// n+1 would be 16. Which, to
// find the number of bits to shift, we must
// take the log base 2
return log2(n >> 1);
}
// Operands
let a = 0b00001010 // 10
let b = 0b00010100 // 20
// Max number of bits in
// in binary number
let n = 8
// Max set bit is the 16 bit, which is in position
// 4. We will need to pad 4 more zeros
let s = msb(b)
// How many zeros to pad on the left
// 8 - 4 = 4
let r = Math.abs(n - s)
// Shift a over by the computed
// number of bits including padded zeros
let c = b + (a << s + r)
console.log(c)
Output:
2580
Notes:
This is NOT commutative.
Add error checking to log2() for negative numbers, and other edge cases.
References:
https://www.geeksforgeeks.org/find-significant-set-bit-number/
https://github.com/N02870941/java_data_structures/blob/master/src/main/java/util/misc/Mathematics.java
so the problem:
a is 10 (in binary 0000 1010)
b is 20 (in binary 0100 0100)
you want to get 2580 using bit shift somehow.
if you right shift a by 8 using a<<=8 (this is the same as multiplying a by 2^8) you get 1010 0000 0000 which is the same as 10*2^8 = 2560. since the lower bits of a are all 0's (when you use << it fills the new bits with 0) you can just add b on top of it 1010 0000 0000 + 0100 0100 gives you 1010 0001 0100.
so in 1 line of code, it's var result = a<<8 + b. Remember in programming languages, most of them have no explicit built-in types for "binary". But everything is binary in its nature. so int is a "binary", an object is "binary" ....etc. When you want to do some binary operations on some data you can just use the datatype you have as operands for binary operations.
this is a more general version of how to concatenate two numbers' binary representations using no string operations and data
/*
This function concate b to the end of a and put 0's in between them.
b will be treated starting with it's first 1 as its most significant bit
b needs to be bigger than 0, otherwise, Math.log2 will give -Infinity for 0 and NaN for negative b
padding is the number of 0's to add at the end of a
*/
function concate_bits(a, b, padding) {
//add the padding 0's to a
a <<= padding;
//this gets the largest power of 2
var power_of_2 = Math.floor(Math.log2(b));
var power_of_2_value;
while (power_of_2 >= 0) {
power_of_2_value = 2 ** power_of_2;
a <<= 1;
if (b >= power_of_2_value) {
a += 1;
b -= power_of_2_value;
}
power_of_2--;
}
return a;
}
//this will print 2580 as the result
let result = concate_bits(10, 20, 3);
console.log(result);
Note, I would like for this to work with any sequences of bits. So even:
var a = 010101
var b = 01110
var c = a + b == 01010101110
This isn't going to be possible unless you convert to a string or otherwise store the number of bits in each number. 10101 010101 0010101 etc are all the same number (21), and once this is converted to a number, there is no way to tell how many leading zeroes the number originally had.
Looking at these implementations, I am wondering if one could explain the reason behind the specific operations. Not coming from computer science, I am not sure why these decisions were made.
function binb2rstr(input) {
var str = []
for (var i = 0, n = input.length * 32; i < n; i += 8) {
var code = (input[i >> 5] >>> (24 - i % 32)) & 0xFF
var val = String.fromCharCode(code)
str.push(val)
}
return str.join('')
}
function rstr2binb(input) {
var output = Array(input.length >> 2)
for (var i = 0, n = output.length; i < n; i++) {
output[i] = 0
}
for (var i = 0, n = input.length * 8; i < n; i += 8) {
output[i >> 5] |= (input.charCodeAt(i / 8) & 0xFF) << (24 - i % 32)
}
return output
}
What I understand so far are:
i += 8 is for iterating through bytes.
0xFF is 255, which is 2^8 - 1, so 1 byte.
32 which is the size of a word, or 4 bytes
| is bitwise OR, <<, >>>, and & are likewise bit operators.
The % modulus keeps the value within that max value of x = x % max.
What I don't understand is:
i >> 5, how that was picked.
& 0xFF, how that was picked.
24 - i % 32, where the 24 came from.
var code = (input[i >> 5] >>> (24 - i % 32)) & 0xFF, how the character code is computed from that.
input.length >> 2
Wondering if this is just a standard computer science function because it's hard to tell where these variables come from and how this was learned. It seems like these values must be a standard algorithm based on byte length but I can't tell how to get there with these open questions. Thank you for your help.
This code consists of some pretty clever bit-fiddling based on 32-bit values.
But let's work on your points:
i >> 5, how that was picked.
This divides i by 32 --- corresponding to the n = input.length * 32 overall length. Considering the whole algorithm this means that one value is processed four times (0,8,16,24) before selecting the next input value.
& 0xFF, how that was picked.
This simply selects the lowest 8-bit of a n-bit value.
24 - i % 32, where the 24 came from.
This relates to i += 8. The i % 32 indicates four different iterations (32/8=4) which are temp= (0, 8, 16, 24). So 24-temp results in (24,16,8,0).
var code = (input[i >> 5] >>> (24 - i % 32)) & 0xFF, how the character code is computed from that.
1. 1st iteration: i=0 ;24-0=24; input[0] >>> 24 & 0xFF = highest byte of input[0] shifted to lowest
2. 2nd iteration: i=8 ;24-8=16; input[0] >>> 16 & 0xFF = 2nd highest byte of input[0] shifted to 2nd lowest
3. 3rd iteration: i=16;24-16=8; input[0] >>> 8 & 0xFF = 2nd lowest byte of input[0] shifted to 2nd highest
4. 4th iteration: i=8 ;24-24=0; input[0] >>> 0 & 0xFF = lowest byte of input[0] shifted to highest
This was the Big-Endian-Conversion.
The next iteration has i=32 and starts the next iteration input[32/32]=input[1].
Overall this algorithm shifts the 32-bit code to the right and masks the lowest 8-bit to be used as a CharCode by String.fromCharCode(code).
The last one is from a different algorithm and so input.length >> 2 simply does a division by 2 discarding the possible rest of 1.
Concerning your last question:
It seems like these values must be a standard algorithm based on byte length but I can't tell how to get there with these open questions.
This is far from a standard algorithm. It is just a clever bit-manipulation based on bytes.
In assembler this code would be even easier to understand.
There is even one instruction called BSWAP to swap between 32-bit Big-Endian and Little-Endian values in a register.
The following python code seems to work very well to het the twos complement of a number:
def twos_comp(self, val, bits):
if (val & (1 << (bits - 1))) != 0:
val -= 1 << bits
return val
It is used as num = self.twos_comp(int(binStr, 2), len(binStr))
where binStr is a string that contains an arbitrary length binary number.
I need to do the exact same thing in javascript (for node.js). I've been fighting it all day and am about to resign from the human race. Clearly binary / bitwise math is not my strong suit.
Could someone please assist so I can go on to more productive time wasting :-)
function toTwosComplement(integer, numberBytes, dontCheckRange) {
// #integer - > the integer to convert
// #numberBytes -> the number of bytes representing the number (defaults to 1 if not specified)
var numberBits = (numberBytes || 1) * 8;
// make sure its in range given the number of bits
if (!dontCheckRange && (integer < (-(1 << (numberBits - 1))) || integer > ((1 << (numberBits - 1)) - 1)))
throw "Integer out of range given " + numberBytes + " byte(s) to represent.";
// if positive, return the positive value
if (integer >= 0)
return integer;
// if negative, convert to twos complement representation
return ~(((-integer) - 1) | ~((1 << numberBits) - 1));
}
I got following bit pattern:
1000 0001 (129)
I now want to set the last four bits after my favor (1 - 10, 0x1 - 0xA):
1000 0010
or
1000 1000
I have actually no idea how I can accomplish this. I could read out the first four bits:
var buff = new Buffer(1);
buff[0] = 129;
var isFirstBitSet = (buff[0] & 128) == 128;
var isSecondBitSet = (buff[0] & 64) == 40;
var isThirdBitSet = (buff[0] & 32) === 32;
var isFourthBitSet = (buff[0] & 16) === 16;
var buff[0] = 0xA;
if (isFirstBitSet) {
buff[0] = buff[0] & 128;
}
and map then on a new one but I think it is self explained that this is crap.
You can set the low four bits of an integer by first ANDing the integer with 0xfffffff0 and then ORing it with your four-bit value.
function setLowFour(n, lowFour) {
return (n & 0xfffffff0) | (lowFour & 0xf);
}
Note that JavaScript doesn't really have an integer type. The bitwise operations force the values to be integers, but they're really still stored as floating point numbers.
edit — I think it actually works too :-) setLowFour(1025, 12) returns 1036. How's that for unit testing?
*"Efficient" here basically means in terms of smaller size (to reduce the IO waiting time), and speedy retrieval/deserialization times. Storing times are not as important.
I have to store a couple of dozen arrays of integers, each with 1800 values in the range 0-50, in the browser's localStorage -- that is, as a string.
Obviously, the simplest method is to just JSON.stringify it, however, that adds a lot of unnecessary information, considering that the ranges of the data is well known. An average size for one of these arrays is then ~5500 bytes.
Here are some other methods I've tried (resultant size, and time to deserialize it 1000 times at the end)
zero-padding the numbers so each was 2 characters long, eg:
[5, 27, 7, 38] ==> "05270738"
base 50 encoding it:
[5, 11, 7, 38] ==> "5b7C"
just using the value as a character code (adding 32 to avoid the weird control characters at the start):
[5, 11, 7, 38] ==> "%+'F" (String.fromCharCode(37), String.fromCharCode(43) ...)
Here are my results:
size Chrome 18 Firefox 11
-------------------------------------------------
JSON.stringify 5286 60ms 99ms
zero-padded 3600 354ms 703ms
base 50 1800 315ms 400ms
charCodes 1800 21ms 178ms
My question is if there is an even better method I haven't yet considered?
Update
MДΓΓБДLL suggested using compression on the data. Combining this LZW implementation with the base 50 and charCode data. I also tested aroth's code (packing 4 integers into 3 bytes). I got these results:
size Chrome 18 Firefox 11
-------------------------------------------------
LZW base 50 1103 494ms 999ms
LZW charCodes 1103 194ms 882ms
bitpacking 1350 2395ms 331ms
If your range is 0-50, then you can pack 4 numbers into 3 bytes (6 bits per number). This would allow you to store 1800 numbers using ~1350 bytes. This code should do it:
window._firstChar = 48;
window.decodeArray = function(encodedText) {
var result = [];
var temp = [];
for (var index = 0; index < encodedText.length; index += 3) {
//skipping bounds checking because the encoded text is assumed to be valid
var firstChar = encodedText.charAt(index).charCodeAt() - _firstChar;
var secondChar = encodedText.charAt(index + 1).charCodeAt() - _firstChar;
var thirdChar = encodedText.charAt(index + 2).charCodeAt() - _firstChar;
temp.push((firstChar >> 2) & 0x3F); //6 bits, 'a'
temp.push(((firstChar & 0x03) << 4) | ((secondChar >> 4) & 0xF)); //2 bits + 4 bits, 'b'
temp.push(((secondChar & 0x0F) << 2) | ((thirdChar >> 6) & 0x3)); //4 bits + 2 bits, 'c'
temp.push(thirdChar & 0x3F); //6 bits, 'd'
}
//filter out 'padding' numbers, if present; this is an extremely inefficient way to do it
for (var index = 0; index < temp.length; index++) {
if(temp[index] != 63) {
result.push(temp[index]);
}
}
return result;
};
window.encodeArray = function(array) {
var encodedData = [];
for (var index = 0; index < dataSet.length; index += 4) {
var num1 = dataSet[index];
var num2 = index + 1 < dataSet.length ? dataSet[index + 1] : 63;
var num3 = index + 2 < dataSet.length ? dataSet[index + 2] : 63;
var num4 = index + 3 < dataSet.length ? dataSet[index + 3] : 63;
encodeSet(num1, num2, num3, num4, encodedData);
}
return encodedData;
};
window.encodeSet = function(a, b, c, d, outArray) {
//we can encode 4 numbers in 3 bytes
var firstChar = ((a & 0x3F) << 2) | ((b >> 4) & 0x03); //6 bits for 'a', 2 from 'b'
var secondChar = ((b & 0x0F) << 4) | ((c >> 2) & 0x0F); //remaining 4 bits from 'b', 4 from 'c'
var thirdChar = ((c & 0x03) << 6) | (d & 0x3F); //remaining 2 bits from 'c', 6 bits for 'd'
//add _firstChar so that all values map to a printable character
outArray.push(String.fromCharCode(firstChar + _firstChar));
outArray.push(String.fromCharCode(secondChar + _firstChar));
outArray.push(String.fromCharCode(thirdChar + _firstChar));
};
Here's a quick example: http://jsfiddle.net/NWyBx/1
Note that storage size can likely be further reduced by applying gzip compression to the resulting string.
Alternately, if the ordering of your numbers is not significant, then you can simply do a bucket-sort using 51 buckets (assuming 0-50 includes both 0 and 50 as valid numbers) and store the counts for each bucket instead of the numbers themselves. That would likely give you better compression and efficiency than any other approach.
Assuming (as in your test) that compression takes more time than the size reduction saves you, your char encoding is the smallest you'll get without bitshifting. You're currently using one byte for each number, but if they're guaranteed to be small enough you could put two numbers in each byte. That would probably be an over-optimization, unless this is a very hot piece of your code.
You might want to consider using Uint8Array or ArrayBuffer. This blogpost shows how it's done. Copying his logic, here's an example, assuming you have an existing Uint8Array named arr.
function arrayBufferToBinaryString(buffer, cb) {
var blobBuilder = new BlobBuilder();
blobBuilder.append(buffer);
var blob = blobBuilder.getBlob();
var reader = new FileReader();
reader.onload = function (e) {
cb(reader.result);
};
reader.readAsBinaryString(blob);
}
arrayBufferToBinaryString(arr.buffer, function(s) {
// do something with s
});