Concatenating hex bytes and strings in JavaScript while preserving bytes - javascript

I would like to concatenate a hex value and a string using JavaScript.
var hexValue = 0x89;
var png = "PNG";
The string "PNG" is equivalent to the concatenation of 0x50, 0x4E, and 0x47.
Concatenating hexValue and png via
var concatHex = String.fromCharCode(0x89) + String.fromCharCode(0x50)
+ String.fromCharCode(0x4E) + String.fromCharCode(0x47);
...give a result with a byte count of 5 because of the first hex value needing a control character:
C2 89 50 4E 47
I am working with raw image data where I have hexValue and png and need to concatenate them without this control character being included.
Is there a way to trim off the control character?
Given I have an array of bytes, is there a better way to concatenate them and a string while preserving the bytes?

Well i was investigating and i found that in javascript to achieve this eficienly JavaScript typed arrays is used.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays
http://msdn.microsoft.com/en-us/library/br212485(v=vs.94).aspx
Here i wrote a code (not tested) to perform what you want:
var png = "PNG";
var hexValue = 0x89;
var lenInBytes = (png.Length + 1) * 8; //left an extra space to concat hexValue
var buffer = new ArrayBuffer(lenInBytes); //create chunk of memory whose bytes are all pre-initialized to 0
var int8View = new Int8Array(buffer); //treat this memory like int8
for(int i = 0; i < png.Length ; i++)
int8View[i] = png[i] //here convert the png[i] to bytes
//at this point we have the string png as array of bytes
int8View[png.Length] = hexValue //here the concatenation is performed
Well hope it helps.

Related

How to use javascript (in Angular) to get bytes encoded by java.util.Base64? [duplicate]

I need to convert a base64 encode string into an ArrayBuffer.
The base64 strings are user input, they will be copy and pasted from an email, so they're not there when the page is loaded.
I would like to do this in javascript without making an ajax call to the server if possible.
I found those links interesting, but they didt'n help me:
ArrayBuffer to base64 encoded string
this is about the opposite conversion, from ArrayBuffer to base64, not the other way round
http://jsperf.com/json-vs-base64/2
this looks good but i can't figure out how to use the code.
Is there an easy (maybe native) way to do the conversion? thanks
Try this:
function _base64ToArrayBuffer(base64) {
var binary_string = window.atob(base64);
var len = binary_string.length;
var bytes = new Uint8Array(len);
for (var i = 0; i < len; i++) {
bytes[i] = binary_string.charCodeAt(i);
}
return bytes.buffer;
}
Using TypedArray.from:
Uint8Array.from(atob(base64_string), c => c.charCodeAt(0))
Performance to be compared with the for loop version of Goran.it answer.
For Node.js users:
const myBuffer = Buffer.from(someBase64String, 'base64');
myBuffer will be of type Buffer which is a subclass of Uint8Array. Unfortunately, Uint8Array is NOT an ArrayBuffer as the OP was asking for. But when manipulating an ArrayBuffer I almost always wrap it with Uint8Array or something similar, so it should be close to what's being asked for.
Goran.it's answer does not work because of unicode problem in javascript - https://developer.mozilla.org/en-US/docs/Web/API/WindowBase64/Base64_encoding_and_decoding.
I ended up using the function given on Daniel Guerrero's blog: http://blog.danguer.com/2011/10/24/base64-binary-decoding-in-javascript/
Function is listed on github link: https://github.com/danguer/blog-examples/blob/master/js/base64-binary.js
Use these lines
var uintArray = Base64Binary.decode(base64_string);
var byteArray = Base64Binary.decodeArrayBuffer(base64_string);
Async solution, it's better when the data is big:
// base64 to buffer
function base64ToBufferAsync(base64) {
var dataUrl = "data:application/octet-binary;base64," + base64;
fetch(dataUrl)
.then(res => res.arrayBuffer())
.then(buffer => {
console.log("base64 to buffer: " + new Uint8Array(buffer));
})
}
// buffer to base64
function bufferToBase64Async( buffer ) {
var blob = new Blob([buffer], {type:'application/octet-binary'});
console.log("buffer to blob:" + blob)
var fileReader = new FileReader();
fileReader.onload = function() {
var dataUrl = fileReader.result;
console.log("blob to dataUrl: " + dataUrl);
var base64 = dataUrl.substr(dataUrl.indexOf(',')+1)
console.log("dataUrl to base64: " + base64);
};
fileReader.readAsDataURL(blob);
}
Javascript is a fine development environment so it seems odd than it doesn't provide a solution to this small problem. The solutions offered elsewhere on this page are potentially slow. Here is my solution. It employs the inbuilt functionality that decodes base64 image and sound data urls.
var req = new XMLHttpRequest;
req.open('GET', "data:application/octet;base64," + base64Data);
req.responseType = 'arraybuffer';
req.onload = function fileLoaded(e)
{
var byteArray = new Uint8Array(e.target.response);
// var shortArray = new Int16Array(e.target.response);
// var unsignedShortArray = new Int16Array(e.target.response);
// etc.
}
req.send();
The send request fails if the base 64 string is badly formed.
The mime type (application/octet) is probably unnecessary.
Tested in chrome. Should work in other browsers.
Pure JS - no string middlestep (no atob)
I write following function which convert base64 in direct way (without conversion to string at the middlestep). IDEA
get 4 base64 characters chunk
find index of each character in base64 alphabet
convert index to 6-bit number (binary string)
join four 6 bit numbers which gives 24-bit numer (stored as binary string)
split 24-bit string to three 8-bit and covert each to number and store them in output array
corner case: if input base64 string ends with one/two = char, remove one/two numbers from output array
Below solution allows to process large input base64 strings. Similar function for convert bytes to base64 without btoa is HERE
function base64ToBytesArr(str) {
const abc = [..."ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"]; // base64 alphabet
let result = [];
for(let i=0; i<str.length/4; i++) {
let chunk = [...str.slice(4*i,4*i+4)]
let bin = chunk.map(x=> abc.indexOf(x).toString(2).padStart(6,0)).join('');
let bytes = bin.match(/.{1,8}/g).map(x=> +('0b'+x));
result.push(...bytes.slice(0,3 - (str[4*i+2]=="=") - (str[4*i+3]=="=")));
}
return result;
}
// --------
// TEST
// --------
let test = "Alice's Adventure in Wonderland.";
console.log('test string:', test.length, test);
let b64_btoa = btoa(test);
console.log('encoded string:', b64_btoa);
let decodedBytes = base64ToBytesArr(b64_btoa); // decode base64 to array of bytes
console.log('decoded bytes:', JSON.stringify(decodedBytes));
let decodedTest = decodedBytes.map(b => String.fromCharCode(b) ).join``;
console.log('Uint8Array', JSON.stringify(new Uint8Array(decodedBytes)));
console.log('decoded string:', decodedTest.length, decodedTest);
Caution!
If you want to decode base64 to STRING (not bytes array) and you know that result contains utf8 characters then atob will fail in general e.g. for character 💩 the atob("8J+SqQ==") will give wrong result . In this case you can use above solution and convert result bytes array to string in proper way e.g. :
function base64ToBytesArr(str) {
const abc = [..."ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"]; // base64 alphabet
let result = [];
for(let i=0; i<str.length/4; i++) {
let chunk = [...str.slice(4*i,4*i+4)]
let bin = chunk.map(x=> abc.indexOf(x).toString(2).padStart(6,0)).join('');
let bytes = bin.match(/.{1,8}/g).map(x=> +('0b'+x));
result.push(...bytes.slice(0,3 - (str[4*i+2]=="=") - (str[4*i+3]=="=")));
}
return result;
}
// --------
// TEST
// --------
let testB64 = "8J+SqQ=="; // for string: "💩";
console.log('input base64 :', testB64);
let decodedBytes = base64ToBytesArr(testB64); // decode base64 to array of bytes
console.log('decoded bytes :', JSON.stringify(decodedBytes));
let result = new TextDecoder("utf-8").decode(new Uint8Array(decodedBytes));
console.log('properly decoded string :', result);
let result_atob = atob(testB64);
console.log('decoded by atob :', result_atob);
Snippets tested 2022-08-04 on: chrome 103.0.5060.134 (arm64), safari 15.2, firefox 103.0.1 (64 bit), edge 103.0.1264.77 (arm64), and node-js v12.16.1
I would strongly suggest using an npm package implementing correctly the base64 specification.
The best one I know is rfc4648
The problem is that btoa and atob use binary strings instead of Uint8Array and trying to convert to and from it is cumbersome. Also there is a lot of bad packages in npm for that. I lose a lot of time before finding that one.
The creators of that specific package did a simple thing: they took the specification of Base64 (which is here by the way) and implemented it correctly from the beginning to the end. (Including other formats in the specification that are also useful like Base64-url, Base32, etc ...) That doesn't seem a lot but apparently that was too much to ask to the bunch of other libraries.
So yeah, I know I'm doing a bit of proselytism but if you want to avoid losing your time too just use rfc4648.
I used the accepted answer to this question to create base64Url string <-> arrayBuffer conversions in the realm of base64Url data transmitted via ASCII-cookie [atob, btoa are base64[with +/]<->js binary string], so I decided to post the code.
Many of us may want both conversions and client-server communication may use the base64Url version (though a cookie may contain +/ as well as -_ characters if I understand well, only ",;\ characters and some wicked characters from the 128 ASCII are disallowed). But a url cannot contain / character, hence the wider use of b64 url version which of course not what atob-btoa supports...
Seeing other comments, I would like to stress that my use case here is base64Url data transmission via url/cookie and trying to use this crypto data with the js crypto api (2017) hence the need for ArrayBuffer representation and b64u <-> arrBuff conversions... if array buffers represent other than base64 (part of ascii) this conversion wont work since atob, btoa is limited to ascii(128). Check out an appropriate converter like below:
The buff -> b64u version is from a tweet from Mathias Bynens, thanks for that one (too)! He also wrote a base64 encoder/decoder:
https://github.com/mathiasbynens/base64
Coming from java, it may help when trying to understand the code that java byte[] is practically js Int8Array (signed int) but we use here the unsigned version Uint8Array since js conversions work with them. They are both 256bit, so we call it byte[] in js now...
The code is from a module class, that is why static.
//utility
/**
* Array buffer to base64Url string
* - arrBuff->byte[]->biStr->b64->b64u
* #param arrayBuffer
* #returns {string}
* #private
*/
static _arrayBufferToBase64Url(arrayBuffer) {
console.log('base64Url from array buffer:', arrayBuffer);
let base64Url = window.btoa(String.fromCodePoint(...new Uint8Array(arrayBuffer)));
base64Url = base64Url.replaceAll('+', '-');
base64Url = base64Url.replaceAll('/', '_');
console.log('base64Url:', base64Url);
return base64Url;
}
/**
* Base64Url string to array buffer
* - b64u->b64->biStr->byte[]->arrBuff
* #param base64Url
* #returns {ArrayBufferLike}
* #private
*/
static _base64UrlToArrayBuffer(base64Url) {
console.log('array buffer from base64Url:', base64Url);
let base64 = base64Url.replaceAll('-', '+');
base64 = base64.replaceAll('_', '/');
const binaryString = window.atob(base64);
const length = binaryString.length;
const bytes = new Uint8Array(length);
for (let i = 0; i < length; i++) {
bytes[i] = binaryString.charCodeAt(i);
}
console.log('array buffer:', bytes.buffer);
return bytes.buffer;
}
made a ArrayBuffer from a base64:
function base64ToArrayBuffer(base64) {
var binary_string = window.atob(base64);
var len = binary_string.length;
var bytes = new Uint8Array(len);
for (var i = 0; i < len; i++) {
bytes[i] = binary_string.charCodeAt(i);
}
return bytes.buffer;
}
I was trying to use above code and It's working fine.
The result of atob is a string that is separated with some comma
,
A simpler way is to convert this string to a json array string and after that parse it to a byteArray
below code can simply be used to convert base64 to an array of number
let byteArray = JSON.parse('['+atob(base64)+']');
let buffer = new Uint8Array(byteArray);
Solution without atob
I've seen many people complaining about using atob and btoa in the replies. There are some issues to take into account when using them.
There's a solution without using them in the MDN page about Base64. Below you can find the code to convert a base64 string into a Uint8Array copied from the docs.
Note that the function below returns a Uint8Array. To get the ArrayBuffer version you just need to do uintArray.buffer.
function b64ToUint6(nChr) {
return nChr > 64 && nChr < 91
? nChr - 65
: nChr > 96 && nChr < 123
? nChr - 71
: nChr > 47 && nChr < 58
? nChr + 4
: nChr === 43
? 62
: nChr === 47
? 63
: 0;
}
function base64DecToArr(sBase64, nBlocksSize) {
const sB64Enc = sBase64.replace(/[^A-Za-z0-9+/]/g, "");
const nInLen = sB64Enc.length;
const nOutLen = nBlocksSize
? Math.ceil(((nInLen * 3 + 1) >> 2) / nBlocksSize) * nBlocksSize
: (nInLen * 3 + 1) >> 2;
const taBytes = new Uint8Array(nOutLen);
let nMod3;
let nMod4;
let nUint24 = 0;
let nOutIdx = 0;
for (let nInIdx = 0; nInIdx < nInLen; nInIdx++) {
nMod4 = nInIdx & 3;
nUint24 |= b64ToUint6(sB64Enc.charCodeAt(nInIdx)) << (6 * (3 - nMod4));
if (nMod4 === 3 || nInLen - nInIdx === 1) {
nMod3 = 0;
while (nMod3 < 3 && nOutIdx < nOutLen) {
taBytes[nOutIdx] = (nUint24 >>> ((16 >>> nMod3) & 24)) & 255;
nMod3++;
nOutIdx++;
}
nUint24 = 0;
}
}
return taBytes;
}
If you're interested in the reverse operation, ArrayBuffer to base64, you can find how to do it in the same link.

How can I convert an image file to base 2 binary?

I'm trying to do it like this:
function handleFiles(files) {
var selectedFile = files[0];
var reader = new FileReader();
reader.readAsBinaryString(selectedFile);
reader.onloadend = function() {
var result = reader.result;
convert(result);
};
}
function convert(data) {
var img = document.createElement("img");
var result = data.toString(2);
document.body.appendChild(img);
img.src = 'data:image/jpeg;base64,' + btoa(data); // this works
console.log("Base ?: " + data); // not sure, I think 16 or more likely 64,
// the MDN documentation doesn't specify what base of binary is produced by the readAsBinaryString method
console.log("Base 2: " + result); // not base 2 binary data as expected
}
<input type="file" id="input" onchange="handleFiles(this.files)">
This code will convert a jpeg into binary data, then render it, but it is not first converting to base 2 binary. If you run that code and look at the logs, it's something more (I'm naive on the subject of bases and binary, but my guess is base 16 or 64). I think toString(2) is supposed to convert to base 2, but it doesn't seem to be doing that. Before converting it back to base 64, I want to get the binary data to base 2 for experimentation.
Can't you just step through your binary string and convert each character to a binary representation?
function convert(data) {
var result = ''
for (let i = 0; i < data.length; i++){
result += data.charCodeAt(i).toString(2)
}
console.log(result)
}
[ I would test this with a small file first. It writes a lot to the console, as you might imagine ]
EDIT -- sorry, didn't realize you wanted to go back and forth. If you want to get from a binary string back to something else, you'll probably want to pad the binary with zeros so you can pull out 8 bit (or whatever) at a time:
function convert(data) {
var result = ''
for (let i = 0; i < data.length; i++){
// pad to make all binary string parts 8 bits long
result += data.charCodeAt(i).toString(2).padStart(8, 0)
}
return result
}
function makeBuffer(string){
buffer = ''
for(let i = 0; i < string.length/8; i++){
let stIdx = i*8
// maybe a way to do this without the slice? dunno.
buffer += String.fromCharCode(parseInt(string.slice(stIdx,stIdx+8), 2))
}
return buffer
}
Binary is always base 2, that's what the "bi" in "binary" means.
When you call fileReader.readAsBinaryString() it's returning binary; Strings can store binary because they're really just an array of characters.
Displaying the string will not be a series of zeros and ones, it will be a jumble of different ASCII characters (letters, numbers, symbols, etc.) because those are the equivalent characters that match the binary values.
Additionally, you don't need the following line since your data you're passing in is already a string.
var result = data.toString(2);

Convert little endian binary to int in javascript on client (wscript)

I have an array of binary data that was converted using node js on server:
buffer.toString('hex');
And there are 2 integer added in the beginning of the buffer like so:
buffer.WriteInt32LE(index,0);
buffer.WriteInt32LE(id,0);
in the end i got a string like this:
var str = "1100000050000000fd2595200380024"
I need the parse this string on client, using windows scripting environment ( javascript + ActiveX)
How can i convert the first two values '11000000' and '50000000' from little endian string to integer, and the rest of the string to binary bytes represented by hex codes? In browser ArrayBuffer is avaliable, but the js executed from windows scripting environment.
var str = "1100000050000000fd25952003800245";
var int1 = parseInt(str.substring(0,7));
var int2 = parseInt(str.substring(8,15));
alert(int1);
alert(int2);
will get you the integers..
the rest of the str has 7 bytes... and a nibble... fd 25 95 20 03 80 02 4
assuming you are missing a nibble to make it 8 bytes...
var hex = [str.substring(16,str.length).length/2];
this will get you the byte array with 8 positions
var hex = [];
hex.push(str2.substring(0,2));
hex.push(str2.substring(2,4));
hex.push(str2.substring(4,6));
hex.push(str2.substring(6,8));
hex.push(str2.substring(8,10));
hex.push(str2.substring(10,12));
hex.push(str2.substring(12,14));
hex.push(str2.substring(14,16));
alert(hex);
just filled the array with the bytes in string, but you just need to convert them now :)
fullcode:
var str = "1100000050000000fd25952003800245";
var int1 = parseInt(str.substring(0,7));
var int2 = parseInt(str.substring(8,15));
var str2 = str.substring(16,str.length);
alert(int1);
alert(int2);
var hex = [];
hex.push(str2.substring(0,2));
hex.push(str2.substring(2,4));
hex.push(str2.substring(4,6));
hex.push(str2.substring(6,8));
hex.push(str2.substring(8,10));
hex.push(str2.substring(10,12));
hex.push(str2.substring(12,14));
hex.push(str2.substring(14,16));
alert(hex);

JavaScript: reading 3 bytes Buffer as an integer

Let's say I have a hex data stream, which I want to divide into 3-bytes blocks which I need to read as an integer.
For example: given a hex string 01be638119704d4b9a I need to read the first three bytes 01be63 and read it as integer 114275. This is what I got:
var sample = '01be638119704d4b9a';
var buffer = new Buffer(sample, 'hex');
var bufferChunk = buffer.slice(0, 3);
var decimal = bufferChunk.readUInt32BE(0);
The readUInt32BE works perfectly for 4-bytes data, but here I obviously get:
RangeError: index out of range
at checkOffset (buffer.js:494:11)
at Buffer.readUInt32BE (buffer.js:568:5)
How do I read 3-bytes as integer correctly?
If you are using node.js v0.12+ or io.js, there is buffer.readUIntBE() which allows a variable number of bytes:
var decimal = buffer.readUIntBE(0, 3);
(Note that it's readUIntBE for Big Endian and readUIntLE for Little Endian).
Otherwise if you're on an older version of node, you will have to do it manually (check bounds first of course):
var decimal = (buffer[0] << 16) + (buffer[1] << 8) + buffer[2];
I'm using this, if someone knows something wrong with it, please advise;
const integer = parseInt(buffer.toString("hex"), 16)
you should convert three byte to four byte.
function three(var sample){
var buffer = new Buffer(sample, 'hex');
var buf = new Buffer(1);
buf[0] = 0x0;
return Buffer.concat([buf, buffer.slice(0, 3)]).readUInt32BE();
}
You can try this function.

Javascript Blob object saved as file contains extra bytes

I have a program which reads an array of bytes. Those bytes are supposed to be ISO-8859-2 decimal codes of characters. My test array has two elements: 103 which is letter g and 179 which is letter Å‚ (l with tail). I then create a Blob object from it and check its content using two methods:
FileReader
objectURL
The first method gives correct results but the second method gives an extra character in the saved blob file.
Here is the code:
var bytes = [103, 179];
var chr1 = String.fromCharCode(bytes[0]);
var chr2 = String.fromCharCode(bytes[1]);
var str = '';
str += chr1;
str += chr2;
console.log(str.charCodeAt(0)); //103
console.log(str.charCodeAt(1)); //179
console.log(str.charCodeAt(2)); //NaN
var blob = new Blob([str]);
console.log(blob.size); //3
//Checking Blob contents using first method - FileReader
var reader = new FileReader();
reader.addEventListener("loadend", function() {
var str1 = this.result;
console.log(str1); //g³
console.log(str1.charCodeAt(0)); //103
console.log(str1.charCodeAt(1)); //179
console.log(str1.charCodeAt(2)); //NaN
});
reader.readAsText(blob);
//Checking Blob contents using second method - objectURL
var url = URL.createObjectURL(blob);
$('<a>',{
text: 'Download the blob',
title: 'Download',
href: url
}).appendTo('#my');
In order to use the second method I created a fiddle. In the fiddle, when you click the "Download" link and save and then open the file in a binary editor, it consists of the following bytes: 103, 194, 179.
My question is, where does the 194 come from and how to create a blob file (using the createobjectURL method) containing only bytes given in the original array ([103, 179] in this case).
The extra 194 comes from an encoding issue :
179 is the unicode code point of "SUPERCRIPT THREE" so the string str will contains "g³". After creating the blob, you will get this string encoded in utf8 : 0x67 for g, 0xC2 0xB3 for ³ (194, 179 in decimal) and it takes 3 bytes. Of course, if you use a FileReader, you will get back 2 characters, "g³".
To avoid that situation (and if you don't want to put everything in utf8), you can use a typed array to construct the blob :
var u8 = new Uint8Array(bytes);
var blob = new Blob([u8]);
That way, you will keep exactly the bytes you want.

Categories