Can’t access unicode keys. Possibly, Javascript JSON parser bug? - javascript

I have a json string, that has Unicode key:
{"Текст":"Text"}
I can successfully JSON.parse it and then JSON.stringify – it will be the same.
I can’t access Текст key; it is undefined:
JSON.parse('{"Текст":"Text"}')["Текст"] // undefined
I can’t get keys by Object.keys(), it returns 0.
I tried to escape Unicode in string before parsing it, and even escape Unicode in brackets when trying to access a key, but it doesn’t help.
Page has
<meta charset="utf-8">
Json string is returned by php’s echo. Php code has at the beginning
header('Content-Type: text/html; charset=utf-8');
String by itself displays correctly.
Update
I changed content type to plain text to all script (can’t use pc and devtools, unfortunately), to see how the json string is echo. It is escaped:
...,"\u041d\u0430\u0447\u0430\u043b\u043e":"some value",...
Update 2
I changed escaped unicode to just unicode, but still facing the issues
Update 3
I think it is indeed a bug. As I wrote above, json object parsed from unicode string is absolutely valid. I can’t access it’s keys, but if I stringify it, it will return a valid string.
I thought: “what if to parse and then set key, that should be there?”. I did it – nothing is changed. And more – adding any key is not affecting a json object. When I stringify it, it returns an original string, without any changes.

Since native JSON stuff doesn’t work, I’ve made my own solution to set/get json keys:
function JSONget(j, k)
{
var i = j.search('"'+k+'"');
if(i == -1)
return j;
i+=k.length;
while(j[i] != ':' && i < j.length)
i++;
i++;
if(j[i]=='"')
i++;
var buffer = [];
for(var a=0; i < j.length; i++, a++)
{
if((j[i]=='"'&&j[i-1]!='\\')||(j[i]==','&&j[i+1]=='"'))
break;
buffer[a] = j[i];
}
return buffer.join('');
}
function JSONset(j, k, v)
{
var i = j.search('"'+k+'"');
if(i == -1)
return j;
i+=k.length;
while(j[i] != ':' && i < j.length)
i++;
i++;
if(j[i]=='"')
i++;
var buffer = [];
for(var a=0; i < j.length; i++, a++)
{
if((j[i]=='"'&&j[i-1]!='\\')||(j[i]==','&&j[i+1]=='"'))
break;
buffer[a] = j[i];
}
var t = j.split('"'+k+'"');
t[1] = t[1].replace(buffer.join(''), v);
return t[0]+('"'+k+'"')+t[1];
}
Use
JSONget('{"Текст":"Text"}', "Текст") // Text
JSONset('{"Текст":"Text"}', "Текст", "Latin vs Cyrillic") // "Текст":"Latin vs Cyrillic"

Related

variable string appearing as undefined, new string not appearing in array, javascript

I'm attempting to write some code that puts a single string of emails into an array of emails. Splitting the string wherever there's a comma(,). The initial problem i'm having is the string that is being passed as a variable is not being recognized. I'm getting the error message "Cannot read property 'length' of undefined" of the conditional part of the for loop. Odd, as I'm definitely passing a string or trying to ?
When I pass in a string directly to the function parameter(to avoid the above problem for testing the rest of the function) only the first 2 email addresses appear the final email address is lost ?
I'm learning programming and this is an exercise as such I'm trying to avoid using the split() method or regEx. Daft i know.
Any help in overcoming these 2 issues greatly appreciated.
function separateCommaValues(text)
{
var input = [];
var val = '';
for(var i = 0; i < text.length; i++) {
if(text[i] == ',') {
if(val.length == 0){
continue;
}
input.push(val);
val = '';
} else {
val += text[i];
}
}
document.write( input );
}
separateCommaValues(str);
var str = "john#google.com, jake#yahoo.com, andrew#hotmail.com";
var str = "john#google.com, jake#yahoo.com, andrew#hotmail.com";
separateCommaValues(str);
This is the correct order. Your variable can be declared before it is used via hoisting, but you can't define it before it is used (undefined error).
And the last email address isn't pushed into the array because it doesn't have a comma after it. So after the loop, before document.write( input );, add something like this:
if(val.length > 0){
input.push(val);
val = '';
}

Change JavaScript string encoding

At the moment I have a large JavaScript string I'm attempting to write to a file, but in a different encoding (ISO-8859-1). I was hoping to use something like downloadify. Downloadify only accepts normal JavaScript strings or base64 encoded strings.
Because of this, I've decided to compress my string using JSZip which generates a nicely base64 encoded string that can be passed to downloadify, and downloaded to my desktop. Huzzah! The issue is that the string I compressed, of course, is still the wrong encoding.
Luckily JSZip can take a Uint8Array as data, instead of a string. So is there any way to convert a JavaScript string into a ISO-8859-1 encoded string and store it in a Uint8Array?
Alternatively, if I'm approaching this all wrong, is there a better solution all together? Is there a fancy JavaScript string class that can use different internal encodings?
Edit: To clarify, I'm not pushing this string to a webpage so it won't automatically convert it for me. I'm doing something like this:
var zip = new JSZip();
zip.file("genSave.txt", result);
return zip.generate({compression:"DEFLATE"});
And for this to make sense, I would need result to be in the proper encoding (and JSZip only takes strings, arraybuffers, or uint8arrays).
Final Edit (This was -not- a duplicate question because the result wasn't being displayed in the browser or transmitted to a server where the encoding could be changed):
This turned out to be a little more obscure than I had thought, so I ended up rolling my own solution. It's not nearly as robust as a proper solution would be, but it'll convert a JavaScript string into windows-1252 encoding, and stick it in a Uint8Array:
var enc = new string_transcoder("windows-1252");
var tenc = enc.transcode(result); //This is now a Uint8Array
You can then either use it in the array like I did:
//Make this into a zip
var zip = new JSZip();
zip.file("genSave.txt", tenc);
return zip.generate({compression:"DEFLATE"});
Or convert it into a windows-1252 encoded string using this string encoding library:
var string = TextDecoder("windows-1252").decode(tenc);
To use this function, either use:
<script src="//www.eu4editor.com/string_transcoder.js"></script>
Or include this:
function string_transcoder (target) {
this.encodeList = encodings[target];
if (this.encodeList === undefined) {
return undefined;
}
//Initialize the easy encodings
if (target === "windows-1252") {
var i;
for (i = 0x0; i <= 0x7F; i++) {
this.encodeList[i] = i;
}
for (i = 0xA0; i <= 0xFF; i++) {
this.encodeList[i] = i;
}
}
}
string_transcoder.prototype.transcode = function (inString) {
var res = new Uint8Array(inString.length), i;
for (i = 0; i < inString.length; i++) {
var temp = inString.charCodeAt(i);
var tempEncode = (this.encodeList)[temp];
if (tempEncode === undefined) {
return undefined; //This encoding is messed up
} else {
res[i] = tempEncode;
}
}
return res;
};
encodings = {
"windows-1252": {0x20AC:0x80, 0x201A:0x82, 0x0192:0x83, 0x201E:0x84, 0x2026:0x85, 0x2020:0x86, 0x2021:0x87, 0x02C6:0x88, 0x2030:0x89, 0x0160:0x8A, 0x2039:0x8B, 0x0152:0x8C, 0x017D:0x8E, 0x2018:0x91, 0x2019:0x92, 0x201C:0x93, 0x201D:0x94, 0x2022:0x95, 0x2013:0x96, 0x2014:0x97, 0x02DC:0x98, 0x2122:0x99, 0x0161:0x9A, 0x203A:0x9B, 0x0153:0x9C, 0x017E:0x9E, 0x0178:0x9F}
};
This turned out to be a little more obscure than [the author] had thought, so [the author] ended up rolling [his] own solution. It's not nearly as robust as a proper solution would be, but it'll convert a JavaScript string into windows-1252 encoding, and stick it in a Uint8Array:
var enc = new string_transcoder("windows-1252");
var tenc = enc.transcode(result); //This is now a Uint8Array
You can then either use it in the array like [the author] did:
//Make this into a zip
var zip = new JSZip();
zip.file("genSave.txt", tenc);
return zip.generate({compression:"DEFLATE"});
Or convert it into a windows-1252 encoded string using this string encoding library:
var string = TextDecoder("windows-1252").decode(tenc);
To use this function, either use:
<script src="//www.eu4editor.com/string_transcoder.js"></script>
Or include this:
function string_transcoder (target) {
this.encodeList = encodings[target];
if (this.encodeList === undefined) {
return undefined;
}
//Initialize the easy encodings
if (target === "windows-1252") {
var i;
for (i = 0x0; i <= 0x7F; i++) {
this.encodeList[i] = i;
}
for (i = 0xA0; i <= 0xFF; i++) {
this.encodeList[i] = i;
}
}
}
string_transcoder.prototype.transcode = function (inString) {
var res = new Uint8Array(inString.length), i;
for (i = 0; i < inString.length; i++) {
var temp = inString.charCodeAt(i);
var tempEncode = (this.encodeList)[temp];
if (tempEncode === undefined) {
return undefined; //This encoding is messed up
} else {
res[i] = tempEncode;
}
}
return res;
};
encodings = {
"windows-1252": {0x20AC:0x80, 0x201A:0x82, 0x0192:0x83, 0x201E:0x84, 0x2026:0x85, 0x2020:0x86, 0x2021:0x87, 0x02C6:0x88, 0x2030:0x89, 0x0160:0x8A, 0x2039:0x8B, 0x0152:0x8C, 0x017D:0x8E, 0x2018:0x91, 0x2019:0x92, 0x201C:0x93, 0x201D:0x94, 0x2022:0x95, 0x2013:0x96, 0x2014:0x97, 0x02DC:0x98, 0x2122:0x99, 0x0161:0x9A, 0x203A:0x9B, 0x0153:0x9C, 0x017E:0x9E, 0x0178:0x9F}
};
Test the following script:
<script type="text/javascript" charset="utf-8">
The best solution for me was posted here and this is my one-liner:
<!-- Required for non-UTF encodings (quite big) -->
<script src="encoding-indexes.js"></script>
<script src="encoding.js"></script>
...
// windows-1252 is just one typical example encoding/transcoding
let transcodedString = new TextDecoder( 'windows-1252' ).decode(
new TextEncoder().encode( someUtf8String ))
or this if the transcoding has to be applied on multiple inputs reusing the encoder and decoder:
let srcArr = [ ... ] // some UTF-8 string array
let encoder = new TextEncoder()
let decoder = new TextDecoder( 'windows-1252' )
let transcodedArr = srcArr.forEach( (s,i) => {
srcArr[i] = decoder.decode( encoder.encode( s )) })
(The slightly modified other answer from related question:)
This is what I found after a more specific Google search than just
UTF-8 encode/decode. so for those who are looking for a converting
library to convert between encodings, here you go.
github.com/inexorabletash/text-encoding
var uint8array = new TextEncoder().encode(str);
var str = new TextDecoder(encoding).decode(uint8array);
Paste from repo readme
All encodings from the Encoding specification are supported:
utf-8 ibm866 iso-8859-2 iso-8859-3 iso-8859-4 iso-8859-5 iso-8859-6
iso-8859-7 iso-8859-8 iso-8859-8-i iso-8859-10 iso-8859-13 iso-8859-14
iso-8859-15 iso-8859-16 koi8-r koi8-u macintosh windows-874 windows-1250
windows-1251 windows-1252 windows-1253 windows-1254 windows-1255
windows-1256 windows-1257 windows-1258 x-mac-cyrillic gb18030 hz-gb-2312
big5 euc-jp iso-2022-jp shift_jis euc-kr replacement utf-16be utf-16le
x-user-defined
(Some encodings may be supported under other names, e.g. ascii,
iso-8859-1, etc. See Encoding for additional labels for each
encoding.)

jQuery / JavaScript Parsing strings the proper way

Recently, I've been attempting to emulate a small language in jQuery and JavaScript, yet I've come across what I believe is an issue. I think that I may be parsing everything completely wrong.
In the code:
#name Testing
#inputs
#outputs
#persist
#trigger
print("Test")
The current way I am separating and parsing the string is by splitting all of the code into lines, and then reading through this lines array using searches and splits. For example, I would find the name using something like:
if(typeof lines[line] === 'undefined')
{
}
else
{
if(lines[line].search('#name') == 0)
{
name = lines[line].split(' ')[1];
}
}
But I think that I may be largely wrong on how I am handling parsing.
While reading through examples on how other people are handling parsing of code blocks like this, it appeared that people parsed the entire block, instead of splitting it into lines as I do. I suppose the question of the matter is, what is the proper and conventional way of parsing things like this, and how do you suggest I use it to parse something such as this?
In simple cases like this regular expressions is your tool of choice:
matches = code.match(/#name\s+(\w+)/)
name = matches[1]
To parse "real" programming languages regexps are not powerful enough, you'll need a parser, either hand-written or automatically generated with a tool like PEG.
A general approach to parsing, that I like to take often is the following:
loop through the complete block of text, character by character.
if you find a character that signalizes the start of one unit, call a specialized subfunction to parse the next characters.
within each subfunction, call additional subfunctions if you find certain characters
return from every subfunction when a character is found, that signalizes, that the unit has ended.
Here is a small example:
var text = "#func(arg1,arg2)"
function parse(text) {
var i, max_i, ch, funcRes;
for (i = 0, max_i = text.length; i < max_i; i++) {
ch = text.charAt(i);
if (ch === "#") {
funcRes = parseFunction(text, i + 1);
i = funcRes.index;
}
}
console.log(funcRes);
}
function parseFunction(text, i) {
var max_i, ch, name, argsRes;
name = [];
for (max_i = text.length; i < max_i; i++) {
ch = text.charAt(i);
if (ch === "(") {
argsRes = parseArguments(text, i + 1);
return {
name: name.join(""),
args: argsRes.arr,
index: argsRes.index
};
}
name.push(ch);
}
}
function parseArguments(text, i) {
var max_i, ch, args, arg;
arg = [];
args = [];
for (max_i = text.length; i < max_i; i++) {
ch = text.charAt(i);
if (ch === ",") {
args.push(arg.join(""));
arg = [];
continue;
} else if (ch === ")") {
args.push(arg.join(""));
return {
arr: args,
index: i
};
}
arg.push(ch);
}
}
FIDDLE
this example just parses function expressions, that follow the syntax "#functionName(argumentName1, argumentName2, ...)". The general idea is to visit every character exactly once without the need to save current states like "hasSeenAtCharacter" or "hasSeenOpeningParentheses", which can get pretty messy when you parse large structures.
Please note that this is a very simplified example and it misses all the error handling and stuff like that, but I hope the general idea can be seen. Note also that I'm not saying that you should use this approach all the time. It's a very general approach, that can be used in many scenerios. But that doesn't mean that it can't be combined with regular expressions for instance, if it, at some part of your text, makes more sense than parsing each individual character.
And one last remark: you can save yourself the trouble if you put the specialized parsing function inside the main parsing function, so that all functions have access to the same variable i.

Letters to predefined numbers conversion in one window

)
I have searched high and low, but i can´t find what i need. Or i´m to stupid to get it right ;-)
I need a page with several input boxes where i can type some text, and then an output area below each input, that shows the text converted to some predefined numbers.
example:
input: abcde fghi æøå (i need all kinds of characters like .,/: etc.)
output: 064 065 066 067 068 032
So it needs to convert like this:
"a"="064 "
"b"="065 "
"space"="032 "
(and yes, each number in output needs to be separated, or a space added after each number)
I have tried some different cipher guides in both php and javascript, but can´t get it to work. I did do an Excel document that could do some of it, but it had a limited amount of characters it could convert, then it started behaving weird. So i thought maybe PHP was the answer!
Any help is very appreciated
/Rasmus
In the spirit of elclanrs deleted answer, and for posterity:
<script>
// Using standard for loop
function stringToCharcodes(s) {
var result = [];
function pad(n){ return (n<10? '00' : n<100? '0' : 0) + n;}
for (var i=0, iLen=s.length; i<iLen; i++) {
result.push(pad(s.charCodeAt(i)));
}
return result.join(' ');
}
// Using ES5 forEach
function stringToCharcodes2(s) {
var result = [];
function pad(n){ return (n<10? '00' : n<100? '0' : 0) + n;}
s.split('').forEach(function(a){result.push(pad(a.charCodeAt(0)))});
return result.join(' ');
}
</script>
<input onkeyup="document.getElementById('s0').innerHTML = stringToCharcodes(this.value);"><br>
<span id="s0"></span>
Edit
If you want a custom mapping, use an object (I've only included 2 characters, you can add as many as you want):
var mapChars = (function() {
var mapping = {'198':'019', '230':'018'};
return function (s) {
var c, result = [];
for (var i=0, iLen=s.length; i<iLen; i++) {
c = s.charCodeAt(i);
result.push(c in mapping? mapping[c] : c);
}
return result.join(' ');
}
}());
alert(mapChars('Ææ')); //
Using the character code for mapping seems to be a reasonable solution, using the actual character may be subject to different page character encoding.

Base64 encoding and decoding in client-side Javascript [duplicate]

This question already has answers here:
How can you encode a string to Base64 in JavaScript?
(33 answers)
Closed 1 year ago.
Are there any methods in JavaScript that could be used to encode and decode a string using base64 encoding?
Some browsers such as Firefox, Chrome, Safari, Opera and IE10+ can handle Base64 natively. Take a look at this Stackoverflow question. It's using btoa() and atob() functions.
For server-side JavaScript (Node), you can use Buffers to decode.
If you are going for a cross-browser solution, there are existing libraries like CryptoJS or code like:
http://ntt.cc/2008/01/19/base64-encoder-decoder-with-javascript.html (Archive)
With the latter, you need to thoroughly test the function for cross browser compatibility. And error has already been reported.
Internet Explorer 10+
// Define the string
var string = 'Hello World!';
// Encode the String
var encodedString = btoa(string);
console.log(encodedString); // Outputs: "SGVsbG8gV29ybGQh"
// Decode the String
var decodedString = atob(encodedString);
console.log(decodedString); // Outputs: "Hello World!"
Cross-Browser
Re-written and modularized UTF-8 and Base64 Javascript Encoding and Decoding Libraries / Modules for AMD, CommonJS, Nodejs and Browsers. Cross-browser compatible.
with Node.js
Here is how you encode normal text to base64 in Node.js:
//Buffer() requires a number, array or string as the first parameter, and an optional encoding type as the second parameter.
// Default is utf8, possible encoding types are ascii, utf8, ucs2, base64, binary, and hex
var b = Buffer.from('JavaScript');
// If we don't use toString(), JavaScript assumes we want to convert the object to utf8.
// We can make it convert to other formats by passing the encoding type to toString().
var s = b.toString('base64');
And here is how you decode base64 encoded strings:
var b = Buffer.from('SmF2YVNjcmlwdA==', 'base64')
var s = b.toString();
with Dojo.js
To encode an array of bytes using dojox.encoding.base64:
var str = dojox.encoding.base64.encode(myByteArray);
To decode a base64-encoded string:
var bytes = dojox.encoding.base64.decode(str)
bower install angular-base64
<script src="bower_components/angular-base64/angular-base64.js"></script>
angular
.module('myApp', ['base64'])
.controller('myController', [
'$base64', '$scope',
function($base64, $scope) {
$scope.encoded = $base64.encode('a string');
$scope.decoded = $base64.decode('YSBzdHJpbmc=');
}]);
But How?
If you would like to learn more about how base64 is encoded in general, and in JavaScript in-particular, I would recommend this article: Computer science in JavaScript: Base64 encoding
In Gecko/WebKit-based browsers (Firefox, Chrome and Safari) and Opera, you can use btoa() and atob().
Original answer: How can you encode a string to Base64 in JavaScript?
Here is a tightened up version of Sniper's post. It presumes well formed base64 string with no carriage returns. This version eliminates a couple of loops, adds the &0xff fix from Yaroslav, eliminates trailing nulls, plus a bit of code golf.
decodeBase64 = function(s) {
var e={},i,b=0,c,x,l=0,a,r='',w=String.fromCharCode,L=s.length;
var A="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
for(i=0;i<64;i++){e[A.charAt(i)]=i;}
for(x=0;x<L;x++){
c=e[s.charAt(x)];b=(b<<6)+c;l+=6;
while(l>=8){((a=(b>>>(l-=8))&0xff)||(x<(L-2)))&&(r+=w(a));}
}
return r;
};
Short and fast Base64 JavaScript Decode Function without Failsafe:
function decode_base64 (s)
{
var e = {}, i, k, v = [], r = '', w = String.fromCharCode;
var n = [[65, 91], [97, 123], [48, 58], [43, 44], [47, 48]];
for (z in n)
{
for (i = n[z][0]; i < n[z][1]; i++)
{
v.push(w(i));
}
}
for (i = 0; i < 64; i++)
{
e[v[i]] = i;
}
for (i = 0; i < s.length; i+=72)
{
var b = 0, c, x, l = 0, o = s.substring(i, i+72);
for (x = 0; x < o.length; x++)
{
c = e[o.charAt(x)];
b = (b << 6) + c;
l += 6;
while (l >= 8)
{
r += w((b >>> (l -= 8)) % 256);
}
}
}
return r;
}
function b64_to_utf8( str ) {
return decodeURIComponent(escape(window.atob( str )));
}
https://developer.mozilla.org/en-US/docs/Web/API/WindowBase64/Base64_encoding_and_decoding#The_.22Unicode_Problem.22
Modern browsers have built-in javascript functions for Base64 encoding btoa() and decoding atob(). More info about support in older browser versions: https://caniuse.com/?search=atob
However, be aware that atob and btoa functions work only for ASCII charset.
If you need Base64 functions for UTF-8 charset, you can do it with:
function base64_encode(s) {
return btoa(unescape(encodeURIComponent(s)));
}
function base64_decode(s) {
return decodeURIComponent(escape(atob(s)));
}
Did someone say code golf? =)
The following is my attempt at improving my handicap while catching up with the times. Supplied for your convenience.
function decode_base64(s) {
var b=l=0, r='',
m='ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/';
s.split('').forEach(function (v) {
b=(b<<6)+m.indexOf(v); l+=6;
if (l>=8) r+=String.fromCharCode((b>>>(l-=8))&0xff);
});
return r;
}
What I was actually after was an asynchronous implementation and to my surprise it turns out forEach as opposed to JQuery's $([]).each method implementation is very much synchronous.
If you also had such crazy notions in mind a 0 delay window.setTimeout will run the base64 decode asynchronously and execute the callback function with the result when done.
function decode_base64_async(s, cb) {
setTimeout(function () { cb(decode_base64(s)); }, 0);
}
#Toothbrush suggested "index a string like an array", and get rid of the split. This routine seems really odd and not sure how compatible it will be, but it does hit another birdie so lets have it.
function decode_base64(s) {
var b=l=0, r='',
m='ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/';
[].forEach.call(s, function (v) {
b=(b<<6)+m.indexOf(v); l+=6;
if (l>=8) r+=String.fromCharCode((b>>>(l-=8))&0xff);
});
return r;
}
While trying to find more information on JavaScript string as array I stumbled on this pro tip using a /./g regex to step through a string. This reduces the code size even more by replacing the string in place and eliminating the need of keeping a return variable.
function decode_base64(s) {
var b=l=0,
m='ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/';
return s.replace(/./g, function (v) {
b=(b<<6)+m.indexOf(v); l+=6;
return l<8?'':String.fromCharCode((b>>>(l-=8))&0xff);
});
}
If however you were looking for something a little more traditional perhaps the following is more to your taste.
function decode_base64(s) {
var b=l=0, r='', s=s.split(''), i,
m='ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/';
for (i in s) {
b=(b<<6)+m.indexOf(s[i]); l+=6;
if (l>=8) r+=String.fromCharCode((b>>>(l-=8))&0xff);
}
return r;
}
I didn't have the trailing null issue so this was removed to remain under par but it should easily be resolved with a trim() or a trimRight() if you'd prefer, should this pose a problem for you.
ie.
return r.trimRight();
Note:
The result is an ascii byte string, if you need unicode the easiest is to escape the byte string which can then be decoded with decodeURIComponent to produce the unicode string.
function decode_base64_usc(s) {
return decodeURIComponent(escape(decode_base64(s)));
}
Since escape is being deprecated we could change our function to support unicode directly without the need for escape or String.fromCharCode we can produce a % escaped string ready for URI decoding.
function decode_base64(s) {
var b=l=0,
m='ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/';
return decodeURIComponent(s.replace(/./g, function (v) {
b=(b<<6)+m.indexOf(v); l+=6;
return l<8?'':'%'+(0x100+((b>>>(l-=8))&0xff)).toString(16).slice(-2);
}));
}
Edit for #Charles Byrne:
Can't remember why we didn't ignore the '=' padding characters, might've worked with a specification that didn't require them at the time. If we were to modify the decodeURIComponent routine to ignore these, as we should since they do not represent any data, the result decodes the example correctly.
function decode_base64(s) {
var b=l=0,
m='ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/';
return decodeURIComponent(s.replace(/=*$/,'').replace(/./g, function (v) {
b=(b<<6)+m.indexOf(v); l+=6;
return l<8?'':'%'+(0x100+((b>>>(l-=8))&0xff)).toString(16).slice(-2);
}));
}
Now calling decode_base64('4pyTIMOgIGxhIG1vZGU=') will return the encoded string '✓ à la mode', without any errors.
Since '=' is reserved as padding character I can reduce my code golf handicap, if I may:
function decode_base64(s) {
var b=l=0,
m='ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/';
return decodeURIComponent(s.replace(/./g, function (v) {
b=(b<<6)+m.indexOf(v); l+=6;
return l<8||'='==v?'':'%'+(0x100+((b>>>(l-=8))&0xff)).toString(16).slice(-2);
}));
}
nJoy!
The php.js project has JavaScript implementations of many of PHP's functions. base64_encode and base64_decode are included.
For what it's worth, I got inspired by the other answers and wrote a small utility which calls the platform specific APIs to be used universally from either Node.js or a browser:
/**
* Encode a string of text as base64
*
* #param data The string of text.
* #returns The base64 encoded string.
*/
function encodeBase64(data: string) {
if (typeof btoa === "function") {
return btoa(data);
} else if (typeof Buffer === "function") {
return Buffer.from(data, "utf-8").toString("base64");
} else {
throw new Error("Failed to determine the platform specific encoder");
}
}
/**
* Decode a string of base64 as text
*
* #param data The string of base64 encoded text
* #returns The decoded text.
*/
function decodeBase64(data: string) {
if (typeof atob === "function") {
return atob(data);
} else if (typeof Buffer === "function") {
return Buffer.from(data, "base64").toString("utf-8");
} else {
throw new Error("Failed to determine the platform specific decoder");
}
}
I have tried the Javascript routines at phpjs.org and they have worked well.
I first tried the routines suggested in the chosen answer by Ranhiru Cooray - http://ntt.cc/2008/01/19/base64-encoder-decoder-with-javascript.html
I found that they did not work in all circumstances. I wrote up a test case where these routines fail and posted them to GitHub at:
https://github.com/scottcarter/base64_javascript_test_data.git
I also posted a comment to the blog post at ntt.cc to alert the author (awaiting moderation - the article is old so not sure if comment will get posted).
Frontend: Good solutions above, but quickly for the backend...
NodeJS - no deprecation
Use Buffer.from.
> inBase64 = Buffer.from('plain').toString('base64')
'cGxhaW4='
> // DEPRECATED //
> new Buffer(inBase64, 'base64').toString()
'plain'
> (node:1188987) [DEP0005] DeprecationWarning: Buffer() is deprecated due to security and usability issues. Please use the Buffer.alloc(), Buffer.allocUnsafe(), or Buffer.from() methods instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
// Works //
> Buffer.from(inBase64, 'base64').toString()
'plain'
In Node.js we can do it in simple way
var base64 = 'SGVsbG8gV29ybGQ='
var base64_decode = new Buffer(base64, 'base64').toString('ascii');
console.log(base64_decode); // "Hello World"
I'd rather use the bas64 encode/decode methods from CryptoJS, the most popular library for standard and secure cryptographic algorithms implemented in JavaScript using best practices and patterns.
For JavaScripts frameworks where there is no atob method and in case you do not want to import external libraries, this is short function that does it.
It would get a string that contains Base64 encoded value and will return a decoded array of bytes (where the array of bytes is represented as array of numbers where each number is an integer between 0 and 255 inclusive).
function fromBase64String(str) {
var alpha =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
var value = [];
var index = 0;
var destIndex = 0;
var padding = false;
while (true) {
var first = getNextChr(str, index, padding, alpha);
var second = getNextChr(str, first .nextIndex, first .padding, alpha);
var third = getNextChr(str, second.nextIndex, second.padding, alpha);
var fourth = getNextChr(str, third .nextIndex, third .padding, alpha);
index = fourth.nextIndex;
padding = fourth.padding;
// ffffffss sssstttt ttffffff
var base64_first = first.code == null ? 0 : first.code;
var base64_second = second.code == null ? 0 : second.code;
var base64_third = third.code == null ? 0 : third.code;
var base64_fourth = fourth.code == null ? 0 : fourth.code;
var a = (( base64_first << 2) & 0xFC ) | ((base64_second>>4) & 0x03);
var b = (( base64_second<< 4) & 0xF0 ) | ((base64_third >>2) & 0x0F);
var c = (( base64_third << 6) & 0xC0 ) | ((base64_fourth>>0) & 0x3F);
value [destIndex++] = a;
if (!third.padding) {
value [destIndex++] = b;
} else {
break;
}
if (!fourth.padding) {
value [destIndex++] = c;
} else {
break;
}
if (index >= str.length) {
break;
}
}
return value;
}
function getNextChr(str, index, equalSignReceived, alpha) {
var chr = null;
var code = 0;
var padding = equalSignReceived;
while (index < str.length) {
chr = str.charAt(index);
if (chr == " " || chr == "\r" || chr == "\n" || chr == "\t") {
index++;
continue;
}
if (chr == "=") {
padding = true;
} else {
if (equalSignReceived) {
throw new Error("Invalid Base64 Endcoding character \""
+ chr + "\" with code " + str.charCodeAt(index)
+ " on position " + index
+ " received afer an equal sign (=) padding "
+ "character has already been received. "
+ "The equal sign padding character is the only "
+ "possible padding character at the end.");
}
code = alpha.indexOf(chr);
if (code == -1) {
throw new Error("Invalid Base64 Encoding character \""
+ chr + "\" with code " + str.charCodeAt(index)
+ " on position " + index + ".");
}
}
break;
}
return { character: chr, code: code, padding: padding, nextIndex: ++index};
}
Resources used: RFC-4648 Section 4
Base64 Win-1251 decoding for encodings other than acsi or iso-8859-1.
As it turned out, all the scripts I saw here convert Cyrillic Base64 to iso-8859-1 encoding. It is strange that no one noticed this.
Thus, to restore the Cyrillic alphabet, it is enough to do an additional transcoding of the text from iso-8859-1 to windows-1251.
I think that with other languages, it will be the same. Just change Cyrilic windows-1251 to yours.
... and Thanks to Der Hochstapler for his code i'm take from his comment ... of over comment, which is somewhat unusual.
code for JScript (for Windows desktop only) (ActiveXObject) - 1251 file encoding
decode_base64=function(f){var g={},b=65,d=0,a,c=0,h,e="",k=String.fromCharCode,l=f.length;for(a="";91>b;)a+=k(b++);a+=a.toLowerCase()+"0123456789+/";for(b=0;64>b;b++)g[a.charAt(b)]=b;for(a=0;a<l;a++)for(b=g[f.charAt(a)],d=(d<<6)+b,c+=6;8<=c;)((h=d>>>(c-=8)&255)||a<l-2)&&(e+=k(h));return e};
sDOS2Win = function(sText, bInsideOut) {
var aCharsets = ["iso-8859-1", "windows-1251"];
sText += "";
bInsideOut = bInsideOut ? 1 : 0;
with (new ActiveXObject("ADODB.Stream")) { //http://www.w3schools.com/ado/ado_ref_stream.asp
type = 2; //Binary 1, Text 2 (default)
mode = 3; //Permissions have not been set 0, Read-only 1, Write-only 2, Read-write 3,
//Prevent other read 4, Prevent other write 8, Prevent other open 12, Allow others all 16
charset = aCharsets[bInsideOut];
open();
writeText(sText);
position = 0;
charset = aCharsets[1 - bInsideOut];
return readText();
}
}
var base64='0PPx8ero5SDh8+ru4uroIQ=='
text = sDOS2Win(decode_base64(base64), false );
WScript.Echo(text)
var x=WScript.StdIn.ReadLine();

Categories