Why does unescape work but decodeURI doesn't? - javascript

I have the following variable:
var string="Mazatl%E1n";
The string is returned like that by the server. All I want to do is decode that into: Mazatlán. I've tried the following:
var string="Mazatl%E1n";
alert(unescape(string));
alert(decodeURI(string));
unescape works fine but I don't want to use it because I understand it is deprecated, instead I tried decodeURI which fails with the following error:
Uncaught URIError: URI malformed
Why ? Any help is appreciated.
var string="Mazatl%E1n";
alert(unescape(string));
alert(decodeURI(string));

You get the error because %E1 is the Unicode encoding, but decodeURI() expects UTF-8.
You'll either have to create your own unescape function, for example:
function unicodeUnEscape(string) {
return string.replace(/%u([\dA-Z]{4})|%([\dA-Z]{2})/g, function(_, m1, m2) {
return String.fromCharCode(parseInt("0x" + (m1 || m2)));
})
}
var string = "Mazatl%E1n";
document.body.innerHTML = unicodeUnEscape(string);
or you could change the server to send the string encoded in UTF-8 instead, in which case you can use decodeURI()
var string = "Mazatl%C3%A1n"
document.body.innerHTML = decodeURI(string);

URI supports the ASCII character-set , and the correct format encoding for á is %C3%A1 (in UTF-8 encoding)
fiddle
escape and unescape use an hexadecimal escape sequences
(which is different ..);
so the value you're getting form the server has been encoded using escape(string).

The decodeURI() function expects a valid URI as its parameter. If you are only trying to decode a string instead of a full URI, use decodeURIComponent()

Related

Decode UTF16 encoded string (URL) in Java Script

I have string that is encoded in UTF16 and i want to decode it using JS, when i use simple decodeURI()
function i get the desired result but in case when special characters are there in the string like á, ó, etc it do not decodes.
On more analysis i came to know that these characters in the encoded string contains the ASCII value.
Say I have string "Acesse já, Encoded version : "Acesse%20j%E1". How can i get the string from the encode version using java script?
EDIT:
The string is a part of URL
Ok, your string seems to have been encoded using escape, use unescape to decode it!
unescape('Acesse%20j%E1'); // => 'Acesse já'
However, escape and unescape are deprecated, you’d better use encodeURI or encodeURIComponent here.
encodeURIComponent('Acesse já'); // => 'Acesse%20j%C3%A1'
decodeURIComponent('Acesse%20j%C3%A1'); // => 'Acesse já'

base64 encoding in javascript decoding in php

I am trying to encode a string in javascript and decode it in php.
I use this code to put the string in a inputbox and then send it via form PUT.
document.getElementById('signature').value= b64EncodeUnicode(ab2str(signature));
And this code to decode
$signature=base64_decode($signature);
Here there is a jsfiddle for the encoding page:
https://jsfiddle.net/okaea662/
The problem is that I always get a string 98% correct but with some different characters.
For example: (the first string is the string printed in the inputbox)
¦S÷ä½m0×C|u>£áWÅàUù»¥ïs7Dþ1Ji%ýÊ{\ö°(úýýÁñxçO9Ù¡ö}XÇIWçβÆü8ú²ðÑOA¤nì6S+̽ i¼?¼ºNËÒo·a©8»eO|PPþBE=HèÑqaX©$Ì磰©b2(Ðç.$nÈR,ä_OX¾xè¥3éÂòkå¾ N,sáW§ÝáV:ö~Å×à<4)íÇKo¡L¤<Í»äA(!xón#WÙÕGù¾g!)ùC)]Q(*}?­Ìp
¦S÷ ä½m0×C|u>£áWÅàUù»¥ïs7Dþ1Ji%ýÊ{\ö°(úýýÁñxçO9Ù¡ö}XÇIWçβÆü8ú²ðÑOA¤nì6S+̽ i¼?¼ºNËÒo·a©8»eO|PPþBE=HèÑ qaX©$Ì磰©b2(Ðç.$nÈR,ä_OX¾xè¥3éÂòkå¾ N ,sá W§ÝáV:ö~Å×à<4)íÇKo¡L¤<Í»äA(!xón#WÙÕGù¾g!)ùC)]Q(*}?­Ìp
Note that the 4th character is distinct and then there is one or two more somewhere.
The string corresponds to a digital signature so these characters make the signature to be invalid.
I have no idea what is happening here. Any idea? I use Chrome browser and utf-8 encoding in header and metas (Firefox seems to use a different encoding in the inputbox but I will look that problem later)
EDIT:
The encoding to base64 apparently is not the problem. The base64 encoded string is the same in the browser than in the server. If I base64-decode it in javascript I get the original string but if I decode it in PHP I get a slightly different string.
EDIT2:
I still don't know what the problem is but I have avoided it sending the data in a blob with ajax.
Try using this command to encode your string with js:
var signature = document.getElementById('signature');
var base64 = window.btoa(signature);
Now with php, you simply use: base64_decode($signature)
If that doesn't work (I haven't tested it) there may be something wrong with the btoa func. So checkout this link here:
https://developer.mozilla.org/en-US/docs/Web/API/WindowBase64/Base64_encoding_and_decoding
There is a function in there that should work (if the above does not)
function b64EncodeUnicode(str) {
return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g, function(match, p1) {
return String.fromCharCode('0x' + p1);
}));
}
b64EncodeUnicode(signature); // "4pyTIMOgIGxhIG1vZGU="

js escape => android decode

I encoded a string using js method:
var result = escape('Вася')
and get as result: "%u0412%u0430%u0441%u044F"
how can I decode this string in Java?
This doesn't work:
URLDecoder.decode(text, "UTF-8");
Use encodeURI(), which more-less is equivalent to URLEncoder.encode() which in turn happen to be the reverse of URLDecoder.decode(). You may also try with encodeURIComponent() which handles non-ASCII characters more graciously.

How to parse Json with jquery having special chars in it?

I got a json structure somehow as below and my question is how can i parse this with jQuery so that i can use it like myJson[0].name and than alert it so that "M\\xe9t\\xe9o" = Météo.
Jquery tells me this is invalid json why ?
Json uses double backslash if i use single backslash ("M\xe9t\xe9o") Jquery is OK with the syntax.
var jsonObj = '{"title":[{"id":"1","name": "M\\xe9t\\xe9o"},{"id":"2","name": "Meteo"}]}';
var myJson = jQuery.parseJSON(jsonObj);
The JSON syntax only allows \uxxxx escapes.
Change it to "M\\u00e9t\\u00e9o".
If you use a single backslash, it gets parsed by the Javascript string literal, so the actual string value contains the real Unicode character, not an escape. In other words, "M\xe9t\xe9o" === "Météo"
It is looks like the json was incorrectly (manually?) encoded. When you encode it in UTF-8, e.g. with PHP, you'll get:
{"title":[{"id":"1","name": "M\u00e9t\u00e9o"},{"id":"2","name": "Meteo"}]}
which is correctly parsed by JS. But \xe9 is unrecognized by parser.

How to detect if a string is encoded with escape() or encodeURIComponent()

I have a web service that receives data from various clients. Some of them sends the data encoded using escape(), while the others instead use encodeURIComponent(). Is there a way to detect the encoding used to escape the data?
This won't help in the server-side, but in the client-side I have used javascript exceptions to detect if the url encoding has produced ISO Latin or UTF8 encoding.
decodeURIComponent throws an exception on invalid UTF8 sequences.
try {
result = decodeURIComponent(string);
}
catch (e) {
result = unescape(string);
}
For example, ISO Latin encoded umlaut 'ä' %E4 will throw an exception in Firefox, but UTF8-encoded 'ä' %C3%A4 will not.
See Also
decodeURIComponent vs unescape, what is wrong with unescape?
Comparing escape(), encodeURI(), and encodeURIComponent()
I realize this is an old question, but I am unaware of a better solution. So I do it like this (thanks to a comment by RobertPitt above):
function isEncoded(str) {
return typeof str == "string" && decodeURIComponent(str) !== str;
}
I have not yet encountered a case where this failed. Which doesn't mean that case doesn't exists. Maybe someone could shed some light on this.
Encourage your clients to use encodeURIComponent(). See this page for an explanation: Comparing escape(), encodeURI(), and encodeURIComponent(). If you really want to try to figure out exactly how something was encoded, you can try to look for some of the characters that escape() and encodeURI() do not encode.
Thanks for #mika for great answer. Maybe just one improvement since unescape function is considered as deprecated:
declare function unescape(s: string): string;
decodeURItoString(str): string {
var resp = str;
try {
resp = decodeURI(str);
} catch (e) {
console.log('ERROR: Can not decodeURI string!');
if ( (unescape != null) && (unescape instanceof Function) ) {
resp = unescape(str);
}
}
return resp;
}
You don't have to differentiate them. escape() is so called percent encoding, it only differs from URI encoding in how certain chars encodes. For example, Space is encoded as %20 with escape but + with URI encoding. Once decoded, you always get the same value.

Categories