I have an application that takes text entered by a user and passes it to the server as part of a URL so that an image containing the text can be rendered. The URL parameter is encoded using encodeURIComponent function.
The problem I have is that if the user enters text containing + or foreign characters I cannot get the string decoded correctly server side.
For example, If the string is "François + Anna"
The encoded URL is previewImage.ashx?id=1&text=Fran%25E7ois%2520%2B%2520Anna
On the server
Uri.UnescapeDataString( Context.Request.QueryString["text"] )
Throws an "Invalid URI: There is an invalid sequence in the string." exception. If I replace the extended character from the string, it is decoded as "Francois + Anna"
However, if I use
HttpUtility.UrlDecode(
Context.Request.QueryString["text"], System.Text.UTF8Encoding.UTF7 )
the foreign characters are decoded correctly but the encoded + is changed to a space; "François Anna".
The URL wasn't encoded correctly to begin with. previewImage.ashx?id=1&text=Fran%25E7ois%2520%2B%2520Anna is not the correct URL encoding of François + Anna
I believe the correct encoding should have been previewImage.ashx?id=1&text=Fran%E7ois+%2B+Anna or previewImage.ashx?id=1&text=Fran%E7ois%20%2B%20Anna
Once the encoding has been fixed, then you should be able to retrieve the result via a simple Context.Request.QueryString["text"] call. No need to do anything special.
Related
I was intending to set a cookie, and want do to some work with that in client side JS.
While decoding the URI through decodeURIComponent() function, some undesired characters were appended before the URI and also in decoded URI. I did a quick fix by removing some of the first characters in my URI and decoding it to get JSON,
I would like to know why was j: added in URI and also how to deal with it.
Also : It looks right i.e nothing was appended in cookie when seeing decoded URI in DevTools
For setting my cookie having name as note with JS object
res.cookie('note',note,{maxAge : 1000*60*60*24});
let decoded = decodeURIComponent(document.cookie.substring(9, ));
decoded = JSON.parse(decoded);
I did this to decode my cookie
and converting JSON I got from decodeURIComponent fun to JS Object which I want to use
I tried encoding my object with encodeURIComponent but it seems it automatically get encoded.
I have some legacy code (pre-React) that encodes part of a URL with encodeURIComponent before calling history.push() on the https://www.npmjs.com/package/history module in order to navigate to that URL.
history.push() inside history.js then uses decodeURI to decode the entire URL partially (decodeURI only decodes the same characters that encodeURI encodes)
this partially decoded location.pathname ends up in ReactRouter where useParams() gives me the partially decoded URL component back again.
Now I'm stuck with a partially decoded URL component which I cannot use. I need it fully decoded.
I can't use decodeURIComponent on the partially decoded string, because the original string might contain a %, in which case this % will already be decoded in the partially decoded string and this would cause decodeURIComponent to crash with a Uncaught URIError: URI malformed.
My options seem to be:
use unescape to fully decode the partially decoded string (it doesn't complain about the single %) even though its use is discouraged (why?)
manually re-encode any % (that isn't followed by a digit and a subsequent hex character) back to %25 and then run the result through decodeURIComponent
Are there any less ugly solutions that I haven't thought of yet ?
EDIT : I was asked for examples of what I meant by partially decided string
const original = 'A-Za-z0-9;,/?:#&=+$-_.!~*()#%';
const encodedURIComponent = encodeURIComponent(original); // "A-Za-z0-9%3B%2C%2F%3F%3A%40%26%3D%2B%24-_.!~*()%23%25"
console.log(decodeURIComponent(encodedURIComponent)); // "A-Za-z0-9;,/?:#&=+$-_.!~*()#%"
const partiallyUnescaped = decodeURI(encodedURIComponent); // "A-Za-z0-9%3B%2C%2F%3F%3A%40%26%3D%2B%24-_.!~*()%23%" - notice the '%25' at the end was decoded back to '%'
console.log(unescape(partiallyUnescaped)); // "A-Za-z0-9;,/?:#&=+$-_.!~*()#%"
//console.log(decodeURIComponent(partiallyUnescaped)); // error
EDIT 2: In case it can be of any help, here's a more realistic example of some of the characters our URL might contain, but because it's user generated, it could be anything really:
console.log( encodeURIComponent('abcd+%;- efgh')) ; // "abcd%2B%25%3B-%20efgh"
console.log( decodeURI(encodeURIComponent('abcd+%; -efgh'))) ; // "abcd%2B%%3B- efgh"
//console.log(decodeURIComponent(decodeURI(encodeURIComponent('abcd+%; -efgh')))); // Error: URI malformed
I got a base64 encoded string of a csv file from frontend. In backend i am converting base64 string to binary and then trying to convert it to json object.
var csvDcaData = new Buffer(source, 'base64').toString('binary')//convert base64 to binary
Problem is, Ui is sending some illegal characters with on of the field which are not visible to user in plain csv. "" these are characters appended in one of csv field.
I want to remove these kind of characters from data from base64 but i am not able to recognize them in buffer, after conversion these characters appear.
It is possible in any way to detect such kind of characters from buffer.
The source is sending you a message. The message consists of metadata and text. The first few bytes of the message are identifiable as metadata because they are the Byte-Order Mark (BOM) encoded in UTF-8. That strongly suggests that the text is encoded in UTF-8. Nonetheless, to read the text you should find out from the sender which encoding is used.
Yes, the BOM "characters" should be stripped off when wanting to deal only in the text. They are not characters in the sense that they are not part of the text. (Though, if you decode the bytes as UTF-8, it matches the codepoint U+FEFF.)
So, though perhaps esoteric, the message does not contain illegal characters but actually has useful metadata.
Also, given that you are not stripping off the BOM, the fact that you are seeing "" instead of "" (U+FEFF ZERO WIDTH NO-BREAK SPACE) means that you are not using UTF-8 to decode the text. That could result in data loss. There is no text but encoded text. You always have to know and use the correct encoding.
Now, source is a JavaScript string (which, by-the-way, uses the UTF-16 encoding of Unicode). The content of the string is a message encoded in Base64. The message is a sequence of bytes which are the UTF-8 encoding of a BOM and text. You want the text in a JavaScript string. (And the text happens to be some form of CSV. For that, you'll need to know the line ending, delimiter, and text-qualifier.) There is a lot for you and the sender to talk about. Perhaps the sender has documented all this.
const stripBom = require('strip-bom');
const original = "¡You win one million ₹! Now you can get a real 🚲";
const base64String = Buffer.from("\u{FEFF}" + original, "utf-8").toString("base64");
console.log(base64String);
const decodedString =
stripBom(Buffer.from(base64String, "base64").toString("utf-8"));
console.log(decodedString);
console.log(original === decodedString);
I am trying to encode a string in javascript and decode it in php.
I use this code to put the string in a inputbox and then send it via form PUT.
document.getElementById('signature').value= b64EncodeUnicode(ab2str(signature));
And this code to decode
$signature=base64_decode($signature);
Here there is a jsfiddle for the encoding page:
https://jsfiddle.net/okaea662/
The problem is that I always get a string 98% correct but with some different characters.
For example: (the first string is the string printed in the inputbox)
¦S÷ä½m0×C|u>£áWÅàUù»¥ïs7Dþ1Ji%ýÊ{\ö°(úýýÁñxçO9Ù¡ö}XÇIWçβÆü8ú²ðÑOA¤nì6S+̽ i¼?¼ºNËÒo·a©8»eO|PPþBE=HèÑqaX©$Ì磰©b2(Ðç.$nÈR,ä_OX¾xè¥3éÂòkå¾ N,sáW§ÝáV:ö~Å×à<4)íÇKo¡L¤<Í»äA(!xón#WÙÕGù¾g!)ùC)]Q(*}?Ìp
¦S÷ ä½m0×C|u>£áWÅàUù»¥ïs7Dþ1Ji%ýÊ{\ö°(úýýÁñxçO9Ù¡ö}XÇIWçβÆü8ú²ðÑOA¤nì6S+̽ i¼?¼ºNËÒo·a©8»eO|PPþBE=HèÑ qaX©$Ì磰©b2(Ðç.$nÈR,ä_OX¾xè¥3éÂòkå¾ N ,sá W§ÝáV:ö~Å×à<4)íÇKo¡L¤<Í»äA(!xón#WÙÕGù¾g!)ùC)]Q(*}?Ìp
Note that the 4th character is distinct and then there is one or two more somewhere.
The string corresponds to a digital signature so these characters make the signature to be invalid.
I have no idea what is happening here. Any idea? I use Chrome browser and utf-8 encoding in header and metas (Firefox seems to use a different encoding in the inputbox but I will look that problem later)
EDIT:
The encoding to base64 apparently is not the problem. The base64 encoded string is the same in the browser than in the server. If I base64-decode it in javascript I get the original string but if I decode it in PHP I get a slightly different string.
EDIT2:
I still don't know what the problem is but I have avoided it sending the data in a blob with ajax.
Try using this command to encode your string with js:
var signature = document.getElementById('signature');
var base64 = window.btoa(signature);
Now with php, you simply use: base64_decode($signature)
If that doesn't work (I haven't tested it) there may be something wrong with the btoa func. So checkout this link here:
https://developer.mozilla.org/en-US/docs/Web/API/WindowBase64/Base64_encoding_and_decoding
There is a function in there that should work (if the above does not)
function b64EncodeUnicode(str) {
return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g, function(match, p1) {
return String.fromCharCode('0x' + p1);
}));
}
b64EncodeUnicode(signature); // "4pyTIMOgIGxhIG1vZGU="
I cant figure out why in ajax post "+" sign converts to " ".please explain ?
Use the encodeURIComponent() function to turn your data in valid encoded data for the request:
xhr.open("POST", url, true);
xhr.send(encodeURIComponent(postdata));
It's how URL encoding works. If you want a plus sign it's %2B, but you should really just escape or encode the data you're sending to the server. Type "a+b c" in here.
"+" is the url encoded symbol for space. As such, when your post data is decoded the "+" is converted to a space.
This is because URL Encoding converts spaces to + since spaces aren't valid in URLs.
Normally characters are converted to % followed by two hex digits, but having + instead of %20 makes URLs more readable.
If you encode your + as %2B that should work.
Chances are that you are using the + sign in an URL, where it is rightly converted into a space, as + is the URLEncoded representation of a space character.
Run escape() on whatever value you are putting into your URL to get it into URL-encoded form.
That's just standard url encoding. Plus signs are converted to spaces on the server. If you want to pass a plus sign you need to escape it as %2b.