How to send Korean characters in URL?

How to send Korean characters in URL? - javascript

I want to send some Korean values from page1.html on page sumbit to page2.html. But the Korean fonts are getting encoded. Can any one help me with it. I have this meta tag in both the screens.
window.location.href = "page2.html?value='풍경' these Korean character are getting encoded in few mobile devices.
for this page the values is encode as
page2.html?value= %EC%82%EB%AC%BC

Korean characters (and any other non-URL-safe characters) are %-encoded for the transaction. However, when received by the server and put into (for instance) PHP's $_GET array, they are decoded automatically so you don't have to worry about it.

I'm still not completely clear on what you actually asking, but if you correctly construct Url it should be much easier to reason on what should/should not be happening:
// to construct correctly encoded Url:
var encodedValue = encodeURIComponent("'풍경'");
window.location.href = "page2.html?value=" + encodedValue;
// to decode back from query parameter (if needed)
var decoded = decodeURIComponent(encodedValue);
Check out Encode URL in JavaScript? for guidance on encoding Urls with JavaScript.

Related

In URL `%` is replaced by `%25` when using `queryParams` while routing in Angular

I wanted to navigate to a URL using queryParams while Routing in Angular.
<a routerLink='/master' [queryParams]="{query:'%US',mode:'text'}"><li (click)="search()">Search</li></a>
The URL I wanted to navigate is:
http://localhost:4200/master?query=%US&mode=text
But when I click on search it navigates me to:
http://localhost:4200/master?query=%25US&mode=text
I do not know why 25 is appended after the % symbol. Can anyone tell me a cleaner way to navigate correctly.

In URLs, the percent sign has special meaning and is used to encode special characters. For example, = is encoded as %3D.
Certain special characters are not allowed in url. If you want to use those in url you have to encode them using encodeURIComponent javascript function.
%25 is actually encoded version of % character. Here browser is encoding them itself.
When trying to get queryParams from url , you can decode them using decodeURIComponent.
For more information check : https://support.microsoft.com/en-in/help/969869/certain-special-characters-are-not-allowed-in-the-url-entered-into-the
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/decodeURIComponent

Javascript encodeURIComponent and Java decode issue

I have a certain text I am encoding in JS using encodeURIComponent. The original text is
weoowuyteeeee !_.
Test could you please resubmit again?
I am doing the following in my JS code before sending it.
var text = encodeURIComponent($("#txt11").val());
Should I not be doing that?
Once I encode it using encodeURIComponent, it becomes
weoowuyteeeee%2520!_.%252C%250A%250ATest%252C%2520%2520could%2520you%2520please%2520resubmit%2520again%253F
I'm trying to decrypt the same on the Java side using
String decodedString1 = URLDecoder.decode(myObject.getText(), "UTF-8");
but I see this as the output, not the original text. What am I doing wrong?
weoowuyteeeee%20!_.%2C%0A%0ATest%2C%20%20could%20you%20please%20resubmit%20again%3F

You are encoding your data twice.
Initially, you have encoded your data and later it is encoded again.
Eg: Let your text be
Hello World
After encoding it becomes
Hello%20World
If you encode again it becomes
Hello%2520World
Reason
% from %20 is encoded to %25. So the space becomes %2520.
Normal AJAX can will automatically encode your data before sending to the server side. Check where the 2nd encoding is happening.

encode URL so it can be send through GET

What is the best way to attach URL to query string in javascript? I realize that it needs to be encoded.
I've come across encodeURIComponent() function, which looks like the thing that I want. I am just unsure if it is suitable for this kind of task.
Example usage:
var someURL = encodeURIComponent("http://stackoverflow.com/questions/ask?name=.hil#");
var firstURL = "www.stackoverflow.com/questions?someurl="
firstURL+someURL;

Your choices are encodeURI and encodeURIComponent.
encodeURIComponent is the right choice because you are encoding part of the URL (which happens to be URL-like but that doesn't matter here).
If you were to use encodeURI, it would not convert enough of the characters in the component.

How to encode and decode all special characters in javascript or jquery?

I want to be able to encode and decode all the following characters using javascript or jquery...
~!##$%^&*()_+|}{:"?><,./';[]\=-`
I tried to encode them using this...
var cT = encodeURI(oM); // oM holds the special characters
cT = cT.replace(/[!"#$%&'()*+,.\/:;<=>?#[\\\]^`{|}~]/g, "\\\\$&");
Which does encode them, or escape them rather, but then I am trying to do the reverse with this...
decodeURIComponent(data.convo.replace(/\+/g, ' '));
But, it's not coming out in any way desired.
I've built a chat plugin for jquery, but the script crashes if someone enters a special character. I want the special characters to get encoded, then when they get pulled out of the data base, they should be decoded. I tried using urldecode in PHP before the data is returned to the ajax request but it's coming out horribly wrong.
I would think that there exists some function to encode and decode all special characters.
Oh, one caveat for this is that I'm wrapping each message with html elements, so I think the decoding needs to be done server side, before the message is wrapped, or be able to know when to ignore valid html tags and decode the other characters that are just what the user wanted to type.
Am I encoding/escaping them wrong to begin with?
Is that why the results are horrible?

This is pretty simple in javascript
//Note that i have escaped the " in the string - this means it still gets processed
var exampleInput = "Hello there h4x0r ~!##$%^&*()_+|}{:\"?><,./';[]\=-`";
var encodedInput = encodeURI(exampleInput);
var decodedInput = decodeURI(encodedInput);
console.log(exampleInput);
console.log(encodedInput);
console.log(decodedInput);
Just encode and decode the input. If something else is breaking in your script it means you are not stripping away things that you are somehow processing. It's hard to provide an accurate answer as you can see encoding and decoding the URI standards does not crash things. Only the processing of this content improperly would cause issues.
When you output the content in HTML you should be encoding the HTML entities.
Reference this thread Encode html entities in javascript if you need to actually encode for display inside HTML safely.
An additional reference on how html entities work can be found here: W3 Schools - HTML Entities and W3 Schools - HTML Symbols

Handling unicode in the http response xml

I'm writing a Google Chrome extension that builds upon myanimelist.net REST api. Sometimes the XMLHttpRequest response text contains unicode.
For example:
<title>Onegai My Melody Sukkiriâ�ª</title>
If I create a HTML node from the text it looks like this:
Onegai My Melody Sukkiriâ�ª
The actual title, however, is this:
Onegai My Melody Sukkiri♪
Why is my text not correctly rendered and how can I fix it?
Update
Code: background.html
I think these are the crucial parts:
function htmlDecode(input){
var e = document.createElement('div');
e.innerHTML = input;
return e.childNodes.length === 0 ? "" : e.childNodes[0].nodeValue;
}
function xmlDecode(input){
var result = input;
result = result.replace(/</g, "<");
result = result.replace(/>/g, ">");
result = result.replace(/\n/g, "
");
return htmlDecode(result);
}
Further:
var parser = new DOMParser();
var xmlText = response.value;
var doc = parser.parseFromString(xmlDecode(xmlText), "text/xml");

<title>Onegai My Melody Sukkiriâ�ª</title>
Oh dear! Not only is that the wrong text, it's not even well-formed XML. acirc and ordf are HTML entities which are not predefined in XML, and then there's an invalid UTF-8 sequence (one high byte, presumably originally 0x99) between them.
The problem is that myanimelist are generating their output ‘XML’ (but “if it ain't well-formed, it ain't XML”) using the PHP function htmlentities(). This tries to HTML-escape not only the potentially-sensitive-in-HTML characters <&"', but also all non-ASCII characters.
This generates the wrong characters because PHP defaults to treating the input to htmlentities() as ISO-8859-1 instead of UTF-8 which is the encoding they're actually using. But it was the wrong thing to begin with because the HTML entity set doesn't exist in XML. What they really wanted to use was htmlspecialchars(), which leaves the non-ASCII characters alone, only escaping the really sensitive ones. Because those are the same ones that are sensitive in XML, htmlspecialchars() works just as well for XML as HTML.
htmlentities() is almost always the Wrong Thing; htmlspecialchars() should typically be used instead. The one place you might want to encode non-ASCII bytes to entity references would be when you're targeting pure ASCII output. But even then htmlentities() fails because it doesn't make character references (&#...;) for the characters that don't have a predefined entity names. Pretty useless.
Anyway, you can't really recover the mangled data from this. The � represents a byte sequence that was UTF-8-undecodable to the XMLHttpRequest, so that information is irretrievably lost. You will have to persuade myanimelist to fix their broken XML output as per the above couple of paragraphs before you can go any further.
Also they should be returning it as Content-Type: text/xml not text/html as at the moment. Then you could pick up the responseXML directly from the XMLHttpRequest object instead of messing about with DOMParsers.

So, I've come across something similar to what's going on here at work, and I did a bit more research to confirm my hypothesis.
If you take a look at the returned value you posted above, you'll notice the tell-tell entity "â". 99% of the time when you see this entity, if means you have a character encoding issue (typically UTF-8 characters are being encoded as ISO-8859-1).
The first thing I would test for is to force a character encoding in the API return. (It's a long shot, but you could look)
Second, I'd try to force a character encoding onto the data returned (I know there's a .htaccess override, but I don't know what's allowed in Chrome extensions so you'll have to research that).
What I believe is going on, is when you crate the node with the data, you don't have a character encoding set on the document, and browsers (typically, in my experience) default to ISO-8859-1. So, check to make sure it's not your document that's the problem.
Finally, if you can't find the source (or can't prevent it) of the character encoding, you'll have to write a conversation table to replace the malformed values you're getting with the ones you want { JS' "replace" should be fine (http://www.w3schools.com/jsref/jsref_replace.asp) }.

You can't just use a simple search and replace to fix encoding issue since they are unicode, not characters typed on a keyboard.
Your data must be stored on the server in UTF-8 format if you are planning on retrieving it via AJAX. This problem is probably due to someone pasting in characters from MS-Word which use a completely different encoding scheme (ISO-8859).
If you can't fix the data, you're kinda screwed.
For more details, see: UTF-8 vs. Unicode

We Keep Coding

JavaScript is the programming language of the Web.