I thought values entered in forms are properly encoded by browsers.
But this simple test file "test_get_vs_encodeuri.html" shows it's not true:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
<title></title>
</head><body>
<form id="test" action="test_get_vs_encodeuri.html" method="GET" onsubmit="alert(encodeURIComponent(this.one.value));">
<input name="one" type="text" value="Euro-€">
<input type="submit" value="SUBMIT">
</form>
</body></html>
When hitting submit button:
encodeURICompenent encodes input value into "Euro-%E2%82%AC"
while browser into the GET query writes only a simple "Euro-%80"
Could someone explain?
How do i encode everything in the same way of the borwser's FORM (windows-1252) using Javascript??? (escape function does not work, encodeURIComponent does not work either)?
Or is encodeURIComponent doing unnecessary conversions?
This is a character encoding issue. Your document is using the charset Windows-1252 where the € is at position 128 that is encoded with Windows-1252 as 0x80. But encodeURICompenent is expecting the input to be UTF-8, thus using Unicode’s charset where the € is at position 8364 (PDF) that is encoded with UTF-8 0xE282AC.
A solution would be to use UTF-8 for your document as well. Or you write a mapping to convert UTF-8 encoded strings to Windows-1252.
I think the root of the problem is character encodings. If I mess around with charset in the meta tag and save the file with different encodings I can get the page to render in the browser like this:
(source: boogdesign.com)
That € looks a lot like what you're getting from encodeURIComponent. However I could find no combination of encodings which made any difference to what encodeURIComponent was returning. I can make a difference to what the GET query returns. This is your original page, submitting gives an URL like:
test-get-vs-encodeuri.html?one=Euro-%80
This is a UTF-8 version of the page, submitting gives an URL that looks like this (in Firefox):
http://www.boogdesign.com/examples/encode/test-get-vs-encodeuri-utf8.html?one=Euro-€
But if I copy and paste it I get:
http://www.boogdesign.com/examples/encode/test-get-vs-encodeuri-utf8.html?one=Euro-%E2%82%AC
So it looks like if the page is UTF-8 then the GET and encodeURIComponent match.
Related
I am designing web page in slovak language. To be able to use meantioned language special characters such as á or ž, I am using this html code:
<html lang="sk">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>
Now it works as expected but only when I hard code that kind of text into html file.
As soon as I use jquery to print them it breaks down and those characters are not correctly shown.
$("#myDiv").html("áž");
Am I supposed to specify something in jquery or is there another way to overcome this problem?
You can pass the numeric entity for that character into the html() function to achieve that,
Try a sample,
$('body').html('Ξ');
DEMO
I think you may be use some tricks here
Try this
$("#myDiv").html($("<div>").html("áž").text());
Or simply try this
$("#myDiv").text("áž");
It is quite Easy you can do the following
Use any special Character u want
$("#mydiv").text("*&^&*^*&^*");
Here is the Demo
For some reason this character is getting generated in my HTML email when it is being sent: –. I have tried replacing it with nothing in my PHP using preg_replace('/–/', '', $var), but that is not working. For some reason when I get an email containing HTML this character shows up. I am guessing it is generated from this JavaScript in my code somehow:
$('.comments0').click(function(){
$('.comments').val($('.comments').val() + 'Our warranties are:\nNew – 1 year\nRemanufactured - 6 months\nRepair - 6 months');
});
If it is not being generated with JavaScript, I am not sure how this character keeps getting created in the middle of my HTML. It gets generated right after New, just like this: New – 1 Year. I have no idea why this character is coming up randomly like this.
By the way, here is the HTML directly related to that JavaScript:
<form action="?AddToQuote" method="POST" id="myForm" name="myForm">
<input type="checkbox" name="comments[0]" class="comments0" id="comments0" /><label>6 Months Warranty</label>
<textarea cols="75" rows="6" name="comments" class="comments" id="comments"><?php if(isset($_SESSION['comments'])) { echo $_SESSION['comments']; } ?></textarea>
</form>
Apparently the characters in your message were copied/pasted from somewhere else. If you delete them and manually retype directly in the JS source that should do the trick.
This is an en dash:
New – 1 year
If you don’t serve the script with the same encoding as it was written in, there will be errors. So make sure it’s saved as UTF-8 and serve it as UTF-8. If the JavaScript is part of your HTML, add this at the top of the <head> (HTML5):
<meta charset="utf-8">
You can test it:
$ echo '–' > test.html
$ firefox test.html
(– shows up in a browser)
Be sure that your text editor / IDE is set to save files as UTF-8 with NO BOM.
Also, be sure you are using <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> in your page <head> and setting your emails up as UTF-8.
I'm using with the file: jquery.fileuploader.min. when I change the string in the file from English text to Hebrew it returns wrong encoding.
I changed to:
text:{uploadButton:"עיין",cancelButton:"Cancel",......
instead of:
text:{uploadButton:"Upload A File",cancelButton:"Cancel",......
Then I get in the Html button: "����" instead of "עיין".
I have in the body a meta tag of Enciding:
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
So I don't know why it happens.
Like adeneo wrote, I opened the jquery.fileuploader.min sheet in Notepad and saved it to utf-8 unicode instead of ansi
I have a form on my page which runs a javascript function on submit. This function opens a new window with window.open(uri,…)
Since this is a german form there are umlauts and other characters like ß,ä,ö,ü.
So I'm passing the values of the input with escape(input.value) to my uri variable.
In Chrome this works perfectly fine and the passed url looks like this
index.php?PLZ=&Ort=Ha%DFloch
but when I open the site in Firefox it looks like this:
index.php?PLZ=&Ort=Ha�loch
So how can I achieve the right result in both browsers?
I tried nearly everything from encodeURI to encodeURIcomponent etc…
Use the form to set the character encoding to UTF-8:
<form accept-charset="utf-8">
And set it in the head as well:
<head>
<meta charset="utf-8">
</head>
I have a url like this : http://www.refskou.dk/safari-%F8.html
The file is named like this: safari-ø.html
The file consists of this:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<script>
alert(this.location);
</script>
</head>
<body>
</body>
</html>
But it does not print out /safari-%F8.html nor safari-ø.html
It prints out the question mark indicating that it does not know of the character "ø".
All I want is to print out the URL as I see it in the address bar.
Please give me a hint. This is only a problem in Safari as far as I have testet.
I need to tell you that I do not have control over what kind of charset used on the page. I can only execute javascript :-)
In response to this answer.
The reason for the lack of control, is that I do a script that can be included to hopefully any webpage, and so I have no control over what kind of charset used. The included script can ofcouse have its own charset, introduced by the charset attribute on the "script" tag but I cannot get it to work.
unescape('/safari-%F8.html') == 'safari-ø.html'
Note that Safari still gives you a ?, but Chrome shows either a %F8 or ø
In Safari (nevermind):
var str = '/safari-%F8.html';
alert(str.replace(/%[A-F0-9]{2}/g, function(v){ return String.fromCharCode(parseInt(v.substr(1), 16)); }));
The above works on normal strings, but Safari is seeing that character as unicode 65533, and I'm not sure how to convert that back to ASCII 248...
Try the unescape javascript function:
alert(unescape(this.location));
I believe you'll need to specify a character set.
The first thing in your Head section...
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
More Info Here
EDIT: I missed the part where the OP states he has no control over the character set on the page. I believe this is the root of the problem and wonder why he has no control over this.
Well I finally got it working. For some reason Safari cannot understand the strange characters when asking from this/window.location. But moving down a level to the document object and asking for the URL gives me just what I need. Why this is, I cannot tell you, but it solves the problem.