i have an alert box which i want to show some icelandic text but its not showing it
<script>
function check() {
alert("Þú verður að vera skráð/ur inn til þess að senda skilaboð");
}
</script>
it is showing the alert box but the text is messed up :(
Þú verður að vera skráð/ur inn til þess að senda skilaboð
any help please :(
Today the web uses many international languages and has settled on using UTF-8 (a flavour of unicode) for character encoding. This is important.
You are using iso-8859-1, the MS Windows character set. If you have Word 2007 or 2010 you have the option of re-saving your text as UTF-8. If you've ever seen ????? or � instead of text on someone's web site, it's due to the wrong encoding type.
<meta http-equiv="Content-type" content="text/html; charset=UTF-8"/>
Always use UTF-8 end-to end. Do not use 8859-1 or Windows 2151 encoding.
See:
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
ISO-8859-1 vs UTF-8?
Character encodings and the beauty of UTF-8
Let's do it in html5 manner :)
<!doctype html>
<html>
<head>
<meta charset = "utf-8" />
</head>
encoding hazards my best guesses:
check if...
the js-file is stored properly encoded (UTF8) on your server
the server delivers the JS file with proper encoding header (HTTP/1.1 about encoding)
... #Diodeus is right
What you see displayed by the alert is UTF-8 encoded text misinterpreted as windows-1252 encoded. (Windows-1252 is a Microsoft extension to ISO-8859-1.)
If your pages are ISO-8859-1 encoded, as they apparently are, then this applies to the script element content too. There is something odd going if the code you posted does not work. Are you sure the element is really inside a normal page of yours where Icelandic characters work OK? You should not try this fix the situation with a shot in the dark like changing encodings without knowing what is going on.
I’m just making a guess: the alert() invocation is really in an external .js file, which is UTF-8 encoded but treated by browsers as windows-1252 encoded. Then there are two alternative fixes: 1) open that file in an editor and save it as windows-1252 or ISO-8859-1 encoded; or 2) modify server settings to declared UTF-8 for .js files or (less reliably) add charset=utf-8 attribute to the script element.
Alternatively, if the alert() invocation is really inside a script element in an HTML file, then perhaps this file is really UTF-8 encoded but you don’t observe other problems because the content of the file does not otherwise contain Icelandic characters. In this case, it is best to open the HTML file in your authoring program and change its encoding to windows-1252 or ISO-8859-1.
Related
I've done some research and turns out that to encode special characters we use encodeURI(component) and decodeURI.
However when I try do something like:
var my_special_char = 'ñ';
my_div.innerHTML = decodeURI(encodeURI(my_special_char))
A "question mark" is printed.
I found this (non-complete) table about special characters: http://www.javascripter.net/faq/accentedcharacters.htm
Effectively when I do
decodeURI("%C3%B1"); // ñ
it prints ñ.
But if I try with:
decodeURI(encodeURI('ñ'))
I still get a "question mark".
How does character enconding work in JS? And where can I find a really comprehensive special characters' in encodeURI format (ready out-of-the-box to be decoded via decodeURI)?
EDIT:
in my (the application is an AngularJS application) I have meta charset=utf-8 (written in the right HTML syntax as proposed in the answer, it actually comes from AngularJS' starter project)
I'm using WebStorm IDE: I checked out the settings and the enconding used is UTF-8
I'm serving the page locally in Apache (XAMPP)
EDIT 2:
as advised in the answers, I created a .htaccess file in /htdocs whose content is:
AddDefaultCharset UTF-8
as well as renaming both index.html and the view's file by adding .utf8 before .html file extension.
then I restarted Apache (from XAMPP console).
But the issue is not gone. Any clue?
EDIT 3: I finally even tried to open the file in Sublime Text 3 and save as UTF-8 file, nothing changes
You don't have to do any special encoding in your JS strings (apart for the special case of strings which may be seen as script element closing).
If your JS file encoding matches the HTTP header (most commonly UTF-8), it's decoded if you just do
var my_special_char = 'ñ';
my_div.innerHTML = my_special_char;
To help the browser, and assuming you're correctly serving the files with the relevant HTTP header (the way it's set up highly depends on your server), you should have this meta tag in you HTML header:
<meta charset='utf-8'>
If your script is in a separate file, you should also declare the encoding in the script element:
<script charset="UTF-8" src="yourFile.js"></script>
You should add <meta charset="utf-8" /> inside your head tag. In this way the browser knows which charset to use and no more question marks will appear :)
in classic notepad it solved by clicking
file > Save As > in Encoding dropdown menu > UTF-8
in notepad++ by click
Encoding > Encode in UTF-8
or by adding charset attribute into metatag charset='utf-8'
<meta charset='utf-8'>
In an external javascript file I have a function that is used to append text to table cells (within the HTML doc that the javascript file is added to), text that can sometimes have Finnish characters (such as ä). That text is passed as an argument to my function:
content += addTableField(XML, 'Käyttötarkoitus', 'purpose', 255);
The problem is that diacritics such as "ä" get converted to some other bogus characters, such as "�". I see this when viewing the HTML doc in a browser. This is obviously not desirable, and is quite strange as well since the character encoding for the HTML doc is UTF-8.
How can I solve this problem?
Thanks in advance for helping out!
The file that contains content += addTableField(XML, 'Käyttötarkoitus', 'purpose', 255); is not saved in UTF-8 encoding.
I don't know what editor you are using but you can find it in settings or in the save dialog.
Example:
If you can't get this to work you could always write out the literal code points in javascript:
content += addTableField(XML, 'K\u00E4ytt\u00f6tarkoitus', 'purpose', 255);
credit: triplee
To check out the character encoding announced by a server, you can use Firebug (in the Info menu, there’s a command for viewing HTTP headers). Alternatively, you can use online services like Web-Sniffer.
If the headers for the external .js file specify a charset parameter, you need to use that encoding, unless you can change the relevant server settings (perhaps a .htaccess file).
If they lack a charset parameter, you can specify the encoding in the script element, e.g. <script src="foo.js" charset="utf-8">.
The declared encoding should of course match the actual encoding, which you can normally select when you save a file (using “Save As” command if needed).
The character encoding of the HTML file / doc does not matter any external ressource.
You will need to deliver the script file with UTF8 character encoding. If it was saved as such, your server config is bogus.
I'm trying to display the pound symbol in HTML (from PHP) but all I get is a symbol with a question mark.
The following are things that I've tried.
In PHP:
header('Content-type: text/html; charset=utf-8');
In HTML, put this in the head tag:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
I tried displaying it using a javascript function which converts it to:
�
I suppose it would help if I knew what I was doing... but I guess that's why I'm asking this question :)
Educated guess: You have a ISO-8859-1 encoded pound sign in a UTF-8 encoded page.
Make sure your data is in the right encoding and everything will work fine.
Use £. I had the same problem and solved it using jQuery:
$(this).text('£');
If you try this and it does not work, just change the jQuery methods,
$(this).html('£');
This always work in all contexts...
1st: the pound symbol is a "special" char in utf8 encoding (try saving £$ in a iso-8859-1 (or iso-8859-15) file and you will get ä when encoding using header)
2nd: change your encoding to utf8 form the file.
there are plenty of methods to do it.
notepad and notepad++ are great sugestions.
3rd: use ob_start(); (in php) BEFORE YOU MAKE ANY OUTPUT if you are getting weird encoding errors, like missing the encoding sometimes.
and YES, this solves it!
this kind of errors occurs when a page is encoded in windows-1252(ANSI),ASCII,iso-8859-1(5) and then you have all the others in utf8.
this is a terrible error and can cause weird things like session_start(); not working.
4th: other php solutions:
utf8_encode('£');
htmlentities('£');
echo '£';
5th: javascript solutions:
document.getElementById('id_goes_here').innerText.replace('£','£');
document.getElementById('id_goes_here').innerText.replace('£',"\u00A3");
$(this).html().replace('£','£'); //jquery
$(this).html().replace('£',"\u00A3"); //jquery
String.fromCharCode('163');
you MUST send £, so it will repair the broken encoded code point.
please, avoid these solutions!
use php!
these solutions only show how to 'fix' the error, and the last one only to create the well-encoded char.
Have you tried displaying a £ ?
Here is an overwhelming list.
You could try using £ or £ instead of embedding the character directly; if you embed it directly, you're more likely to run into encoding issues in which your editor saves the file is ISO-8859-1 but it's interpreted as UTF-8, or vice versa.
If you want to embed it (or other Unicode characters) directly, make sure you actually save your file as UTF-8, and set the encoding as you did with the Content-Type header. Make sure when you get the file from the server that the header is present and correct, and that the file hasn't been transcoded by the web server.
Or for other code equivalents try:
£
£
You need to save your PHP script file in UTF-8 encoding, and leave the <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> in the HTML.
For text editor, I recommend Notepad++, because it can detect and display the actual encoding of the file (in the lower right corner of the editor), and you can convert it as well.
This works in all chrome, IE, Firefox.
In Database > table > field type .for example set the symbol column TO varchar(2) utf8_bin
php code:
$symbol = '£';
echo mb_convert_encoding($symbol, 'UTF-8', 'HTML-ENTITIES');
or
html_entity_decode($symbol, ENT_NOQUOTES, 'UTF-8');
And also make sure set the HTML OR XML encoding to encoding="UTF-8"
Note: You should make sure that database, document type and php code all have a same encoding
How ever the better solution would be using £
I would like to change string encoding from UTF-8 to ISO-8859-2 in Javascript. How can I do it?
I need it because I've designed a widget. User just copies < script > tag from my site and puts it on his. This script creates div and puts into div widget contents with text.
If target website is in UTF-8 encoding - it works fine. But when it is in ISO-8859-2 than text that is encoded in UTF-8 is displayed on site with ISO-8859-2 and as a result I see trash.
Instead of using e.g. "ĉ" in your JavaScript code, use Unicode escapes such as "\u0109".
If you're in control of the output, you can replace all special characters with unicode escapes (e.g. \u00e4 for ä). The browser can interpret it correctly regardless of document encoding.
The easiest way to do this would be to put the string into a JSON encoder. Both PHP's and Ruby's does that. Don't know about other implementations though.
Another solution that might work is to add charset="utf-8" to the <script> tag.
I suppose you just need to convert your wdiget from UTF-8 to ISO-8859-2 and provide 2 versions of script.
I'm running into a character encoding issue when I load a dropdown using jQuery from an external js file. This only seems to happen when the JavaScript object is not within the page.
For example the below is the JavaScript object.
var langs = [
{value:'zh-CN', text:'中文 (简体) Chinese Simplified'},
{value:'en', text:'English'},
{value:'eo', text:'EsperAnt'},
{value:'es', text:'Español'},
{value:'ja', text:'日本語 (Japanese)'},
{value:'pt-PT', text:'Português'},
{value:'ru', text:'Русский (Russian)'},
];
If this is in my page with the proper meta tags <meta http-equiv="content-type" content="text/html; charset=utf-8" /> the below code works.
$(document).ready(function() {
// Fill language select
$.each(langs, function(i, j){
$('#LangSelect').append($("<option></option>").attr("value",j.value).text(j.text));
});
But, since I need languages on more then one page I've moved the langs object to an external js file and reference it. After doing this, I run into encoding issues such as russian characters become РуÑÑкий (Russian).
This encoding issues seems to still appear even when the reference to the external js file is set as below:
<script type="text/javascript" charset="UTF-8" src="externalJS.js"></script>
Is there anyway to force the JavaScript object to be loaded with the proper encoding from an external file?
Please note I am experiencing these issues when viewing content on the iPhone Mobile Safari browser. Additionally these pages are simply html and JavaScript without any server side components.
Thanks in advance,
Ben
Is there anyway to force the JavaScript object to be loaded with the proper encoding from an external file?
Yes, the script charset attribute as you quoted. However it historically didn't work everywhere and was best not relied on. Where this is not supported, the browser will always use the charset of the main page as the charset in the script. So as long as you include the UTF-8 charset parameter in the main page you should be fine either way.
I am surprised if a modern browser like Mobile Safari doesn't understand it, though.
Is it possible your server might be serving .js files with a bad Content-Type header containing a wrong charset? A combination of unset mime-types for JS plus AddDefaultCharset in Apache could leave you with:
Content-Type: text/plain;charset=iso-8859-1
Which might maybe have the effect of mucking it up.
Make sure you save the javascript file using UTF-8 encoding. If you open the file in Notepad++, then you can click Format>Encode in UTF-8 (If you try Format>Convert to UTF-8, then have a look at the page using a hex editor. Sometimes you end up with some strange characters at the beginning of the file).