I'm trying to translate text in a javascript file which has been created using English into Japanese. When I just change the string to Japanese characters using copy & paste, the text in Pavlovia is displayed as '?????'. From what I could gather from other posts this has something to do with the fact that Japanese characters are encoded in UTF-8, which is not the standard encoding used in my JavaScript file. I have tried saving the whole file with encoding UTF-8, but this seemingly cannot be read Pavlovia. So I assume I need some local UTF-8 encoding of the individual Japanese text bits.
Unfortunately I have zero JavaScript experience, so any help with this would be greatly appreciated.
Related
I am trying to load a SQL table with a text file. I am using the Text File Input Step. All but one column may contain Japanese characters(Comment). How can I pass all English text and Japanese text as readable characters?
Is there a regex or Java Script code that will pass all of these characters or some other step that Pentaho offers? I prefer not to alter the server if possible.
I am at a loss.
Thanks in advance.
I found that it is a simple as setting the encoding to Shift_JS
I'm working on something that will read a user's text messages and export them to a csv file, which they can then download. The messages are being retrieved from a third-party web interface—I am essentially using js to grab the html of each message and compiling it as needed. The content of each message is added to a variable which, once all message are gathered, is given to a new Blob, which is then downloaded.
The problem I am having is that, in this web interface, emoji are represented as images, rather than characters. Thus, when writing a message containing an emoji to a file, the result is as so:
"Blah blah blah <img height="18px" width="18px" class="emoji adjustedSpriteForMessageDisplay spriteEMOJI sprite-1f612" data-textvalue="%F0%9F%98%92" src="assets/blank.gif">"
Now, from this image, we can get 2 workable values:
The UTF-8 hex value
F09F9892
and the Unicode codepoint (I may be referring to this wrong, I don't know much about encoding).
U+1f612
Now, what I want to do is take either of these values (whichever works better), and write it to the csv file as the character itself. So that, when viewing the csv file in a text editor or what have you, it would appear as
Though I have no idea where to even start with this. Maybe it's as simple as throwing some syntax around the character values, but I haven't been able to get anything from google, because I'm not familiar enough with encoding to know what to Google.
I suggest preprocessing the data as you grab it from the webpage instead of extracting it from the string afterwards.
You can then use decodeURIComponent() to decode the percent-encoded string:
decodeURIComponent('%F0%9F%98%92')
Combine that with jQuery to access the data-textvalue-attribute:
decodeURIComponent($(element).data('textvalue'))
I created a simple example on JSFiddle.
For some reason the emoji doesn't render correctly in the result screen in my browser, but that is a font issue. When looking at the result using a DOM inspector (or copying the text into a different application), the result is shown with a smiley.
CSV file format does not have character encoding information, so Excel usually assumes ASCII.
https://en.wikipedia.org/wiki/Comma-separated_values#General_functionality
Microsoft Excel mangles Diacritics in .csv files?
I'm trying to use ☰ in external javascript file
$('<div />',{
text: '☰',
......
But I couldn't save the file and its saying:
The document's current encoding can not correctly save all of the characters within the document. You may want to change to UTF-8 or an encoding that supports the special characters in this document.
What should I do?
You should convert the file to UTF-8, and then try pasting the character in, again, after it's converted and saved.
Your file could be in one of many, many formats, depending on your editor, but if you're just using a text-editor like Notepad, it's going to cause you problems with things that don't fit happily into ASCII.
I'm trying to convert Chinese characters in a XML file to readable Chinese string using javascript, but I'm not sure how to. I have checked other SO posts, and tried the following
unescape(encodeURIComponent('丘'))
but still can't get it to work, and wondering if someone could help?
<utf8>丘</utf8>
Neither unescape nor encodeURIComponent (which deal with percent-encoding) will help you with a XML character entity. You just want to parse the XML file! Accessing the DOM then will yield the expected string.
I would like to change string encoding from UTF-8 to ISO-8859-2 in Javascript. How can I do it?
I need it because I've designed a widget. User just copies < script > tag from my site and puts it on his. This script creates div and puts into div widget contents with text.
If target website is in UTF-8 encoding - it works fine. But when it is in ISO-8859-2 than text that is encoded in UTF-8 is displayed on site with ISO-8859-2 and as a result I see trash.
Instead of using e.g. "ĉ" in your JavaScript code, use Unicode escapes such as "\u0109".
If you're in control of the output, you can replace all special characters with unicode escapes (e.g. \u00e4 for ä). The browser can interpret it correctly regardless of document encoding.
The easiest way to do this would be to put the string into a JSON encoder. Both PHP's and Ruby's does that. Don't know about other implementations though.
Another solution that might work is to add charset="utf-8" to the <script> tag.
I suppose you just need to convert your wdiget from UTF-8 to ISO-8859-2 and provide 2 versions of script.