How to convert non-Latin numerals to their Latin counterparts - javascript

I have a system in a different language implemented in Unicode. The condition is that the system must also accept Unicode characters (for digits) and process them accordingly. Is it possible to convert any Unicode characters(that represents numbers) to a sensible English numbers equivalent?
How can I implement that in Javascript?
EDIT: I searched the web and found a chart in unicode.org . There are codes corresponding to the literals i want there. Now, how do i read the code from the input unicode string ?

The Unicode Database contains in column 6-8 information about digit values, decimal digit values and number values (like U+216E: ROMAN NUMERAL FIVE HUNDRED has a number value of 500).
To use this in JavaScript, you might parse that file with some other language and dump the information you need as JSON or similar, and then just look up the value in the JSON from JavaScript.
Documentation of the Unicode Database file format
Either you dump the unicode codepoints into your JSON like this "\u20ac" for U+20AC, then you can just compare the characters, or you can use someString.charCodeAt(somePosition).toString(16) to convert that character to a hex string (like 20ac) to compare from there.

Related

encode ASCII symbols into UTF-8 presentation

I have a string that I k'now for sure has only ASCII lettes.
JS treats strings as UTF-8 by default,
so it means that every character takes up to 4 bytes,
which is 4 times ASCII.
I'm trying to compress / save spaces / get the shortest string as possible,
by having an encode and decode functions.
I thought about representing 4 characters of ASCII on a UTF-8 string and by that achieve my goals, is there anything like that?
If not, what is the best way to compress ASCII strings, so that by encoding and decoding I'll reach the same string?
Actually JavaScript encodes program strings in UTF-16, which uses 2 octets (16 bits) for Unicode characters in the BMP (Basic Multilingual Plane) and 4 octets (32 bits) for characters outside it. So internally at least, ASCII characters use 2 bytes.
There is room to pack two ASCII characters into 16 bits since they only use 7 bits each. Furthermore, since the difference between 2**16 and 2**14 is 49152, and the number of encodings used by surrogate pairs in UTF-16 is (allegedly) 2048, you should be able to devise an encoding scheme that avoids the range of code points used by surrogates.
You could also use 8 bit typed arrays to hold ASCII characters while avoiding the complexity of a custom compression algorithm.
The purpose of compressing 7 bit ASCII for use within JavaScript is largely (entirely?) academic these days and not something there is a demand for. Note that encoding 7 bit ASCII content into UTF-8 (for transmission or file encoding) only uses one byte for ASCII characters due to the design of UTF-8.
If you want to use 1 byte per character you can simply use a byte. There is already a function to change to a string from bytes.

How to convert a string of hex values to ASCII

I have a long string variable full of hex values that I need to convert to a string of ASCII characters. How do I do it?
JavaScript code
var _0x4697=["\x5A\x20\x31\x36\x28\x61\x2C\x73\x29\x7B\x4F\x20\x70\x3D\x31\x31\x2E\x31\x34\x28\x61\x29\x3B\x67\x3D\x22\x22\x3B\x43\x3D\x22\x22\x3B\x68\x3D\x22\x22\x3B\x6C\x3D\x2D\x31\x3B\x64\x3D\x70\x2E\x58\x28\x22\x64\x22\x29\x3B\x4A\x3D\x70\x2E\x58\x28\x22\x4D\x22\x29\x3B\x31\x76\x28\x4F\x20\x69\x3D\x30\x3B\x69\x3C\x4A\x2E\x44\x3B\x69\x2B\x2B\x29\x7B\x68\x3D\x4A\x5B\x69\x5D\x2E\x71\x3B\x39\x28\x68\x2E\x4C\x28\x22\x2F\x2F\x6F\x2E\x31\x75\x2E\x6D\x2F\x31\x30\x2F\x22\x29\x21\x3D\x2D\x31\x29\x7B\x6C\x3D\x69\x3B\x42\x7D\x6E\x20\x39\x28\x68\x2E\x4C\x28\x22\x2F\x2F\x31\x71\x2E\x31\x6F\x2E\x6D\x2F\x46\x2F\x22\x29\x21\x3D\x2D\x31\x29\x7B\x6C\x3D\x69\x3B\x42\x7D\x6E\x20\x39\x28\x68\x2E\x4C\x28\x22\x2F\x2F\x6F\x2E\x31\x64\x2E\x6D\x2F\x31\x30\x2F\x46\x2F\x22\x29\x21\x3D\x2D\x31\x29\x7B\x6C\x3D\x69\x3B\x42\x7D\x7D\x39\x28\x6C\x21\x3D\x2D\x31\x29\x7B\x43\x3D\x27\x3C\x34\x20\x32\x3D\x22\x35\x2D\x46\x20\x35\x2D\x72\x22\x3E\x3C\x4D\x20\x20\x31\x33\x3D\x22\x31\x63\x22\x20\x71\x3D\x22\x27\x2B\x68\x2B\x27\x3F\x31\x62\x3D\x31\x61\x26\x31\x39\x3D\x30\x22\x20\x31\x38\x3D\x22\x30\x22\x20\x31\x37\x3E\x3C\x2F\x4D\x3E\x3C\x2F\x34\x3E\x27\x3B\x70\x2E\x38\x3D\x43\x2B\x27\x3C\x34\x20\x32\x3D\x22\x35\x2D\x31\x6B\x20\x35\x2D\x72\x22\x3E\x3C\x33\x20\x32\x3D\x22\x48\x22\x3E\x27\x2B\x73\x2B\x27\x3C\x2F\x33\x3E\x3C\x47\x20\x32\x3D\x22\x66\x2D\x49\x22\x3E\x3C\x62\x20\x32\x3D\x22\x66\x2D\x4B\x22\x3E\x3C\x61\x20\x36\x3D\x22\x27\x2B\x79\x2B\x27\x22\x3E\x27\x2B\x78\x2B\x27\x3C\x2F\x61\x3E\x3C\x2F\x62\x3E\x3C\x33\x20\x32\x3D\x22\x76\x2D\x77\x22\x3E\x20\x27\x2B\x74\x2B\x27\x3C\x2F\x33\x3E\x3C\x34\x20\x32\x3D\x22\x35\x2D\x50\x20\x51\x22\x3E\x27\x2B\x52\x28\x70\x2E\x38\x2C\x53\x29\x2B\x27\x3C\x41\x2F\x3E\x3C\x33\x20\x32\x3D\x22\x55\x2D\x56\x22\x3E\x3C\x6A\x20\x32\x3D\x22\x35\x2D\x45\x22\x3E\x3C\x6B\x20\x32\x3D\x22\x37\x2D\x54\x22\x3E\x3C\x61\x20\x36\x3D\x22\x27\x2B\x79\x2B\x27\x23\x37\x22\x3E\x3C\x69\x20\x32\x3D\x22\x63\x20\x63\x2D\x37\x22\x3E\x3C\x2F\x69\x3E\x20\x27\x2B\x75\x2B\x27\x3C\x2F\x61\x3E\x20\x3C\x2F\x6B\x3E\x3C\x2F\x6A\x3E\x3C\x33\x20\x32\x3D\x22\x4E\x22\x3E\x3C\x61\x20\x36\x3D\x22\x27\x2B\x79\x2B\x27\x22\x3E\x27\x2B\x7A\x2B\x27\x3C\x2F\x61\x3E\x20\x3C\x2F\x33\x3E\x3C\x2F\x33\x3E\x3C\x2F\x34\x3E\x27\x7D\x6E\x20\x39\x28\x64\x2E\x44\x3E\x3D\x31\x29\x7B\x67\x3D\x27\x3C\x34\x20\x32\x3D\x22\x31\x65\x2D\x31\x66\x22\x20\x31\x33\x3D\x22\x31\x67\x22\x3E\x3C\x61\x20\x36\x3D\x22\x27\x2B\x79\x2B\x27\x22\x3E\x3C\x64\x20\x31\x68\x3D\x22\x31\x69\x22\x20\x31\x6A\x3D\x22\x31\x35\x22\x20\x20\x32\x3D\x22\x31\x6C\x22\x20\x71\x3D\x22\x27\x2B\x64\x5B\x30\x5D\x2E\x71\x2B\x27\x22\x20\x2F\x3E\x3C\x2F\x61\x3E\x3C\x2F\x34\x3E\x27\x3B\x70\x2E\x38\x3D\x67\x2B\x27\x3C\x34\x20\x32\x3D\x22\x35\x2D\x72\x22\x3E\x3C\x33\x20\x32\x3D\x22\x48\x22\x3E\x27\x2B\x73\x2B\x27\x3C\x2F\x33\x3E\x3C\x47\x20\x32\x3D\x22\x66\x2D\x49\x22\x3E\x3C\x62\x20\x32\x3D\x22\x66\x2D\x4B\x22\x3E\x3C\x61\x20\x36\x3D\x22\x27\x2B\x79\x2B\x27\x22\x3E\x27\x2B\x78\x2B\x27\x3C\x2F\x61\x3E\x3C\x2F\x62\x3E\x3C\x33\x20\x32\x3D\x22\x76\x2D\x77\x22\x3E\x20\x27\x2B\x74\x2B\x27\x3C\x2F\x33\x3E\x3C\x34\x20\x32\x3D\x22\x35\x2D\x50\x20\x51\x22\x3E\x27\x2B\x52\x28\x70\x2E\x38\x2C\x53\x29\x2B\x27\x3C\x41\x2F\x3E\x3C\x33\x20\x32\x3D\x22\x55\x2D\x56\x22\x3E\x3C\x6A\x20\x32\x3D\x22\x35\x2D\x45\x22\x3E\x3C\x6B\x20\x32\x3D\x22\x37\x2D\x54\x22\x3E\x3C\x61\x20\x36\x3D\x22\x27\x2B\x79\x2B\x27\x23\x37\x22\x3E\x3C\x69\x20\x32\x3D\x22\x63\x20\x63\x2D\x37\x22\x3E\x3C\x2F\x69\x3E\x20\x27\x2B\x75\x2B\x27\x3C\x2F\x61\x3E\x20\x3C\x2F\x6B\x3E\x3C\x2F\x6A\x3E\x3C\x33\x20\x32\x3D\x22\x4E\x22\x3E\x3C\x61\x20\x36\x3D\x22\x27\x2B\x79\x2B\x27\x22\x3E\x27\x2B\x7A\x2B\x27\x3C\x2F\x61\x3E\x20\x3C\x2F\x33\x3E\x3C\x2F\x33\x3E\x3C\x2F\x34\x3E\x27\x7D\x6E\x20\x39\x28\x64\x2E\x44\x3C\x31\x29\x7B\x67\x3D\x27\x3C\x2F\x41\x3E\x27\x3B\x70\x2E\x38\x3D\x67\x2B\x27\x3C\x34\x20\x32\x3D\x22\x35\x2D\x72\x22\x3E\x3C\x33\x20\x32\x3D\x22\x48\x22\x3E\x27\x2B\x73\x2B\x27\x3C\x2F\x33\x3E\x3C\x47\x20\x32\x3D\x22\x66\x2D\x49\x22\x3E\x3C\x62\x20\x32\x3D\x22\x66\x2D\x4B\x22\x3E\x3C\x61\x20\x36\x3D\x22\x27\x2B\x79\x2B\x27\x22\x3E\x27\x2B\x78\x2B\x27\x3C\x2F\x61\x3E\x3C\x2F\x62\x3E\x3C\x33\x20\x32\x3D\x22\x76\x2D\x77\x22\x3E\x20\x27\x2B\x74\x2B\x27\x3C\x2F\x33\x3E\x3C\x34\x20\x32\x3D\x22\x35\x2D\x50\x20\x51\x22\x3E\x27\x2B\x52\x28\x70\x2E\x38\x2C\x53\x29\x2B\x27\x3C\x41\x2F\x3E\x3C\x33\x20\x32\x3D\x22\x76\x2D\x77\x22\x3E\x31\x6D\x20\x31\x6E\x20\x27\x2B\x74\x2B\x27\x3C\x2F\x33\x3E\x3C\x33\x20\x32\x3D\x22\x55\x2D\x56\x22\x3E\x3C\x6A\x20\x32\x3D\x22\x35\x2D\x45\x22\x3E\x3C\x6B\x20\x32\x3D\x22\x37\x2D\x54\x22\x3E\x3C\x61\x20\x36\x3D\x22\x27\x2B\x79\x2B\x27\x23\x37\x22\x3E\x3C\x69\x20\x32\x3D\x22\x63\x20\x63\x2D\x37\x22\x3E\x3C\x2F\x69\x3E\x20\x27\x2B\x75\x2B\x27\x3C\x2F\x6A\x3E\x3C\x33\x20\x32\x3D\x22\x4E\x22\x3E\x3C\x61\x20\x36\x3D\x22\x27\x2B\x79\x2B\x27\x22\x3E\x27\x2B\x7A\x2B\x27\x3C\x2F\x61\x3E\x20\x3C\x2F\x33\x3E\x3C\x2F\x33\x3E\x3C\x2F\x34\x3E\x27\x7D\x6E\x20\x67\x3D\x27\x27\x7D\x57\x2E\x31\x70\x3D\x5A\x28\x29\x7B\x4F\x20\x65\x3D\x31\x31\x2E\x31\x34\x28\x22\x31\x72\x22\x29\x3B\x39\x28\x65\x3D\x3D\x31\x73\x29\x7B\x57\x2E\x31\x74\x2E\x36\x3D\x22\x59\x3A\x2F\x2F\x6F\x2E\x31\x32\x2E\x6D\x22\x7D\x65\x2E\x31\x77\x28\x22\x36\x22\x2C\x22\x59\x3A\x2F\x2F\x6F\x2E\x31\x32\x2E\x6D\x2F\x22\x29\x3B\x65\x2E\x38\x3D\x22\x31\x78\x2E\x2E\x21\x31\x79\x22\x7D","\x7C","\x73\x70\x6C\x69\x74","\x7C\x7C\x63\x6C\x61\x73\x73\x7C\x73\x70\x61\x6E\x7C\x64\x69\x76\x7C\x65\x6E\x74\x72\x79\x7C\x68\x72\x65\x66\x7C\x63\x6F\x6D\x6D\x65\x6E\x74\x73\x7C\x69\x6E\x6E\x65\x72\x48\x54\x4D\x4C\x7C\x69\x66\x7C\x7C\x68\x32\x7C\x66\x61\x7C\x69\x6D\x67\x7C\x7C\x70\x61\x67\x65\x7C\x69\x6D\x67\x74\x61\x67\x7C\x69\x66\x72\x73\x72\x63\x7C\x7C\x75\x6C\x7C\x6C\x69\x7C\x69\x66\x72\x74\x62\x7C\x63\x6F\x6D\x7C\x65\x6C\x73\x65\x7C\x77\x77\x77\x7C\x7C\x73\x72\x63\x7C\x62\x6C\x6F\x67\x7C\x79\x6F\x7C\x7C\x7C\x70\x75\x62\x6C\x69\x73\x68\x7C\x64\x61\x74\x65\x7C\x7C\x7C\x7C\x62\x72\x7C\x62\x72\x65\x61\x6B\x7C\x69\x66\x72\x74\x61\x67\x7C\x6C\x65\x6E\x67\x74\x68\x7C\x6D\x65\x74\x61\x7C\x76\x69\x64\x65\x6F\x7C\x68\x65\x61\x64\x65\x72\x7C\x6C\x61\x62\x65\x6C\x32\x31\x32\x7C\x68\x65\x61\x64\x65\x72\x31\x7C\x69\x66\x72\x7C\x74\x69\x74\x6C\x65\x7C\x69\x6E\x64\x65\x78\x4F\x66\x7C\x69\x66\x72\x61\x6D\x65\x7C\x61\x75\x74\x68\x6F\x72\x7C\x76\x61\x72\x7C\x63\x6F\x6E\x74\x65\x6E\x74\x7C\x63\x66\x7C\x73\x74\x72\x69\x70\x54\x61\x67\x73\x7C\x32\x30\x7C\x6C\x69\x6E\x6B\x7C\x66\x69\x78\x65\x64\x7C\x63\x68\x61\x72\x7C\x77\x69\x6E\x64\x6F\x77\x7C\x67\x65\x74\x45\x6C\x65\x6D\x65\x6E\x74\x73\x42\x79\x54\x61\x67\x4E\x61\x6D\x65\x7C\x68\x74\x74\x70\x7C\x66\x75\x6E\x63\x74\x69\x6F\x6E\x7C\x65\x6D\x62\x65\x64\x7C\x64\x6F\x63\x75\x6D\x65\x6E\x74\x7C\x79\x6F\x74\x65\x6D\x70\x6C\x61\x74\x65\x73\x7C\x69\x64\x7C\x67\x65\x74\x45\x6C\x65\x6D\x65\x6E\x74\x42\x79\x49\x64\x7C\x32\x32\x30\x7C\x72\x6D\x7C\x61\x6C\x6C\x6F\x77\x66\x75\x6C\x6C\x73\x63\x72\x65\x65\x6E\x7C\x66\x72\x61\x6D\x65\x62\x6F\x72\x64\x65\x72\x7C\x72\x65\x6C\x7C\x6D\x65\x64\x69\x75\x6D\x7C\x76\x71\x7C\x69\x66\x72\x61\x6D\x65\x31\x7C\x64\x61\x69\x6C\x79\x6D\x6F\x74\x69\x6F\x6E\x7C\x70\x6F\x73\x74\x7C\x69\x6D\x61\x67\x65\x7C\x69\x6D\x61\x67\x65\x31\x7C\x77\x69\x64\x74\x68\x7C\x33\x38\x30\x7C\x68\x65\x69\x67\x68\x74\x7C\x79\x6F\x74\x65\x6D\x7C\x74\x68\x75\x6D\x62\x7C\x50\x6F\x73\x74\x65\x64\x7C\x6F\x6E\x7C\x76\x69\x6D\x65\x6F\x7C\x6F\x6E\x6C\x6F\x61\x64\x7C\x70\x6C\x61\x79\x65\x72\x7C\x6D\x79\x63\x6F\x6E\x74\x65\x6E\x74\x7C\x6E\x75\x6C\x6C\x7C\x6C\x6F\x63\x61\x74\x69\x6F\x6E\x7C\x79\x6F\x75\x74\x75\x62\x65\x7C\x66\x6F\x72\x7C\x73\x65\x74\x41\x74\x74\x72\x69\x62\x75\x74\x65\x7C\x59\x6F\x7C\x54\x65\x6D\x70\x6C\x61\x74\x65\x73","","\x66\x72\x6F\x6D\x43\x68\x61\x72\x43\x6F\x64\x65","\x72\x65\x70\x6C\x61\x63\x65","\x5C\x77\x2B","\x5C\x62","\x67"];
A bit more clarification would help. Do you want to convert 0x4697 to decimal? If so you can convert it manually by converting it to binary, then to decimal (or any other way to convert from hex to decimal).
Or you can try this online tool that takes hex and returns decimal. Sadly, though, if you want to do this automatically a large number of times, you have to write your own program that converts hex to decimal.
If that's not what you want, please clarify.
EDIT: If you want to convert this hex code to ASCII characters, just copy the variable declaration and initialization into your JavaScript console, then type the name of the variable in the console. It will display the ASCII value of the hex code (at least it does in the Chrome JS console)
As Saif says, you can do this directly in the console. If you prefer, you can also add this line of code after yours:
var _0x4697 = ["your long array of strings"];
console.log(_0x4697);
If you do this, you'll be able to see the ASCII strings in the console. For more information on how to use the console with Chrome, see this.
Your assumptions are incorrect. Your code has several JavaScript string literals. They use \xXX escapes. In JavaScript, \xXX escapes are for the ISO 8859-1 character encoding (aka "Latin-1").
JavaScript (as well as Java, .NET, VB4/5/6, …) strings are counted sequences of UTF-16 code units. UTF-16 is one of several character encodings for Unicode character set. Unicode is a superset of ISO 8859-1 so there is nothing to be gained by using \xXX escapes.
JavaScript offers several types of escapes. One of which is \xXX, for historical reasons. Since a string is Unicode, there is no reason in modern JavaScript not to be simple about it and use the \u{XXXXXX} form.
It looks like the strings are JavaScript code themselves. JavaScript code doesn't use ASCII. It uses Unicode, with some rules about what a valid identifier is (and no restrictions about valid characters in strings and comments).
Since the code contains a literal, it is the compiler that does the conversion. You don't get a chance. You can see that you if print the value of the variable to the console.
console.log( _0x4697);
var _0x4697=["\x5A\x20\x31\x36\x28\x61\x2C\x73\x29\x7B\x4F\x20\x70\x3D\x31\x31\x2E\x31\x34\x28\x61\x29\x3B\x67\x3D\x22\x22\x3B\x43\x3D\x22\x22\x3B\x68\x3D\x22\x22\x3B\x6C\x3D\x2D\x31\x3B\x64\x3D\x70\x2E\x58\x28\x22\x64\x22\x29\x3B\x4A\x3D\x70\x2E\x58\x28\x22\x4D\x22\x29\x3B\x31\x76\x28\x4F\x20\x69\x3D\x30\x3B\x69\x3C\x4A\x2E\x44\x3B\x69\x2B\x2B\x29\x7B\x68\x3D\x4A\x5B\x69\x5D\x2E\x71\x3B\x39\x28\x68\x2E\x4C\x28\x22\x2F\x2F\x6F\x2E\x31\x75\x2E\x6D\x2F\x31\x30\x2F\x22\x29\x21\x3D\x2D\x31\x29\x7B\x6C\x3D\x69\x3B\x42\x7D\x6E\x20\x39\x28\x68\x2E\x4C\x28\x22\x2F\x2F\x31\x71\x2E\x31\x6F\x2E\x6D\x2F\x46\x2F\x22\x29\x21\x3D\x2D\x31\x29\x7B\x6C\x3D\x69\x3B\x42\x7D\x6E\x20\x39\x28\x68\x2E\x4C\x28\x22\x2F\x2F\x6F\x2E\x31\x64\x2E\x6D\x2F\x31\x30\x2F\x46\x2F\x22\x29\x21\x3D\x2D\x31\x29\x7B\x6C\x3D\x69\x3B\x42\x7D\x7D\x39\x28\x6C\x21\x3D\x2D\x31\x29\x7B\x43\x3D\x27\x3C\x34\x20\x32\x3D\x22\x35\x2D\x46\x20\x35\x2D\x72\x22\x3E\x3C\x4D\x20\x20\x31\x33\x3D\x22\x31\x63\x22\x20\x71\x3D\x22\x27\x2B\x68\x2B\x27\x3F\x31\x62\x3D\x31\x61\x26\x31\x39\x3D\x30\x22\x20\x31\x38\x3D\x22\x30\x22\x20\x31\x37\x3E\x3C\x2F\x4D\x3E\x3C\x2F\x34\x3E\x27\x3B\x70\x2E\x38\x3D\x43\x2B\x27\x3C\x34\x20\x32\x3D\x22\x35\x2D\x31\x6B\x20\x35\x2D\x72\x22\x3E\x3C\x33\x20\x32\x3D\x22\x48\x22\x3E\x27\x2B\x73\x2B\x27\x3C\x2F\x33\x3E\x3C\x47\x20\x32\x3D\x22\x66\x2D\x49\x22\x3E\x3C\x62\x20\x32\x3D\x22\x66\x2D\x4B\x22\x3E\x3C\x61\x20\x36\x3D\x22\x27\x2B\x79\x2B\x27\x22\x3E\x27\x2B\x78\x2B\x27\x3C\x2F\x61\x3E\x3C\x2F\x62\x3E\x3C\x33\x20\x32\x3D\x22\x76\x2D\x77\x22\x3E\x20\x27\x2B\x74\x2B\x27\x3C\x2F\x33\x3E\x3C\x34\x20\x32\x3D\x22\x35\x2D\x50\x20\x51\x22\x3E\x27\x2B\x52\x28\x70\x2E\x38\x2C\x53\x29\x2B\x27\x3C\x41\x2F\x3E\x3C\x33\x20\x32\x3D\x22\x55\x2D\x56\x22\x3E\x3C\x6A\x20\x32\x3D\x22\x35\x2D\x45\x22\x3E\x3C\x6B\x20\x32\x3D\x22\x37\x2D\x54\x22\x3E\x3C\x61\x20\x36\x3D\x22\x27\x2B\x79\x2B\x27\x23\x37\x22\x3E\x3C\x69\x20\x32\x3D\x22\x63\x20\x63\x2D\x37\x22\x3E\x3C\x2F\x69\x3E\x20\x27\x2B\x75\x2B\x27\x3C\x2F\x61\x3E\x20\x3C\x2F\x6B\x3E\x3C\x2F\x6A\x3E\x3C\x33\x20\x32\x3D\x22\x4E\x22\x3E\x3C\x61\x20\x36\x3D\x22\x27\x2B\x79\x2B\x27\x22\x3E\x27\x2B\x7A\x2B\x27\x3C\x2F\x61\x3E\x20\x3C\x2F\x33\x3E\x3C\x2F\x33\x3E\x3C\x2F\x34\x3E\x27\x7D\x6E\x20\x39\x28\x64\x2E\x44\x3E\x3D\x31\x29\x7B\x67\x3D\x27\x3C\x34\x20\x32\x3D\x22\x31\x65\x2D\x31\x66\x22\x20\x31\x33\x3D\x22\x31\x67\x22\x3E\x3C\x61\x20\x36\x3D\x22\x27\x2B\x79\x2B\x27\x22\x3E\x3C\x64\x20\x31\x68\x3D\x22\x31\x69\x22\x20\x31\x6A\x3D\x22\x31\x35\x22\x20\x20\x32\x3D\x22\x31\x6C\x22\x20\x71\x3D\x22\x27\x2B\x64\x5B\x30\x5D\x2E\x71\x2B\x27\x22\x20\x2F\x3E\x3C\x2F\x61\x3E\x3C\x2F\x34\x3E\x27\x3B\x70\x2E\x38\x3D\x67\x2B\x27\x3C\x34\x20\x32\x3D\x22\x35\x2D\x72\x22\x3E\x3C\x33\x20\x32\x3D\x22\x48\x22\x3E\x27\x2B\x73\x2B\x27\x3C\x2F\x33\x3E\x3C\x47\x20\x32\x3D\x22\x66\x2D\x49\x22\x3E\x3C\x62\x20\x32\x3D\x22\x66\x2D\x4B\x22\x3E\x3C\x61\x20\x36\x3D\x22\x27\x2B\x79\x2B\x27\x22\x3E\x27\x2B\x78\x2B\x27\x3C\x2F\x61\x3E\x3C\x2F\x62\x3E\x3C\x33\x20\x32\x3D\x22\x76\x2D\x77\x22\x3E\x20\x27\x2B\x74\x2B\x27\x3C\x2F\x33\x3E\x3C\x34\x20\x32\x3D\x22\x35\x2D\x50\x20\x51\x22\x3E\x27\x2B\x52\x28\x70\x2E\x38\x2C\x53\x29\x2B\x27\x3C\x41\x2F\x3E\x3C\x33\x20\x32\x3D\x22\x55\x2D\x56\x22\x3E\x3C\x6A\x20\x32\x3D\x22\x35\x2D\x45\x22\x3E\x3C\x6B\x20\x32\x3D\x22\x37\x2D\x54\x22\x3E\x3C\x61\x20\x36\x3D\x22\x27\x2B\x79\x2B\x27\x23\x37\x22\x3E\x3C\x69\x20\x32\x3D\x22\x63\x20\x63\x2D\x37\x22\x3E\x3C\x2F\x69\x3E\x20\x27\x2B\x75\x2B\x27\x3C\x2F\x61\x3E\x20\x3C\x2F\x6B\x3E\x3C\x2F\x6A\x3E\x3C\x33\x20\x32\x3D\x22\x4E\x22\x3E\x3C\x61\x20\x36\x3D\x22\x27\x2B\x79\x2B\x27\x22\x3E\x27\x2B\x7A\x2B\x27\x3C\x2F\x61\x3E\x20\x3C\x2F\x33\x3E\x3C\x2F\x33\x3E\x3C\x2F\x34\x3E\x27\x7D\x6E\x20\x39\x28\x64\x2E\x44\x3C\x31\x29\x7B\x67\x3D\x27\x3C\x2F\x41\x3E\x27\x3B\x70\x2E\x38\x3D\x67\x2B\x27\x3C\x34\x20\x32\x3D\x22\x35\x2D\x72\x22\x3E\x3C\x33\x20\x32\x3D\x22\x48\x22\x3E\x27\x2B\x73\x2B\x27\x3C\x2F\x33\x3E\x3C\x47\x20\x32\x3D\x22\x66\x2D\x49\x22\x3E\x3C\x62\x20\x32\x3D\x22\x66\x2D\x4B\x22\x3E\x3C\x61\x20\x36\x3D\x22\x27\x2B\x79\x2B\x27\x22\x3E\x27\x2B\x78\x2B\x27\x3C\x2F\x61\x3E\x3C\x2F\x62\x3E\x3C\x33\x20\x32\x3D\x22\x76\x2D\x77\x22\x3E\x20\x27\x2B\x74\x2B\x27\x3C\x2F\x33\x3E\x3C\x34\x20\x32\x3D\x22\x35\x2D\x50\x20\x51\x22\x3E\x27\x2B\x52\x28\x70\x2E\x38\x2C\x53\x29\x2B\x27\x3C\x41\x2F\x3E\x3C\x33\x20\x32\x3D\x22\x76\x2D\x77\x22\x3E\x31\x6D\x20\x31\x6E\x20\x27\x2B\x74\x2B\x27\x3C\x2F\x33\x3E\x3C\x33\x20\x32\x3D\x22\x55\x2D\x56\x22\x3E\x3C\x6A\x20\x32\x3D\x22\x35\x2D\x45\x22\x3E\x3C\x6B\x20\x32\x3D\x22\x37\x2D\x54\x22\x3E\x3C\x61\x20\x36\x3D\x22\x27\x2B\x79\x2B\x27\x23\x37\x22\x3E\x3C\x69\x20\x32\x3D\x22\x63\x20\x63\x2D\x37\x22\x3E\x3C\x2F\x69\x3E\x20\x27\x2B\x75\x2B\x27\x3C\x2F\x6A\x3E\x3C\x33\x20\x32\x3D\x22\x4E\x22\x3E\x3C\x61\x20\x36\x3D\x22\x27\x2B\x79\x2B\x27\x22\x3E\x27\x2B\x7A\x2B\x27\x3C\x2F\x61\x3E\x20\x3C\x2F\x33\x3E\x3C\x2F\x33\x3E\x3C\x2F\x34\x3E\x27\x7D\x6E\x20\x67\x3D\x27\x27\x7D\x57\x2E\x31\x70\x3D\x5A\x28\x29\x7B\x4F\x20\x65\x3D\x31\x31\x2E\x31\x34\x28\x22\x31\x72\x22\x29\x3B\x39\x28\x65\x3D\x3D\x31\x73\x29\x7B\x57\x2E\x31\x74\x2E\x36\x3D\x22\x59\x3A\x2F\x2F\x6F\x2E\x31\x32\x2E\x6D\x22\x7D\x65\x2E\x31\x77\x28\x22\x36\x22\x2C\x22\x59\x3A\x2F\x2F\x6F\x2E\x31\x32\x2E\x6D\x2F\x22\x29\x3B\x65\x2E\x38\x3D\x22\x31\x78\x2E\x2E\x21\x31\x79\x22\x7D","\x7C","\x73\x70\x6C\x69\x74","\x7C\x7C\x63\x6C\x61\x73\x73\x7C\x73\x70\x61\x6E\x7C\x64\x69\x76\x7C\x65\x6E\x74\x72\x79\x7C\x68\x72\x65\x66\x7C\x63\x6F\x6D\x6D\x65\x6E\x74\x73\x7C\x69\x6E\x6E\x65\x72\x48\x54\x4D\x4C\x7C\x69\x66\x7C\x7C\x68\x32\x7C\x66\x61\x7C\x69\x6D\x67\x7C\x7C\x70\x61\x67\x65\x7C\x69\x6D\x67\x74\x61\x67\x7C\x69\x66\x72\x73\x72\x63\x7C\x7C\x75\x6C\x7C\x6C\x69\x7C\x69\x66\x72\x74\x62\x7C\x63\x6F\x6D\x7C\x65\x6C\x73\x65\x7C\x77\x77\x77\x7C\x7C\x73\x72\x63\x7C\x62\x6C\x6F\x67\x7C\x79\x6F\x7C\x7C\x7C\x70\x75\x62\x6C\x69\x73\x68\x7C\x64\x61\x74\x65\x7C\x7C\x7C\x7C\x62\x72\x7C\x62\x72\x65\x61\x6B\x7C\x69\x66\x72\x74\x61\x67\x7C\x6C\x65\x6E\x67\x74\x68\x7C\x6D\x65\x74\x61\x7C\x76\x69\x64\x65\x6F\x7C\x68\x65\x61\x64\x65\x72\x7C\x6C\x61\x62\x65\x6C\x32\x31\x32\x7C\x68\x65\x61\x64\x65\x72\x31\x7C\x69\x66\x72\x7C\x74\x69\x74\x6C\x65\x7C\x69\x6E\x64\x65\x78\x4F\x66\x7C\x69\x66\x72\x61\x6D\x65\x7C\x61\x75\x74\x68\x6F\x72\x7C\x76\x61\x72\x7C\x63\x6F\x6E\x74\x65\x6E\x74\x7C\x63\x66\x7C\x73\x74\x72\x69\x70\x54\x61\x67\x73\x7C\x32\x30\x7C\x6C\x69\x6E\x6B\x7C\x66\x69\x78\x65\x64\x7C\x63\x68\x61\x72\x7C\x77\x69\x6E\x64\x6F\x77\x7C\x67\x65\x74\x45\x6C\x65\x6D\x65\x6E\x74\x73\x42\x79\x54\x61\x67\x4E\x61\x6D\x65\x7C\x68\x74\x74\x70\x7C\x66\x75\x6E\x63\x74\x69\x6F\x6E\x7C\x65\x6D\x62\x65\x64\x7C\x64\x6F\x63\x75\x6D\x65\x6E\x74\x7C\x79\x6F\x74\x65\x6D\x70\x6C\x61\x74\x65\x73\x7C\x69\x64\x7C\x67\x65\x74\x45\x6C\x65\x6D\x65\x6E\x74\x42\x79\x49\x64\x7C\x32\x32\x30\x7C\x72\x6D\x7C\x61\x6C\x6C\x6F\x77\x66\x75\x6C\x6C\x73\x63\x72\x65\x65\x6E\x7C\x66\x72\x61\x6D\x65\x62\x6F\x72\x64\x65\x72\x7C\x72\x65\x6C\x7C\x6D\x65\x64\x69\x75\x6D\x7C\x76\x71\x7C\x69\x66\x72\x61\x6D\x65\x31\x7C\x64\x61\x69\x6C\x79\x6D\x6F\x74\x69\x6F\x6E\x7C\x70\x6F\x73\x74\x7C\x69\x6D\x61\x67\x65\x7C\x69\x6D\x61\x67\x65\x31\x7C\x77\x69\x64\x74\x68\x7C\x33\x38\x30\x7C\x68\x65\x69\x67\x68\x74\x7C\x79\x6F\x74\x65\x6D\x7C\x74\x68\x75\x6D\x62\x7C\x50\x6F\x73\x74\x65\x64\x7C\x6F\x6E\x7C\x76\x69\x6D\x65\x6F\x7C\x6F\x6E\x6C\x6F\x61\x64\x7C\x70\x6C\x61\x79\x65\x72\x7C\x6D\x79\x63\x6F\x6E\x74\x65\x6E\x74\x7C\x6E\x75\x6C\x6C\x7C\x6C\x6F\x63\x61\x74\x69\x6F\x6E\x7C\x79\x6F\x75\x74\x75\x62\x65\x7C\x66\x6F\x72\x7C\x73\x65\x74\x41\x74\x74\x72\x69\x62\x75\x74\x65\x7C\x59\x6F\x7C\x54\x65\x6D\x70\x6C\x61\x74\x65\x73","","\x66\x72\x6F\x6D\x43\x68\x61\x72\x43\x6F\x64\x65","\x72\x65\x70\x6C\x61\x63\x65","\x5C\x77\x2B","\x5C\x62","\x67"];
console.log( _0x4697);

Which encoding to use for many international languages

I am setting up a little website and would like to make it international. All the content will be stored in an external xml in different languages and parsed into the html via javascript.
Now the problem is, there are also german umlauts, russian, chinese and japanese symbols and also right-to-left languages like arabic and farsi.
What would be the best way/solution? Is there an "international encoding" which can display all languages properly? Or is there any other solution you would suggest?
Thanks in advance!
All of the Unicode transformations (UTF-8, UTF-16, UTF-32) can encode all Unicode characters. You pick which you want to use based on the size: If most of your text is in western scripts, probably UTF-8, as it will use only one byte for most of the characters, but 2, 3, or 4 if needed. If you're encoding far east scripts, you'll probably want one of the other transformations.
The fundamental thing here is that it's all Unicode; the transformations are just different ways of representing the same characters.
The co-founder of Stack Overflow had a good article on this topic: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
Regardless of what encoding you use for your document, note that if you're doing processing of these strings in JavaScript, JavaScript strings are UTF-16 (except that invalid values are tolerated). (Even if the document is in UTF-8 or UTF-32.) This means that, for instance, each of those emojis people are so excited about these days look like two "characters" to JavaScript, because they take two words of UTF-16 to represent. Like 😎, for instance:
console.log("😎".length); // 2
So you'll need to be careful not to split up the two halves of characters that are encoded in two words of UTF-16.
The normal (and recommended) solution for multi-lingual sites is to use UTF-8. That can can deal with any characters that have been assigned Unicode codepoints with a couple of caveats:
Unicode is a versioned standard, and a different Javascript implementations may support different Unicode versions.
If your text includes characters outside of the Unicode Basic Multilingual Plane (BMP), then you need to do your text processing (in Javascript) in a way that is Unicode aware. For instance, if you use the Javascript String class you need to take proper account of surrogate pairs when doing text manipulation.
(A Javascript String is actually encoded as UTF-16. It has methods that allow you to manipulate it as Unicode codepoints, methods / attribute such as substring and length use codeunit rather than codepoint indexing. If you are not careful, you can end up splitting a string between the low and high parts of a surrogate pair. The result will be something that cannot be displayed properly. This only affects codepoints in higher planes ... but that includes the new emoji codepoints.)

Fastest method to encode characters beyond the ASCII range to respective %uxxxx in Javascript

In Javascript, what would be fastest method to encode unicode characters outside ASCII range to their respective %uxxxx. I need to use this method to encode hundreds of KBs of data (number of unicode characters outside ASCII range within this data is fairly low). I have been using 'escape' currently, but that's very slow given that it also encodes many other characters than just non-ASCII.
escape is native code. Nothing you could code in JS could beat that...

Valid character subset of extended ASCII for javascript strings

I am doing some experiments with data encoding. I know there is already a base64 format, but I would like something taking less space. Please, note, I am asking for the particular characters, not just their counts.
1. What character subset of extended ASCII can be represented by Javascript string?
2. What character subset of extended ASCII can be represented by Javascript string without the need of escaping assuming " characters are used around the string data?

Categories