VBScript chr() appears to return wrong value - javascript

I'm trying to convert a character code to a character with chr(), but VBScript isn't giving me the value I expect. According to VBScript, character code 199 is:
�
However, when using something like Javascript's String.fromCharCode, 199 is:
Ç
The second result is what I need to get out of VBScript's chr() function. Any idea what the problem is?

Edited to reflect comments
Chr(199) returns a 2-byte character, which is being interpreted as 2 separate characters.
use ChrW(199) to return a Unicode string.
use ChrB(199) to return it as a single-byte character

Encoding is the problem. Javascript may be interpreting as latin-1; VBScript may be using a different encoding and getting confused.

The fromCharCode() takes the specified Unicode values and returns a string.
The Chr() function converts the specified ANSI character code to a character.

Related

UTF8 String not allowing charAt() or substring to pull out specific characters

In my code I'm trying to isolate out the first character of a variable, it is the UTF8 symbol: 🌈
The code to outputs are as follows:
Code:
console.log(login_name);
console.log(login_name.charAt(0));
console.log(login_name.substring(0,1));
Output:
🌈 ✨✨✨UTF8MB4
�
�
Obviously, I want .charAt() to print 🌈 and not �. Any known oddities with utf8mb4 that I'm missing? My main problem is I don't know how to word this specific problem.
Also if I swap the rainbow for/ target the ✨, it functions as it should and prints properly.
JavaScript can't handle Unicode properly. charAt() operates on code units instead of code points.
Luckily JavaScript has workarounds. To get the characters in a string instead of UTF-16/UCS-2 code units you need to call Array.from(yourstring), which will get you an array of characters. From there on you can get the first element in the usual way.
let characters = Array.from(login_name);
console.log(characters.shift());

JavaScript not decoding parameter

So from the textarea I take the shortcode %91samurai id="19"%93 it should be [samurai id="19"]:
var not_decoded_content = jQuery('[data-module_type="et_pb_text_forms_00132547"]')
.find('#et_pb_et_pb_text_form_content').html();
But when I try to decode the %91 and %93
self.content = decodeURI(not_decoded_content);
I get the error:
Uncaught URIError: URI malformed
How can i solve this problem?
The encodings are invalid. If you can't fix the whatever-system-produces-them to correctly produce %5B and %5D, then your only option is to do a replacement yourself: replace all %91 with character 91 which is '[', then replace all %93 with character 93 which is ']'.
Note that javascript String Replace as-is won't do "Replace all occurrences". If you need that, then create a loop (while it contains(...) do a replace), or search the internet for javascript replace all, you should find plenty results.
And a final note, I am used to using decodeURIComponent(...). If you can make the whatever-system-produces-them to correctly produce %5B and %5D, and you still get that error, then try using decodeURIComponent(...) instead of decodeURI(...).
The string you're trying to decode is not a URI. Use decodeURIComponent() instead.
UPDATE
Hmm, that's not actually the issue, the issues are the %91 and %93.
encodeURI('[]')
gives %5b%5d, it looks like whatever has encoded this string has used the decimal rather than hexadecimal value.
Decimal 91 = hex 5b
Decimal 93 = hex 5d
Trying again with the hex values
decodeURI('%5bsamurai id="19"%5d') == '[samurai id="19"]'
I know this is not the solution you want to see, but can you try using "%E2%80%98" for %91 and "%E2%80%9C" for %93 ?
The %91 and %93 are part of control characters which html does not like to decode (for reasons beyond me). Simply put, they're not your ordinary ASCII characters for HTML to play around with.

Javascript "".length returning 1 rather than 0

Ok so I am rather stumped by this one.
I get a string value from a javascript library. I call myStringVar = myStringVar.trim() but when I do myStringVar.substring(0,1) it gives me an empty string. When I call var arr = myStringVar.split('') the first element in the array is and empty string, and when I call arr[0].trim().length it returns 1 instead of zero.
Am I missing something?
EDIT
Following the comments and responses I have been able to isolate the problem down to the existence of a non-visual unicode character at the beginning of the string. I will now try to find a way to remove those characters from the string....or better yet extract the portions of the string that are of interest.
Thanks for the help.
The most likely answer for this is that you have some invisible Unicode character in your string (for instance, "⁣", U+2063 INVISIBLE SEPARATOR).
A string containing only such a character would look to a user (or programmer) like an empty string, but would infact have length 1 since it does contain a character.
One simple way to test if this is the case, is to get the Unicode character code of the character in the string with string.charCodeAt(0). You can then look this up value in a Unicode table (such as this one), which should tell you if you have an invisible character in your string.

Javascript encodeURI() vs. PHP rawurldecode() and special characters

Encoding a string with German umlauts like ä,ü,ö,ß with Javascript encodeURI() causes a weird bug after decoding it in PHP with rawurldecode(). Although the string seems to be correctly decoded it isn't. See below example screenshots from my IDE
Also the strlen() of the - with rawurldecode() - decoded string gives more characters than it really has!
Problems occur when I need to process the decoded string, for example if I want to replace the German characters ä,ü,ö with ae, ue and oe. This can be seen in the example provided here.
I have also made an PHP fiddle where this whole weirdness can be seen.
What I've tried so far:
- utf8_decode
- iconv
- and also first two suggestions from here
This is a Unicode equivalence issue and it looks like your IDE doesnt handle multibyte strings very well.
In unicode you can represent Ü with either:
the single unicode codepoint (U+00DC) or %C3%9C in utf8
or use a capital U (U+0055) with a modifier (U+0308) or %55%CC%88 in utf8
Your GWT string uses the latter method called NFD while your one from PHP uses the first method called NFC. That's why your GWT string is 3 characters longer even though they are both valid encodings of logically identical unicode strings. Your problem is that they are not identical byte for byte in PHP.
More details about utf-8 normalisation.
If you want to do preg replacements on the strings you need to normalise them to the same form first. From your example I can see your IDE is using NFC since it's the PHP string that works. So I suggest normalising to NFC form in PHP (the default), then doing the preg_replace.
http://php.net/manual/en/normalizer.normalize.php
function cleanImageName($name)
{
$name = Normalizer::normalize( $name, Normalizer::FORM_C );
$clean = preg_replace(
Otherwise you have to do something like this which is based on this article.

Some encoded Javascript that I need in plain text

I'm having some issues trying to decode some javascript.. I have no idea what kind of encoding this is.. i tried base 64 decoders etc. If you can please help me out with this, here's a fragment of the code:
\x69\x6E\x6E\x65\x72\x48\x54\x4D\x4C","\x61\x70\x70\x34\x39\x34\x39\x3
Any ways I can get plain text from that?
Thanks!
\xNN is an escape sequence. NN is a hexidecimal number (00 to FF) that represents a Latin-1 character.
Escape sequences are interpreted literally within a string. So:
"\x69" === "i" // true
The escape() function encodes a
string.
This function makes a string portable,
so it can be transmitted across any
network to any computer that supports
ASCII characters.
This function encodes special
characters, with the exception of: * #
- _ + . /
The reverse of escape() is the unescape() function.
Try this:
alert(unescape("\x69\x6E\x6E\x65\x72\x48\x54\x4D\x4C\x61\x70\x70\x34\x39\x34\x39\x3"));
Edit: As J-P mentioned unescape isn't really needed here after all.
These are simply hex-values of symbols.
\x69 = i, etc. First several letters: "innerHTML", "ap…"
I think you should use window.unescape(), or unescape()

Categories