JavaScript toLocaleDateString Producing Foreign Character in String [duplicate] - javascript

We use the Javascript function toLocaleTimeString() to parse date/times. The newest version of Chrome is returning an ASCII 226 between the seconds and the AM/PM part of the time suddenly. Edge is not having any issues nor are older versions of Chrome. 110+ has the issue and 109 or earlier does not.
For example, if the last couple of characters returned are:
00 AM
The ASCII translation of that is:
48 48 226 128 175
That 226 used to be a 32 (space).
Anyone else seeing this behavior as well?

This is apparently caused by this V8 CL
Here is the summary of this ChangeLog:
[intl] Enhance Date parser to take Unicode SPACE
This is needed to prepare for the landing of ICU72.
Allow U+202F in the Date String, which the toLocaleString("en-US")
will generate w/ ICU72.
So it's done on purpose, to support the next version of ICU-72. We can thus assume that other browsers will also follow on this.
[Update]
Since this change caused åtoo many web-compat issues](https://crbug.com/1414292), Chrome did patch their Intl implementation against this ICU-72 change and convert these U+202F characters back to U+2000 characters. Apparently, Firefox did the same even before.

I think it's non-breaking space.
Non-breaking space
Since it also occurs on Edge110, I think it is derived from Chromium.
const event = new Date('August 19, 1975 23:15:30 GMT+00:00');
const localTime = event.toLocaleTimeString('en-US');
console.log(localTime);
console.log(localTime.indexOf(" "))
console.log(localTime.indexOf("\u{202F}"))
for (let i = 0; i < localTime.length; i++){
console.log(localTime.charCodeAt(i));
}

Related

For char golfing in javascript, encode with an encoding, decode with another

I'm doing char golfing these days in different languages and I was skeptic at first cause it's totally disconnected from 'real world' practices but I ended up loving it for its educationnal purpose: I learned a LOT about my languages in the process.
And let's admit it, it's fun.
I'm currently trying to learn tricks in JS and here's the last I found:
Say, you have this script:
for(i=5;i--;)print(i*i) (23 chars)
The script is made of ASCII chars, each of them is basically a pair of hex digits.
For example 'f' is 66 and 'o' is 6f.
So if you group the informations of these two chars you get: 666f, which the utf16 code of one char: 景
My script has an odd number of chars so let's add a space somewhere to make it even:
for(i=5;i--;) print(i*i) (24 chars)
and now by applying the previous idea to the whole script we get:
景爨椽㔻椭ⴻ⤠灲楮琨椪椩 (12 chars)
So now my question is: how can I reconstruct the script back from the 12 chars with as few chars as possible?
I came up with that:
eval(unescape(escape`景爨椽㔻椭ⴻ⤠灲楮琨椪椩`.replace(/%u(..)/g,'%$1%')))
but it adds a constant cost of 50 chars to the process so it makes this method useless if your script has less than 100 chars.
It's great for long scripts (e.g. 600 chars becomes 350 chars) but in golfing problems, the script is rarely long, usually it's less than 100 chars.
I'm not an encoding specialist at all, that's why I came here cause I'm pretty sure there's a shorter method.
30 chars of constant cost would be already amazing cause it would make the threshold drop from 100 to 60 chars.
Note that I used utf16 here but it could be another encoding, as long as it shortens the script I'm happy with it.
My Version of JS is: Node 12.13.0
The standard way to switch between string decodings in node.js is to use the Buffer api:
Buffer.from(…, "utf16le").toString("ascii")
To golf this a bit, you can take advantage of some legacy options and defaults:
''+new Buffer(…,"ucs2")
(The .toString() without arguments actually does use UTF-8 but it doesn't matter for ASCII data)
Since node only supports UTF16-le instead of UTF16-be your string won't work, you'll need to swap the bytes and use different characters though:
global.print = console.log;
eval(''+new Buffer("潦⡲㵩㬵⵩㬭
牰湩⡴⩩⥩","ucs2"))
(online demo)

How do I reverse an array in JavaScript in 16 characters or less without .reverse()?

I'm trying to solve a challenge on Codewars which requires you to reverse an array in JavaScript, in 16 characters or less. Using .reverse() is not an option.
The maximum number of characters allowed in your code is 28, which includes the function name weirdReverse, so that leaves you with just 16 characters to solve it in. The constraint -
Your code needs to be as short as possible, in fact not longer than 28 characters
Sample input and output -
Input: an array containing data of any types. Ex: [1,2,3,'a','b','c',[]]
Output: [[],'c','b','a',3,2,1]
The starter code given is -
weirdReverse=a=>
My solution (29 characters) is -
weirdReverse=a=>a.sort(()=>1)
which of course fails -
Code length should less or equal to 28 characters.
your code length = 29 - Expected: 'code length <= 28', instead got: 'code length > 28'
I'm not sure what else to truncate here.
Note - I did think about posting this question on CodeGolf SE, but I felt it wouldn't be a good fit there, due to the limited scope.
I'd like to give you a hint, without giving you the answer:
You're close, but you can save characters by not using something you need to add in your code.
By adding the thing you won't use, you can remove ().
Spoiler (answer):
// Note: this only really works for this specific case.
// Never EVER use this in a real-life scenario.
var a = [1,2,3,'a','b','c',[]]
weirdReverse=a=>a.sort(x=>1)
// ^ That's 1 character shorter than ()
console.log(weirdReverse(a))

String.fromCharCode() does not work after the value "126"

I have been trying the following code to get the ASCII equivalent character
String.fromCharCode("149")
but, it seems to work till 126 is passed as parameter. But for 149, the symbol generated should be
•
128 and beyond is not standard ASCII.
var s = "•";
alert(s.charCodeAt(0))
gives 8226
https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/String/fromCharCode
Getting it to work with higher values Although most common Unicode
values can be represented with one 16-bit number (as expected early on
during JavaScript standardization) and fromCharCode() can be used to
return a single character for the most common values (i.e., UCS-2
values which are the subset of UTF-16 with the most common
characters), in order to deal with ALL legal Unicode values (up to 21
bits), fromCharCode() alone is inadequate. Since the higher code point
characters use two (lower value) "surrogate" numbers to form a single
character, String.fromCodePoint() (part of the ES6 draft) can be used
to return such a pair and thus adequately represent these higher
valued characters.
The fromCharCode() method converts Unicode values into characters.
to use unicode see the link for unicode table
http://unicode-table.com/en/
I got String.fromCodePoint(149) to show inside an alert in firefox but not in IE & Chrome. It may be because of browser language settings.
But this looks correct accourding to the ASCII table.
http://www.asciitable.com/
This is the code I used
alert(String.fromCodePoint(149));

Numbers with leading zeros in Javascript

I receive a long XML from backend. To further use the xml I convert it to JSON object using one of the standard XMLtoJSON javascript library. The issue is, some of the XML value contains number with leading zeros eg: 001072.
The problem is, when javascript library converts xml to JSON, number with leading zeros give completely different value.
For example
“001072” converts “570”
Other times it parse it correctly. For example:
“0045678” converts to 45678
The problem is how javascript handle number with zeros. I don’t know the reason of this strange behavior!!
Please suggest a solution which can parse number with zeros consistently and how can I use it with xmltojson library
This is most likely a problem with octal literals. If a number starts with a leading 0, JavaScript by default will try to parse it as an octal literal.
For this reason, you should always specify the radix parameter when calling parseInt. The library probably does not do that.
parseInt("012", 8); // 10
parseInt("012", 10); // 12
I think this is the offending line in the library, probably. Either edit the library, or edit your XML.
Octal numbers with the leading zero are on the way out. For ECMAScript5 they can still cause problems and are thus not allowed in strict mode and throw a runtime error. You really should not be using 3rd party scripts that are not in strict mode they are dangerous for way too many reasons, as you can see with the handling of the octal numbers.
As ECMAScript 6 becomes more wide spread the use of the leading zero will be pushed out all together.
Octals literals will have a '0o' prefix 0o10 === 8 can be uppercase 'o' but I am sure you can see this will be a hassle. ES6 will also formalise the binary format with the prefix 0b1000 === 8 though most browsers have supported it for some time. Hex has also been around for a while 0x08 == 8
The reason some numbers with leading zeros are decmil and some octal is dependent on what digits are in the number. Octal does not use the digits 8 and 9 so any number that have these digits can not be octal.

How to correctly use Unicode and UTF-8 special characters in Node and browser js?

So i have this character:
🀀
MAHJONG TILE EAST WIND Which has the Unicode point U+1F000 (U+D83C U+DC00) and the UTF-8 encoding F0 9F 80 80
My question is how to I escape this in javascript?
I see \uff00 all the time, but that is for ASCII as 8 bytes will only take you up to 255. Just putting \u1F000' returns the (incorrect) 'ἀ0' and trying to fill in the extra bytes with 0s just returns \u0001F000'. How do I escape values that are higher (such as my above character?).
And how do I escape not just the Unicode point but also the UTF-8 encoding?
Taking on to this, I have noticed that the node REPL is able to show many Unicode values but not some (such as Emoji) even when my terminal window (mac) normally could. Is there any rhyme or reason to this
You can escape the char using \uXXXX x2 (for 32-bit values) format.
To use UTF-8 strings look into typed arrays and TextEncoder / TextDecoder. They are fairly new so you may need to use polyfill in some browsers.
Example
document.write('<h1>\uD83C\uDC00</h1>');
JavaScript does not support UTF-8 strings. All JavaScript strings are UCS-2 (but it supports UTF-16-style surrogate pairs). You can escape astral plane characters with two 16-bit characters: "\ud83c\udc00".
"🀀".charCodeAt(0).toString(16)
// => "d83c"
"🀀".charCodeAt(1).toString(16)
// => "dc00"
console.log("\ud83c\udc00")
// => 🀀
This also means that JavaScript doesn't know how to get the correct length of strings containing astrals, and that any indexing or substringing has a chance of being wrong:
"🀀".length
// => 2

Categories