Using macOS with BigSur version 11.4.
A file name on my mac called: второй
If i copy and paste the file name to chrome console and print "второй".charCodeAt(5) - 1080
Is Safari: "второй".charCodeAt(5) - 1081
This causes some discrepancies in my app.
Is there a way to handle this so both browsers will act the same?
There are (at least) two ways to write that word in Unicode: второй (as in your question), which uses a и (U+0438) followed by the combining character for the mark (U+0306); and второй, which uses a single code point (U+0439) that is the combination of those (й). The one using a separate letter and combining mark is in normalization form D ("canonical decomposed," in which separate code points with combining marks are used where possible), and the one using the combined code point is in normalization form C ("canonical composed," in which combined code points are used where possible).
So for whatever reason, on Safari your string (in form D) is getting normalized to form C, but not on Chrome.
To ensure you're dealing with the same sequence of code points, you can normalize the string using the normalize method (ES2015+). It defaults to NFC, but you can pass it "NFD" if you want NFD:
const original = "второй";
console.log("original:", original.length, original.charCodeAt(5));
const nfc = original.normalize(); // Defaults to "NFC"
console.log("NFC:", nfc.length, nfc.charCodeAt(5));
const nfd = nfc.normalize("NFD");
console.log("NFD:", nfd.length, nfd.charCodeAt(5));
Note that charCodeAt works in UTF-16 code units, not code points (post on my blog about the difference), although it happens that in your example all of the code points are represented by a single code unit. You can use codePointAt to look at code points instead, although (again) in this particular case it doesn't make a difference.
Related
I have never seen this error in my life
?
And I just found it so odd...
My code here is seen like this:
const no = require('noparenth');
(at index.js:1)
and I have literally never seen this in my life...
the library in question is one that just allows you to invoke a function without parenthesis. I made it so it would make it easier to code + make it faster.
the code is as seen here:
Function.prototype.valueOf = function() {
this.call(this);
return 0;
};
Thats it...
actual error:
C:\Users\summit\testing\index.js:1
��c
SyntaxError: Invalid or unexpected token
One possible way to get that type of characters is to save a file as UTF-16 LE and load it as if it were UTF-8.
If you use PowerShell, it's easy to produce a file in UTF-16 LE encoding, for example if you run echo "" > index.js. If you open the resulting file in VSCode, you can see at the bottom right that it's UTF-16 LE.
Note that the resulting file is not really empty, and you can verify that by inputting require('fs').readFileSync('index.js') in the Node.js REPL.
You'll get <Buffer ff fe 0d 00 0a 00> which, when interpreted in UTF-16 LE, consists of byte order mark (U+FEFF), carriage return (U+000D) and new line character (U+000A).
Even if you open that file with a text editor and replace everything with your text, the byte order mark will still be there (as long as you still save in UTF-16 LE). For example, if you save const x = 0 the buffer will become like ff fe 63 00 ..., notice that the ff fe still is there.
When a program attempts to load it as UTF-8, it will get <invalid character> <invalid character> c <null character> and so on.
What you're seeing in the output is exactly from <invalid character> <invalid character> c (Unicode: U+FFFD U+FFFD U+0063), obviously not a valid token in Node.js.
As for your actual function, it will only be invoked when you have type coercions (like +myFunction or myFunction + "") or if you directly call it like myFunction.valueOf().
And I expect it to make it (slightly) slower, because it calls a user-defined function rather than just native code.
And apart from the bad idea of modifying prototypes of objects you don't own, there is limited usefulness (if any) to this.
The this.call(this) means that you can't add more arguments to the function call.
Also, it's like myFunction.call(myFunction), I'm not sure why you would want to do that. And when used on a method of an object, it's like myObject.myMethod.call(myObject.myMethod) (with this equal to the function itself) rather than myObject.myMethod.call(myObject)
And it will break things that depend on the behavior of myFunction + "" etc.
It seems not that easy to find an example, nevertheless I found this example :
isNative(Math.sin) was originally supposed to return true, but after modifying the prototype the way shown above, it became 3
I'm generating an "Up-Time" Win7 Gadget and trying to reproduce the .vbs code found in similar Gadgets.
I'm a .js coder.
Relevant JS:
vbStr=GetUpTime();
Relevant VBS:
Function GetUpTime
Set loc=CreateObject("WbemScripting.SWbemLocator")
Set svc=loc.ConnectServer(MachineName, "root\cimv2")
Set oss=svc.ExecQuery("SELECT * FROM Win32_OperatingSystem")
For Each os in oss
tim=os.LastBootUpTime
Next
GetUpTime=tim
End Function
Essentially this .vbs does the trick, as currently there is only 1 os running. I would like to expand on this by learning:
1) What is the relevance of MachineName?
If I return MachineName instead of tim, I get an undefined value.
2) How to extract individual os's without the For Each loop, equivelant to the .js:
os=oss[n];
3) How to return an array of tim's relative to each os.
The .vbs code loops through the available os's and gets their respective up-times, but the developer only planned for 1 os and as such there was no code to return an array of tim's. After researching .vbs arrays I've found how to create a 'set-length' array, but this is not relevant!
Machine name is undefined so treated as a zero length string. Which means it's ignored. Normally it's the computer on the network that you wish to query. It's optional so it's undeclaredness doesn't raise an error.
Using COM abbreviation (where Items is a default property)
os=oss(1)
or in full
os=oss.items(1)
A dictionary is easier (from Help).
Set d = CreateObject("Scripting.Dictionary")
d.Add "a", "Athens" ' Add some keys and items.
d.Add "b", "Belgrade"
d.Add "c", "Cairo"
NB: JScript uses COM just like VBScript. The code would be similar.
I am having trouble understanding how to properly use Visitors in ANTLR4, Javascript target.
I have prepared a very basic grammar, it accepts INT + INT or INT - INT operations.
grammar PlusMinus;
INT : [0-9]+;
WS : [ \t\r]+ -> skip;
PLUS : '+';
MINUS : '-';
input : plusOrMinus
;
plusOrMinus
: numberLeft PLUS numberRight # Plus
| numberLeft MINUS numberRight # Minus
;
numberLeft : INT;
numberRight : INT;
From this grammar ANTLR will generate a Visitor that has these three functions, visitInput, visitPlus and visitMinus. I start from visitInput where I will be able to fetch the operation ctx by doing this operation = ctx.plusOrMinus().
This is where I get stuck, how do I know if operation is of type plus or minus? In other words, where do I pass ctx.plusOrMinux(), to visitPlus() or visitMinus()?
I managed to create a visitor that does work, but it's very ugly, I am posting it here because perhaps it will help to better understand my question. Lines 20-29 is where the problem is.
First of all... PLUS and MINUS are lexer rules. You don't visit tokens (the result of lexer rules).
It rather looks like you're expecting this to work like a listener (where you set up your function that gets called when the tree walker reaches that node. You can be called on enter or exit from the node (depends on whether you want to get the node before or after you've processed it's children). Visitors expect you to handle your own tree navigation, which is sometimes useful, but listeners are cleaner where they suit the purpose. With nesting, You'll probably want to listen after the children nodes are processed, so you'll want to implement an exitPlusOrMins() function on your listener. I'd suggest stopping your code in the debugger inside this function to take a look at the objects you have available to you (in the ctx object).
(You also need to rethink your numberLeft and numberRIght parser rules. Something more like:
plusOrMinus: lexpr=INT (op=PLUS | op=MINUS) rexpr=INT;
would give you a pretty close equivalent to what you have so far. What you have will work with a recursive descent parser like ANTLR (so far as this example goes), but you're headed in the wrong direction making them different parse rules. Specifically, by making them two alternative parse rules, you're giving PLUS a higher precedence than minus, and PLUS and MINUS should have the same precedence in order of evaluation. As a result, they need to be the same parse rule.). When you place alternatives like this in a parser rule, you're also establishing precedence, so be careful about the order of these rules.
To get further than adding or subtracting integers, though, you'll need lexpr and rexpr to actually be expressions themselves (you should read up on expression parsing in the ANTLR book; it's covered very nicely).
With that rule, your exitPlusOrMinus can parse the int values of lexpr and rexpr and then evaluate the value of the op to determine whether to add or subtract.
Sometimes I see in a view source page ( html view source) this code:
if (JSON.stringify(["\u2028\u2029"]) === '["\u2028\u2029"]') JSON.stringify = function (a) {
var b = /\u2028/g,
c = /\u2029/g;
return function (d, e, f) {
var g = a.call(this, d, e, f);
if (g) {
if (-1 < g.indexOf('\u2028')) g = g.replace(b, '\\u2028');
if (-1 < g.indexOf('\u2029')) g = g.replace(c, '\\u2029');
}
return g;
};
}(JSON.stringify);
What is the problem with JSON.stringify(["\u2028\u2029"]) that it needs to be checked ?
Additional info :
JSON.stringify(["\u2028\u2029"]) value is "["
"]"
'["\u2028\u2029"]' value is also "["
"]"
I thought it might be a security feature. FileFormat.info for 2028 and 2029 have a banner stating
Do not use this character in domain names. Browsers are blacklisting it because of the potential for phishing.
But it turns out that the line and paragraph separators \u2028 and \u2029 respectively are treated as a new line in ES5 JavaScript.
From http://www.thespanner.co.uk/2011/07/25/the-json-specification-is-now-wrong/
\u2028 and \u2029 characters that can break entire JSON feeds since the string will contain a new line and the JavaScript parser will bail out
So you are seeing a patch for JSON.stringify. Also see Node.js JavaScript-stringify
Edit: Yes, modern browsers' built-in JSON object should take care of this correctly. I can't find any links to the actual source to back this up though. The Chromium code search doesn't mention any bugs that would warrant adding this workaround manually. It looks like Firefox 3.5 was the first version to have native JSON support, not entirely bug-free though. IE8 supports it too. So it is likely a now unnecessary patch, assuming browsers have implemented the specification correctly.
After reading both answers , here is the Simple visual explanation :
doing this
alert(JSON.stringify({"a":"sddd\u2028sssss"})) // can cause problems
will alert :
While changing the trouble maker to something else ( for example from \u to \1u)
will alert :
Now , let's invoke the function from my original Q ,
Lets try this alert(JSON.stringify({"a":"sddd\u2028sssss"})) again :
result :
and now , everybody's happy.
\u2028 and \u2029 are invisible Unicode line and paragraph separator characters. Natively JSON.stringify method converts these codes to their symbolic representation (as JavaScript automatically does in the strings), resulting in "["
"]". The code you have provided does not let JSON to convert the codes to symbols and preserves their \uXXXX representation in the output string, i.e. returning "["\u2028\u2029"]".
I want to format number using javascript as below:
10.00=10,00
1,000.00=1.000,00
Every browser supports Number.prototype.toLocaleString(), a method intended to return a localized string from a number. However, the specification defines it as follows:
Produces a string value that represents the value of the Number formatted according to the conventions of the host environment's current locale. This function is implementation-dependent, and it is permissible, but not encouraged, for it to return the same thing as toString.
Implementation-dependant means that it's up to the vendor how the result will look, and results in interoperability issues.
Internet Explorer (IE 5.5 to IE 9) comes closest to what you want and formats the number in a currency style - thousands separator and fixed at 2 decimal places.
Firefox (2+) formats the number with a thousands separator and decimal places but only if applicable.
Opera, Chrome & Safari output the same as toString() -- no thousands separator, decimal place only if required.
Solution
I came up with the following code (based on an old answer of mine) to try and normalize the results to work like Internet Explorer's method:
(function (old) {
var dec = 0.12 .toLocaleString().charAt(1),
tho = dec === "." ? "," : ".";
if (1000 .toLocaleString() !== "1,000.00") {
Number.prototype.toLocaleString = function () {
var neg = this < 0,
f = this.toFixed(2).slice(+neg);
return (neg ? "-" : "")
+ f.slice(0,-3).replace(/(?=(?!^)(?:\d{3})+(?!\d))/g, tho)
+ dec + f.slice(-2);
}
}
})(Number.prototype.toLocaleString);
This will use the browser's built-in localization if it's available, whilst gracefully degrading to the browser's default locale in other cases.
Working demo: http://jsfiddle.net/R4DKn/49/
I know this solution using NumberFormat but it is necessary to convert the values to string.
https://developer.mozilla.org/es/docs/Web/JavaScript/Referencia/Objetos_globales/NumberFormat
// Remove commas
number = "10,000.00".replace(/,/g, '');
// Create a NumberFormat type object
var formatter = new Intl.NumberFormat('de-DE', {
minimumFractionDigits: 2
});
// Apply format
console.log(formatter.format(number));
output:10.000,00
Javascript doesn't provide this functionality itself, but there are a number of third-party functions around which can do what you want.
Note, whichever method you use, you should be careful to only use the resulting string for display purposes -- it won't be a valid number value in Javascript after you've converted the decimal point to a comma.
The quickest solution I can offer is to use the number_format() function written by the phpJS people. They've implemented Javascript versions of a load of commonly-used PHP functions, including number_format(), and this function will do exactly what you want.
See the link here: http://phpjs.org/functions/number_format
I wouldn't bother about taking the whole phpJS library (a lot of it is of questionable value anyway), but just grab the 20-odd line function shown on the page linked above and paste it into your app.
If you want a more flexible solution, there are a number of JS functions around which simulate the printf() function from C. There is already a good question on SO covers this. See here: JavaScript equivalent to printf/string.format
Hope that helps.