How does string comparison work in JavaScript? [duplicate] - javascript

This question already has answers here:
Why is one string greater than the other when comparing strings in JavaScript?
(5 answers)
Closed 1 year ago.
What the algorithm work behind string comparison in javascript, for example
'bc' > 'ac' = true/false ?
'ac' > 'bc' = true/false ?

This is calculated using The Abstract Relational Comparison Algorithm in ECMA-5. The relevant part is quoted below.
4. Else, both px and py are Strings
a) If py is a prefix of px, return false. (A String value p is a prefix
of String value q if q can be the result of concatenating p and some
other String r. Note that any String is a prefix of itself, because
r may be the empty String.)
b) If px is a prefix of py, return true.
c) Let k be the smallest nonnegative integer such that the character
at position k within px is different from the character at position
k within py. (There must be such a k, for neither String is a prefix
of the other.)
d) Let m be the integer that is the code unit value for the character
at position k within px.
e) Let n be the integer that is the code unit value for the character
at position k within py.
f) If m < n, return true. Otherwise, return false.

Not only in javascript , In simple, all string comparison algorithm will follow the lexicographical order or dictionary order, which means it will check each character by character, if some mismatches, you can decide the result
True:
ant < any
aaa < aab
aab < b
b < baa

JavaScript compares strings by the order of Unicode codepoints and lexicographically compares UTF-16 code units.
From your question:
'bc' > 'ac' // Because 'b' comes after 'a'.

Strings are compared character by character, so the very first letter is the most significant, hence
'bc' > 'ac' = true/false ? => true
'ac' > 'bc' = true/false ? => false
if the first letter will be equal then, second will be compared and so on, till one of them will be greater or less or equal.

Related

java script string comparison [duplicate]

This question already has answers here:
Why is one string greater than the other when comparing strings in JavaScript?
(5 answers)
Closed 6 years ago.
I took a JS course on a website , and in one of the lessons there was a piece Of code that did not make sense to me :
the code is in the picture , why str1 is less then str2 ?
Strings are compared based on standard lexicographical ordering, using Unicode values. That means "a" < "b" and "c" > "b"
Two strings are strictly equal when they have the same sequence of
characters, same length, and same characters in corresponding
positions. source
var str1 = "aardvark";
var str2="beluga";
console.log(str1 < str2);//true
console.log(str1.length < str2.length);//false
This compares each character from 0-index, for example "a"<"b" thi is true. If there are equal, it compares next index, and next, ...
"aad">"aac", because, twice "a"="a" and then "d">"c"
JavaScript in this case will compare the strings lexographically character by character, where the letter 'a' is lower than the letter 'b' and so on. It works for numbers too, and the uppercase alphabet is considerd higher than the lowercase alphabet.
So, in your example, 'a' < 'b' and therefore the statement is true.

Why does Number < String returns true in JavaScript?

EDIT: I will rephrase my question, I type Number < String and it returns true, also works when I do typeof(2) < typeof("2").
Number < String => true
typeof(2) < typeof("2") => true
I'm guessing it is the value of ASCII characters of each letter in Number and String but I am not sure if that is the reason this is returning true, and I want to know why does this happens, what processes or how does the interpreter gets to this result?
First answer:
The charCodeAt() method returns the numeric Unicode value of the character at the given index. Read here
Now if you do not specify any index position then character at 0th index is considered. Now, S ASCII value is 83 and N ASCII value is 78. so, you are getting those number. Check here.
And 78 < 83 => true is obvious.
Try "String".charCodeAt(1) and you will get 116 which is ASCII value of t
Second answer based on OP's edited question:
Frankly speaking your comparison Number < String is "technically" incorrect because Less-than Operator < or any similar operator is for expressions, and Number and String are functions and not expressions. However #Pointy explained on how Number < String worked and gave you results.
More insight on comparison operators
Comparison operators like < works on expressions, read here. Typically, you should have a valid expression or resolved value for RHS and LHS.
Now this is the definition of expression, read more here - "An expression is any valid unit of code that resolves to a value. Conceptually, there are two types of expressions: those that assign a value to a variable and those that simply have a value."
So, (x = 7) < (x = 2) or new Number() < new String() is a "technically" valid/good comparison, even this Object.toString < Number.toString() but really not Object < Function.
Below are rules/features for comparisons, read more here
Two strings are strictly equal when they have the same sequence of characters, same length, and same characters in corresponding positions.
Two numbers are strictly equal when they are numerically equal (have the same number value). NaN is not equal to anything, including NaN. Positive and negative zeros are equal to one another.
Two Boolean operands are strictly equal if both are true or both are false.
Two distinct objects are never equal for either strict or abstract comparisons.
An expression comparing Objects is only true if the operands reference the same Object.
Null and Undefined Types are strictly equal to themselves and abstractly equal to each other.
The result of
Number < String
is not the result of comparing the strings "Number" and "String", or not exactly that. It's the result of comparing the strings returned from Number.toString() and String.toString(). Those strings will (in all the runtimes I know of) have more stuff in them than just the strings "Number" and "String", but those two substrings will be the first place that they're different.
You can see what those actual strings are by typing
Number.toString()
in your browser console.
JavaScript does the following thing:
"String".charCodeAt(); => 83
"S".charCodeAt(); => 83
"String".charCodeAt(0); => 83
The method charCodeAt(a) gets the char code from position a. The default value is 0
If you compare N > S you will get 78 > 83 => true
For the complete String Javascript calculates the sum of all ASCII char codes.
So I can answer your question with yes.

Why does '' (empty string) permeate all strings?

I just ran into a bit of confusion today, "string".indexOf(''); always returns 0, but I would expect -1 (for false); inversely, "string".lastIndexOf(''); always returns 6
lastIndexOf is easier to understand, since string is 6 letters long ("string".length, being zero-indexed returns 5) but I don't see anywhere in the ECMAscript spec (5.1 or 6.0) that describes why '' would be treated like a word/character boundary
What, exactly, is going on here?
The spec says:
Return the smallest possible integer k not smaller than start such
that k+searchLen is not greater than len, and for all
nonnegative integers j less than searchLen, the character at
position k+j of S is the same as the character at position j
of searchStr; but if there is no such integer k, then return the
value -1.
That condition is fulfilled at position 0 because of vacuous truth: since you are searching the empty string, any statement you can think of will hold for every character, because it has no characters.
More formally, for any statement P, if S = ∅, P(x) holds ∀ x ∈ S.

How does the Javascript '>' operator compare characters with a space?

I am trying to understand this expression:
((ch = stream.getChar()) > ' ')
Here, getChar() gets a character. How does this greater-than comparision operator check if any char is greater than an empty space?
Is this possible?
An empty space has a character code. Even though it doesn't look like much, it still has a value. So does the character taken from the stream. Comparing the character codes of these values is what produces the output.
Let's take a gander at the language specification (the algorithm itself is described in here) (do note that it defines <, but the > operator simply flips the resulting value).
What the operator does is try to convert both operands to primitive types, with a preference for numbers:
2. a. Let py be the result of calling ToPrimitive(y, hint Number).
2. b. Let px be the result of calling ToPrimitive(x, hint Number).
In our case, x === stream.getChar() and y === ' '. Since both of the operands are primitive strings already, that results in the original values (px = x, py = y), and we move on to:
4. Else, both px and py are Strings
Now it does checks to see if any of the operands are prefixes of the other, for example:
'abc' > 'abcd' // false
'foo' > 'foobar' // false
Which is relevant if getChar() results in a space, since the space is a prefix of itself:
' ' > ' ' // false
We move on, to finding the first character in x and y who're on the same position in the strings, but are different characters:
Let k be the smallest nonnegative integer such that the character at position k within px is different from the character at position k within py. (There must be such a k, for neither String is a prefix of the other.)
(e.g., 'efg' and 'efh', we want g and h)
The characters we've found are then converted to their integer values:
Let m be the integer that is the code unit value for the character at position k within px.
Let n be the integer that is the code unit value for the character at position k within py.
And finally, a comparison is made:
If m < n, return true. Otherwise, return false.
And that's how it's compared to the space.
tl;dr It converts both arguments to their code-unit integer representations, and compares that.
In Javascript strings are compared in alphabetical order. These expressions are true:
'abacus' <= 'calculator'
'abacus' < 'abate'
In most (if not all) programming languages, characters are represented internally by a number. When you do equality/greater-than/less-than checks what you're actually checking is the underlying number.
hence in JS:
alert('c' > 'b'); // alerts true
alert('a' > 'b'); // alerts false
A space character also has a numeric representation, therefore the check is a valid one.
[string] > [string] will compare the character(s) by their representative values (see ASCII Table)
Characters are stored in the computer's memory as a number (usually a byte or two).
Each character has a unique identifying number.
By checking if a character is greater than space, you actually comapare their place in a table.
See http://en.wikipedia.org/wiki/ASCII for more.
Check out this link, it'll explain how the comparison works on JS: http://javascript.about.com/od/decisionmaking/a/des02.htm
Basically, you're comparing the ASCII value of each character to the ASCII value of the blank space, which is also, a character and therefore, has a corresponding ASCII value.

Javascript Compiler behavior - double plus for array of empty array and array of zero is.. ONE

My question may be already answered, but I could not find it not in Search Engines google or bing doesn't like '+' (plus) sign in search request.
Anyway, why this is zero
+[[]][0] // = 0
and this is one
++[[]][0] // = 1
UPD:
Michael Berkowski have a good answer, but I steal don't understand one thing
if [[]][0] evaluates to an empty array, then why ++[] is ReferenceError: Invalid left-hand side expression in prefix operation
UPD2:
now I get it.. it seems I was trying to type ++0 in console and getting an Error, but I should be using var a = 0; ++a
This is best explored by breaking down the way its components evaluate.
[[]][0] alone evaluates to the empty array []. By adding + in front, you cast its string representation to an integer 0 (like saying +4 or -3) via a unary positive operator. +0 is just 0.
++ as a numeric operator, also casts the empty string to an integer 0, but applies its operation (the prefix increment) resulting 1.
[[]][0]
// [] empty array
[[]][0].toString()
// ""
// Unary + casts the empty string to an integer
+("")
// 0
// Prefix increment on an empty string results in 1 (increments the 0)
var emptyString = "";
++emptyString;
// 1

Categories