Why does '' (empty string) permeate all strings? - javascript

I just ran into a bit of confusion today, "string".indexOf(''); always returns 0, but I would expect -1 (for false); inversely, "string".lastIndexOf(''); always returns 6
lastIndexOf is easier to understand, since string is 6 letters long ("string".length, being zero-indexed returns 5) but I don't see anywhere in the ECMAscript spec (5.1 or 6.0) that describes why '' would be treated like a word/character boundary
What, exactly, is going on here?

The spec says:
Return the smallest possible integer k not smaller than start such
that k+searchLen is not greater than len, and for all
nonnegative integers j less than searchLen, the character at
position k+j of S is the same as the character at position j
of searchStr; but if there is no such integer k, then return the
value -1.
That condition is fulfilled at position 0 because of vacuous truth: since you are searching the empty string, any statement you can think of will hold for every character, because it has no characters.
More formally, for any statement P, if S = ∅, P(x) holds ∀ x ∈ S.

Related

Why does Number < String returns true in JavaScript?

EDIT: I will rephrase my question, I type Number < String and it returns true, also works when I do typeof(2) < typeof("2").
Number < String => true
typeof(2) < typeof("2") => true
I'm guessing it is the value of ASCII characters of each letter in Number and String but I am not sure if that is the reason this is returning true, and I want to know why does this happens, what processes or how does the interpreter gets to this result?
First answer:
The charCodeAt() method returns the numeric Unicode value of the character at the given index. Read here
Now if you do not specify any index position then character at 0th index is considered. Now, S ASCII value is 83 and N ASCII value is 78. so, you are getting those number. Check here.
And 78 < 83 => true is obvious.
Try "String".charCodeAt(1) and you will get 116 which is ASCII value of t
Second answer based on OP's edited question:
Frankly speaking your comparison Number < String is "technically" incorrect because Less-than Operator < or any similar operator is for expressions, and Number and String are functions and not expressions. However #Pointy explained on how Number < String worked and gave you results.
More insight on comparison operators
Comparison operators like < works on expressions, read here. Typically, you should have a valid expression or resolved value for RHS and LHS.
Now this is the definition of expression, read more here - "An expression is any valid unit of code that resolves to a value. Conceptually, there are two types of expressions: those that assign a value to a variable and those that simply have a value."
So, (x = 7) < (x = 2) or new Number() < new String() is a "technically" valid/good comparison, even this Object.toString < Number.toString() but really not Object < Function.
Below are rules/features for comparisons, read more here
Two strings are strictly equal when they have the same sequence of characters, same length, and same characters in corresponding positions.
Two numbers are strictly equal when they are numerically equal (have the same number value). NaN is not equal to anything, including NaN. Positive and negative zeros are equal to one another.
Two Boolean operands are strictly equal if both are true or both are false.
Two distinct objects are never equal for either strict or abstract comparisons.
An expression comparing Objects is only true if the operands reference the same Object.
Null and Undefined Types are strictly equal to themselves and abstractly equal to each other.
The result of
Number < String
is not the result of comparing the strings "Number" and "String", or not exactly that. It's the result of comparing the strings returned from Number.toString() and String.toString(). Those strings will (in all the runtimes I know of) have more stuff in them than just the strings "Number" and "String", but those two substrings will be the first place that they're different.
You can see what those actual strings are by typing
Number.toString()
in your browser console.
JavaScript does the following thing:
"String".charCodeAt(); => 83
"S".charCodeAt(); => 83
"String".charCodeAt(0); => 83
The method charCodeAt(a) gets the char code from position a. The default value is 0
If you compare N > S you will get 78 > 83 => true
For the complete String Javascript calculates the sum of all ASCII char codes.
So I can answer your question with yes.

Algorithm for javascript pre-defined functions (parseInt, parseFloat, isNaN, etc.)

What is the best way, if even possible, to see the underlying code for the predefined functions in Javascript. Is there documentation that shows how these were coded, or an easy way to actually view the underlying code?
parseInt
parseFloat
isNaN
They are native functions, and maybe coded in the language your JS engine was written in - you'd need to contact it's source.
However, you probably are more interested in the EcmaScript specification that describes how the algorithms work.
And if you're lucky, for some of the functions you even might find an JS equivalent. You'll find them mostly on pages that test ES implementations against the standard.
After looking further I found this in the ECMAScript specification.
http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf
When the parseInt function is called, the following steps are taken:
Let inputString be ToString(string).
Let S be a newly created substring of inputString consisting of the first character that is not a
StrWhiteSpaceChar and all characters following that character. (In other words, remove leading white
space.) If inputString does not contain any such characters, let S be the empty string.
Let sign be 1.
If S is not empty and the first character of S is a minus sign -, let sign be 1.
If S is not empty and the first character of S is a plus sign + or a minus sign -, then remove the first character
from S.
Let R = ToInt32(radix).
Let stripPrefix be true.
If R  0, then© Ecma International 2011 105
a. If R < 2 or R > 36, then return NaN.
b. If R  16, let stripPrefix be false.
Else, R = 0
a. Let R = 10.
If stripPrefix is true, then
a. If the length of S is at least 2 and the first two characters of S are either ―0x‖ or ―0X‖, then remove
the first two characters from S and let R = 16.
If S contains any character that is not a radix-R digit, then let Z be the substring of S consisting of all
characters before the first such character; otherwise, let Z be S.
If Z is empty, return NaN.
Let mathInt be the mathematical integer value that is represented by Z in radix-R notation, using the letters
A-Z and a-z for digits with values 10 through 35. (However, if R is 10 and Z contains more than 20
significant digits, every significant digit after the 20th may be replaced by a 0 digit, at the option of the
implementation; and if R is not 2, 4, 8, 10, 16, or 32, then mathInt may be an implementation-dependent
approximation to the mathematical integer value that is represented by Z in radix-R notation.)
Let number be the Number value for mathInt.
Return sign  number.
NOTE parseInt may interpret only a leading portion of string as an integer value; it ignores any characters that
cannot be interpreted as part of the notation of an integer, and no indication is given that any such characters were
ignored.
When the parseFloat function is called, the following steps are taken:
Let inputString be ToString(string).
Let trimmedString be a substring of inputString consisting of the leftmost character that is not a
StrWhiteSpaceChar and all characters to the right of that character. (In other words, remove leading white
space.) If inputString does not contain any such characters, let trimmedString be the empty string.
If neither trimmedString nor any prefix of trimmedString satisfies the syntax of a StrDecimalLiteral (see
9.3.1), return NaN.
Let numberString be the longest prefix of trimmedString, which might be trimmedString itself, that satisfies
the syntax of a StrDecimalLiteral.
Return the Number value for the MV of numberString.
NOTE parseFloat may interpret only a leading portion of string as a Number value; it ignores any characters that
cannot be interpreted as part of the notation of an decimal literal, and no indication is given that any such characters were
ignored.
Returns true if the argument coerces to NaN, and otherwise returns false.
If ToNumber(number) is NaN, return true.
Otherwise, return false.
NOTE A reliable way for ECMAScript code to test if a value X is a NaN is an expression of the form X !== X. The
result will be true if and only if X is a NaN.
Those functions are implementation specific depending on browser, and are not written in JS (unless somebody's decided to write a browser engine in JS). The code is not guaranteed to be the same across environments, though they do have to (in theory) adhere to the ECMAScript specification for their behavior.

How does the Javascript '>' operator compare characters with a space?

I am trying to understand this expression:
((ch = stream.getChar()) > ' ')
Here, getChar() gets a character. How does this greater-than comparision operator check if any char is greater than an empty space?
Is this possible?
An empty space has a character code. Even though it doesn't look like much, it still has a value. So does the character taken from the stream. Comparing the character codes of these values is what produces the output.
Let's take a gander at the language specification (the algorithm itself is described in here) (do note that it defines <, but the > operator simply flips the resulting value).
What the operator does is try to convert both operands to primitive types, with a preference for numbers:
2. a. Let py be the result of calling ToPrimitive(y, hint Number).
2. b. Let px be the result of calling ToPrimitive(x, hint Number).
In our case, x === stream.getChar() and y === ' '. Since both of the operands are primitive strings already, that results in the original values (px = x, py = y), and we move on to:
4. Else, both px and py are Strings
Now it does checks to see if any of the operands are prefixes of the other, for example:
'abc' > 'abcd' // false
'foo' > 'foobar' // false
Which is relevant if getChar() results in a space, since the space is a prefix of itself:
' ' > ' ' // false
We move on, to finding the first character in x and y who're on the same position in the strings, but are different characters:
Let k be the smallest nonnegative integer such that the character at position k within px is different from the character at position k within py. (There must be such a k, for neither String is a prefix of the other.)
(e.g., 'efg' and 'efh', we want g and h)
The characters we've found are then converted to their integer values:
Let m be the integer that is the code unit value for the character at position k within px.
Let n be the integer that is the code unit value for the character at position k within py.
And finally, a comparison is made:
If m < n, return true. Otherwise, return false.
And that's how it's compared to the space.
tl;dr It converts both arguments to their code-unit integer representations, and compares that.
In Javascript strings are compared in alphabetical order. These expressions are true:
'abacus' <= 'calculator'
'abacus' < 'abate'
In most (if not all) programming languages, characters are represented internally by a number. When you do equality/greater-than/less-than checks what you're actually checking is the underlying number.
hence in JS:
alert('c' > 'b'); // alerts true
alert('a' > 'b'); // alerts false
A space character also has a numeric representation, therefore the check is a valid one.
[string] > [string] will compare the character(s) by their representative values (see ASCII Table)
Characters are stored in the computer's memory as a number (usually a byte or two).
Each character has a unique identifying number.
By checking if a character is greater than space, you actually comapare their place in a table.
See http://en.wikipedia.org/wiki/ASCII for more.
Check out this link, it'll explain how the comparison works on JS: http://javascript.about.com/od/decisionmaking/a/des02.htm
Basically, you're comparing the ASCII value of each character to the ASCII value of the blank space, which is also, a character and therefore, has a corresponding ASCII value.

Javascript Compiler behavior - double plus for array of empty array and array of zero is.. ONE

My question may be already answered, but I could not find it not in Search Engines google or bing doesn't like '+' (plus) sign in search request.
Anyway, why this is zero
+[[]][0] // = 0
and this is one
++[[]][0] // = 1
UPD:
Michael Berkowski have a good answer, but I steal don't understand one thing
if [[]][0] evaluates to an empty array, then why ++[] is ReferenceError: Invalid left-hand side expression in prefix operation
UPD2:
now I get it.. it seems I was trying to type ++0 in console and getting an Error, but I should be using var a = 0; ++a
This is best explored by breaking down the way its components evaluate.
[[]][0] alone evaluates to the empty array []. By adding + in front, you cast its string representation to an integer 0 (like saying +4 or -3) via a unary positive operator. +0 is just 0.
++ as a numeric operator, also casts the empty string to an integer 0, but applies its operation (the prefix increment) resulting 1.
[[]][0]
// [] empty array
[[]][0].toString()
// ""
// Unary + casts the empty string to an integer
+("")
// 0
// Prefix increment on an empty string results in 1 (increments the 0)
var emptyString = "";
++emptyString;
// 1

What is the default value of lastIndexOf?

string.lastIndexOf(searchValue[, fromIndex])
MDN says that fromIndex default value is equal to the string.length, however, I really think it is string.length-1
But it doesn't matter what I think... I need someone to confirm what is the default value of fromIndex
Here is what they say:
"It can be any integer between 0 and the length of the string. The default value is the length of the string."
According to ECMAScript 5, it will be the length of the String.
15.5.4.8 String.prototype.lastIndexOf (searchString, position)
If position is undefined, the length of the String value is assumed, so as to search all of the String.
...
Call CheckObjectCoercible passing the this value as its argument.
Let S be the result of calling ToString, giving it the this value as its argument.
Let searchStr be ToString(searchString).
Let numPos be ToNumber(position). (If position is undefined, this step produces the value NaN).
If numPos is NaN, let pos be +∞; otherwise, let pos be ToInteger(numPos).
Let len be the number of characters in S.
Let start min(max(pos, 0), len).
Let searchLen be the number of characters in searchStr.
Return the largest possible nonnegative integer k not larger than start such that k+ searchLen is not greater than len, and for all nonnegative integers j less than searchLen, the character at position k+j of S is the same as the character at position j of searchStr; but if there is no such integer k, then return the value -1.
It doesn't matter, at all. Since the index is zero-based, both string.length and string.length-1 will include the entire string.
EDIT
You can test for differences in the result pretty simply:
var s = '01923456789abcdef';
alert(s.lastIndexOf('f',s.length+1));
alert(s.lastIndexOf('f',s.length));
alert(s.lastIndexOf('f',s.length-1));
alert(s.lastIndexOf('f',s.length-2));
That alerts 16, 16, 16, -1. Thus, if you are very concerned with an extra few cycles being used when a useragent runs .lastIndexOf(), you can pass .length-1 and have it spend a few extra cycles parsing the extra parameter.
If fromIndex is as large or larger than the string length, the function returns -1.
If not, the string.substring(fromIndex) searches from the end of the substring.

Categories