Escape a character knowing only its index - javascript

I'm looking to escape a \n character knowing only its index. I am NOT looking to replace the character itself, but rather "prepend" an escape character to the existing \n, effectively escaping it.
For example, in JavaScript: '\\' + '\n' === '\\\n'
What I'm looking for is \\n not \\\n.
To reiterate, I do NOT want to replace \n entirely. I am well aware '\n'.replace('\n', '\\n') would do what I'm looking for. I simply want to prepend an escape character to the existing newline. Is there a reason that JS prepends a literal backslash and does not escape the newline?
'\\' + '\'' === "\\'"
I guess I'm wondering why the newlines behave differently. Thanks for any ideas!

It's not possible to prepend a backslash to a character in order to escape it. For example, with a newline character, '\n', to Javascript, this is a string containing one character code: 10.
for (const char of '\n') {
console.log(char.charCodeAt());
}
There is not actually an n (nor a literal backslash) anywhere in there - \n is simply the convention programmers use and understand to refer to a newline. If a string is composed of a literal backslash and a literal n, the character codes are completely different: there's a character code of 92 (for the backslash) and a character code of 110 (for the n).
for (const char of '\\n') {
console.log(char.charCodeAt());
}
Your only option is to completely replace the literal newline character with a backslash and an n (with '\n'.replace('\n', '\\n') - or, for a more general solution, construct a Map or object of literal characters and their escape sequences)

Related

How can I avoid control character interpretation in Javascript template string interpolation?

I have a string variable containing a control character (newline). I want to output that variable but with the control character as its literal representation and not interpreted:
console.log(`Using "${nl}" as newline`)
where the nl variable may contain one of \n, \r, or \r\n.
The output of this is of course
Using
as newline
but it should be e.g.
Using "\r\n" as newline
I guess I somehow need to construct a string like this
console.log('Using "\\r\\n" as newline')
So I've been trying to escape the backslashes in nl by prepending another backslash using nl.replace() but that doesn't seem to work because the nl variable doesn't actually contain any backslash characters.
Is there a way to do this generically, i.e. without coding explicitly for \n, \r, and \r\n?
You could remove the double quotes, and use JSON.stringify which also produces those quotes, and which encodes newline characters (and more, such as TAB, BS, FF, and escapes double quote and backslash):
let nl = "\r\n";
console.log(`Using ${JSON.stringify(nl)} as newline`);
// Or with a newline in a template literal:
nl = `
`;
console.log(`Using ${JSON.stringify(nl)} as newline`);

why "\a" and "a" are the same thing in JavaScript?

You know that "n" and "\n" are not same in JavaScript, cause the second one is a escape sequence, but why "\a" and "a" is the same? If you check charCodeAt of the two strings, you will know.
Can someone explain to me?
What exactly escape sequence is defined in JavaScript?
\a is not an special sequence (like \n or \t), so the \ falls back to being an escape character, meaning that the character following it will be used literally (even if it were a quote, or a special character).
Hence, '\a' === 'a'.
The second purpose of backslash (the first is printing special character like newline with \n or TAB with \t), is to escape JavaScript special character. For example, to have a string containing a quote, you can either mark the string with double quotes "'" or if you use single quotes, you will need to escape with the backslash, like so: '\'', to prevent the literal ' from terminating the string.
As you can see in this answer, not every letter has an associated escape sequence. 'a' is one of the letters that does NOT have an escape sequence associated with it, and so to Javascript, there's no special meaning, it's just a backslash and the letter 'a'.
only few letters in combination with a backslash form escape sequence (like \n, \f, \r, \b, \t, \v) and \a is not in the list.
please refer the following link
https://www.w3schools.com/js/js_strings.asp

why does '\\\[' equals '\\[' ? How does backslash work in string?

As the title
console.log('\\\[' === '\\[');
returns true.
Can anyone explain in detail what's the difference?
A backslash before most characters will only be parsed as an unnecessary escape character - the backslash will be ignored. This is what's happening in the second part of the first string. Before a certain few characters though, such as another backslash in \\, or \n, it will be parsed as a escape sequence. \\ is the escape sequence for a single literal backslash:
console.log('\\');
and is only one character.
A backslash before a [ will resolve to just the [, though:
console.log('\[');
So:
'\\\[' - A literal backslash, followed by an (unnecessarily escaped) [
'\\[' - A literal backslash, followed by a plain [
See MDN for a list of escape sequences.
In strings, the backslash (\) is a special character used to encode other special characters, including the backslash.
'\\[' is a JavaScript string literal that contains a backslash (\\) and an open square bracket ([). In the compiled program the string is \[.
'\\\[' is a JavaScript string literal that contains a correctly encoded backslash (\\) followed by the combination of characters \[ that looks like an escape sequence but doesn't mean anything. Because this combination is not defined and \ by itself does not mean anything, the JavaScript interpreter ignores the backslash and corrects the string; it becomes identical to the first one (\[).
The behaviour is documented:
For characters not listed in the table, a preceding backslash is ignored, but this usage is deprecated and should be avoided.
Backslash is a special character. Literally, JS talk to browser to interpret the symbol after \ as is. Sometimes it calls screening or shielding.
That is why we can write smth like that: console.log("Double \"quotes\" inside another one."); with the result of Double "quotes" inside another one. without any error. Although that is not the way we need to use anywhere.
"\\\[" separates into 2 parts: \\ and \[. First returns \ and the second returns [. Finally it is \[.
"\\[" separates into 2 parts: \\ and [. First returns \ and the second returns [. Finally it is \[.

Regular Expression in JS: \\. does not match \n

I am getting a string containing newlines (/n), tabs (/t) and lowercase letters [a-z]. It is possible to do that by matching /\n|\t/. AFAIK the dot represents the wildcard.
Therefore I was wondering, why /\n|\t/ doesn't match the same things as /\\./
var text = 'test1 \ntest2';
text.split(/\n/) //['test1', 'test2']
text.split(/\./) //['test1 \ntest2']
text.split(/\\./) //['test1 \ntest2']
Shouldn't the \\. match the \n (newline)?
Let me try and answer all the points:
AFAIK the dot represents the wildcard.
No, in regex, we do not use the term "wildcard". It is a special regex (meta)character. A dot in JavaScript regex matches any character but a newline.
I was wondering, why /\n|\t/ doesn't match the same things as /\\./
Because /\n|\t/ matches 1 symbol, either a newline or tab, while the regex /\\./ matches a literal \ and a character other than a newline.
The \n and \t are escape sequences. That means that the \ is not a literal backaslash that, together with the following symbol forms a code unit, a string that cannot be written otherwise. Indeed, how can we write a line break on the paper with a pen? No way!
See more about JavaScript character escape sequences here.
Now,
text.split(/\n/) //['test1', 'test2']
True, your input string contains a line break, thus, you get two elements in the resulting array
text.split(/\./) //['test1 \ntest2']
No match was found because \. matches a literal dot. A dot that is escaped (that has a literal \ before it) in the regex stops being a special regex metacharacter, and just matches its literal representation. Your string has no dot, thus, no matches.
text.split(/\\./) //['test1 \ntest2']
Again, no match is found, as /\\./ looks for a literal \ followed by any character but a newline.
A hint: use your expressions at regex101.com, it will tell you what your regex can match on the right.
Here, with regex, you have a literal notation (/.../). In literal notation, \ is considered a literal, thus, you do not have to escape it twice. If you used a constructor notation (i.e. RegExp(....)), you would have to use double escaping. E.g.
var re = /\\./; // is equal to
var re = new RegExp("\\\\.");
See more about constructor and literal notations at MDN RegExp help page.
\n gets evaluated to a new line, so you're essentially matching against an empty string. If you do a quick console.log('\n'); you can see the output of that.

How does the regular expression `new RegExp("user=" + un + "[\"'\\s>]", "i")` work?

I am a Javascript Junkie and wishing to test around some codes.
new RegExp("user=" + un + "[\"'\\s>]", "i")
what will this actually mean?
It was from a site, and it actually works!
I especially don't get the [\"'\\s>] part.
[ and ] signify a character class. Basically, it will match one of any characters found within. The backslash \ escapes special characters. So you'll see that the first double-quote is escaped, showing we're looking for the " char within the value, and not merely using the " within our regular expression to wrap around a value. Then the single-quote ' is considered, followed by a backslash (literal - so it too must be escaped \\ looks for one backslash). It appears the original was indicating a space, which is \s. And then the greater-than symbol.
[\"'\s>] means any of the following characters: " ' space >
So, assuming un = "abc" your regex would match any of the following:
user=abc"
user=abc'
user=abc // There is a space after abc.
user=abc>
[\"'\\s>]
represents a char class containing 4 thing: a double quote, a single quote, a \s, and a >. It matches either one of these four once. The \s in turn could mean a single white space which could come from a space, tab, newline, carriage return.
The extra \ you see are for escaping.

Categories