console.log("%%%","\n");
produces only two "%" characters and a newline (one of the percent signs is getting removed)
console.log("%%%"+"\n");
produces all 3 characters as expected....
if I replace the "%" character with any other character, both examples output 3 characters and a newline character....it is only with the "%" character that one of them gets removed.
https://replit.com/#JustJamie/PercentSignConfusion#index.js
I've tried searching for an explanation for this phenomenon but couldn't find any previous mention of this. I've tried replacing the "%" character with many other characters, including all special characters, and only get this result using the % character.
While typing this question, I may have discovered the answer. I believe what is happening is that javascript is interpreting the % sign as a placeholder, and then replacing the last instance of the placeholder with the newline character from the second argument passed to console.log. Can anyone find the javascript reference that explains this?
% may be used for formatting in console.log. For instance, %d may be used to output a number.
console.log('%d', 42);
%% is used to escape the percent sign, i.e. output a single percent symbol on its own.
Related
I'm having trouble with a certain RegEx replacement string for later use in Javascript.
We have quite a bit of text that was stored in a rather odd format that we aren't allowed to fix.
But we do need to find all the "network path" strings inside it, following these rules:
A. The matches always start with 2 backslashes.
B. The matching characters should stop as soon as it hits a first occurrence of any 1 of these:
A < character
A space
A line feed
A carriage return
A & character
A literal "\r" or "\n" string (but only if occurring at end of line)
We "almost" have it working with /\\\\[^ &<\s]*/gi as shown in this RegEx Tester page:
https://regex101.com/r/T4cDOL/5
Even if we get it working, the RegEx has to be even futher "escape escaped" before putting on
our Javascript code, but that's also not working as expected.
From your example, it seems you literally have a backslash followed by an n and a backslash followed by an r (as opposed to a newline or carriage return), which means you can't only use a negated character class (since you need to handle a sequence of two characters). I'd use a positive lookahead to know where to stop, so I can use an alternation for that part.
You haven't said what parts of those strings should match, so I've had to guess a bit, but here's my best guess (with useful input from Niet the Dark Absol):
const rex = /\\\\.*?(?=[ &<\r\n]|\\[rn](?:$| ))/gmi;
That says:
Match starting with \\
Take everything prior to the lookahead (non-greedy)
Lookahead: An alternation of:
A space, &, <, carriage return (\r, character 13), or a newline (\n, character 10); or
A backslash followed by r or n if that's either at the end of a line or followed by a space (so we get the \nancy but not the \n after it).
Updated regex101
You might want to have more characters than just a space after the \r/\n. If so, make it a character class (and/or use \s for "whitespace" if that applies):
const rex = /\\\\.*?(?=[ &<\r\n]|\\[rn](?:$|[ others]))/gmi;
// −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−^^^^^^^^^
I am facing an issue with a regular expression while trying to block any string which has minus(-) in the beginning of some white listed characters.
^(?!-.*$).([a-zA-Z0-9-:#\\,()\\/\\.]+)$
It is blocking minus(-) at place and allowing it any where in the character sequence but this regex is not working if the passed string is single character.
For e.g A or 9 etc.
Please help me out with this or give me a good regex to do the task.
Your pattern requires at least 2 chars in the input string because there is a dot after the first lookahead and then a character class follows that has + after it (that is, at least 1 occurrence must be present in the string).
So, you need to remove the dot. Also, you do not need to escape any special char inside a character class. Besides, to avoid matching strings atarting with - a mere (?!-) will suffice, no need adding .*$ there. You may use
^(?!-)[a-zA-Z0-9:#,()/.-]+$
See the regex demo. Remember to escape / if used in a regex literal notation in JavaScript, there is no need to escape it in a constructor notation or in a Java regex pattern.
Details
^ - start of a string
(?!-) - cannot start with -
[a-zA-Z0-9:#,()/.-]+ - 1 or more ASCII letters, digits and special chars defined in the character class (:, #, ,, (, ), /, ., -)
$ - end of string.
If i understand correctly, and you don't want a minus at the beginning, does ^[^-].* work as a regex for you? Java's "matches" would return false if it starts with minus
There is a method in a String class that provides you exactly what you are asking for - it's a startsWith() method - you could use this method in your code like this (you can translate it as "If the given String doesn't start with -, doSomething, in other case do the else part, that can contain some code or might be empty if you want nothing to be done if the given String starts with - ") :
if(!(yourString.startsWith("-"))) {
doSomething()
} else {
doNothingOrProvideAnyInformationAboutWrongInput()
}
I think that it can help you.
^(?!-).*[a-zA-Z0-9-:#\\,()\/\\.]+$
I want to match a string pattern which has first 4 characters, then the "|" symbol, then 4 characters, then the "|" symbol again and then a minimum of 7 characters.
For example, "test|test|test123" should be matched.
I tried RegExp("^([a-za-z0-9-|](4)[a-za-z0-9-|](5)[a-za-z0-9-|](3)+)$") for this, but it didn't match my test case.
test|test|test1234
Ramesh, does this do what you want?
^[a-zA-Z0-9-]{4}\|[a-zA-Z0-9-]{4}\|[a-zA-Z0-9-]{7,}$
You can try it at https://regex101.com/r/jilO6O/1
For example, the following will be matched:
test|test|test123
a1-0|b100|c10-200
a100|b100|c100200
But the following will not:
a10|b100|c100200
a100|b1002|c100200
a100|b100|c10020
Tips on modifying your original code.
You have "a-za-z" where you probably intended "a-zA-Z", to allow either upper or lower case.
To specify the number of characters to be exactly 4, use "{4}". You were nearly there with your round brackets, but they need to be curly, to specify a count.
To specify a range of number of characters, use "{lowerLimit,upperLimit}". Leaving the upper limit blank allows unlimited repeats.
We need to escape the "|" character because it has the special meaning of "alternate", in regular expressions, i.e. "a|b" matches either "a" or "b". By writing it as "\|" the regex interpreter knows we want to match the "|" character itself.
I use this regex code to parse urls:
/^(((http|https):\/\/)+[www.])?+\s*\S+\s*+(.com|.es|.net|.org|.co)$/ig
It works perfectly on https://regex101.com/r/bX5oM4/1
But on my console I keep getting the:
SyntaxError: Invalid regular expression: /^(((http|https):\/\/)+[www\.])?+\s*\S+\s*+(\.com|\.es|\.net|\.org|\.co)$/: Nothing to repeat
I tried escaping the + but It doesn't work. I'm kinda new on regex so It could be anything.
Here is your fixed regex:
^(?:https?:\/\/www\.)?[a-zA-Z0-9]\S+(\.(?:com|es|net|org|co))$
See demo
Or, to match the strings inside larger strings:
\b(?:https?:\/\/www\.)?[a-zA-Z0-9]\S+(?:\.(?:com|es|net|org|co))\b
See another demo
In JavaScript, you cannot set + to ? quantifier.
Also, note that [www.] matches 1 character, either w or . since it is a character class. You must have meant a group, and thus you need round brackets, not square ones.
I removed unnecessary groups, regrouped them a bit and escaped the dots. Note that unescaped dot matches any character but a newline.
So, the regex:
^ - Asserts the position at the start of the string
(?:https?:\/\/www\.)? - Optionally matches http or https then //www. literally
\w\S+ - 1 alhoanumeric and 1 or more non-whitespace characters
(\.(?:com|es|net|org|co)) - Matches a dot and then any of the alternatives in the round brackets
$ - Asserts end of string
Try this (update!)
^((http|https):\/\/)?([\w]+[.-]?)+\.(com|es|net|org|co|uk|de)$
instead of
/^(((http|https):\/\/)+[www.])?+\s*\S+\s*+(.com|.es|.net|.org|.co)$/ig
You had an extra + behind a ? and another one behind a *. And several other things were not quite OK, as stribizhev pointed out quite rightly!
This regex is looking for a limited range of TLDs ... (e. g. french pages would not pass). The [www.] was syntactically wrong and also surperfluous as any domain name can have subdomains (expressed by ([\w]+[.-]?)+) and 'www.' is just one of the possible ones.
I am trying to trim leading and trailing whitespace and newlines from a string. The newlines are written as \n (two separate characters, slash and n). In other words, it is a string literal, not a CR LF special character.
For example, this:
\n \nRight after this is a perfectly valid newline:\nAnd here is the second line. \n
Should become this:
Right after this is a perfectly valid newline:\nAnd here is the second line.
I came up with this solution:
text = text
.replace(/^(\s*(\\n)*)*/, '') // Beginning
.replace(/(\s*(\\n)*)*$/, '') // End
These patterns match just fine according to RegexPal.
However, the second pattern (matching the end of the string) takes a very long time — about 32 seconds in Chrome on a string with only a couple of paragraphs and a few trailing spaces. The first pattern is quite fast (milliseconds) on the same string.
Here is a CodePen to demonstrate it.
Why is it so slow? Is there a better way to go about this?
The reason it takes so long is because you have a * quantifying two more *
A good explanation can be found in the PHP manual, but I don't think JavaScript supports once-only subpatterns.
I would suggest this regex instead:
text = text.replace(/^(?:\s|\\n)+|(?:\s|\\n)+$/g,"");
Not a good answer but one workaround would be to reverse the string and also reverse \n to n\ in the regular expression (for beginning), apply it, then reverse the string back.