I have found the below Javascript recently, and (believe) I understand its operation, but cannot figure out (what appears) to be a ¿regex string class? ("/\W/.test")
function AlphaNumericStringCheck(text)
{
if (/\W/.test(text.replace(/^\s+|\s+$/g,""))) return false;
return true;
}
Can someone put a name to this technique, so I can research it more?
The /\W/ in your source code is a regular expression literal (MDC link, as MDC is about 18X clearer than the specification). Just as with a string literal ("foo"), a regular expression literal is a way of writing regular expressions in the code. The / characters in a regular expression literal are analogous to the quote characters in a string literal. In a string literal, what's inside the quotes is the content of the string; in a regular expression literal, what's inside the / characters is the regular expression. (There can also be flags following the ending /.)
So this:
var rex = /\W/;
...creates a regular expression object for the regular expression \W (match one word character). It's (essentially) equivalent to:
var rex = new RegExp("\\W");
Note that in the long form, I had to escape the backslash in the string, since backslashes are special in string literals. This is one of the reasons we have regular expression literals: Because it gets very confusing, very quickly, when you have to escape all of your backslashes (backslashes being a significant part of many regular expressions).
Regular expressions are objects, which have properties with functions attached to them (effectively, methods, although JavaScript doesn't technically have methods per se). So /\W/.test(...) calls the test function on the regular expression object defined by the literal /\W/.
\W is a shortcut (shorthand character classes or extensions) for word. Just like \d for digits and \s for whitespace and new lines. It depends on the implementation of the regex you're using.
This literals are replaced by full expression before compiled. \d turns into [0-9] and \W probably turns into [0-9a-Z][0-9a-Z]* or similar. They are designed to make your expressions more readable.
You can see some more of them here: http://www.zytrax.com/tech/web/regex.htm#special
Related
I have the following in JavaScript:
function escape(text)
{
var tx = text.replace(/[&<>"']/g);
}
Im having problems trying to do the same on Dart:
var reg = new RegExp("/[&<>"']/g"); -->this throws error.
How can I get an equivalent expression?
The Dart RegExp source does not use / to delimit regular expressions, they're just strings passed to the RegExp constructor.
It's usually recommended that you use a "raw string" because backslashes mean something in RegExps as well as in non-raw string literals, and the JavaScript RegExp /\r\n/ would be RegExp("\\r\\n") in Dart without raw strings, but RegExp(r"\r\n") with a raw string, much more readable.
In this particular case, where the string contains both ' and ", that becomes harder, but you can use a "multiline string" instead - it uses tripple quote characters as delimiters, so it can contain single quote characters unescaped (it doesn't have to actually span multiple lines).
Dart doesn't have something similar to the g flag of JavaScript regexps. Dart regexps are stateless, it's the functions using them which need to care about remembering where it matched, not the RegExp itself. So, no need for the g.
So:
RegExp(r"""[&<>"']""");
// or
RegExp(r'''[&<>"']''');
That gets a little crowded with all those quotes, and you can choose to use a non-raw string instead so you can escape the quote which matches the string (which is easier because your RegExp does not contain any backslashes itself):
RegExp("[&<>\"']");
// or
RegExp('[&<>"\']');
If you do that when your regexp uses a RegExp backslash, then you'll need to double the backslash, something which is easy to forget, which is why raw strings are recommended.
You forgot to escape double quotes
new RegExp("/[&<>\"']", 'g');
Say I want to match any of these files:
/foo/bar/baz/x
/foo/bar/baz/y/z
so I create a regex like so:
new RegExp('^/foo/bar/baz/.*')
however my question is - how do I tell the RegExp constructor to view . * and ^ as regular expression characters, not literal characters?
As this is node.js related, my solution would be to use quotemeta
https://github.com/substack/quotemeta
as if it was perl related my solution would be to use \Q \E ;-)
new RegExp('^/foo/bar/baz/.*')
...and
/^\/foo\/bar\/baz\/.*/
...are equivalent. Everything in the new RegExp string are treated as regular expression characters, not literal characters (luckily / does not need to be escaped).
If you do want to make something a literal character using the constructor syntax, make sure you use a double backslash since \ within a string literal has its own meaning. For example, if you want to capture a literal . character, this:
new RegExp('\.')
...won't work right because that is interpreted the same as '.', making the RegExp the same as /./. You'd need to do this instead:
new RegExp('\\.')
I want the following regular expression:
/(ending)$/
Where ending is a variable. I discovered that to use variables with regular expressions I must use regular expression constructors. So I tried:
var pattern = new RegExp((ending)$);
But this does not work either! This works if I do not include the grouping parenthesis and dollar sign, but I need those special characters as part of my pattern!
I tried to wrap the special characters in quotations, and also cancel them out by using a backslash, but nothing seems to work! What should I do to include special characters in my regular expression constructor?!
it takes a string...
var pattern = new RegExp("(" + ending+ ")$");
I tried many ways to get a single backslash from an executed (I don't mean an input from html).
I can get special characters as tab, new line and many others then escape them to \\t or \\n or \\(someother character) but I cannot get a single backslash when a non-special character is next to it.
I don't want something like:
str = "\apple"; // I want this, to return:
console.log(str); // \apple
and if I try to get character at 0 then I get a instead of \.
(See ES2015 update at the end of the answer.)
You've tagged your question both string and regex.
In JavaScript, the backslash has special meaning both in string literals and in regular expressions. If you want an actual backslash in the string or regex, you have to write two: \\.
The following string starts with one backslash, the first one you see in the literal is an escape character starting an escape sequence. The \\ escape sequence tells the parser to put a single backslash in the string:
var str = "\\I have one backslash";
The following regular expression will match a single backslash (not two); again, the first one you see in the literal is an escape character starting an escape sequence. The \\ escape sequence tells the parser to put a single backslash character in the regular expression pattern:
var rex = /\\/;
If you're using a string to create a regular expression (rather than using a regular expression literal as I did above), note that you're dealing with two levels: The string level, and the regular expression level. So to create a regular expression using a string that matches a single backslash, you end up using four:
// Matches *one* backslash
var rex = new RegExp("\\\\");
That's because first, you're writing a string literal, but you want to actually put backslashes in the resulting string, so you do that with \\ for each one backslash you want. But your regex also requires two \\ for every one real backslash you want, and so it needs to see two backslashes in the string. Hence, a total of four. This is one of the reasons I avoid using new RegExp(string) whenver I can; I get confused easily. :-)
ES2015 and ES2018 update
Fast-forward to 2015, and as Dolphin_Wood points out the new ES2015 standard gives us template literals, tag functions, and the String.raw function:
// Yes, this unlikely-looking syntax is actually valid ES2015
let str = String.raw`\apple`;
str ends up having the characters \, a, p, p, l, and e in it. Just be careful there are no ${ in your template literal, since ${ starts a substitution in a template literal. E.g.:
let foo = "bar";
let str = String.raw`\apple${foo}`;
...ends up being \applebar.
Try String.raw method:
str = String.raw`\apple` // "\apple"
Reference here: String.raw()
\ is an escape character, when followed by a non-special character it doesn't become a literal \. Instead, you have to double it \\.
console.log("\apple"); //-> "apple"
console.log("\\apple"); //-> "\apple"
There is no way to get the original, raw string definition or create a literal string without escape characters.
please try the below one it works for me and I'm getting the output with backslash
String sss="dfsdf\\dfds";
System.out.println(sss);
I'm writing a function that takes a prospective filename and validates it in order to ensure that no system disallowed characters are in the filename. These are the disallowed characters: / \ | * ? " < >
I could obviously just use string.indexOf() to search for each special char one by one, but that's a lot longer than it would be to just use string.search() using a regular expression to find any of those characters in the filename.
The problem is that most of these characters are considered to be part of describing a regular expression, so I'm unsure how to include those characters as actually being part of the regex itself. For example, the / character in a Javascript regex tells Javascript that it is the beginning or end of the regex. How would one write a JS regex that functionally behaves like so: filename.search(\ OR / OR | OR * OR ? OR " OR < OR >)
Put your stuff in a character class like so:
[/\\|*?"<>]
You're gonna have to escape the backslash, but the other characters lose their special meaning. Also, RegExp's test() method is more appropriate than String.search in this case.
filenameIsInvalid = /[/\\|*?"<>]/.test(filename);
Include a backslash before the special characters [\^$.|?*+(){}, for instance, like \$
You can also search for a character by specified ASCII/ANSI value. Use \xFF where FF are 2 hexadecimal digits. Here is a hex table reference. http://www.asciitable.com/ Here is a regex reference http://www.regular-expressions.info/reference.html
The correct syntax of the regex is:
/^[^\/\\|\*\?"<>]+$/
The [^ will match anything, but anything that is matched in the [^] group will return the match as null. So to check for validation is to match against null.
Demo: jsFiddle.
Demo #2: Comparing against null.
The first string is valid; the second is invalid, hence null.
But obviously, you need to escape regex characters that are used in the matching. To escape a character that is used for regex needs to have a backslash before the character, e.g. \*, \/, \$, \?.
You'll need to escape the special characters. In javascript this is done by using the \ (backslash) character.
I'd recommend however using something like xregexp which will handle the escaping for you if you wish to match a string literal (something that is lacking in javascript's native regex support).