Javascript Regex to replace any non-alphanumeric characters, including brackets - javascript

I have this regex /[\W_]+/g that I use to remove any non-alphanumeric characters. However it does not remove brackets.
What I need is for it to remove any kind of bracket/paranthesis so that a string like Hello (world) becomes helloworld.
A string like Hello(world) becomes helloworld, but it does not work if there is a space between them.
Is this possible?

You should be able to use this Java / JavaScript compliant regex according to RegexBuddy 4.x:
([\W\s_]+)
And just replace anything it matches with '' or ""
Following the documentation here, something like this:
#set($mystring = "Hello (world)! It's _{now}_ or -- [never]...;")
$mystring.replaceAll("</?([\W\s_]+)/?>", "");
=>
HelloworldItsnowornever

Related

Regex JS: Matching string between two strings including newlines

I want to match string between two string with Regex including newlines.
For example, I have the next string:
{count, plural,
one {apple}
other {apples}
}
I need to get string between plural, and one. It will be \n*space**space*.
I tried this Regex:
/(?:plural,)(.*?)(?:one)/gs
It works, but not in JS. How to do that with JavaScript?
To match the everything including newline character, you can use [\s\S] or [^]
var str = `{count, plural,
one {apple}
other {apples}
} `;
console.log(str.match(/(?:plural,)([\s\S]*?)(?:one)/g));
console.log(str.match(/(?:plural,)([^]*?)(?:one)/g));
It doesn´t work because your testing with the wrong regex engine.
`/s` does not exist in the JS regex engine, only in pcre
It must be something like:
/(?:plural,)((.|\n)*?)(?:one)/g
Hope it helps.

matching character when preceded by specific string

I'm trying to find characters when preceded with exact string but not the preceding string. how to do that?
I have the string
+1545464454<+440545545454<+210544455454<+75455454545
the above string are phone numbers with international prefix, but some of them have 0 between prefix and number and I need to take it out.
I have /(\+4|\+44|\+7 ... allprefixes here...)0/g but this selects the prefix as well and I need to select only 0
I's in the javascript
You're almost close to that. Just use Capturing groups and replace function like below. Most languages support capturing groups.
/(\+(?:4|44|7 ... allprefixes here without `+`...))0/g
REplacement string:
$1
or \1
If you're on PHP, \K should work. \K discards previously matched characters from printing at the final.
'~\+(?:4|44|7)\K0~g'
In javascript.
> var str = "+1545464454<+440545545454<+210544455454<+75455454545"
> str.replace(/(\+(?:44|21|7|4))0/g, "$1")
'+1545464454<+44545545454<+21544455454<+75455454545'
If your language supports lookbehinds, you can use it as used in the following regex
/(?<=\+(4|44|7))0/g
Javascript doesn't support it. So you'll need to use something like this
str.replace(/(\+(4|44|7))0/, "$1");

Writing a Javascript regex that includes special reserved characters

I'm writing a function that takes a prospective filename and validates it in order to ensure that no system disallowed characters are in the filename. These are the disallowed characters: / \ | * ? " < >
I could obviously just use string.indexOf() to search for each special char one by one, but that's a lot longer than it would be to just use string.search() using a regular expression to find any of those characters in the filename.
The problem is that most of these characters are considered to be part of describing a regular expression, so I'm unsure how to include those characters as actually being part of the regex itself. For example, the / character in a Javascript regex tells Javascript that it is the beginning or end of the regex. How would one write a JS regex that functionally behaves like so: filename.search(\ OR / OR | OR * OR ? OR " OR < OR >)
Put your stuff in a character class like so:
[/\\|*?"<>]
You're gonna have to escape the backslash, but the other characters lose their special meaning. Also, RegExp's test() method is more appropriate than String.search in this case.
filenameIsInvalid = /[/\\|*?"<>]/.test(filename);
Include a backslash before the special characters [\^$.|?*+(){}, for instance, like \$
You can also search for a character by specified ASCII/ANSI value. Use \xFF where FF are 2 hexadecimal digits. Here is a hex table reference. http://www.asciitable.com/ Here is a regex reference http://www.regular-expressions.info/reference.html
The correct syntax of the regex is:
/^[^\/\\|\*\?"<>]+$/
The [^ will match anything, but anything that is matched in the [^] group will return the match as null. So to check for validation is to match against null.
Demo: jsFiddle.
Demo #2: Comparing against null.
The first string is valid; the second is invalid, hence null.
But obviously, you need to escape regex characters that are used in the matching. To escape a character that is used for regex needs to have a backslash before the character, e.g. \*, \/, \$, \?.
You'll need to escape the special characters. In javascript this is done by using the \ (backslash) character.
I'd recommend however using something like xregexp which will handle the escaping for you if you wish to match a string literal (something that is lacking in javascript's native regex support).

Javascript string replace with regex to strip off illegal characters

Need a function to strip off a set of illegal character in javascript: |&;$%#"<>()+,
This is a classic problem to be solved with regexes, which means now I have 2 problems.
This is what I've got so far:
var cleanString = dirtyString.replace(/\|&;\$%#"<>\(\)\+,/g, "");
I am escaping the regex special chars with a backslash but I am having a hard time trying to understand what's going on.
If I try with single literals in isolation most of them seem to work, but once I put them together in the same regex depending on the order the replace is broken.
i.e. this won't work --> dirtyString.replace(/\|<>/g, ""):
Help appreciated!
What you need are character classes. In that, you've only to worry about the ], \ and - characters (and ^ if you're placing it straight after the beginning of the character class "[" ).
Syntax: [characters] where characters is a list with characters.
Example:
var cleanString = dirtyString.replace(/[|&;$%#"<>()+,]/g, "");
I tend to look at it from the inverse perspective which may be what you intended:
What characters do I want to allow?
This is because there could be lots of characters that make in into a string somehow that blow stuff up that you wouldn't expect.
For example this one only allows for letters and numbers removing groups of invalid characters replacing them with a hypen:
"This¢£«±Ÿ÷could&*()\/<>be!##$%^bad".replace(/([^a-z0-9]+)/gi, '-');
//Result: "This-could-be-bad"
You need to wrap them all in a character class. The current version means replace this sequence of characters with an empty string. When wrapped in square brackets it means replace any of these characters with an empty string.
var cleanString = dirtyString.replace(/[\|&;\$%#"<>\(\)\+,]/g, "");
Put them in brackets []:
var cleanString = dirtyString.replace(/[\|&;\$%#"<>\(\)\+,]/g, "");

What does /[\[]/ do in JavaScript?

I am having trouble googling this. In some code I see
name = name.replace(/[\[]/,"\\\[").replace(/[\]]/,"\\\]");
/[\[]/ looks to be 1 parameter. What do the symbols do? It looks like it's replacing [] with \[\] but what specifically does /[\[]/ do?
The syntax /…/ is the literal regular expression syntax. And the regular expression [\[] describes a character class ([…]) that’s only character the [ is). So /[\[]/ is a regular expression that describes a single [.
But since the global flag is not set (so only the first match will be replaced), the whole thing could be replaced with this (probably easier to read):
name.replace("[", "\\[").replace("]","\\]")
But if all matches should be replaced, I would probably use this:
name.replace(/([[\]])/g, "\\$1")
It's a regular expression that matches the left square bracket character.
It's a weird way to do it; overall it looks like the code is trying to put backslashes before square brackets in a string, which you could also do like this:
var s2 = s1.replace(/\[/g, '\\[').replace(/]/g, '\\]');
I think.
/[[]/ defined a character range which includes only the ']' character (escaped), you are correct that is replaced [] with [].
The [] is in regex itself used to denote a collection of to-be-matched characters. If you want to represent the actual [ or ] in regex, then you need to escape it by \, hence the [\[] and [\]]. The leading and trailing / are just part of the standard JS syntax to to denote a regex pattern.
After all, it replaces [ by \[ and then replaces ] by \].

Categories