regex for content between and including the parenthees - javascript

Can someone help me with a regex that will catch the following:
has to be at the end of the string
remove all characters between ( and ) including the parentheses
It's going to me done in javascript.
here's what i have so far -
var title = $(this).find('title').text().replace(/\w+\s+\(.*?\)/, "");
It seems to be catching some chars outside of the parenthees though.

This deals with matching between parens, and only at the of the string: \([^(]*\)\s*$. If the parens might be nested, you need a parser, not a regular expression.

Where's the $? You need a dollar at the end and possibly catch optional whitespace.
var title = $(this).find('title').text().replace(/\s*\([^\)]*?\)\s*$/, "");
If brackets can also be angle brackets, then this can match those too:
var title = $(this).find('title').text().replace(/\s*(\([^\)]*?\)|\<[^\>]*?\>)\s*$/, "");

var title = $(this).find('title').text().replace(/\([^()]*\)\s*$/, "");
should work.
To remove < and > you don't really need regexes, but of course you can do a mystr.replace(/[<>]+/g, "");
This will match a (, any number of characters except parentheses (thereby ensuring that only the last parentheses will match) and a ), and then the end of the string.
Currently, it allows whitespace between the parentheses and the end of the string (and will remove it, too). If that's not desired, remove the \s* bit from the regex.

Related

how to replace or remove all text strings outside of parenthesis in javascript

I am very new to Regex and I am trying to remove all text outside the parenthesis and only keep everything inside the parenthesis.
For example 1,
Hello,this_isLuxy.(example)
to this:
(example)
Example 2:remove everything after the period
luxySO_i.example
to this:
luxySO_i
Using JS + Regex? Thanks so much!
For this simple string, you can use indexOf and substring functions:
var openParenthesisIndex = str.indexOf('(');
var closedParenthesisIndex = str.indexOf(')', openParenthesisIndex);
var result = str.substring(openParenthesisIndex, closedParenthesisIndex + 1);
Ok, if you want to use regex, then it's going to be a bit complicated. Anyways, here you go:
var str = "Hello,this_(isL)uxy.(example) asd (todo)";
var result = str.replace(/[^()](?=([^()]*\([^()]*\))*[^()]*$)/g, '');
console.log(result); // "(isL)(example)(todo)"
In short, this replaces any non () character, which is followed by zero or more balanced parenthesis. It will fail for nested or non-balanced parenthesis though.
To keep only things inside parenthesis you can use
s.replace(/.*?(\([^)]*\)).*?/g, "$1")
meaning is:
.*? any sequence of any char (but the shortest possible sequence)
\( an open parenthesis
[^)]* zero or more chars that are NOT a closed parenthesis
\) a close parenthesis
.*? any sequence of any char (but the shortest possible)
the three middle elements are what is kept using grouping (...) and $1.
To remove everything after the first period the expression is simply:
s.replace(/\..*/, "")
meaning:
\. the dot character (. is special and would otherwise mean "any char")
.* any sequence of any characters (i.e. everything until the end of the string)
replacing it with the empty string

Creating javascript regex tp replace characters using whitelist

I'm trying to create a regex which will replace all the characters which are not in the specified white list (letters,digits,whitespaces, brackets, question mark and explanation mark)
This is the code :
var regEx = /^[^(\s|\w|\d|()|?|!|<br>)]*?$/;
qstr += tempStr.replace(regEx, '');
What is wrong with it ?
Thank you
The anchors are wrong - they only allow the regex to match the entire string
The lazy quantifier is wrong - you wouldn't want the regex to match 0 characters (if you have removed the anchors)
The parentheses and pipe characters are wrong - you don't need them in a character class.
The <br> is wrong - you can't match specific substrings in a character class.
The \d is superfluous since it's already contained in \w (thanks Alex K.!)
You're missing the global modifier to make sure you can do more than one replace.
You should be using + instead of * in order not to replace lots of empty strings with themselves.
Try
var regEx = /[^\s\w()?!]+/g;
and handle the <br>s independently (before that regex is applied, or the brackets will be removed).
You'll want to use the g (global) modifier:
var regEx = /^[^(\s|\w|\d|()|?|!|<br>)]*?$/g; // <-- `g` goes there
qstr += tempStr.replace(regEx, '');
This allows your expression to match multiple times.

Javascript regex parsing

I'm looking to parse some formatting out of a field using javascript. My rule is catching some extra things which I need to fix. The regex is:
/[\((\)\s)-]/g
This regex is properly cleaning up: (123) 456-7890 the problem I'm having is that it is also removing all spaces rather than just spaces following a closing parentheses. I'm no expert in regex but it was my understanding that (\)\s) would only remove the closing parentheses and space combo. What would the correct regex look like? It needs to remove all parentheses and dashes. Also, only remove spaces immediately following a closing parentheses.
The outcomes I would like are such.
The replace method i am using should work as such
var str = mystring.replace(/[\((\)\s)-]/g, '');
(123) 456-7890 should become 1234567890 which is working.
leave me alone should stay leave me alone the issue is that it is becoming leavemealone
This will do the job:
var str = mystring.replace(/\)\s*|\(\s*|-/g, '');
Explanation of the regex:
\)\s* : Open parenthesis followed by any number of whitespace
| : OR
\(\s* : Close parenthesis followed by any number of whitespace
| : OR
- : Hyphen
Since parenthesis are regex-metacharacters used for grouping they need to be escaped when you want to match them literally.
Placing everything in brackets ([]) creates a class of characters to match anywhere in the input. Taking your requirements literally ("remove all parentheses, dashes and spaces immediately following a closing parentheses"):
"(123) 456-789 0".replace(/\)[\(\)\s-]+/g, ")")
Output:
"(123)456-789 0"
This matches (essentially) the same character class, but specifies that these characters immediately follow a closing parenthesis.
You could use lookbehind to ensure that there is a paranthesis or something else preceding the space:
(?<=\))\s
------------ OLD ANSWER ----------
If you want to remove all paranthesis, dashes and spaces, you would go with something like this:
/[\s-\(\)]+/g
[something] - would look for anything that is in the brackets (letters s, o, m, e, t, h, i, n, g).
\s = white space
( = paranthesis
) = paranthesis
+ = at least one or more occurance of what is preceding it (which would be paranthesis, white space and dashes)

Javascript string replace with regex to strip off illegal characters

Need a function to strip off a set of illegal character in javascript: |&;$%#"<>()+,
This is a classic problem to be solved with regexes, which means now I have 2 problems.
This is what I've got so far:
var cleanString = dirtyString.replace(/\|&;\$%#"<>\(\)\+,/g, "");
I am escaping the regex special chars with a backslash but I am having a hard time trying to understand what's going on.
If I try with single literals in isolation most of them seem to work, but once I put them together in the same regex depending on the order the replace is broken.
i.e. this won't work --> dirtyString.replace(/\|<>/g, ""):
Help appreciated!
What you need are character classes. In that, you've only to worry about the ], \ and - characters (and ^ if you're placing it straight after the beginning of the character class "[" ).
Syntax: [characters] where characters is a list with characters.
Example:
var cleanString = dirtyString.replace(/[|&;$%#"<>()+,]/g, "");
I tend to look at it from the inverse perspective which may be what you intended:
What characters do I want to allow?
This is because there could be lots of characters that make in into a string somehow that blow stuff up that you wouldn't expect.
For example this one only allows for letters and numbers removing groups of invalid characters replacing them with a hypen:
"This¢£«±Ÿ÷could&*()\/<>be!##$%^bad".replace(/([^a-z0-9]+)/gi, '-');
//Result: "This-could-be-bad"
You need to wrap them all in a character class. The current version means replace this sequence of characters with an empty string. When wrapped in square brackets it means replace any of these characters with an empty string.
var cleanString = dirtyString.replace(/[\|&;\$%#"<>\(\)\+,]/g, "");
Put them in brackets []:
var cleanString = dirtyString.replace(/[\|&;\$%#"<>\(\)\+,]/g, "");

regex and javascript

using http://www.regular-expressions.info/javascriptexample.html I tested the following regex
^\\{1}([0-9])+
this is designed to match a backslash and then a number.
It works there
If I then try this directly in code
var reg = /^\\{1}([0-9])+/;
reg.exec("/123")
I get no matches!
What am I doing wrong?
Update:
Regarding the update of your question. Then the regex has to be:
var reg = /^\/(\d+)/;
You have to escape the slash inside the regex with \/.
The backslash needs to be escaped in the string too:
reg.exec("\\123")
Otherwise \1 will be treated as special character.
Btw, the regular expression can be simplified:
var reg = /^\\(\d+)/;
Note that I moved the quantifier + inside the capture group, otherwise it will only capture a single digit (namely 3) and not the whole number 123.
You need to escape the backslash in your string:
"\\123"
Also, for various implementation bugs, you may want to set reg.lastIndex = 0;.
In addition, {1} is completely redundant, you can simplify your regex to /^\\(\d)+/.
One last note: (\d)+ will only capture the last digit, you may want (\d+).

Categories