How to remove a substring between two specific characters - javascript

So I have a string:
"this is the beginning, this is what i want to remove/ and this is the end"
How do I use Javascript to target the string between the comma and the forward slash?
(I also want to remove the comma and the slash)

string = string.replace( /,.*?\//, '' );
In the regexp, , matches itself, .*? matches any sequence of characters (preferring the shortest match, due to the ? modifier) and \/ matches a slash. (The slash needs to be escaped with a backslash because it's also used as the regexp delimiter. Unfortunately, JavaScript doesn't support alternative regexp literal delimiters, so you'll just have to cope with the leaning toothpick syndrome.)
Note that the code above removes everything from the first comma to the first following slash in the string. Depending on what you want to happen if there might be multiple commas or slashes, there are various ways to modify the behavior of the regexp. For example:
.replace( /,.*\//, '' ) removes everything from the first comma to the last slash;
.replace( /^(.*),.*?\//, '$1' ) removes everything from the last comma followed by a slash up to the next slash;
.replace( /^(.*),.*\//, '$1' ) removes everything from the last comma followed by a slash up to the last slash in the string;
.replace( /,[^\/]*\//, '' ) does the same as the original regexp, i.e. removes everything from the first comma to the first following slash (even if there are other commas in between);
.replace( /,[^,\/]*\//, '' ) removes the first substring that begins with a comma and ends with a slash, and does not contain any commas or slashes in between;
.replace( /^(.*),[^,\/]*\//, '$1' ) removes the last substring that begins with a comma and ends with a slash, and has no commas or slashes in between; and
.replace( /^,[^,\/]*\//g, '' ) removes all substrings that begin with a comma and end with a slash, and have no commas or slashes in between.
Also note that, due to a historical quirk of regexp syntax, the . metacharacter will actually match any character except a newline ("\n"). Most regexp implementations provide some way to turn off this quirk and make . really match all characters, but JavaScript, for some reason, doesn't. If your string might contain newlines, you should replace all occurrences of . in the regexps above with some workaround such as [\s\S] (works in most PCRE-style regexp engines) or [^] (shorter, but specific to JavaScript's regexp flavor).

As your question is tagged with regex, you seem to want .replace:
return str.replace(/, this is what i want to remove\//, "");
If you don't know the string in between, use
/,(.+?)\//
or with a * instead of the + if the string could also be empty
With string functions, it would be
var cpos = str.indexOf(","),
spos = str.indexOf("/");
if (cpos > -1 && spos > cpos)
return str.substr(0, cpos)+str.substr(spos+1);

return str.split(',')[1].split('/')[0];

var commaIndex=str.indexOf(',')
var slashIndex=str.indexOf('/')
var finalString=str.substring(commaIndex,slashIndex)
Hope it will work

Related

Regex keeps finding character I want matched along with previous character

I have the following regex in javascript for a split operation since I can't do a negative look behind to find any delimiters , in a string that is not proceeded by one or more escape characters of \.
[^\\],
The regex works fine for finding where the commas not proceeded by \ are, but also finds the character that proceeds the comma as a match and thus splits the string incorrectly.
For example if I had the string
hello\,there,are
The result would be that e, matches my regex and not just ,. Making the split string array read
[hello\,ther] [are]
Why does the regex I am using keep finding the comma and the proceeding character instead of only matching the comma?
You cannot use split here because you'd need a lookbehind that JS regex does not support. Use a match with appropriate regex. Like the one below:
/(?:[^\\,]|\\.)+/g
See the regex demo.
The pattern matches 1 or more (+) sequences of any char other than , and \ ([^\\,]) or (|) any escaped character (excluding linebreak chars) with \\.
JS demo:
var regex = /(?:[^\\,]|\\.)+/g;
var str = "hello\\,there,are";
var res = str.match(regex);
console.log(res);

Replace multiline text between two strings

I need to replace old value between foo{ and }bar using Javascript regex.
foo{old}bar
This works if old is a single line:
replace(
/(foo{).*(}bar)/,
'$1' + 'new' + '$2'
)
I need to make it work with:
foo{old value
which takes more
than one line}bar
How should I change my regex?
Change your regex to,
/(foo{)[^{}]*(}bar)/
OR
/(foo{)[\s\S]*?(}bar)/
so that it would match also a newline character. [^{}]* matches any character but not of { or }, zero or more times. [\s\S]*? matches any space or non-space characters, zero or more times non-greedily.

Capturing consecutive non-alphanumeric characters in a string in a single group

I want to replace all special characters in a string with dashes. I use the following regex to replace the characters.
var x = "Querty(&)keypad";
alert(x.replace(/[^A-Za-z0-9]/g, "-"));
However, this causes each character to be replaced by a dash, rather than replacing consecutive characters with a single dash. This examples gives me the output Querty---keypad. My desired output is Querty-keypad.
You can see the issue in this jsfiddle.
Use + to match 1 or more repetitions:
> "Querty(&)keypad".replace(/[^A-Za-z0-9]+/g, "-")
"Querty-keypad"

Why does my regexp not work when the strings end with spaces?

I am using this regexp - [^\s\da-zA-ZåäöÅÄÖ]+$ to filter out anything but A-Z, 0-9 plus the Swedish characters ÅÄÖ. It works as expected as long as the string isn't ending with whitespace and I am a bit confused on what I need correct to make it accept strings even if they end with whitespace. The \s is there but is apparently not enough.
What is wrong in my regexp?
"something #¤%&/()=?".replace(/[^\s\da-zA-ZåäöÅÄÖ]+$/, '') # => a string
"something ending with whitespace #¤%&/()=? ".replace(/[^\s\da-zA-ZåäöÅÄÖ]+$/, '')# => a string ending with space #¤%&/()=?
You're using a negated character class ("anything that is not a space, a digit, a letter etc."), therefore your regex fails to match.
Drop the \s from it, and also the $ (which ties the match to the end of the string), and it should work.
If you do want to keep spaces inside the string and only remove them at the end, use
"something with whitespace #¤%&/()=? ".replace(/[^\s\da-zA-ZåäöÅÄÖ]+|\s+$/g, '')
Result:
something with whitespace
Your regex says: "match one or more instances of the characters not in the following range, followed by end-of-string". This essentially means that your regex will match only sequences of not-allowed characters appearing at the end of the string. Since your test string ends with a whitespace, which is allowed by your logic, there's no 'sequence of not-allowed characters appearing at the end of the string' and so the regex doesn't match anything.
You can achieve your desired filtering if you remove the $ from the end of the regex and instead use the g flag to make it globally replace anything not in the specified character range with the empty string.
If you additionally want to trim trailing whitespace, it'd be better to do so using another regex, or a simpler trimRight call.

Regex to replace single backslashes, excluding those followed by certain chars

I have a regex expression which removes any backslashes from a string if not followed by one of these characters: \ / or }.
It should turn this string:
foo\bar\\batz\/hi
Into this:
foobar\\batz\/hi
But the problem is that it is dealing with each backslash as it goes along. So it follows the rule in that it removes that first backslash, and ignores the 2nd one because it is followed by another backslash. But when it gets to the 3rd one, it removes it, because it isn't followed by another.
My current code looks like this: str.replace(/\\(?!\\|\/|\})/g,"")
But the resulting string looks like this: foobar\batz\/hi
How do I get it to skip the 3rd backslash? Or is it a case of doing some sort of explicit negative search & replace type thing? Eg. replace '\', but don't replace '\\', '\/' or '\}'?
Please help! :)
EDIT
Sorry, I should have explained - I am using javascript, so I don't think I can do negative lookbehinds...
You need to watch out for an escaped backslash, followed by a single backslash. Or better: an uneven number of successive backslashes. In that case, you need to keep the even number of backslashes intact, and only replace the last one (if not followed by a / or {).
You can do that with the following regex:
(?<!\\)(?:((\\\\)*)\\)(?![\\/{])
and replace it with:
$1
where the first match group is the first even number of backslashes that were matched.
A short explanation:
(?<!\\) # looking behind, there can't be a '\'
(?:((\\\\)*)\\) # match an uneven number of backslashes and store the even number in group 1
(?![\\/{]) # looking ahead, there can't be a '\', '/' or '{'
In plain ENglish that would read:
match an uneven number of back-slashes, (?:((\\\\)*)\\), not followed by \\ or { or /, (?![\\/{]), and not preceded by a backslash (?<!\\).
A demo in Java (remember that the backslashes are double escaped!):
String s = "baz\\\\\\foo\\bar\\\\batz\\/hi";
System.out.println(s);
System.out.println(s.replaceAll("(?<!\\\\)(?:((\\\\\\\\)*)\\\\)(?![\\\\/{])", "$1"));
which will print:
baz\\\foo\bar\\batz\/hi
baz\\foobar\\batz\/hi
EDIT
And a solution that does not need look-behinds would look like:
([^\\])((\\\\)*)\\(?![\\/{])
and is replaced by:
$1$2
where $1 is the non-backslash char at the start, and $2 is the even (or zero) number of backslashes following that non-backslash char.
The required regex is as simple as \\.
You need to know however, that the second argument to replace() can be a function like so:
result = string.replace(/\\./g, function (ab) { // ab is the matched portion of the input string
var b = ab.charAt(1);
switch (b) { // if char after backslash
case '\\': case '}': case '/': // ..is one of these
return ab; // keep original string
default: // else
return b; // replace by second char
}
});
You need a lookahead, like you have, and also a lookbehind, to ensure that you dont delete the second slash (which clearly doesnt have a special character after it. Try this:
(?<![\\])[\\](?![\\\/\}]) as your regex

Categories