JavaScript regex replace - but only part of matched string? - javascript

I have the following replace function
myString.replace(/\s\w(?=\s)/,"$1\xA0");
The aim is to take single-letter words (e.g. prepositions) and add a non-breaking space after them, instead of standard space.
However the above $1 variable doesn't work for me. It inserts text "$1 " instead of a part of original matched string + nbsp.
What is the reason for the observed behaviour? Is there any other way to achieve it?

$1 doesn't work because you don't have any capturing subgroups.
The regular expression should be something like /\b(\w+)\s+/.

Seems you want to do something like this:
myString.replace(/\s(\w)\s/,"$1\xA0");
but that way you will loose the whitespace before your single-letter word. So you probably want to also include the first \s in the capturing group.

Related

Understanding replacement using regex

I want to remove all trailing and leading dashes (-) and replace any repeating dashes with one dash otherwise in JavaScript. I've developed a regex to do it:
"----asdas----asd-as------q---".replace(/^-+()|()-+$|(-)+/g,'$3')
And it works:
asdas-asd-as-q
But I don't understand the $3 part (obtained through desperate experiment). Why not $1?
You can actually use this without any capturing groups:
"----asdas----asd-as------q---".replace(/^-+|-+$|-+(?=-)/g, '');
//=> "asdas-asd-as-q"
Here -+(?=-) is a positive lookahead that makes sure to match 1 or more hyphens except the last - in the match.
Because there are 3 capturing groups. (two redundant empty ones and (-)). $3 replaced with the string that matched the third group.
If you remove the first two empty capturing groups, you can use $1.
"----asdas----asd-as------q---".replace(/^-+|-+$|(-)+/g, '$1')
// => "asdas-asd-as-q"
As other answers say, $3 indicates the third captured subpattern, ie. third set of parentheses.
Personally, however, I would see that as two operations, and do it as such:
Trim leading and trailing -s
Condense duplicate -s
Like so:
"----asdas----asd-as------q---".replace(/^-+|-+$/g,"").replace(/--+/g,"-");
This kind of concept may mean more code, but I believe it makes it much easier to read and understand what's going on here, because you're doing one thing at a time instead of trying to do everything at once.
$ are the replacement groups being formed.
See demo.
http://regex101.com/r/pP3pN1/25
On the right side you can see the groups being generated by ().
Replace and see.$1 is blank in your case.

Nice way to do this regex substitution

I'm writing a javascript function which takes a regex and some elements against which it matches the regex against the name attribute.
Let's say i'm passed this regex
/cmw_step_attributes\]\[\d*\]/
and a string that is structured like this
"foo[bar][]chicken[123][cmw_step_attributes][456][name]"
where all the numbers could vary, or be missing. I want to match the regex against the string in order to swap out the 456 for another number (which will vary), eg 789. So, i want to end up with
"foo[bar][]chicken[123][cmw_step_attributes][789][name]"
The regex will match the string, but i can't swap out the whole regex for 789 as that will wipe out the "[cmw_step_attributes][" bit. There must be a clean and simple way to do this but i can't get my head round it. Any ideas?
thanks, max
Capture the first part and put it back into the string.
.replace(/(cmw_step_attributes\]\[)\d*/, '$1789');
// note I removed the closing ] from the end - quantifiers are greedy so all numbers are selected
// alternatively:
.replace(/cmw_step_attributes\]\[\d*\]/, 'cmw_step_attributes][789]')
Either literally rewrite part that must remain the same in replacement string, or place it inside capturing brackets and reference it in replace.
See answer on: Regular Expression to match outer brackets.
Regular expressions are the wrong tool for the job because you are dealing with nested structures, i.e. recursion.
Have you tried:
var str = 'foo[bar][]chicken[123][cmw_step_attributes][456][name]';
str.replace(/cmw_step_attributes\]\[\d*?\]/gi, 'cmw_step_attributes][XXX]');

Alternation operator inside square brackets does not work

I'm creating a javascript regex to match queries in a search engine string. I am having a problem with alternation. I have the following regex:
.*baidu.com.*[/?].*wd{1}=
I want to be able to match strings that have the string 'word' or 'qw' in addition to 'wd', but everything I try is unsuccessful. I thought I would be able to do something like the following:
.*baidu.com.*[/?].*[wd|word|qw]{1}=
but it does not seem to work.
replace [wd|word|qw] with (wd|word|qw) or (?:wd|word|qw).
[] denotes character sets, () denotes logical groupings.
Your expression:
.*baidu.com.*[/?].*[wd|word|qw]{1}=
does need a few changes, including [wd|word|qw] to (wd|word|qw) and getting rid of the redundant {1}, like so:
.*baidu.com.*[/?].*(wd|word|qw)=
But you also need to understand that the first part of your expression (.*baidu.com.*[/?].*) will match baidu.com hello what spelling/handle????????? or hbaidu-com/ or even something like lkas----jhdf lkja$##!3hdsfbaidugcomlaksjhdf.[($?lakshf, because the dot (.) matches any character except newlines... to match a literal dot, you have to escape it with a backslash (like \.)
There are several approaches you could take to match things in a URL, but we could help you more if you tell us what you are trying to do or accomplish - perhaps regex is not the best solution or (EDIT) only part of the best solution?

Regex to get the text after first "space"?

Here a string I need to parse using regex.
http://carto1.wallonie.be/documents/terrils/fiche_terril.idc?TERRIL_id=1 Crachet 7/12
In fact this is an url followed by 1 space and a text.
I need to extract url and the text in 2 separate ways.
To extract the url \S+ is working just fine.
But to extract the text after first space, it gets really hard to understand.
I am using Yahoo Pipes. (I don't know if this link to edit the code will work)
EDIT:
Using (\S+) (.+) gives me something weird:
According to the Pipes documentation, it looks like it uses fairly standard regex syntax. Try this:
^(\S+)\s(.+)$
Then the URL will be $1 and the comment will be $2. The . operator matches any character, which you will need since it looks like the comments may have spaces.
EDIT: changed from literal space to \s since you might be looking at some odd whitespace character(s). You might as well throw a ^ and $ in there too, so the match fails instead of doing something weird.

Creating Slugs from Titles?

I have everything in place to create slugs from titles, but there is one issue. My RegEx replaces spaces with hyphens. But when a user types "Hi there" (multiple spaces) the slug ends up as "Hi-----there". When really it should be "Hi-there".
Should I create the regular expression so that it only replaces a space when there is a character either side?
Or is there an easier way to do this?
I use this:
yourslug.replace(/\W+/g, '-')
This replaces all occurrences of one or more non-alphanumeric characters with a single dash.
Just match multiple whitespace characters.
s/\s+/-/g
Daniel's answer is correct.
However if somebody is looking for complete solution I like this function,
http://dense13.com/blog/2009/05/03/converting-string-to-slug-javascript/
Thanks to "dense13"!
It might be the easiest to fold repeated -s into one - as the last step:
replace /-{2,}/ by "-"
Or if you only want this to affect spaces, fold spaces instead (before the other steps, obviously)
I would replace [\s]+ with '-' and then replace [^\w-] with ''
You may want to trim the string first, to avoid leading and trailing hyphens.
function hyphenSpace(s){
s= (s.trim)? s.trim(): s.replace(/^\s+|\s+$/g,'');
return s.split(/\s+/).join('-');
}

Categories