Creating Slugs from Titles? - javascript

I have everything in place to create slugs from titles, but there is one issue. My RegEx replaces spaces with hyphens. But when a user types "Hi there" (multiple spaces) the slug ends up as "Hi-----there". When really it should be "Hi-there".
Should I create the regular expression so that it only replaces a space when there is a character either side?
Or is there an easier way to do this?

I use this:
yourslug.replace(/\W+/g, '-')
This replaces all occurrences of one or more non-alphanumeric characters with a single dash.

Just match multiple whitespace characters.
s/\s+/-/g

Daniel's answer is correct.
However if somebody is looking for complete solution I like this function,
http://dense13.com/blog/2009/05/03/converting-string-to-slug-javascript/
Thanks to "dense13"!

It might be the easiest to fold repeated -s into one - as the last step:
replace /-{2,}/ by "-"
Or if you only want this to affect spaces, fold spaces instead (before the other steps, obviously)

I would replace [\s]+ with '-' and then replace [^\w-] with ''

You may want to trim the string first, to avoid leading and trailing hyphens.
function hyphenSpace(s){
s= (s.trim)? s.trim(): s.replace(/^\s+|\s+$/g,'');
return s.split(/\s+/).join('-');
}

Related

Understanding replacement using regex

I want to remove all trailing and leading dashes (-) and replace any repeating dashes with one dash otherwise in JavaScript. I've developed a regex to do it:
"----asdas----asd-as------q---".replace(/^-+()|()-+$|(-)+/g,'$3')
And it works:
asdas-asd-as-q
But I don't understand the $3 part (obtained through desperate experiment). Why not $1?
You can actually use this without any capturing groups:
"----asdas----asd-as------q---".replace(/^-+|-+$|-+(?=-)/g, '');
//=> "asdas-asd-as-q"
Here -+(?=-) is a positive lookahead that makes sure to match 1 or more hyphens except the last - in the match.
Because there are 3 capturing groups. (two redundant empty ones and (-)). $3 replaced with the string that matched the third group.
If you remove the first two empty capturing groups, you can use $1.
"----asdas----asd-as------q---".replace(/^-+|-+$|(-)+/g, '$1')
// => "asdas-asd-as-q"
As other answers say, $3 indicates the third captured subpattern, ie. third set of parentheses.
Personally, however, I would see that as two operations, and do it as such:
Trim leading and trailing -s
Condense duplicate -s
Like so:
"----asdas----asd-as------q---".replace(/^-+|-+$/g,"").replace(/--+/g,"-");
This kind of concept may mean more code, but I believe it makes it much easier to read and understand what's going on here, because you're doing one thing at a time instead of trying to do everything at once.
$ are the replacement groups being formed.
See demo.
http://regex101.com/r/pP3pN1/25
On the right side you can see the groups being generated by ().
Replace and see.$1 is blank in your case.

Regex to replace string with word and characters

I've got three working regexp's,
string.replace(\catalogue\g, "") // replace a the word catalogue
string.replace(/[/:]/g, "") // replace the characters /, :
string.replace(\20%\g, "") // replace '20%'
Instead of replacing the string three times, I want to combine my regexp's.
Wanted result = 'removethewordnow';
var string = 'rem:ove20%the/word:catalogue20%now';
My latest try was:
string.replace(/\catalogue\b|[/20%:]/g, ""); // works, but catalouge is unaffected and 20% isn't combined as a word
Off the top of my head:
string.replace(/(catalogue|[\/:]|20%)/g,"");
Just use an alternative, i.e. separate each of the regular expressions you had before by the alternation operator |:
catalogue|20%|[/:]
Also note that you cannot just combine character classes and literal strings in the way you have done there. Above naïve combination works and everything beyond that might be optimisation (even if it can't be optimised further in this case) – but that only works if you don't change the language described by the regex.
You seem to be having a typo there (\c), also you don't want 20% inside the character class (and you should escape the slash). You also need to remove the word boundaries if you want to allow catalogue20% to match - there is no boundary between catalogue and 20, therefore the \b fails:
string.replace(/catalogue|20%|[\/:]/g, "");
var string = 'rem:ove20%the/word:catalogue20%now';
string.replace(/([:/]|20%|catalogue)/g, '');
\b refers to a word boundary, but your word catalogue is mixed with other words. So your regex should be:
string.replace(/catalogue|[\/20%:]/g, "");
Also do escape the / with \/.
string.replace(/catalogue|20%|[/:]/g, '')

Regex to find last space character

I am looking for a regex that will give me the index of the last space in a string using javascript.
I was using goolge to find a suitable regex, but no success.
Even the SO-Question Regex to match last space character does not hold a solution because the goal there was to remove more than one character in the end.
What is the correct regex?
As I commented I would just use lastIndexOf() but here is a regex solution:
The regex / [^ ]*$/ finds the last space character in a string. Use it like this:
// Alerts 9
alert("this is a str".search(/ [^ ]*$/));
The correct solution is not using a regex at all but the built-in lastIndexOf method strings have. Regexes are meant to match strings, not give you indexes (even though grouped matchs may be returned as index+length instead of a string - C-based regex libraries usually do so to avoid unnecessary copying)

Struggling with regex to match only two of a character, not three

I need to match all occurrences of // in a string in a Javascript regex
It can't match /// or /
So far I have (.*[^\/])\/{2}([^\/].*)
which is basically "something that isn't /, followed by // followed by something that isn't /"
The approach seems to work apart from when the string I want to match starts with //
This doesn't work:
//example
This does
stuff // example
How do I solve this problem?
Edit: A bit more context - I am trying to replace // with !, so I am then using:
result = result.replace(myRegex, "$1 ! $2");
Replace two slashes that either begin the string or do not follow a slash,
and are followed by anything not a slash or the end of the string.
s=s.replace(/(^|[^/])\/{2}([^/]|$)/g,'$1!$2');
It looks like it wouldn't work for example// either.
The problem is because you're matching // preceded and followed by at least one non-slash character. This can be solved by anchoring the regex, and then you can make the preceding/following text optional:
^(.*[^\/])?\/{2}([^\/].*)?$
Use negative lookahead/lookbehind assertions:
(.*)(?<!/)//(?!/)(.*)
Use this:
/([^/]*)(\/{2})([^/]*)/g
e.g.
alert("///exam//ple".replace(/([^/]*)(\/{2})([^/]*)/g, "$1$3"));
EDIT: Updated the expression as per the comment.
/[/]{2}/
e.g:
alert("//example".replace(/[/]{2}/, ""));
This does not answer the OP's question about using regex, but since some of the original comments suggested using .replaceAll, since not everyone who reads the question in the future wants to use regex, since people might mistakenly assume that regex is the only alternative, and since these details cannot be accommodated by submitting a comment, here's a poor man's non-regex approach:
Temporarily replace the three contiguous characters with something that would never naturally occur — really important when dealing with user-entered values.
Replace the remaining two contiguous characters using .replaceAll().
Return the original three contiguous characters.
For instance, let's say you wanted to remove all instances of ".." without affecting occurrences of "...".
var cleansedText = $(this).text().toString()
.replaceAll("...", "☰☸☧")
.replaceAll("..", "")
.replaceAll("☰☸☧", "...")
;
$(this).text(cleansedText);
Perhaps not as fast as regex for longer strings, but works great for short ones.

JavaScript regex replace - but only part of matched string?

I have the following replace function
myString.replace(/\s\w(?=\s)/,"$1\xA0");
The aim is to take single-letter words (e.g. prepositions) and add a non-breaking space after them, instead of standard space.
However the above $1 variable doesn't work for me. It inserts text "$1 " instead of a part of original matched string + nbsp.
What is the reason for the observed behaviour? Is there any other way to achieve it?
$1 doesn't work because you don't have any capturing subgroups.
The regular expression should be something like /\b(\w+)\s+/.
Seems you want to do something like this:
myString.replace(/\s(\w)\s/,"$1\xA0");
but that way you will loose the whitespace before your single-letter word. So you probably want to also include the first \s in the capturing group.

Categories