Javascript Regex remove specialchars except - and æøå - javascript

I got this:
var stringToReplace = 'æøasdasd\89-asdasd sse';
var desired = stringToReplace.replace(/[^\w\s]/gi, '');
alert(desired);
I found the replace rule from another SO question.
This works fine, it gives output:
asdasd89asdasd sse
Although I would like to set up additional rules:
Keep æøå characters
Keep - character
Turn whitespace/space into a - character
So the output would be:
æøåasdasd89-asdasd-sse
I know I can run an extra line:
stringtoReplace.replace(' ', '-');
to accomplish my 3) goal - but I dont know what to do with the 1 and 2), since I am not into regex expressions ?

This should work:
str = str.replace(/[^æøå\w -]+/g, '').replace(/ +/g, '-');
Live Demo: http://ideone.com/d60qrX

You can just add the special characters to the exclusion list.
/[^\w\sæøå-]/gi
Fiddle with example here.
And as you said - you can use another replace to replace spaces with dashes

Your original regex [^\w\s] targets any character which isn't a word or whitespace character (-, for example). To include other characters in this regex's 'whitelist', simply add them to that character group:
stringToReplace.replace(/[^\w\sæøå-]/gi, '');
As for replacing spaces with hyphens, you cannot do both in a single regex. You can, however, use a string replacement afterwards to solve that.
stringToReplace.replace(" ","-");

Related

Strip everything but letters and numbers and replace spaces that are in the sentence with hyphens

So I'm trying to parse a string similar to the way StackOverflow's tags work. So letters and numbers are allowed, but everything else should be stripped. Also spaces should be replaced with hyphens, but only if they are inside the word and not have disallowed characters before them.
This is what I have right now:
label = label.trim();
label = label.toLowerCase();
label = label.replace(/[^A-Za-z0-9\s]/g,'');
label = label.replace(/ /g, '-');
This works but with a few caveats, for example this:
/ this. is-a %&&66 test tag . <-- (4 spaces here, the arrow and this text is not part of the test string)
Becomes:
-this-is-a66-test-tag----
Expected:
this-is-a66-test-tag
I looked at this to get what I have now:
How to remove everything but letters, numbers, space, exclamation and question mark from string?
But like I said it doesn't fully give me what I'm looking for.
How do I tweak my code to give me what I want?
You need to make 2 changes:
Since you do not replace all whitespace with the first replace you need to replace all whitespace chars with the second regex (so, a plain space must be replaced with \s, and even better, with \s+ to replace multiple consecutive occurrences),
To get rid of leading/trailing hyphens in the end, use trim() after the first replace.
So, the actual fix will look like
var label = " / this. is-a %&&66 test tag . ";
label = label.replace(/[^a-z0-9\s-]/ig,'')
.trim()
.replace(/\s+/g, '-')
.toLowerCase();
console.log(label); // => this-isa-66-test-tag
Note that if you add - to the first regex, /[^a-z0-9\s-]/ig, you will also keep the original hyphens in the output and it will look like this-is-a-66-test-tag for the current test case.
Use trim just before changing all spaces with hyphens.
You can use this function:
function tagit(label) {
label = label.toLowerCase().replace(/[^A-Za-z0-9\s]/g,'');
return label.trim().replace(/ /g, '-'); }
var str = 'this. is-a %&&66 test tag .'
console.log(tagit(str));
//=> "this-isa-66-test-tag"

Creating javascript regex tp replace characters using whitelist

I'm trying to create a regex which will replace all the characters which are not in the specified white list (letters,digits,whitespaces, brackets, question mark and explanation mark)
This is the code :
var regEx = /^[^(\s|\w|\d|()|?|!|<br>)]*?$/;
qstr += tempStr.replace(regEx, '');
What is wrong with it ?
Thank you
The anchors are wrong - they only allow the regex to match the entire string
The lazy quantifier is wrong - you wouldn't want the regex to match 0 characters (if you have removed the anchors)
The parentheses and pipe characters are wrong - you don't need them in a character class.
The <br> is wrong - you can't match specific substrings in a character class.
The \d is superfluous since it's already contained in \w (thanks Alex K.!)
You're missing the global modifier to make sure you can do more than one replace.
You should be using + instead of * in order not to replace lots of empty strings with themselves.
Try
var regEx = /[^\s\w()?!]+/g;
and handle the <br>s independently (before that regex is applied, or the brackets will be removed).
You'll want to use the g (global) modifier:
var regEx = /^[^(\s|\w|\d|()|?|!|<br>)]*?$/g; // <-- `g` goes there
qstr += tempStr.replace(regEx, '');
This allows your expression to match multiple times.

Regex to replace string with word and characters

I've got three working regexp's,
string.replace(\catalogue\g, "") // replace a the word catalogue
string.replace(/[/:]/g, "") // replace the characters /, :
string.replace(\20%\g, "") // replace '20%'
Instead of replacing the string three times, I want to combine my regexp's.
Wanted result = 'removethewordnow';
var string = 'rem:ove20%the/word:catalogue20%now';
My latest try was:
string.replace(/\catalogue\b|[/20%:]/g, ""); // works, but catalouge is unaffected and 20% isn't combined as a word
Off the top of my head:
string.replace(/(catalogue|[\/:]|20%)/g,"");
Just use an alternative, i.e. separate each of the regular expressions you had before by the alternation operator |:
catalogue|20%|[/:]
Also note that you cannot just combine character classes and literal strings in the way you have done there. Above naïve combination works and everything beyond that might be optimisation (even if it can't be optimised further in this case) – but that only works if you don't change the language described by the regex.
You seem to be having a typo there (\c), also you don't want 20% inside the character class (and you should escape the slash). You also need to remove the word boundaries if you want to allow catalogue20% to match - there is no boundary between catalogue and 20, therefore the \b fails:
string.replace(/catalogue|20%|[\/:]/g, "");
var string = 'rem:ove20%the/word:catalogue20%now';
string.replace(/([:/]|20%|catalogue)/g, '');
\b refers to a word boundary, but your word catalogue is mixed with other words. So your regex should be:
string.replace(/catalogue|[\/20%:]/g, "");
Also do escape the / with \/.
string.replace(/catalogue|20%|[/:]/g, '')

jQuery remove special characters from string and more

I have a string like this:
var str = "I'm a very^ we!rd* Str!ng.";
What I would like to do is removing all special characters from the above string and replace spaces and in case they are being typed, underscores, with a - character.
The above string would look like this after the "transformation":
var str = 'im-a-very-werd-strng';
replace(/[^a-z0-9\s]/gi, '') will filter the string down to just alphanumeric values and replace(/[_\s]/g, '-') will replace underscores and spaces with hyphens:
str.replace(/[^a-z0-9\s]/gi, '').replace(/[_\s]/g, '-')
Source for Regex: RegEx for Javascript to allow only alphanumeric
Here is a demo: http://jsfiddle.net/vNfrk/
Assuming by "special" you mean non-word characters, then that is pretty easy.
str = str.replace(/[_\W]+/g, "-")
str.toLowerCase().replace(/[\*\^\'\!]/g, '').split(' ').join('-')
Remove numbers, underscore, white-spaces and special characters from the string sentence.
str.replace(/[0-9`~!##$%^&*()_|+\-=?;:'",.<>\{\}\[\]\\\/]/gi,'');
Demo
this will remove all the special character
str.replace(/[_\W]+/g, "");
this is really helpful and solve my issue. Please run the below code and ensure it works
var str="hello world !#to&you%*()";
console.log(str.replace(/[_\W]+/g, ""));
Since I can't comment on Jasper's answer, I'd like to point out a small bug in his solution:
str.replace(/[^a-z0-9\s]/gi, '').replace(/[_\s]/g, '-');
The problem is that first code removes all the hyphens and then tries to replace them :)
You should reverse the replace calls and also add hyphen to second replace regex. Like this:
str.replace(/[_\s]/g, '-').replace(/[^a-z0-9-\s]/gi, '');
Remove/Replace all special chars in Jquery :
If
str = My name is "Ghanshyam" and from "java" background
and want to remove all special chars (") then use this
str=str.replace(/"/g,' ')
result:
My name is Ghanshyam and from java background
Where g means Global
var str = "I'm a very^ we!rd* Str!ng.";
$('body').html(str.replace(/[^a-z0-9\s]/gi, " ").replace(/^\s+|\s+$|\s+(?=\s)/g, "").replace(/[_\s]/g, "-").toLowerCase());
First regex remove special characters with spaces than remove extra spaces from string and the last regex replace space with "-"

Javascript string replace with regex to strip off illegal characters

Need a function to strip off a set of illegal character in javascript: |&;$%#"<>()+,
This is a classic problem to be solved with regexes, which means now I have 2 problems.
This is what I've got so far:
var cleanString = dirtyString.replace(/\|&;\$%#"<>\(\)\+,/g, "");
I am escaping the regex special chars with a backslash but I am having a hard time trying to understand what's going on.
If I try with single literals in isolation most of them seem to work, but once I put them together in the same regex depending on the order the replace is broken.
i.e. this won't work --> dirtyString.replace(/\|<>/g, ""):
Help appreciated!
What you need are character classes. In that, you've only to worry about the ], \ and - characters (and ^ if you're placing it straight after the beginning of the character class "[" ).
Syntax: [characters] where characters is a list with characters.
Example:
var cleanString = dirtyString.replace(/[|&;$%#"<>()+,]/g, "");
I tend to look at it from the inverse perspective which may be what you intended:
What characters do I want to allow?
This is because there could be lots of characters that make in into a string somehow that blow stuff up that you wouldn't expect.
For example this one only allows for letters and numbers removing groups of invalid characters replacing them with a hypen:
"This¢£«±Ÿ÷could&*()\/<>be!##$%^bad".replace(/([^a-z0-9]+)/gi, '-');
//Result: "This-could-be-bad"
You need to wrap them all in a character class. The current version means replace this sequence of characters with an empty string. When wrapped in square brackets it means replace any of these characters with an empty string.
var cleanString = dirtyString.replace(/[\|&;\$%#"<>\(\)\+,]/g, "");
Put them in brackets []:
var cleanString = dirtyString.replace(/[\|&;\$%#"<>\(\)\+,]/g, "");

Categories