I'm trying to create a function that removes all special characters (including periods) except apostrophes when they are naturally part of a word. The regex pattern I've made is supposed to remove anything that doesn't fit the schema of word either followed by an apostrophe ' and/or another word:
function removeSpecialCharacters(str) {
return str.toLowerCase().replace(/[^a-z?'?a-z ]/g, ``)
}
console.log(removeSpecialCharacters(`I'm a string.`))
console.log(removeSpecialCharacters(`I'm a string with random stuff.*/_- '`))
console.log(removeSpecialCharacters(`'''`))
As you can see from the snippet it works well except for removing the rogue apostrophes.
And if I add something like [\s'\s] or ['] to the pattern it breaks it completely. Why is it doing this and what am I missing here?
Alternate the pattern with '\B, which will match and remove apostrophes which are not followed by a word character, eg ab' or ab'#, while preserving strings like ab'c:
function removeSpecialCharacters(str) {
return str.toLowerCase().replace(/'\B|[^a-z'? ]/g, ``)
}
console.log(removeSpecialCharacters(`I'm a string.`))
console.log(removeSpecialCharacters(`I'm a string with random stuff.*/_- '`))
console.log(removeSpecialCharacters(`'''`))
(you can also remove the duplicated characters from the character set)
Not sure what went wrong with yours as I can't see what you attempted. However, I got this to work.
function removeSpecialCharacters(str) {
str = str.toLowerCase();
// reduce duplicate apostrophes to single
str = str.replace(/'+/g,`'`);
// get rid of wacky chars
str = str.replace(/[^a-z'\s]/g,'');
// replace dangling apostrophes
str = str.replace(/(^|\s)'(\s|$)/g, ``);
return str;
}
console.log(removeSpecialCharacters(`I'm a string.`))
console.log(removeSpecialCharacters(`I'm a string with random stuff.*/_- '`))
console.log(removeSpecialCharacters(`'''`))
console.log(removeSpecialCharacters(`regex 'til i die`))
Here's one very easy solution. To remove certain characteristics from a string, you can run a bunch of if-statements through a while loop. This allows you to chose exactly which symbols to remove.
while (increment < string.length)
{
if (string[increment] == "!")
}
delete "!";
}
increment += 1;
}
That's a simple rundown of what'll look like (not actual code) to give you a sense of what you're doing.
I'm trying to replace double quotes with curly quotes, except when the text is wrapped in certain tags, like [quote] and [code].
Sample input
[quote="Name"][b]Alice[/b] said, "Hello world!"[/quote]
<p>"Why no goodbye?" replied [b]Bob[/b]. "It's always Hello!"</p>
Expected output
[quote="Name"][b]Alice[/b] said, "Hello world!"[/quote]
<p>“Why no goodbye?” replied [b]Bob[/b]. “It's always Hello!”</p>
I figured how to elegantly achieve what I want in PHP by using (*SKIP)(*F), however my code will be run in javascript, and the javascript solution is less than ideal.
Right now I'm splitting the string at those tags, running the replace, then putting the string together:
var o = 3;
a = a
.split(/(\[(?<first>(?:icode|quote|code))[^\]]*?\](?:[\s]*?.)*?[\s]*?\[\/(?:\k<first>)\])/i)
.map(function(x,i) {
if (i == o-1 && x) {
x = '';
}
else if (i == o && x)
{
x = x.replace(/(?![^<]*>|[^\[]*\])"([^"]*?)"/gi, '“$1”')
o = o+3;
}
return x;
}).join('');
Javascript Regex Breakdown
Inside split():
(\[(?<first>icode|quote|code)[^\]]*?\](?:.)*?\[\/(\k<first>)\]) - captures the pattern inside parentheses:
\[(?<first>quote|code|icode)[^\]]*?\] - a [quote], [code], or [icode] opening tag, with or without parameters like =html, eg [code=html]
(?:[\s]*?.)*? - any 0+ (as few as possible) occurrences of any char (.), preceded or not by whitespace, so it doesn't break if the opening tag is followed by a line break
[\s]*? - 0+ whitespaces
\[\/(\k<first>)\] - [\quote], [\code], or [\icode] closing tags. Matches the text captured in the (?<first>) group. Eg: if it's a quote opening tag, it'll be a quote closing tag
Inside replace():
(?![^<]*>|[^\[]*\])"([^"]*?)" - captures text inside double quotes:
(?![^<]*>|[^\[]*\]) - negative lookahead, looks for characters (that aren't < or [) followed by either > or ] and discards them, so it won't match anything inside bbcode and html tags. Eg: [spoiler="Name"] or <span style="color: #24c4f9">. Note that matches wrapped in tags are left untouched.
" - literal opening double quotes character.
([^"]*?) - any 0+ character, except double quotes.
" - literal closing double quotes character.
SPLIT() REGEX DEMO: https://regex101.com/r/Ugy3GG/1
That's awful, because the replace is executed multiple times.
Meanwhile, the same result can be achieved with a single PHP regex. The regex I wrote was based on Match regex pattern that isn't within a bbcode tag.
(\[(?<first>quote|code|icode)[^\]]*?\](?:[\s]*?.)*?[\s]*?\[\/(\k<first>)\])(*SKIP)(*F)|(?![^<]*>|[^\[]*\])"([^"]*?)"
PHP Regex Breakdown
(\[(?<first>quote|code|icode)[^\]]*?\](?:[\s]*?.)*?[\s]*?\[\/(\k<first>)\])(*SKIP)(*F) - matches the pattern inside capturing parentheses just like javascript split() above, then (*SKIP)(*F) make the regex engine omit the matched text.
| - or
(?![^<]*>|[^\[]*\])"([^"]*?)" - captures text inside double quotes in the same way javascript replace() does
PHP DEMO: https://regex101.com/r/fB0lyI/1
The beauty of this regex is that it only needs to be run once. No splitting and joining of strings. Is there a way to implement it in javascript?
Because JS lacks backtracking verbs you will need to consume those bracketed chunks but later replace them as is. By obtaining the second side of the alternation from your own regex the final regex would be:
\[(quote|i?code)[^\]]*\][\s\S]*?\[\/\1\]|(?![^<]*>|[^\[]*\])"([^"]*)"
But the tricky part is using a callback function with replace() method:
str.replace(regex, function($0, $1, $2) {
return $1 ? $0 : '“' + $2 + '”';
})
Above ternary operator returns $0 (whole match) if first capturing group exists otherwise it encloses second capturing group value in curly quotes and returns it.
Note: this may fail in different cases.
See live demo here
Nested markup is hard to parse with rx, and JS's RegExp in particular. Complex regular expressions also hard to read, maintain, and debug. If your needs are simple, a tag content replacement with some banned tags excluded, consider a simple code-based alternative to run-on RegExps:
function curly(str) {
var excludes = {
quote: 1,
code: 1,
icode: 1
},
xpath = [];
return str.split(/(\[[^\]]+\])/) // breakup by tag markup
.map(x => { // for each tag and content:
if (x[0] === "[") { // tag markup:
if (x[1] === "/") { // close tag
xpath.pop(); // remove from current path
} else { // open tag
xpath.push(x.slice(1).split(/\W/)[0]); // add to current path
} //end if open/close tag
} else { // tag content
if (xpath.every(tag =>!excludes[tag])) x = x.replace(/"/g, function repr() {
return (repr.z = !repr.z) ? "“" : "”"; // flip flop return value (naive)
});
} //end if markup or content?
return x;
}) // end term map
.join("");
} /* end curly() */
var input = `[quote="Name"][b]Alice[/b] said, "Hello world!"[/quote]
<p>"Why no goodbye?" replied [b]Bob[/b]. "It's always Hello!"</p>`;
var wants = `[quote="Name"][b]Alice[/b] said, "Hello world!"[/quote]
<p>“Why no goodbye?” replied [b]Bob[/b]. “It's always Hello!”</p>`;
curly(input) == wants; // true
To my eyes, even though it a bit longer, code allows documentation, indentation, and explicit naming that makes these sort of semi-complicated logical operations easier to understand.
If your needs are more complex, use a true BBCode parser for JavaScript and map/filter/reduce it's model as needed.
I have a String that I need to strip out all the spaces except for what between "". Here is the Regex that I am using to strip out spaces.
str.replace(/\s/g, "");
I cant seem to figure out how to get it to ignore spaces between quotes.
Example
str = 'Here is my example "leave spaces here", ok im done'
Output = 'Hereismyexample"leave spaces here",okimdone'
Another way to do it. This has the assumption that no escaping is allowed within double quoted part of the string (e.g. no "leave \" space \" here"), but can be easily modified to allow it.
str.replace(/([^"]+)|("[^"]+")/g, function($0, $1, $2) {
if ($1) {
return $1.replace(/\s/g, '');
} else {
return $2;
}
});
Modified regex to allow escape of " within quoted string:
/([^"]+)|("(?:[^"\\]|\\.)+")/
var output = input.split('"').map(function(v,i){
return i%2 ? v : v.replace(/\s/g, "");
}).join('"');
Note that I renamed the variables because I can't write code with a variable whose name starts with an uppercase and especially when it's a standard constructor of the language. I'd suggest you stick with those guidelines when in doubt.
Rob, resurrecting this question because it had a simple solution that only required one replace call, not two. (Found your question while doing some research for a regex bounty quest.)
The regex is quite short:
"[^"]+"|( )
The left side of the alternation matches complete quoted strings. We will ignore these matches. The right side matches and captures spaces to Group 1, and we know they are the right spaced because they were not matched by the expression on the left.
Here is working code (see demo):
var subject = 'Here is my example "leave spaces here", ok im done';
var regex = /"[^"]+"|( )/g;
replaced = subject.replace(regex, function(m, group1) {
if (group1 == "" ) return m;
else return "";
});
document.write(replaced);
Reference
How to match pattern except in situations s1, s2, s3
How to match a pattern unless...
I have strings with extra whitespace characters. Each time there's more than one whitespace, I'd like it be only one. How can I do this using JavaScript?
Something like this:
var s = " a b c ";
console.log(
s.replace(/\s+/g, ' ')
)
You can augment String to implement these behaviors as methods, as in:
String.prototype.killWhiteSpace = function() {
return this.replace(/\s/g, '');
};
String.prototype.reduceWhiteSpace = function() {
return this.replace(/\s+/g, ' ');
};
This now enables you to use the following elegant forms to produce the strings you want:
"Get rid of my whitespaces.".killWhiteSpace();
"Get rid of my extra whitespaces".reduceWhiteSpace();
Here's a non-regex solution (just for fun):
var s = ' a b word word. word, wordword word ';
// with ES5:
s = s.split(' ').filter(function(n){ return n != '' }).join(' ');
console.log(s); // "a b word word. word, wordword word"
// or ES2015:
s = s.split(' ').filter(n => n).join(' ');
console.log(s); // "a b word word. word, wordword word"
Can even substitute filter(n => n) with .filter(String)
It splits the string by whitespaces, remove them all empty array items from the array (the ones which were more than a single space), and joins all the words again into a string, with a single whitespace in between them.
using a regular expression with the replace function does the trick:
string.replace(/\s/g, "")
I presume you're looking to strip spaces from the beginning and/or end of the string (rather than removing all spaces?
If that's the case, you'll need a regex like this:
mystring = mystring.replace(/(^\s+|\s+$)/g,' ');
This will remove all spaces from the beginning or end of the string. If you only want to trim spaces from the end, then the regex would look like this instead:
mystring = mystring.replace(/\s+$/g,' ');
Hope that helps.
jQuery.trim() works well.
http://api.jquery.com/jQuery.trim/
I know I should not necromancy on a subject, but given the details of the question, I usually expand it to mean:
I want to replace multiple occurences of whitespace inside the string with a single space
...and... I do not want whitespaces in the beginnin or end of the string (trim)
For this, I use code like this (the parenthesis on the first regexp are there just in order to make the code a bit more readable ... regexps can be a pain unless you are familiar with them):
s = s.replace(/^(\s*)|(\s*)$/g, '').replace(/\s+/g, ' ');
The reason this works is that the methods on String-object return a string object on which you can invoke another method (just like jQuery & some other libraries). Much more compact way to code if you want to execute multiple methods on a single object in succession.
var x = " Test Test Test ".split(" ").join("");
alert(x);
Try this.
var string = " string 1";
string = string.trim().replace(/\s+/g, ' ');
the result will be
string 1
What happened here is that it will trim the outside spaces first using trim() then trim the inside spaces using .replace(/\s+/g, ' ').
How about this one?
"my test string \t\t with crazy stuff is cool ".replace(/\s{2,9999}|\t/g, ' ')
outputs "my test string with crazy stuff is cool "
This one gets rid of any tabs as well
If you want to restrict user to give blank space in the name just create a if statement and give the condition. like I did:
$j('#fragment_key').bind({
keypress: function(e){
var key = e.keyCode;
var character = String.fromCharCode(key);
if(character.match( /[' ']/)) {
alert("Blank space is not allowed in the Name");
return false;
}
}
});
create a JQuery function .
this is key press event.
Initialize a variable.
Give condition to match the character
show a alert message for your matched condition.
I want to remove all unnecessary commas from the start/end of the string.
eg; google, yahoo,, , should become google, yahoo.
If possible ,google,, , yahoo,, , should become google,yahoo.
I've tried the below code as a starting point, but it seems to be not working as desired.
trimCommas = function(s) {
s = s.replace(/,*$/, "");
s = s.replace(/^\,*/, "");
return s;
}
In your example you also want to trim the commas if there's spaces between them at the start or at the end, use something like this:
str.replace(/^[,\s]+|[,\s]+$/g, '').replace(/,[,\s]*,/g, ',');
Note the use of the 'g' modifier for global replace.
You need this:
s = s.replace(/[,\s]{2,}/,""); //Removes double or more commas / spaces
s = s.replace(/^,*/,""); //Removes all commas from the beginning
s = s.replace(/,*$/,""); //Removes all commas from the end
EDIT: Made all the changes - should work now.
My take:
var cleanStr = str.replace(/^[\s,]+/,"")
.replace(/[\s,]+$/,"")
.replace(/\s*,+\s*(,+\s*)*/g,",")
This one will work with opera, internet explorer, whatever
Actually tested this last one, and it works!
What you need to do is replace all groups of "space and comma" with a single comma and then remove commas from the start and end:
trimCommas = function(str) {
str = str.replace(/[,\s]*,[,\s]*/g, ",");
str = str.replace(/^,/, "");
str = str.replace(/,$/, "");
return str;
}
The first one replaces every sequence of white space and commas with a single comma, provided there's at least one comma in there. This handles the edge case left in the comments for "Internet Explorer".
The second and third get rid of the comma at the start and end of string where necessary.
You can also add (to the end):
str = str.replace(/[\s]+/, " ");
to collapse multi-spaces down to one space and
str = str.replace(/,/g, ", ");
if you want them to be formatted nicely (space after each comma).
A more generalized solution would be to pass parameters to indicate behaviour:
Passing true for collapse will collapse the spaces within a section (a section being defined as the characters between commas).
Passing true for addSpace will use ", " to separate sections rather than just "," on its own.
That code follows. It may not be necessary for your particular case but it might be better for others in terms of code re-use.
trimCommas = function(str,collapse,addspace) {
str = str.replace(/[,\s]*,[,\s]*/g, ",").replace(/^,/, "").replace(/,$/, "");
if (collapse) {
str = str.replace(/[\s]+/, " ");
}
if (addspace) {
str = str.replace(/,/g, ", ");
}
return str;
}
First ping on Google for "Javascript Trim": http://www.somacon.com/p355.php. You seem to have implemented this using commas, and I don't see why it would be a problem (though you escaped in the second one and not in the first).
Not quite as sophisticated, but simple with:
',google,, , yahoo,, ,'.replace(/\s/g, '').replace(/,+/g, ',');
You should be able to use only one replace call:
/^( *, *)+|(, *(?=,|$))+/g
Test:
'google, yahoo,, ,'.replace(/^( *, *)+|(, *(?=,|$))+/g, '');
"google, yahoo"
',google,, , yahoo,, ,'.replace(/^( *, *)+|(, *(?=,|$))+/g, '');
"google, yahoo"
Breakdown:
/
^( *, *)+ # Match start of string followed by zero or more spaces
# followed by , followed by zero or more spaces.
# Repeat one or more times
| # regex or
(, *(?=,|$))+ # Match , followed by zero or more spaces which have a comma
# after it or EOL. Repeat one or more times
/g # `g` modifier will run on until there is no more matches
(?=...) is a look ahead will will not move the position of the match but only verify that a the characters are after the match. In our case we look for , or EOL
match() is much better tool for this than replace()
str = " aa, bb,, cc , dd,,,";
newStr = str.match(/[^\s,]+/g).join(",")
alert("[" + newStr + "]")
When you want to replace ",," ",,,", ",,,," and ",,,,," below code will be removed by ",".
var abc = new String("46590,26.91667,75.81667,,,45346,27.18333,78.01667,,,45630,12.97194,77.59369,,,47413,19.07283,72.88261,,,45981,13.08784,80.27847,,");
var pqr= abc.replace(/,,/g,',').replace(/,,/g, ',');
alert(pqr);