I'd like to replace the "&" character, along with characters that may interfere with urls syntax.
so far i tried:
myText = myText.replace(/[^a-zA-Z0-9-. ]/g,'');
that probably works for other characters (didn't test it) but didn't comprehend the "&" which is what i care most about, so i added in combo the following line but also didn't get rid of the &:
myText = myText.replace(/&/g,'');
but neither work, how can i replace this special character?
SOLUTION:
Code was reading & at delivery and not &, so i had to do:
myText = myText.replace(/&/g,'');
and it works.
SNIPPET:
var text = "god & damn it";
console.log(text.replace(/&|&/g,''));
According to your comments, what you are trying to replace is this &, the html encoding of the & character.
With lodash you can _.unescape the string before replacing:
myText = _.unescape(myText).replace(/&/g, '');
This way you handle both & and & cases. Then if you have to append that text in the html you should _.escape it back to prevent weird side effects: _.escape(myText);.
Without lodash you can just search both in your regex:
myText = myText.replace(/&|&/g, '');
But this method can have it's side effects when other special characters are present because it removes the & character too, for example this string "Three is > than two & one" would end up looking like this "Three is gt; than two one" (notice the ugly gt; in the middle)
console.log("m&yText".replace(/\&/g,''))
I can suggest adding the backslash character before the & as to 'escape' using the & as the regex character. You want the regex to find and replace any literal & character.
Related
How would I match the quotations around "text" in the string below and not around "TEST TEXT" using RegEx. I wanted just quotations only when they are by themselves. I tried a negative lookahead (for a second quote) but it still captured the second of the two quotes around TEST TEXT.
This is some "text". This is also some ""TEST TEXT""
Be aware that I need this to scale so sometimes it would be right in the middle of a string so something like this:
/(\s|\w)(\")(?!")/g (using $2...)
Would work in this example but not if the string was:
This is some^"text".This is also some ""TEST TEXT""
I just need quotation marks by themselves.
EDIT
FYI, this needs to be Javascript RegEx so lookbehind would not be an option for me for this one.
Since you have not tagged any particular flavor of regex I am takig liberty of using lookbehind also. You can use:
(?<!")"(?!")[^"]*"
RegEx Demo
Update: For working with Javascript you can use this regex:
/""[^"]*""|(")([^"]*)(")/
And use captured group # 1 for your text.
RegEx Demo
I'm not sure if I really understood well your needs. I'll post this answer to check if it helps you but I can delete it if it doesn't.
So, is this what you want using this regex:
"\w+?"
Working demo
By the way, if you just want to get the content within "..." you can use this regex:
"(\w+?)"
Working demo
You can't do this with a pure JavaScript regexp. I am going to eat my words now however, as you can use the following solution using callback parameters:
var regex = /""+|(")/g
replaced = subject.replace(regex, function($0, $1) {
if ($1 == "\"") return "-"; // What to replace to?
else return $0;
});
"This is some -text-. This is also some ""TEST TEXT"""
If you're needing the regex to split the string, then you can use the above to replace matches to something distinctive, then split by them:
var regex = /""+|(")/g
replaced = subject.replace(regex, function($0, $1) {
if ($1 == "\"") return "☺";
else return $0;
});
splits = replaced.split("☺");
["This is some ", "text", ". This is also some ""TEST TEXT"""]
Referenced by:http://www.rexegg.com/regex-best-trick.html
I've a string done like this: "http://something.org/dom/My_happy_dog_%28is%29cool!"
How can I remove all the initial domain, the multiple underscore and the percentage stuff?
For now I'm just doing some multiple replace, like
str = str.replace("http://something.org/dom/","");
str = str.replace("_%28"," ");
and go on, but it's really ugly.. any help?
Thanks!
EDIT:
the exact input would be "My happy dog is cool!" so I would like to get rid of the initial address and remove the underscores and percentage and put the spaces in the right place!
The problem is that trying to put a regex on Chrome "something goes wrong". Is it a problem of Chrome or my regex?
I'd suggest:
var str = "http://something.org/dom/My_happy_dog_%28is%29cool!";
str.substring(str.lastIndexOf('/')+1).replace(/(_)|(%\d{2,})/g,' ');
JS Fiddle demo.
The reason I took this approach is that RegEx is fairly expensive, and is often tricky to fine tune to the point where edge-cases become less troublesome; so I opted to use simple string manipulation to reduce the RegEx work.
Effectively the above creates a substring of the given str variable, from the index point of the lastIndexOf('/') (which does exactly what you'd expect) and adding 1 to that so the substring is from the point after the / not before it.
The regex: (_) matches the underscores, the | just serves as an or operator and the (%\d{2,}) serves to match digit characters that occur twice in succession and follow a % sign.
The parentheses surrounding each part of the regex around the |, serve to identify matching groups, which are used to identify what parts should be replaced by the ' ' (single-space) string in the second of the arguments passed to replace().
References:
lastIndexOf().
replace().
substring().
You can use unescape to decode the percentages:
str = unescape("http://something.org/dom/My_happy_dog_%28is%29cool!")
str = str.replace("http://something.org/dom/","");
Maybe you could use a regular expression to pull out what you need, rather than getting rid of what you don't want. What is it you are trying to keep?
You can also chain them together as in:
str.replace("http://something.org/dom/", "").replace("something else", "");
You haven't defined the problem very exactly. To get rid of all stretches of characters ending in %<digit><digit> you'd say
var re = /.*%\d\d/g;
var str = str.replace(re, "");
ok, if you want to replace all that stuff I think that you would need something like this:
/(http:\/\/.*\.[a-z]{3}\/.*\/)|(\%[a-z0-9][a-z0-9])|_/g
test
var string = "http://something.org/dom/My_happy_dog_%28is%29cool!";
string = string.replace(/(http:\/\/.*\.[a-z]{3}\/.*\/)|(\%[a-z0-9][a-z0-9])|_/g,"");
I have a string like that:
var str = 'aaaaaa, bbbbbb, ccccc, ddddddd, eeeeee ';
My goal is to delete the last space in the string. I would use,
str.split(0,1);
But if there is no space after the last character in the string, this will delete the last character of the string instead.
I would like to use
str.replace("regex",'');
I am beginner in RegEx, any help is appreciated.
Thank you very much.
Do a google search for "javascript trim" and you will find many different solutions.
Here is a simple one:
trimmedstr = str.replace(/\s+$/, '');
When you need to remove all spaces at the end:
str.replace(/\s*$/,'');
When you need to remove one space at the end:
str.replace(/\s?$/,'');
\s means not only space but space-like characters; for example tab.
If you use jQuery, you can use the trim function also:
str = $.trim(str);
But trim removes spaces not only at the end of the string, at the beginning also.
Seems you need a trimRight function. its not available until Javascript 1.8.1. Before that you can use prototyping techniques.
String.prototype.trimRight=function(){return this.replace(/\s+$/,'');}
// Now call it on any string.
var a = "a string ";
a = a.trimRight();
See more on Trim string in JavaScript? And the compatibility list
You can use this code to remove a single trailing space:
.replace(/ $/, "");
To remove all trailing spaces:
.replace(/ +$/, "");
The $ matches the end of input in normal mode (it matches the end of a line in multiline mode).
Try the regex ( +)$ since $ in regex matches the end of the string. This will strip all whitespace from the end of the string.
Some programs have a strip function to do the same, I do not believe the stadard Javascript library has this functionality.
Regex Reference Sheet
Working example:
var str = "Hello World ";
var ans = str.replace(/(^[\s]+|[\s]+$)/g, '');
alert(str.length+" "+ ans.length);
Fast forward to 2021,
The trimEnd() function is meant exactly for this!
It will remove all whitespaces (including spaces, tabs, new line characters) from the end of the string.
According to the official docs, it is supported in every major browser. Only IE is unsupported. (And lets be honest, you shouldn't care about IE given that microsoft itself has dropped support for IE in Aug 2021!)
I'm having some troubles with matching a regular expression in multi-line string.
<script>
var str="Welcome to Google!\n";
str = str + "We are proud to announce that Microsoft has \n";
str = str + "one of the worst Web Developers sites in the world.";
document.write(str.replace(/.*(microsoft).*/gmi, "$1"));
</script>
http://jsbin.com/osoli3/3/edit
As you may see on the link above, the output of the code looks like this:
Welcome to Google! Microsoft one of the worst Web Developers sites in the world.
Which means, that the replace() method goes line by line and if there's no match in that line, it returns just the whole line... Even if it has the "m" (multiline) modifier...
The multiline option only changes how the codes ^ and $ work, not how the code . works.
Use a pattern where you match any character using a set like [\w\W] instead of ., as that only matches non-linebreak characters.
document.write(str.replace(/[\w\W]*(microsoft)[\w\W]*/gmi, "$1"));
I have a textbox where a user puts a string like this:
"hello world! I think that __i__ am awesome (yes I am!)"
I need to create a correct URL like this:
hello-world-i-think-that-i-am-awesome-yes-i-am
How can it be done using regular expressions?
Also, is it possible to do it with Greek (for example)?
"Γεια σου κόσμε"
turns to
geia-sou-kosme
In other programming languages (Python/Ruby) I am using a translation array. Should I do the same here?
Try this:
function doDashes(str) {
var re = /[^a-z0-9]+/gi; // global and case insensitive matching of non-char/non-numeric
var re2 = /^-*|-*$/g; // get rid of any leading/trailing dashes
str = str.replace(re, '-'); // perform the 1st regexp
return str.replace(re2, '').toLowerCase(); // ..aaand the second + return lowercased result
}
console.log(doDashes("hello world! I think that __i__ am awesome (yes I am!)"));
// => hello-world-I-think-that-i-am-awesome-yes-I-am
As for the greek characters, yeah I can't think of anything else than some sort of lookup table used by another regexp.
Edit, here's the oneliner version:
Edit, added toLowerCase():
Edit, embarrassing fix to the trailing regexp:
function doDashes2(str) {
return str.replace(/[^a-z0-9]+/gi, '-').replace(/^-*|-*$/g, '').toLowerCase();
}
A simple regex for doing this job is matching all "non-word" characters, and replace them with a -. But before matching this regex, convert the string to lowercase. This alone is not fool proof, since a dash on the end may be possible.
[^a-z]+
Thus, after the replacement; you can trim the dashes (from the front and the back) using this regex:
^-+|-+$
You'd have to create greek-to-latin glyps translation yourself, regex can't help you there. Using a translation array is a good idea.
I can't really say for Greek characters, but for the first example, a simple:
/[^a-zA-Z]+/
Will do the trick when using it as your pattern, and replacing the matches with a "-"
As per the Greek characters, I'd suggest using an array with all the "character translations", and then adding it's values to the regular expression.
To roughly build the url you would need something like this.
var textbox = "hello world! I think that __i__ am awesome (yes I am!)";
var url = textbox.toLowerCase().replace(/([^a-z])/, '').replace(/\s+/, " ").replace(/\s/, '-');
It simply removes all non-alpha characters, removes double spacing, and then replaces all space chars with a dash.
You could use another regular expression to replace the greek characters with english characters.