Javascript regular expression to strip out content between double quotes - javascript

I'm looking for a javascript regex that will remove all content wrapped in quotes(and the qoutes too), in a string that is the outlook format for listing email addresses. Take a look at the sample below, I am a regex tard and really need some help with this one, any help/resources would be appreciated!
"Bill'sRestauraunt"BillsRestauraunt#comcast.net,"Rob&Julie"robjules#ntelos.net,"Foo&Bar"foobar#cstone.net

Assuming no nested quotes:
mystring.replace(/"[^"]*"/g, '')

Try this regular expression:
/(?:"(?:[^"\\]+|\\(?:\\\\)*.)*"|'(?:[^'\\]+|\\(?:\\\\)*.)*')/g

Here's a regex I use to find and decompose the quoted strings within a paragraph. It also isolates several attendant tokens, especially adjacent whitespace. You can string together whichever parts you want.
var re = new RegExp(/([^\s\(]?)"(\s*)([^\\]*?(\\.[^\\]*)*)(\s*)("|\n\n)([^\s\)\.\,;]?)/g);

Related

replacing special characters with a specific special character

I have a string which is exactly like this...
R%26B,Alternative,Rock,Classic Rock,Heavy Metal,Classical,Reggae%2fSka,
I have tried enough to remove the special characters before they reach the browser...but not going anywhere..so planing to rely on my old and trusted friend "javascript" I want it to read
R&B,Alternative,Rock,Classic Rock,Heavy Metal,Classical,Reggae&Ska,
I know this can be done through regular expression which I am just not able to figure it out. How would I write the expression?
Any help would be highly appreciated
You may try using:
decodeURIComponent("R%26B,Alternative,Rock,Classic Rock,Heavy Metal,Classical,Reggae%26Ska")
//^prints^ "R&B,Alternative,Rock,Classic Rock,Heavy Metal,Classical,Reggae&Ska"
Look at these answers:
Regex to remove all special characters from string?
They layout a regex that will remove everything EXCEPT those characters you want to allow, this is safer then removing a list of %26,%2f, etc.
For example...
[^0-9a-zA-Z, ]+ would allow all letters, numbers, commas and whitespace.
[^0-9a-zA-Z]+ would be only letters and numbers
The other answers are probably pointing you in a better direction... if it means fixing the string before it gets to the client.
Probably you need decodeURIComponent() function.
<script>
var decodedString = decodeURIComponent('R%26B,Alternative,Rock,Classic Rock,Heavy Metal,Classical,Reggae%2fSka');
</script>
http://www.w3schools.com/jsref/jsref_decodeuricomponent.asp

Extracting both the full match, and the last token match in a regexp

I have a little interesting issue here. I have a plaintext URL coming from Excel and I need to change it to an HTML URL with a unique body. Here is the regex code for javascript:
text = text.toString().replace(/=hyperlink\(([#\\\w\s\(\)-\.\/]+)\)/g, "<a href='file:///$1'>$1</a>");
This works perfectly fine for what it does. Example, text is:
=hyperlink("\\share\folder\log\2013\13-05-13\13-05-13.txt")
regex turns it into
\\share\folder\log\2013\13-05-13\13-05-13.txt
However, I need the inner HTML to be just the text file name:
13-05-13.txt
To further complicate the matter, the original text the regex is going through is not a single occurrence. It is an entire spreadsheet with 100's of rows that contain this. So the regex will be matching and replacing 100's of these strings in one operation.
Hopefully it is possible to get this all done in one regexp on the entire string, but I suppose I could loop through each line of the string first...
If there is no way to do this with one regex engine, what do you think the best approach is? (no PHP/Python/Server side. Just Javascript, HTML, Jquery, etc).
I guess you could use this regex:
=hyperlink\("([#\\\w\s\(\)\-\.\/]+\\([^"]+))"\)
And this new replace:
$2
I'm not sure how your regex was working, but I added the quotes in the regex and replaced the single quotes by double quotes in the replace. Revert those if need be.
Demo

getElementById replace HTML

<script type="text/javascript">
var haystackText = document.getElementById("navigation").innerHTML;
var matchText = 'Subscribe to RSS';
var replacementText = '<ul><li>Some Other Thing Here</li></ul>';
var replaced = haystackText.replace(matchText, replacementText);
document.getElementById("navigation").innerHTML = replaced;
</script>
I'm attempting to try and replace a string of HTML code to be something else. I cannot edit the code directly, so I'm using Javascript to alter the code.
If I use the above method Matching Text on a regular string, such as just 'Subscribe to RSS', I can replace it fine. However, once I try to replace an HTML string, the code 'fails'.
Also, what if the HTML I wish to replace contains line breaks? How would I search for that?
<ul><li>\n</li></ul>
??
What should I be using or doing instead of this? Or am I just missing a small step? I did search around here, but maybe my keywords for the search weren't optimal to find a result that fit my situation...
Edit: Gonna mention, I'm writing this script in the footer of my page, well after the text I wish to replace, so it's not an issue of the script being written before what I want to overwrite to appear. :)
Currently you are using String.replace(substring, replacement) that will search for an exact match of the substring and replace it with the replacement e.g.
"Hello world".replace("world", "Kojichan") => "Hello Kojichan"
The problem with exact matches is that it doesn't allow anything else but exact matches.
To solve the problem, you'll have to start to use regular expressions. When using regular expression you have to be aware of
special characters such as ?, /, and \ that need to escaped \?, \/, \\
multiline mode /regexp/m
global matching if you want to replace more than one instance of the expression /regexp/g
closures for allowing multiple instances of white space \s+ for [1..n] white-space characters and \s* for [0..n] white-space characters.
To use regular expression instead of substring matching you just need to change String.replace("substring", "replacement") to String.replace(/regexp/, "replacement") e.g.
"Hello world".replace(/world/, "Kojichan") => "Hello Kojichan"
From MDN:
Note: If a <div>, <span>, or <noembed> node has a child text node that
includes the characters (&), (<), or (>), innerHTML returns these
characters as &amp, &lt and &gt respectively. Use element.textContent
to get a correct copy of these text nodes' contents.
So since textContent (or innerText) won't get you the HTML, you'd have to modify your search string appropriately.
You can use Regular Expressions.
Recommend to use Regular Expression. Notice that ? and / are special characters in Regular Expression. And for global multi-line matching, you need g and m flags set in the regular expression.
Regular expression matching of HTML (other than plain text) that comes out of a web page is a bad idea and is troublesome to make work cross browser (particularly in IE). The HTML that comes out of a web page does not always look the same as what was put in because some browser reconstitute the HTML and don't actually store what went in. Attributes can change order, quote marks can change or disappear, entities can change, etc...
If you want to modify whole tags, then you should directly access the DOM and operate on the actual objects in the page.

Find uppercase substrings and wrap with acronym tags

For example replace the string Yangomo, Congo, DRC with Yangomo, Congo, <acronym>DRC</acronym>. There may potentially be mulitple uppercase substings in each string. I assume some form of regex?
Thanks.
Well, a really simple one might be:
var replaced = original.replace(/\b([A-Z]+)\b/g, '<acronym>$1</acronym>');
Doing this sort of thing always has complications, however; it depends on the source material. (The "\b" thing matches word boundaries, and is an invaluable trick for all sorts of occasions.)
edit — insightful user Buh Buh points out that it might be nice to only affect strings with more than two characters, which would look like /\b([A-Z]{2,})\b/.
Personally I would use PHP to explode the string, use a regex to find all uppercase letters /[A-Z]+/ and then use PHP to insert the tags (using str_replace).

Regex to match part of a string

Regex fun again...
Take for example http://something.com/en/page
I want to test for an exact match on /en/ including the forward slashes, otherwise it could match 'en' from other parts of the string.
I'm sure this is easy, for someone other than me!
EDIT:
I'm using it for a string.match() in javascript
Well it really depends on what programming language will be executing the regex, but the actual regex is simply
/en/
For .Net the following code works properly:
string url = "http://something.com/en/page";
bool MatchFound = Regex.Match(url, "/en/").Success;
Here is the JavaScript version:
var url = 'http://something.com/en/page';
if (url.match(/\/en\//)) {
alert('match found');
}
else {
alert('no match');
}
DUH
Thank you to Welbog and Chris Ballance to making what should have been the most obvious point. This does not require Regular Expressions to solve. It simply is a contains statement. Regex should only be used where it is needed and that should have been my first consideration and not the last.
If you're trying to match /en/ specifically, you don't need a regular expression at all. Just use your language's equivalent of contains to test for that substring.
If you're trying to match any two-letter part of the URL between two slashes, you need an expression like this:
/../
If you want to capture the two-letter code, enclose the periods in parentheses:
/(..)/
Depending on your language, you may need to escape the slashes:
\/..\/
\/(..)\/
And if you want to make sure you match letters instead of any character (including numbers and symbols), you might want to use an expression like this instead:
/[a-z]{2}/
Which will be recognized by most regex variations.
Again, you can escape the slashes and add a capturing group this way:
\/([a-z]{2})\/
And if you don't need to escape them:
/([a-z]{2})/
This expression will match any string in the form /xy/ where x and y are letters. So it will match /en/, /fr/, /de/, etc.
In JavaScript, you'll need the escaped version: \/([a-z]{2})\/.
You may need to escape the forward-slashes...
/\/en\//
Any reason /en/ would not work?
/\/en\// or perhaps /http\w*:\/\/[^\/]*\/en\//
You don't need a regex for this:
location.pathname.substr(0, 4) === "/en/"
Of course, if you insist on using a regex, use this:
/^\/en\//.test(location.pathname)

Categories