Text Replacement With RegEx - javascript

I am using Sublime Text to write some Javascript and need to do a simple text replacement in the editor in order to set code up. I can do it manually but I figured there must be a way to have the replacement occur automatically with RegEx. I've used RegEx a bunch before but have never used it to grab data from one part of the code to reference and edit another part of the code. For example, I have this:
var example_1 = 836;
var example_2 = 837;
var example_3 = 838;
var example_4 = 846;
And then I have this:
SELECT_122=836
SELECT_143=837
SELECT_144=838
SELECT_145=846
I want these to use the corresponding values and format them like this:
SELECT_122: example_1,
SELECT_143: example_2,
SELECT_144: example_3,
SELECT_145: example_4
Note that I'm updating the equal signs to colons with spaces so I figured doing all these changes could be done with some sort of search and replace. I have a large amount of these so I figured it would be best to learn how to do this if it's possible.

I don't have SublimeText, but you said in a comment that you want to do it through a text editor. Here is what works for me in EditPad Pro, it may work in Sublime.
Search:
(?s)(var (example_\d++) = (\d++).*?SELECT_\d++)=\3
Replace:
\1: \2,
Then I click "Replace". This will replace the first instance (SELECT_122=836) with "SELECT_122: example_1,"
Then I click "Replace Next" multiple times, and the SELECT_ strings are left looking like this:
SELECT_122: example_1,
SELECT_143: example_2,
SELECT_144: example_3,
SELECT_145: example_4,
Is this what you want?
Hope the regex and replacement string at least get you started. :)

Related

Regex javascript code, but retaining middle text

I am trying to redo all of our JavaScript code for a page and make it consistent. I have a function that returns getElementById(Id) with just $ but there are a lot of document.form.elementID.value so I need to turn all of these into $("elementID").value
What would be the Regex to retain the ElementID when I am doing a mass replacement in Ecipse? I will also need to add the quotes.
Tested in Eclipse:
Find
document.form.(\w+).value
Replace
\$\("$1"\)

Matching Hrefs in plain text using regEx fails as soon as one link is touched

I am currently writing a code-snippet that automatically links certain keywords and saves the links it linked into an array called linked. I do this last step to prevent a certain word to be linked twice.
Now the user is writing into a textbox, writes a keyword it gets linked. That works fine. My problem now is I am trying to handle the situation when he deletes text from the textbox. This means I have to match all links in the text against the linked array and then remove those from the linked array, that aren't in the text anymore. So far the theory. Unfortunately I am stuck with the following error.
Assume we have a text like this:
Test <a href='link1'>Link1</a> <a href='link2'>Link2</a>
I use this regEx (/href='([^\'\"]+)'/g) to get all the hrefs in the text above like so:
var hrefs = $(textInput).val().match(/href='([^\'\"]+)'/g);
This gives me an array that contains the following:
href='link1'
href='link2'
If I start deleting text and end up with something like this:
Test <a href='link1'>Link1</a> <a href='link2
Notice the one ' that is gone, the whole regEx turns out undefined, even though there still is a link in the string. Since I am not an expert with regEx I can't see exactly why? Is there maybe a better regEx for this situation?
You can simplify your regex like this:
/href='[^']+'/g
Demo
http://regex101.com/r/tU2qL0
Use this regex /href=('|")\w+('|")/g like this;
var hrefs = $(textInput).val().match(/href=('|")\w+('|")/g);
This should give you the matches.
BTW, match() is correct. Don't do exec() as #tenub said
Mark it as answer if it helps :)

trouble using string.replace with regex

Given something a regex like this:
http://rubular.com/r/ai1LFT5jvK
I want to use string.replace to replace "subdir" with a string of my choosing.
Doing myStr.replace(/^.*\/\/.*\.net\/.*\/(.*)\/.*\z/,otherStr)
only returns the same string, as shown here: http://jsfiddle.net/nLmbV/
If you view the Rublar, it appears to capture what I want it to capture, but on the Fiddle, it doesn't replace it.
I'd like to know why this happens, and what I'm doing wrong. A correct regex or a correct implementation of the replace call would be nice, but most of all, I want to understand what I'm doing wrong so that I can avoid it in the future.
EDIT
I've updated the fiddle to change my regex from:
/^.*\/\/.*\.net\/.*\/(.*)\/.*\z/
to
/^.*\/\/.*\.net\/.*\/(.*)\/.*$/
And according to the fiddle, it just returns hello instead of https://xxxxxxxxxxx.cloudfront.net/dir/hello/Slide1_v2.PNG
It's that little \z in your regex.
You probably forgot to replace it with a $ sign. JavaScript uses ^ and $ as anchors, while Ruby uses \A and \z.
To answer your edit:
The match is always replaced as a whole. You'll want to group both the left side and the right side of the to-be-replaced part and reinsert it in the replacement:
url.replace(/^(.*\/\/.*\.net\/.*\/).*(\/.*)$/,"$1hello$2")
Before I get marked down, I know the question asks about regexp. The reason for this answer URLs are nearly impossible to process reliably with a regexp without writing fiendishly complex regexps. It can be done, but it makes your head hurt!
If you are doing this in a browser, you can use an A tag in your script to make things much simpler. The A tag knows how to parse them into pieces, and it lets you modify the pieces independently, so you only need to deal with the pathname:
//make a temporary a tag
var a = document.createElement('a');
//set the href property to the url you want to process
a.href = "scheme://host.domain/path/to/the/file?querystring"
//grab the path part of the url, and chop up into an array of directories
var dirs = a.pathname.split('/');
//set 2nd dir name - array is ['','path','to','file']
dirs[2]='hello';
//put the path back together
a.pathname = dirs.join('/');
a.href now contains the URL you want.
More lines, but also more hair left when you come back to change the code later.

Confused with Regex JS pattern

ok i do have this following data in my div
<div id="mydiv">
<!--
what is your present
<code>alert("this is my present");</code>
where?
<code>alert("here at my left hand");</code>
oh thank you! i love you!! hehe
<code>alert("welcome my honey ^^");</code>
-->
</div>
well what i need to do there is to get the all the scripts inside the <code> blocks and the html codes text nodes without removing the html comments inside. well its a homework given by my professor and i can't modify that div block..
I need to use regular expressions for this and this is what i did
var block = $.trim($("div#mydiv").html()).replace("<!--","").replace("-->","");
var htmlRegex = new RegExp(""); //I don't know what to do here
var codeRegex = new RegExp("^<code(*n)</code>$","igm");
var code = codeRegex.exec(block);
var html = "";
it really doesn't work... please don't give the exact answer.. please teach me.. thank you
I need to have the following blocks for the variable code
alert("this is my present");
alert("here at my left hand");
alert("welcome my honey ^^");
and this is the blocks i need for variable html
what is your present
where?
oh thank you! i love you!! hehe
my question is what is the regex pattern to get the results above?
Parsing HTML with a regular expression is not something you should do.
I'm sure your professor thinks he/she was really clever and that there's no way to access the DOM API and can wave a banner around and justify some minor corner-case for using regex to parse the DOM and that sometimes it's okay.
Well, no, it isn't. If you have complex code in there, what happens? Your regex breaks, and perhaps becomes a security exploit if this is ever in production.
So, here:
http://jsfiddle.net/zfp6D/
Walk the dom, get the nodeType 8 (comment) text value out of the node.
Invoke the HTML parser (that thing that browsers use to parse HTML, rather than regex, why you wouldn't use the HTML parser to parse HTML is totally beyond me, it's like saying "Yeah, I could nail in this nail with a hammer, but I think I'm going to just stomp on the nail with my foot until it goes in").
Find all the CODE elements in the newly parsed HTML.
Log them to console, or whatever you want to do with them.
First of all, you should be aware that because HTML is not a regular language, you cannot do generic parsing using regular expressions that will work for all valid inputs (generic nesting in particular cannot be expressed with regular expressions). Many parsers do use regular expressions to match individual tokens, but other algorithms need to be built around them
However, for a fixed input such as this, it's just a case of working through the structure you have (though it's still often easier to use different parsing methods than just regular expressions).
First lets get all the code:
var code = '', match = [];
var regex = new RegExp("<code>(.*?)</code>", "g");
while (match = regex.exec(content)) {
code += match[1] + "\n";
}
I assume content contains the content of the div that you've already extracted. Here the "g" flag says this is for "global" matching, so we can reuse the regex to find every match. The brackets indicate a capturing group, . means any character, * means repeated 0 or more times, and ? means "non-greedy" (see what happens without it to see what it does).
Now we can do a similar thing to get all the other bits, but this time the regex is slightly more complicated:
new RegExp("(<!--|</code>)(.*?)(-->|<code>)", "g")
Here | means "or". So this matches all the bits that start with either "start comment" or "end code" and end with "end comment" or "start code". Note also that we now have 3 sets of brackets, so the part we want to extract is match[2] (the second set).
You're doing a lot of unnecessary stuff. .html() gives you the inner contents as a string. You should be able to use regEx to grab exactly what you need from there. Also, try to stick with regEx literals (e.g. /^regexstring$/). You have to escape escape characters using new RegExp which gets really messy. You generally only want to use new RegExp when you need to put a string var into a regEx.
The match function of strings accepts regEx and returns a collection of every match when you add the global flag (e.g. /^regexstring$/g <-- note the 'g'). I would do something like this:
var block = $('#mydiv').html(), //you can set multiple vars in one statement w/commas
matches = block.match(/<code>[^<]*<\/code>/g);
//[^<]* <-- 0 or more characters that aren't '<' - google 'negative character class'
matches.join('_') //lazy way of avoiding a loop - join into a string with a safe character
.replace(/<\/*code>/g,'') //\/* 0 or more forward slashes
.split('_');//return the matches string back to array
//Now do what you want with matches. Eval (ew) or append in a script tag (ew).
//You have no control over the 'ew'. I just prefer data to scripts in strings

match text between two html custom tags but not other custom tags

I have something like the following;-
<--customMarker>Test1<--/customMarker>
<--customMarker key='myKEY'>Test2<--/customMarker>
<--customMarker>Test3 <--customInnerMarker>Test4<--/customInnerMarker> <--/customMarker>
I need to be able to replace text between the customMarker tags, I tried the following;-
str.replace(/<--customMarker>(.*?)<--\/customMarker>/g, 'item Replaced')
which works ok. I would like to also ignore custom inner tags and not match or replace them with text.
Also I need a separate expression to extract the value of the attribute key='myKEY' from the tag with Text2.
Many thanks
EDIT
actually I am trying to find things between comment tags but the comment tags were not displaying correctly so I had to remove the '!'. There's a unique situation that required comment tags... in anycase if anyone knows enough regex to help, it would be great. thank u.
In the end, I did something like the following (incase anyone else needs this. enjoy!!! But note: Word about town is that using regex with html tags is not ideal, so do your own research and make up your mind. For me, it had to be done this way, mostly bcos i wanted to, but also bcos it simplified the job in this instance);-
var retVal = str.replace(/<--customMarker>(.*?)<--\/customMarker>/g, function(token, match){
//question 1: I would like to also ignore custom inner tags and not match or replace them with text.
//answer:
var replacePattern = /<--customInnerMarker*?(.*?)<--\/customInnerMarker-->/g;
//remove inner tags from match
match = $.trim(match.replace(replacePattern, ''));
//replace and return what is left with a required value
return token.replace(match, objParams[match]);
//question 2: Also I need a separate expression to extract the value of the attribute key='myKEY' from the tag with Text2.
//answer
var attrPattern = /\w+\s*=\s*".*?"/g;
attrMatches = token.match(attrPattern);//returns a list of attributes as name/value pairs in an array
})
Can't you use <customMarker> instead? Then you can just use getElementsByTagName('customMarker') and get the inner text and child elements from it.
A regex merely matches an item. Once you have said match, it is up to you what you do with it. This is part of the problem most people have with using regular expressions, they try and combine the three different steps. The regex match is just the first step.
What you are asking for will not be possible with a single regex. You're going to need a mini state machine if you want to use regular expressions. That is, a logic wrapper around the matches such that it moves through each logical portion.
I would advise you look in the standard api for a prebuilt engine to parse html, rather than rolling your own. If you do need to do so, read the flex manual to get a basic understanding of how regular expressions work, and the state machines you build with them. The best example would be the section on matching multiline c comments.

Categories