Replace Last Match in Regex

Replace Last Match in Regex - javascript

I have an autocomplete user-tagging system that fills in usernames that come after an # symbol. I have this problem however, where I have two users with a matching substring. For example:
Tagging #billy and #b
When a user fills in the #b tag with a user named (for example) #brendan, it'll replace the #billy tag. How do I go backwards and replace only the last tag?
Edit: this is my current solution, but it feels kludgy. Is there a way to do this just with RegEx?:
function tagUser (chosenUsername) {
var userRegex = new RegExp('(^|\\s)#([' + lastUserTag() + ']*)$', 'gi');
var caption = $("#example").val();
var match = caption.match(userRegex);
var lastMatch = match[match.length - 1];
$("#example").val(caption.replace(lastMatch, " #" + chosenUsername));
}

Not sure if I understood your problem entirely. However just to let you know you can use negative lookahead to replace only last matched text like this:
var str='#billy and #b';
str = str.replace(/#b\b(?!.*?#b\b)/, 'brendan');

Related

Ignore html tag (specifically a tag) from given string during JS replaceAll operation

I've the case where I'm looping through URL array (ex. [www.stackoverflow.com, www.ex.com]) and matching those URLs one by one with given string during loop and replacing with anchor tag to make it clickable.
I'm able to do it using JS replaceAll method but incase of multiple occurrences of same url in given string it even matches url in tag.
For example, if given string is "Check it out at www.stack.com/abc and bookmark the www.stack.com, www.overflow.com" and given URL array is [www.stack.com/abc, www.stack.com]
During first replace iteration it will be "Check it out at www.stack.com/abc and bookmark the www.stack.com"
and then the problem occurs during the second iteration, it'll replace the string even in the tag. I want to ignore the html tag during the replaceAll method. Can someone help me out with this ?
I've tried to ignore tags with the below regex but it doesn't working for content it between anchor tags.
exString.replaceAll(new RegExp(url + "(?![^<>]*>)", "gi"), replaceText);

Let's split and join then
const div = document.getElementById("text");
let str = div.textContent;
let arr = str.split(/ /)
console.log(arr)
const urls = ["www.stack.com/abc", "www.stack.com"];
arr.forEach((word,i) => {
const punctuation = word.match(/(\W$)/)
if (punctuation) word = word.slice(0,-1)
const idx = urls.indexOf(word);
if (idx !=-1) arr[i] = arr[i].replace(word,`${word}`)
})
console.log(arr)
div.innerHTML = arr.join(" ")
<div id="text">Check it out at www.stack.com/abc and bookmark the www.stack.com, www.overflow.com.</div>

Although the solution provided by mplungjan is clever and works well, I wanted to post an alternative.
The algorithm from the accepted answer processes the input string into an array of words and then proceeds to iterate through every word on every URL. Then it needs to see if any word ends with a symbol, and truncate if such.
This would be a bit consuming as one can imagine 50 words X 5 possible URLs = 250 combinations and O(n^2) computation. Then to imagine there could be 20 possible URLs and 20 input texts each containing 15+ words. And finally, to mention that algorithm may have issues with case sensitivity.
This solution uses a lot of thought from mplungjan's approach, but instead, it's only going to quickly narrow down what it actually needs to process via RegEx, and then loops again to apply what actually matched. Plus, the RegEx corrects the possible case sensitivity issue.
let str = 'Check it out at www.stack.com/abc and bookmark the www.stack.com, www.overflow.com.';
let urls = ["www.stack.com", "www.stack.com/abc", "www.not-here.com"];
let arReplace = [];
// sort by longest URLs (prevents processing identical root domains on sub-domains)
urls = urls.sort((a, b) =>{
if(b.length > a.length)
return 1
return -1
});
// find URLs and apply replacement tokens
urls.forEach((url) => {
if(str.match(new RegExp('\\b' + url + '\\b', 'i'))){
arReplace.push(url);
str = str.replace(new RegExp('\\b' + url + '\\b', 'gi'), '%ZZ' + (arReplace.length - 1) + 'ZZ%')
}
});
// replace tokens
arReplace.forEach((url, n) =>{
str = str.replace(new RegExp('%ZZ' + n + 'ZZ%', 'g'), '' + url + '')
});
document.body.innerHTML = str
Fiddle link: https://jsfiddle.net/e05o9cra/

Excluding Links from Javascript

I want to start off by saying I am very very new to javascript, and basically did a lot of googling to find this place and various other resources. While I found a script to modify into something I wanted (and managed to get it to work) it interferes with any links that are made within the specified div. Is there a way to exclude the links from the javascript, and just have the javascript affect the text?
This is the javascript. While I have no problem getting the first part to work (where I replace quoted text), I can't seem to exclude links and images that has html which possesses quotation marks in them.
$("div.gam-posttem div").each(function(){
this.innerHTML = this.innerHTML.replace(/\"([^\"]+)\"/g,
'<div class="gam-quotetext">"$1"</div>');
});
$not('div.gam-posttem > div > a').each(function(){});
And here's the html I am using.
<div class="gam-posttem"><div class="gam-posttem1">
"quote here" and regular text here. "more quote here"
<br><br>Link is Here
</div></div>
Any help is greatly appreciated, and if you need any more info, such as CSS, please feel free to ask.

What makes this particularly tricky is that JavaScript Regular Expressions do not allow look-behind. What you want to do is try and match " pairs where there are an equal number of < and > characters (outside of other " characters) before it. And even if you could, that'd be a pretty nasty looking regex...
However, you only care about characters outside of <> pairs, which matching using Regex (while not advisable) is possible:
<(?:[^><\"\']*?(?:([\"\']).*?\1)?[^><\'\"]*?)+(?:>|$) will match all <> pairs, ignoring close angle brackets within quotes within the tag. Hence you want to match everything outside these tags.
There's probably a better way to do this, but you can try the following idea:
Match all tags
For each tag, get the start position and length => calculate end position
add the start index of the string to the front of the end positions
add the length of the string to the start positions
for each (end, start) pair (as we are inverting the matches), run your replace method and modify the string.
String.prototype.spliceish = function(start, end, newSubStr) {
return this.slice(0, start) + newSubStr + this.slice(end);
};
var tagMatch = /<(?:[^><\"\']*?(?:([\"\']).*?\1)?[^><\'\"]*?)+(?:>|$)/g;
var tokenMatch = /\"([^\"]+)\"/g;
function invertedMatch(htmlString) {
var returnString = htmlString;
var startIndexes = [],
lengths = [],
match = tagMatch.exec(htmlString);
while(match !== null)
{
startIndexes.push(match.index);
lengths.push(match[0].length);
match = tagMatch.exec(htmlString);
}
var endIndexes = startIndexes.map(function(val, ix) { return val + lengths[ix]; });
var invertedStarts = [0].concat(endIndexes); // we are inverting, so capture start text.
var invertedEnds = startIndexes.concat(htmlString.length);
// will need to go backwards
for(var i = invertedStarts.length - 1; i >= 0; i--) {
var start = invertedStarts[i],
end = invertedEnds[i];
var stringReplace = htmlString.substring(start, end);
returnString = returnString.spliceish(start, end, stringReplace.replace(tokenMatch,
'<div class="gam-quotetext">"$1"</div>'));
};
return returnString;
}
$("#root").html(invertedMatch($("#root").html()));
.gam-quotetext{
color: green
}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<div id="root">
<div class="gam-posttem">
<div class="gam-posttem1">
"quote here" and regular text here. "more quote here"
<br>
<br>
Link is Here
</div>
</div>
</div>

Regex match on string only, not substrings

I'm adding strings to a textarea when values in a table are clicked. It has to be possible to select and deselect values in the table, and they will add/remove themselves from the textarea. The textarea has to be a string, and the added values can't be wrapped in any other characters.
The values that are being added could potentailly have any characters in, and may have one of the other of the values as a substring, here are some examples: HOLE 1, HOLE 11, HOLE 17, HOLE (1), cutaway, cut, away, cut-away, Commentator (SMITH, John), (GOAL), GOAL
Once a value has been appended to the textarea, and it's clicked again to deselect it, I'm searching for the value and removing it like so:
var regex = new RegExp("(?![ .,]|^)?(" + mySelectedText + ")(?=[., ]|$)", 'g');
var newDescriptionText = myTextAreaText.replace(regex, '');
The regex matches correctly for strings/substrings of text e.g. cutaway and away however wont work for anything beginning with a bracket e.g. (GOAL). Adding the word boundary selector to the start of the expression \b, will make the regex match for strings that start with a bracket but wont work for strings/substrings containing the same text.
Is there a way to achieve this using regex? Or some other method?
Here's a working CodePen example of the adding/removing from table.

You can use word boundaries (\b) to avoid issue when you deselect away and have cutaway in the list. Just change the regex to:
regex = new RegExp("(?![ .,]|^)?(\\b" + cellText + "\\b)(?=[., ]|$)", 'g');
^^^ ^^^
Here's the code I changed to make it works:
removeFromDescription = function(cell) {
cell.classList.remove(activeClass);
// Remove from the active cells arry
var itemIndex = tempAnnotation.activeCells.indexOf(cell.textContent);
tempAnnotation.activeCells.splice(itemIndex, 1);
// Do the regex find/replace
var annotationBoxText = annotation.value,
cellText = regexEscape(cell.textContent), // Escape any funky characters from the string
regex = new RegExp("(^| )" + cellText + "( |$)", 'g');
var newDescription = annotationBoxText.replace(regex, ' ');
setAnnotationBoxValue(newDescription);
console.info('cellText: ', cellText);
console.info('annotationBoxText:', annotationBoxText);
console.info('newDescription: ', newDescription);
};
regexEscape = function(s) {
return s.replace(/([-\/\\^$*+?.()|[\]{}])/g, `\\$&`);
};
setAnnotationBoxValue = function(newValue) {
annotation.value = newValue;
};

Get all words starting with X and ending with Y

I have got a textarea with keyup=validate()
I need a javascript function that gets all words starting with # and ending with a character that is not A-Za-z0-9
For example:
This is a text #user1 this is more text #user2. And this is even more #user3!
The function gives an array:
Array("#user1","#user2","#user3");
I am sure there must be a way to do this written on somewhere on the internet if I just google something but I have no idea what I have to look for.. I am very new with regular expresions.
Thank you very much!

The regular expression you want is:
/#[a-z\d]+/ig
This matches # followed by a sequence of letters and numbers. The i modifier makes it case-insensitive, so you don't have to put A-Z in the character class, and g makes it find all the matches.
var str = "This is a text #user1 this is more text #user2. And this is even more #user3!";
var matches = str.match(/#[a-z\d]+/ig);
console.log(matches);

JS
var str = "This is a text #user1 this is more text #user2. And this is even more #user3!",
var textArr = str.split(" ");
for(var i = 0; i < textArr.length; i++) {
var test = textArr[i];
matches = test.match(/^#.*.[A-Za-z0-9]$/);
console.log(matches);
};
Explanation:
You should also read about the regex(http://www.w3schools.com/jsref/jsref_obj_regexp.asp) and match(http://www.w3schools.com/jsref/jsref_match.asp) to get an idea how it works.
Basically, applying ^# means starting the regex look for #. $ means ending with. and .* any character in between.
To Test: http://www.regular-expressions.info/javascriptexample.html

Thanks for the replies above, they've helped me - Where I've written this method that hopefully answers the question about having a start and end regex check.
In this example it looks for ##_ at the start and _## at the end
e.g. ##_ anyTokenYouNeedToFind _##.
Code:
const tokenSearchHelper = (inputText) => {
let matches = inputText.match(/##_[a-zA-Z0-9_\d]+_##/ig);
return matches;
}
const out = tokenSearchHelper("Hello ##_World_##");
console.log(out);

JS Regex to find href of several a tags

I need a regex to find the contents of the hrefs from these a tags :
<p class="bc_shirt_delete">
delete
</p>
Just the urls, not the href/ tags.
I'm parsing a plain text ajax request here, so I need a regex.

You can try this regex:
/href="([^\'\"]+)/g
Example at: http://regexr.com?333d1
Update: or easier via non greedy method:
/href="(.*?)"/g

This will do it nicely. http://jsfiddle.net/grantk/cvBae/216/
Regex example: https://regex101.com/r/nLXheV/1
var str = '<p href="missme" class="test">delete</p>'
var patt = /<a[^>]*href=["']([^"']*)["']/g;
while(match=patt.exec(str)){
alert(match[1]);
}

Here is a robust solution:
let href_regex = /<a([^>]*?)href\s*=\s*(['"])([^\2]*?)\2\1*>/i,
link_text = 'another article link',
href = link_text.replace ( href_regex , '$3' );
What it does:
detects a tags
lazy skips over other HTML attributes and groups (1) so you DRY
matches href attribute
takes in consideration possible whitespace around =
makes a group (2) of ' and " so you DRY
matches anything but group (1) and groups (3) it
matches the group (2) of ' and "
matches the group (1) (other attributes)
matches whatever else is there until closing the tag
set proper flags i ignore case

You may don't need Regex to do that.
o = document.getElementsByTagName('a');
urls = Array();
for (i =0; i < o.length; i++){
urls[i] = o[i].href;
}
If it is a plain text, you may insert it into a displayed non DOM element, i.e display: none, and then deal with it regularly in a way like I described.

It might be easier to use jQuery
var html = '<li><h2 class="saved_shirt_name">new shirt 1</h2><button class="edit_shirt">Edit Shirt</button><button class="delete_shirt" data-eq="0" data-href="/CustomContentProcess.aspx?CCID=13524&OID=3936923&A=Delete">Delete Shirt</button></li><li><h2 class="saved_shirt_name">new shirt 2</h2><button class="edit_shirt">Edit Shirt</button><button class="delete_shirt" data-eq="0" data-href="/CustomContentProcess.aspx?CCID=13524&OID=3936924&A=Delete">Delete Shirt</button></li><li><h2 class="saved_shirt_name">new shirt 3</h2><button class="edit_shirt">Edit Shirt</button><button class="delete_shirt" data-eq="0" data-href="/CustomContentProcess.aspx?CCID=13524&OID=3936925&A=Delete">Delete Shirt</button></li>';
$(html).find('[data-href]');
And iterate each node
UPDATE (because post updated)
Let html be your raw response
var matches = $(html).find('[href]');
var hrefs = [];
$.each(matches, function(i, el){ hrefs.push($(el).attr('href'));});
//hrefs is an array of matches

I combined a few solutions around and came up with this (Tested in .NET):
(?<=href=[\'\"])([^\'\"]+)
Explanation:
(?<=) : look behind so it wont include these characters
[\'\"] : match both single and double quote
[^] : match everything else except the characters after '^' in here
+ : one or more occurrence of last character.
This works well and is not greedy with the quote as it would stop matching the moment it finds a quote

var str = "";
str += "<p class=\"bc_shirt_delete\">";
str += "delete";
str += "</p>";
var matches = [];
str.replace(/href=("|')(.*?)("|')/g, function(a, b, match) {
matches.push(match);
});
console.log(matches);
or if you don't care about the href:
var matches = str.match(/href=("|')(.*?)("|')/);
console.log(matches);

how about spaces around = ?
this code will fix it:
var matches = str.match(/href( *)=( *)("|'*)(.*?)("|'*)( |>)/);
console.log(matches);

It's important to be non-greedy. And to cater for —matching— ' or "
test = "<a href="#" class="foo bar"> banana
<a href='http://google.de/foo?yes=1&no=2' data-href='foobar'/>"
test.replace(/href=(?:\'.*?\'|\".*?\")/gi,'');
disclaimer: The one thing it does not catch is html5 attribs data-href...

In this specified case probably this is fastest pregmatch:
/f="([^"]*)/
gets ALL signs/characters (letters, numbers, newline signs etc.) form f=" to nearest next ", excluding it, flags for example /is are unnecesary, return null if empty
but if the source contains lots of other links, it will be necessary to determine that this is exactly the one you are looking for and here we can do it this way, just include in your pregmatch more of the source code, for example (of course its depend from source site code...)
/bc_shirt_delete">\s*<a href="([^"]*)

We Keep Coding

JavaScript is the programming language of the Web.

Replace Last Match in Regex - javascript

Not sure if I understood your problem entirely. However just to let you know you can use negative lookahead to replace only last matched text like this: var str='#billy and #b'; str = str.replace(/#b\b(?!.*?#b\b)/, 'brendan');

Related

Ignore html tag (specifically a tag) from given string during JS replaceAll operation

Excluding Links from Javascript

Regex match on string only, not substrings

Get all words starting with X and ending with Y

JS Regex to find href of several a tags

Categories

Resources