Regex match on string only, not substrings - javascript

I'm adding strings to a textarea when values in a table are clicked. It has to be possible to select and deselect values in the table, and they will add/remove themselves from the textarea. The textarea has to be a string, and the added values can't be wrapped in any other characters.
The values that are being added could potentailly have any characters in, and may have one of the other of the values as a substring, here are some examples: HOLE 1, HOLE 11, HOLE 17, HOLE (1), cutaway, cut, away, cut-away, Commentator (SMITH, John), (GOAL), GOAL
Once a value has been appended to the textarea, and it's clicked again to deselect it, I'm searching for the value and removing it like so:
var regex = new RegExp("(?![ .,]|^)?(" + mySelectedText + ")(?=[., ]|$)", 'g');
var newDescriptionText = myTextAreaText.replace(regex, '');
The regex matches correctly for strings/substrings of text e.g. cutaway and away however wont work for anything beginning with a bracket e.g. (GOAL). Adding the word boundary selector to the start of the expression \b, will make the regex match for strings that start with a bracket but wont work for strings/substrings containing the same text.
Is there a way to achieve this using regex? Or some other method?
Here's a working CodePen example of the adding/removing from table.

You can use word boundaries (\b) to avoid issue when you deselect away and have cutaway in the list. Just change the regex to:
regex = new RegExp("(?![ .,]|^)?(\\b" + cellText + "\\b)(?=[., ]|$)", 'g');
^^^ ^^^
Here's the code I changed to make it works:
removeFromDescription = function(cell) {
cell.classList.remove(activeClass);
// Remove from the active cells arry
var itemIndex = tempAnnotation.activeCells.indexOf(cell.textContent);
tempAnnotation.activeCells.splice(itemIndex, 1);
// Do the regex find/replace
var annotationBoxText = annotation.value,
cellText = regexEscape(cell.textContent), // Escape any funky characters from the string
regex = new RegExp("(^| )" + cellText + "( |$)", 'g');
var newDescription = annotationBoxText.replace(regex, ' ');
setAnnotationBoxValue(newDescription);
console.info('cellText: ', cellText);
console.info('annotationBoxText:', annotationBoxText);
console.info('newDescription: ', newDescription);
};
regexEscape = function(s) {
return s.replace(/([-\/\\^$*+?.()|[\]{}])/g, `\\$&`);
};
setAnnotationBoxValue = function(newValue) {
annotation.value = newValue;
};

Related

Remove (n)th space from string in JavaScript

I am trying to remove some spaces from a few dynamically generated strings. Which space I remove depends on the length of the string. The strings change all the time so in order to know how many spaces there are, I iterate over the string and increment a variable every time the iteration encounters a space. I can already remove all of a specific type of character with str.replace(' ',''); where 'str' is the name of my string, but I only need to remove a specific occurrence of a space, not all the spaces. So let's say my string is
var str = "Hello, this is a test.";
How can I remove ONLY the space after the word "is"? (Assuming that the next string will be different so I can't just write str.replace('is ','is'); because the word "is" might not be in the next string).
I checked documentation on .replace, but there are no other parameters that it accepts so I can't tell it just to replace the nth instance of a space.
If you want to go by indexes of the spaces:
var str = 'Hello, this is a test.';
function replace(str, indexes){
return str.split(' ').reduce(function(prev, curr, i){
var separator = ~indexes.indexOf(i) ? '' : ' ';
return prev + separator + curr;
});
}
console.log(replace(str, [2,3]));
http://jsfiddle.net/96Lvpcew/1/
As it is easy for you to get the index of the space (as you are iterating over the string) , you can create a new string without the space by doing:
str = str.substr(0, index)+ str.substr(index);
where index is the index of the space you want to remove.
I came up with this for unknown indices
function removeNthSpace(str, n) {
var spacelessArray = str.split(' ');
return spacelessArray
.slice(0, n - 1) // left prefix part may be '', saves spaces
.concat([spacelessArray.slice(n - 1, n + 1).join('')]) // middle part: the one without the space
.concat(spacelessArray.slice(n + 1)).join(' '); // right part, saves spaces
}
Do you know which space you want to remove because of word count or chars count?
If char count, you can Rafaels Cardoso's answer,
If word count you can split them with space and join however you want:
var wordArray = str.split(" ");
var newStr = "";
wordIndex = 3; // or whatever you want
for (i; i<wordArray.length; i++) {
newStr+=wordArray[i];
if (i!=wordIndex) {
newStr+=' ';
}
}
I think your best bet is to split the string into an array based on placement of spaces in the string, splice off the space you don't want, and rejoin the array into a string.
Check this out:
var x = "Hello, this is a test.";
var n = 3; // we want to remove the third space
var arr = x.split(/([ ])/); // copy to an array based on space placement
// arr: ["Hello,"," ","this"," ","is"," ","a"," ","test."]
arr.splice(n*2-1,1); // Remove the third space
x = arr.join("");
alert(x); // "Hello, this isa test."
Further Notes
The first thing to note is that str.replace(' ',''); will actually only replace the first instance of a space character. String.replace() also accepts a regular expression as the first parameter, which you'll want to use for more complex replacements.
To actually replace all spaces in the string, you could do str.replace(/ /g,""); and to replace all whitespace (including spaces, tabs, and newlines), you could do str.replace(/\s/g,"");
To fiddle around with different regular expressions and see what they mean, I recommend using http://www.regexr.com
A lot of the functions on the JavaScript String object that seem to take strings as parameters can also take regular expressions, including .split() and .search().

Regex match quotes inside bracket regex

I'm working on a regex that must match only the text inside quotes but not in a comment, my macthes must only the strings in bold
<"love";>
>/*"love"*/<
<>'love'<>
"lo
more love
ve"
I'm stunck on this:
/(?:((\"|\')(.|\n)*?(\"|\')))(?=(?:\/\**\*\/))/gm
The first one (?:((\"|\')(.|\n)*?(\"|\'))) match all the strings
the second one (?=(?:\/\**\*\/)) doesn't match text inside quotes inside /* "mystring" */
bit my logic is cleary wrong
Any suggestion?
Thanks
Maybe you just need to use a negative lookahead to check for the comment end */?
But first, I'd split the string into separate lines
var arrayOfLines = input_str.split(/\r?\n/);
or, without empty lines:
var arrayOfLines = input_str.match(/[^\r\n]+/g);
and then use this regex:
["']([^'"]+)["'](?!.*\*\/)
Sample code:
var rebuilt_string = ''
var re = /["']([^'"]+)["'](?!.*\*\/)/g;
var subst = '<b>$1</b>';
for (i = 0; i < arrayOfLines.length; i++)
{
rebuilt_string = rebuilt_string + arrayOfLines[i].replace(re, subst) + "\r\n";
}
The way to avoid commented parts is to match them before. The global pattern looks like this:
/(capture parts to avoid)|target/
Then use a callback function for the replacement (when the capture group exists, return the match without change, otherwise, replace the match with what you want.
Example:
var result = text.replace(/(\/\*[^*]*(?:\*+(?!\/)[^*]*)*\*\/)|"[^"\\]*(?:\\[\s\S][^"\\]*)*"|'[^'\\]*(?:\\[\s\S][^'\\]*)*'/g,
function (m, g1) {
if (g1) return g1;
return '<b>' + m + '</b>';
});

JS - Get original value of string replace using regex

We have a string:
var dynamicString = "This isn't so dynamic, but it will be in real life.";
User types in some input:
var userInput = "REAL";
I want to match on this input, and wrap it with a span to highlight it:
var result = " ... but it will be in <span class='highlight'>real</span> life.";
So I use some RegExp magic to do that:
// Escapes user input,
var searchString = userInput.replace(/[\-\[\]\/\{\}\(\)\*\+\?\.\\\^\$\|]/g, "\\$&");
// Now we make a regex that matches *all* instances
// and (most important point) is case-insensitive.
var searchRegex = new RegExp(searchString , 'ig');
// Now we highlight the matches on the dynamic string:
dynamicString = dynamicString.replace(reg, '<span class="highlight">' + userInput + '</span>');
This is all great, except here is the result:
console.log(dynamicString);
// -> " ... but it will be in <span class='highlight'>REAL</span> life.";
I replaced the content with the user's input, which means the text now gets the user's dirty case-insensitivity.
How do I wrap all matches with the span shown above, while maintaining the original value of the matches?
Figured out, the ideal result would be:
// user inputs 'REAL',
// We get:
console.log(dynamicString);
// -> " ... but it will be in <span class='highlight'>real</span> life.";
You'd use regex capturing groups and backreferences to capture the match and insert it in the string
var searchRegex = new RegExp('('+userInput+')' , 'ig');
dynamicString = dynamicString.replace(searchRegex, '<span class="highlight">$1</span>');
FIDDLE
You can use it without capturing groups too.
dynamicString = text.replace(new RegExp(userInput, 'ig'), '<span class="highlight">$&</span>');

Using js regex to replace simple markup styles like **bold** to <b>bold</b>

I'm trying to take a chunk of plain text and convert parts of it into html tags. I don't need a full rich editor, just these few tags:
**bold**
__underline__
~~italics~~
--strike--
<<http://www.link.com>>
This is the method I have attempted to write but my lack of regex/js seems to be holding it back:
function toMarkup($this) {
var text = $this.text();
text = text.replace("\*\*(.*)\*\*", "<b>$1</b>");
text = text.replace("__(.*)__", "<u>$1</u>");
text = text.replace("~~(.*)~~", "<i>$1</i>");
text = text.replace("--(.*)--", "<del>$1</del>");
text = text.replace("<<(.*)>>", "<a href='$1'>Link</a>");
$this.html(text);
}
Any glaring errors as to why these replaces are not working? Another issue I'm just now realizing is by converting this text to html I am unescaping any other potential tags that may be malicious. A bonus would be any advice on how to only escape these elements and nothing else.
First of all, they are just string, not regexs. Secondly you should use not-greedy .*.
Also, you may want to use the g modifier to match every occourrence in the text.
function toMarkup($this) {
var text = $this.text();
text = text.replace(/\*\*(.*?)\*\*/g, "<b>$1</b>");
text = text.replace(/__(.*?)__/g, "<u>$1</u>");
text = text.replace(/~~(.*?)~~/g, "<i>$1</i>");
text = text.replace(/--(.*?)--/g, "<del>$1</del>");
text = text.replace(/<<(.*?)>>/g, "<a href='$1'>Link</a>");
$this.html(text);
}
Use a Regexp object as the first argument to text.replace() instead of a string:
function toMarkup($this) {
var text = $this.text();
text = text.replace(/\*\*(.*?)\*\*/g, "<b>$1</b>");
text = text.replace(/__(.*?)__/g, "<u>$1</u>");
text = text.replace(/~~(.*?)~~/g, "<i>$1</i>");
text = text.replace(/--(.*?)--/g, "<del>$1</del>");
text = text.replace(/<<(.*?)>>/g, "<a href='$1'>Link</a>");
$this.html(text);
}
Note that I also replaced all of the .* with .*? which will match as few characters as possible, otherwise your matches may be too long. For example you would match from the first ** to the very last ** instead of stopping at the next one. The regex also needs the g flag so that all matches will be replaced (thanks Aaron).
function toMarkup($this) {
$this.html ($this.text ().replace (/(__|~~|--|\*\*)(.*?)\1|<<(.*?)>>\/g,
function (m, m1, m2, m3) {
m[1] = {'**' : 'b>', '__': 'u>', '--': 'del>', '~~': 'i>'}[m[1]];
return m[3] ? 'Link'
: ('<' + m[1] + m[2] + '</' + m[1]);
});
}
Note that you cannot nest these, i.e. if you say __--abc--__ will be converted to <u>--abc--</u>.

Replace Last Match in Regex

I have an autocomplete user-tagging system that fills in usernames that come after an # symbol. I have this problem however, where I have two users with a matching substring. For example:
Tagging #billy and #b
When a user fills in the #b tag with a user named (for example) #brendan, it'll replace the #billy tag. How do I go backwards and replace only the last tag?
Edit: this is my current solution, but it feels kludgy. Is there a way to do this just with RegEx?:
function tagUser (chosenUsername) {
var userRegex = new RegExp('(^|\\s)#([' + lastUserTag() + ']*)$', 'gi');
var caption = $("#example").val();
var match = caption.match(userRegex);
var lastMatch = match[match.length - 1];
$("#example").val(caption.replace(lastMatch, " #" + chosenUsername));
}
Not sure if I understood your problem entirely. However just to let you know you can use negative lookahead to replace only last matched text like this:
var str='#billy and #b';
str = str.replace(/#b\b(?!.*?#b\b)/, 'brendan');

Categories