Regex .exec into array - javascript

I want to capture some values in a string, THEN return them to the page. Here is an example of the code. As I understand, the .exec should store the values it matches into the array correct? This should return Savage, Betsy. Can someone enlighten me on to what's wrong?
var regex = /\b(Betsy)(Savage)\b/i;
var string = "My friend is Betsy Ann Savage";
var arrayMatch = null;
while(arrayMatch = regex.exec(string)){
document.getElementById("text").innerHTML = arrayMatch[1] + ", " + arrayMatch[0];
}

You don't get any matches like this. You could add .* between (Betsy) and (Savage)...

It sounds like you think \b(Besty)(Savage)\b will match EITHER Besty, OR Savage, but that isn't the case. It's looking for one string where both parts are combined - you might as well try to match \b(BetsySavage)\b. This is because a while yes, you do have two groups separated by parentasis, you have them directly next to each other, so the Regex engine says, 'okay', I'll look for both right next to each other. I think what you really want to do is use | which represents an OR. As in \b(Besty|Savage)\b.

Related

Javascript regular expression to sanitize string with pipes

I need a little help in trying to sanitize a string. I have written a regular expression that is pretty close to giving me the results I want but I just can't quite get it right. The string I'm receiving is in this format.
||a|assa||asss||ssss
The pipe character are basically placeholders to what would have been a separator for text. However, I'm trying to end up with something that would look like this.
|a|b|c|d in other words I'm just trying to remove consecutive pipes. I have put together a little example to illustrate what I have attempted and keep failing miserably.
const str1 = "||a||jump|fences||in the street";
const str2 = "im a wolf";
const hasPipe = /\|{1}\+/;//if the | is consecutevely repeated more than once than deleted.
console.log(hasPipe.test(str1));
console.log(str1.replace(hasPipe, ""))
console.log(hasPipe.test(str2));
The expected result to the above code should simply be.
|a|jump|fences|in the street"
Can someone please point me in the right direction or point my silly mistake.
Given your test string const str1 = "||a||jump|fences||in the street"; you want to replace multiple occurrences of pipe | with a single pipe.
There are a couple of ways to match a non-empty sequence:
+ = match 1 or more of the previous expression
{n,m} = match at least n but not more than m occurrences.
{n,} = match at least n and unlimited times.
Simple:
str1.replace(/\|+/g, "|")
"|a|jump|fences|in the street"
Matches one or more pipes and replaces with a single pipe. This replaces a single pipe with a pipe.
More exact:
str1.replace(/\|{2,}/g, "|")
"|a|jump|fences|in the street"
Matches two or more (because there is no max after the comma) pipes and replaces with a single pipe. This does not bother replacing a single pipe with another single pipe.
There are also a couple of ways to match exactly two pipes, if you'll never have a run of 3 or more:
str1.replace(/\|\|/, "|");
str1.replace(/\|{2}/, "|");
Not much to it:
\|\|+ replace with |
https://regex101.com/r/vvkrI0/1/
You can use the + to find all the locations that have 1 or more pipes in a row, and replace them all with a single pipe. Your regex would simply be:
/\|+/g
Here is an example, with a variable number of pipes:
const str1 = "||a|||jump|fences||||in the street";
var filtered_str1 = str1.replace(/\|+/g,"|")
console.log(filtered_str1);
You could substitute consective pipe characters like this:
const pat = /\|{2,}/gm;
const str = `||a|||jump|fences||in the street`;
const sub = `|`;
const res = str.replace(pat, sub);
console.log('result: ', res);
Result:
|a|jump|fences|in the street

How to find any of the specific characters exists in a string

Im looking for a solution to search the existence of given characters in a string. That means if any of the given characters present in a string, it should return true.
Now am doing it with arrays and loops. But honestly I feel its not a good way. So is there is any easiest way without array or loop?
var special = ['$', '%', '#'];
var mystring = ' using it to replace VLOOKUP entirely.$ But there are still a few lookups that you are not sure how to perform. Most importantly, you would like to be able to look up a value based on multiple criteria within separate columns.';
var exists = false;
$.each(special, function(index, item) {
if (mystring.indexOf(item) >= 0) {
exists = true;
}
});
console.info(exists);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
try with regex
var patt = /[$%#]/;
console.log(patt.test("using it to replace VLOOKUP entirely.$ But there are still a few lookups that you are not sure how to perform. Most importantly, you would like to be able to look up a value based on multiple criteria within separate columns."));
Be aware that [x] in regEx is for single characters only.
If you say wanted to search for say replace, it's going to look for anything with 'r,e,p,l,a,c' in the string.
Another thing to be aware of with regEx is escaping. Using a simple escape regEx found here -> Is there a RegExp.escape function in Javascript? I've made a more generic find in string.
Of course you asked given characters in a string, so this is more of an addenum answer for anyone finding this post on SO. As looking at your original question of an array of strings, it might be easy for people to think that's what you could just pass to the regEx. IOW: your questions wasn't how can I find out if $, %, # exist in a string.
var mystring = ' using it to replace VLOOKUP entirely.$ But there are still a few lookups that you are not sure how to perform. Most importantly, you would like to be able to look up a value based on multiple criteria within separate columns.';
function makeStrSearchRegEx(findlist) {
return new RegExp('('+findlist.map(
s=>s.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&')).join('|')+')');
}
var re = makeStrSearchRegEx(['$', '%', '#', 'VLOOKUP']);
console.log(re.test(mystring)); //true
console.log(re.test('...VLOOKUP..')); //true
console.log(re.test('...LOOKUP..')); //false
The best way is to use regular expressions. You can read more about it here.
In your case you should do something like this:
const specialCharacters = /[$%#]/;
const myString = ' using it to replace VLOOKUP entirely.$ But there are still a few lookups that you are not sure how to perform. Most importantly, you would like to be able to look up a value based on multiple criteria within separate columns.';
if(specialCharacters.test(myString)) {
console.info("Exists...");
}
Please, note, that it is good approach to store regular expressions in a variable to prevent creating of regular expression (which is not the fastest operation) each time you use it.

How can I inverse matched result of the pattern?

Here is my string:
Organization 2
info#something.org.au more#something.com market#gmail.com single#noidea.com
Organization 3
headmistress#money.com head#skull.com
Also this is my pattern:
/^.*?#[^ ]+|^.*$/gm
As you see in the demo, the pattern matches this:
Organization 2
info#something.org.au
Organization 3
headmistress#money.com
My question: How can I make it inverse? I mean I want to match this:
more#something.com market#gmail.com single#noidea.com
head#skull.com
How can I do that? Actually I can write a new (and completely different) pattern to grab expected result, but I want to know, Is "inverting the result of a pattern" possible?
No, I don't believe there is a way to directly inverse a Regular Expression but keeping it the same otherwise.
However, you could achieve something close to what you're after by using your existing RegExp to replace its matches with an empty string:
var everythingThatDidntMatchStr = str.replace(/^.*?#[^ ]+|^.*$/gm, '');
You can replace the matches from first RegExp by using Array.prototype.forEach() to replace matched RegExp with empty string using `String.ptototype.replace();
var re = str.match(/^.*?#[^ ]+|^.*$/gm);
var res = str;
re.forEach(val => res = res.replace(new RegExp(val), ""));

Single regular expression for two different strings

I need to write a single regular expression that returns the color value and size values from the below two strings.
[{"id":"2","name":"Color","code":"COLOR","optionValue":{"value":"TANGERINE TANGO","priority":0,"altValue1":"ORANGE","altValue2":null}},{"id":"3","name":"Size","code":"SIZE","optionValue":{"value":"MEDIUM","priority":4,"altValue1":null,"altValue2":null}}]
[{"id":"3","name":"Size","code":"SIZE","optionValue":{"value":"MEDIUM","priority":4,"altValue1":null,"altValue2":null}},{"id":"2","name":"Color","code":"COLOR","optionValue":{"value":"PEACOCK BLUE","priority":0,"altValue1":"GREEN","altValue2":null}}]
Currently I have two different regexps for them respectively.
1) COLOR(?:.*?)value":"([^"]+)(?:.*?)SIZE(?:.*?)value":"([^"]+)"
2) SIZE(?:.*?)value":"([^"]+)(?:.*?)COLOR(?:.*?)value":"([^"]+)"
Is there a way I can achieve this using a single regex?
Use JSON.parse, it is safer and is more appropriate with JSON strings:
var strings = ['[{"id":"2","name":"Color","code":"COLOR","optionValue":{"value":"TANGERINE TANGO","priority":0,"altValue1":"ORANGE","altValue2":null}},{"id":"3","name":"Size","code":"SIZE","optionValue":{"value":"MEDIUM","priority":4,"altValue1":null,"altValue2":null}}]', '[{"id":"3","name":"Size","code":"SIZE","optionValue":{"value":"MEDIUM","priority":4,"altValue1":null,"altValue2":null}},{"id":"2","name":"Color","code":"COLOR","optionValue":{"value":"PEACOCK BLUE","priority":0,"altValue1":"GREEN","altValue2":null}}]'];
var cnt = 0;
strings.forEach(function(str) {
var array = JSON.parse(str);
cnt += 1;
document.getElementById("r").innerHTML += "<b>Match " + cnt + "</b><br/>";
array.forEach(function(object) {
document.getElementById("r").innerHTML += object.optionValue.value + "<br/>";
});
});
<div id="r"/>
You can declare an array and push the results you get into the array for later use.
Most compact way to do this in a single regex:
(COLOR|SIZE).*?value":"([^"]+).*?(?!\1)(?:COLOR|SIZE).*?value":"([^"]+)"
I might agree with #stribizhev and #Sirko about parsing the JSON, but I can see if you just need to get a quick-and-dirty job done, then a regex is sometimes useful.
Explanation:
You can use alternation, capturing, lookahead assertions, and backreferencing.
First, let's simplify your regex by removing unnecessary non-capturing groups, (?:...):
COLOR.*?value":"([^"]+).*?SIZE.*?value":"([^"]+)"
SIZE.*?value":"([^"]+).*?COLOR.*?value":"([^"]+)"
Now, here's what gets you halfway (similar to #MarcosPerezGude's suggestion):
(?:COLOR|SIZE).*?value":"([^"]+).*?(?:COLOR|SIZE).*?value":"([^"]+)"
^^^ ^^^^^^ ^^^^^^^^^ ^
But the problem with this is it accepts COLOR COLOR and SIZE SIZE. Here's how to get around that:
( COLOR|SIZE).*?value":"([^"]+).*?(?!\1)(?:COLOR|SIZE).*?value":"([^"]+)"
^^ ^^^^^^
Let me explain this. The \1 is a backreference to whatever's captured in the first capturing group. Which in our case is now COLOR or SIZE because we've removed the non-capturing-ness. The (?!\1) is a negative lookahead assertion that says, "As long as what comes next isn't \1..." Therefore if the captured string was COLOR, the second half must be SIZE, or vice versa.
You can try with the OR operator
([size|color](?:.*?))
Good luck!
You should be able to have it select between two alternatives like this:
COLOR(?:.*?)value":"([^"]+)(?:.*?)SIZE(?:.*?)value":"([^"]+)|SIZE(?:.*?)value":"([^"]+)(?:.*?)COLOR(?:.*?)value":"([^"]+)

Javascript regex expression to replace multiple strings?

I've a string done like this: "http://something.org/dom/My_happy_dog_%28is%29cool!"
How can I remove all the initial domain, the multiple underscore and the percentage stuff?
For now I'm just doing some multiple replace, like
str = str.replace("http://something.org/dom/","");
str = str.replace("_%28"," ");
and go on, but it's really ugly.. any help?
Thanks!
EDIT:
the exact input would be "My happy dog is cool!" so I would like to get rid of the initial address and remove the underscores and percentage and put the spaces in the right place!
The problem is that trying to put a regex on Chrome "something goes wrong". Is it a problem of Chrome or my regex?
I'd suggest:
var str = "http://something.org/dom/My_happy_dog_%28is%29cool!";
str.substring(str.lastIndexOf('/')+1).replace(/(_)|(%\d{2,})/g,' ');
JS Fiddle demo.
The reason I took this approach is that RegEx is fairly expensive, and is often tricky to fine tune to the point where edge-cases become less troublesome; so I opted to use simple string manipulation to reduce the RegEx work.
Effectively the above creates a substring of the given str variable, from the index point of the lastIndexOf('/') (which does exactly what you'd expect) and adding 1 to that so the substring is from the point after the / not before it.
The regex: (_) matches the underscores, the | just serves as an or operator and the (%\d{2,}) serves to match digit characters that occur twice in succession and follow a % sign.
The parentheses surrounding each part of the regex around the |, serve to identify matching groups, which are used to identify what parts should be replaced by the ' ' (single-space) string in the second of the arguments passed to replace().
References:
lastIndexOf().
replace().
substring().
You can use unescape to decode the percentages:
str = unescape("http://something.org/dom/My_happy_dog_%28is%29cool!")
str = str.replace("http://something.org/dom/","");
Maybe you could use a regular expression to pull out what you need, rather than getting rid of what you don't want. What is it you are trying to keep?
You can also chain them together as in:
str.replace("http://something.org/dom/", "").replace("something else", "");
You haven't defined the problem very exactly. To get rid of all stretches of characters ending in %<digit><digit> you'd say
var re = /.*%\d\d/g;
var str = str.replace(re, "");
ok, if you want to replace all that stuff I think that you would need something like this:
/(http:\/\/.*\.[a-z]{3}\/.*\/)|(\%[a-z0-9][a-z0-9])|_/g
test
var string = "http://something.org/dom/My_happy_dog_%28is%29cool!";
string = string.replace(/(http:\/\/.*\.[a-z]{3}\/.*\/)|(\%[a-z0-9][a-z0-9])|_/g,"");

Categories