I am trying to get 10 most frequent word in the sentence below, I need to use regular expression.
let paragraph = `I love teaching. If you do not love teaching what else can you love. I love Python if you do not love something which can give you all the capabilities to develop an application what else can you love.
I want an output like this
{word:'love', count:6},
{word:'you', count:5},
{word:'can', count:3},
{word:'what', count:2},
{word:'teaching', count:2},
{word:'not', count:2},
{word:'else', count:2},
{word:'do', count:2},
{word:'I', count:2},
{word:'which', count:1},
{word:'to', count:1},
{word:'the', count:1},
{word:'something', count:1},
{word:'if', count:1},
{word:'give', count:1},
{word:'develop',count:1},
{word:'capabilities',count:1},
{word:'application', count:1},
{word:'an',count:1},
{word:'all',count:1},
{word:'Python',count:1},
{word:'If',count:1}]```
This is a solution without regexp, but maybe it is also worth looking at?
const paragraph = `I love teaching. If you do not love teaching what else can you love. I love Python if you do not love something which can give you all the capabilities to develop an application what else can you love.`;
let res=Object.entries(
paragraph.toLowerCase()
.split(/[ .,;-]+/)
.reduce((a,c)=>(a[c]=(a[c]||0)+1,a), {})
).map(([k,v])=>({word:k,count:v})).sort((a,b)=>b.count-a.count)
console.log(res.slice(0,10)) // only get the 10 most frequent words
I have something a bit messy but it uses regex and displays top 10 of the highest occuring results which is what you asked for.
Test it and let me know if it works for you.
let paragraph = "I love teaching. If you do not love teaching what else can you love. I love Python if you do not love something which can give you all the capabilities to develop an application what else can you love.";
//remove periods, because teaching and teaching. will appear as different results set
paragraph = paragraph.split(".").join("");
//results array where results will be stored
var results = []
//separate each string from the paragraph
paragraph.split(" ").forEach((word) => {
const wordCount = paragraph.match(new RegExp(word,"g")).length
//concatenate the word to its occurence:: e.g I:3 ::meaning I has appeared 3 times
const res = word + " : " + wordCount;
//check if the word has been added to results
if(!results.includes(res)){
//if not, push
results.push(res)
}
})
function sortResultsByOccurences(resArray) {
//we use a sort function to sort our results into order: highest occurence to lowest
resArray.sort(function(a, b) {
///\D/g is regex that removes anything that's not a digit, so that we can sort by occurences instead of letters as well
return(parseInt(b.replace(/\D/g, ""), 10) -
parseInt(a.replace(/\D/g, ""), 10));
});
//10 means we are using a decimal number system
return(resArray);
}
//reassign results as sorted
results = sortResultsByOccurences(results);
for(let i = 0; i < 10; i++){//for loop is used to display top 10
console.log(results[i])
}
To get all words in a sentence use regular expressions:
/(\w+)(?=\s)/g.
If you use this in your input string then you get all words without the word which end with full-stop(.) i.e don't match the word "love.".
paragraph.match(/(\w+)(?=(\s|\.|\,|\;|\?))/gi)
So, in this case we have to modify the regex as:
/(\w+)(?=(\s|\.))/g.
Similarly, add the other special(,; ...) character which is end with some word.
This is your solution (please add the other special character if it's required).
let paragraph = `I love teaching. If you do not love teaching what else can you love. I love Python if you do not love something which can give you all the capabilities to develop an application what else can you love.`;
let objArr = [];
[...new Set(paragraph.match(/(\w+)(?=(\s|\.|\,|\;|\?))/gi))].forEach(ele => {
objArr.push({
'word': ele,
'count': paragraph.match(new RegExp(ele+'(?=(\\s|\\.|\\,|\\;|\\?))', 'gi'))?.length
})
});
objArr.sort((x,y) => y.count - x.count);
Related
am trying to replace numbers in an array but am facing an issue which am not really able to correctly manage regarding how to correctly target the just one data I really have to change.
I'll make an example to have more accuracy on describing it.
Imagine my data array look like that:
["data", "phone numbers", "address"]
I can change numbers via following script but my first problem is that it makes no differences between the number it find in columns, for example "phone numbers" from "address" (at the moment am not using it, but should I include a ZIP code in the address it would be really be a problem)
Beside, my second and current problem with my script, is that obviosuly in the same "phone numnbers" a number may appear more times while I'd like to affect only the first block of the data - let's say to add/remove the country code (or even replace it with it's country vexillum) which I normally have like that "+1 0000000000" or "+54 0000000000"
So if a number is for example located in EU it really make this script useless: Spain is using "+34" while France "+33" and it wouldn't succeded in any case becouse it recognize only "+3" for both.
I've found some one else already facing this problems which seems to solved it wrapping the values inside a buondaries - for example like that "\b"constant"\b" - but either am wronging syntax either it does not really apply to my case. Others suggest to use forEach or Array.prototype.every which I failed to understand how to apply at this case.
Should you have other ideas about that am open to try it!
function phoneUPDATES(val)
{
var i= 0;
var array3 = val.value.split("\n");
for ( i = 0; i < array3.length; ++i) {
array3[i] = "+" + array3[i];
}
var arrayLINES = array3.join("\n");
const zero = "0";
const replaceZERO = "0";
const one = "1";
const replaceONE = "1";
const result0 = arrayLINES.replaceAll(zero, replaceZERO);
const result1 = result0.replaceAll(one, replaceONE);
const result2 = result1.replaceAll(two, replaceTWO);
const result3 = result2.replaceAll(thre, replaceTHREE);
const result4 = result3.replaceAll(four, replaceFOUR);
const result5 = result4.replaceAll(five, replaceFIVE);
const result6 = result5.replaceAll(six, replaceSIX);
const result7 = result6.replaceAll(seven, replaceSEVEN);
const result8 = result7.replaceAll(eight, replaceEIGHT);
const result9 = result8.replaceAll(nine, replaceNINE);
const result10 = result9.replaceAll(ten, replaceTEN);
const result11 = result10.replaceAll(eleven, replaceELEVEN);
Why not use a regex replace, you could do something like /(\+\d+ )/g which will find a + followed by one or more digits followed by a space, and then you can strip out the match:
const phoneNumbers = [, "+54 9876543210"]
console.log(phoneNumbers.map((num) => num.replaceAll(/(\+\d+ )/g, '')))
If you need to only target the second element in an array, i'd imagine your data looks like
const data = [["data", "+1 1234567890, +1 5555555555", "address"], ["data", "+11 111111111, +23 23232323", "address"]];
console.log(data.map((el) => {
el[1] = el[1].replaceAll(/(\+\d+ )/g, '');
return el;
}))
ok, this almost is cheating but I really didn't thought it before and, by the way does, not even actually solve the problems but jsut seems to work around it.
If I call the replacemente in decreasing order that problem just does not show up becouse condition of replacement involving higher numbers are matched before the smaller one.
but should some one suggest a complete "true code comply" solution is wellcome
I am writing some code to find words in paragraphs that begin with the letter "a". I was wondering if there was a shortcut that I could put inside of a variable. I do know about the startsWith() function but that does not work for what i'm trying to do. Here's what I have so far. I'm trying to use the match method and .innerText to read the paragraphs.
function processText() {
var totalNumberOfWords = document.getElementById('p')
var wordsBegginingWithA = 0;
var wordsEndingWithW = 0;
var wordsFourLettersLong = 0;
var hyphenatedWords = 0;
}
<p><button onClick="processText();">Process</button></p>
<p id="data"></p>
<p>The thousand injuries of Fortunato I had borne as I best could; but when he ventured upon insult, I vowed revenge. You, who so well know the nature of my soul, will not suppose, however, that I gave utterance to a threat.
<span style='font-style:italic;'>At
length</span> I would be avenged; this was a point definitely settled--but the very definitiveness with which it was resolved precluded the idea of risk. I must not only punish, but punish with impunity. A wrong is unredressed when retribution
overtakes its redresser. It is equally unredressed when the avenger fails to make himself felt as such to him who has done the wrong.</p>
You can get the inner text of the p element - split it at the spaces to get the words - pass the words through a function to see if the first letter is "a" and if so, increment a count.
processText();
function processText() {
var p = document.querySelector('p').innerText;
var totalWords = p.split(' ');
var wordsBegginingWithA = 0;
totalWords.forEach(function(word){
if ( beginsWithA(word) ) {
wordsBegginingWithA++
};
})
console.log(wordsBegginingWithA); // gives 5
}
function beginsWithA(word){
return word.toLowerCase().charAt(0) == 'a';
}
<p>Apples and oranges are fruit while red and blue are colors</p>
You can use:
[variablename].match(/(?<!\w)a\w*/ig)!=null? a.match(/(?<!\w)a\w*/ig).length:0; to detect what words starting with what letter (in example it was a).
And:
[variablename].match(/\S+/g)!=null? a.match(/\S+/g).length:0;
to detect word count.
function processText() {
var a = document.getElementById('p').innerText;
var b = a.match(/(?<!\w)a\w*/ig)!=null? a.match(/(?<!\w)a\w*/ig).length:0;
var word= a.match(/\S+/g)!=null? a.match(/\S+/g).length:0;
console.log('Text: ',a,'\nA starting word: ', b, '\nWord count: ',word);
}
processText();
<span id="p">Apple is super delicious. An ant is as good as my cat which favors a pear than fish. I'm going to test them all at once.</span>
Explanation: .match would return all value which matches the expression given.
Notice that I also used conditional (ternary) operator to detect whether or not the Regex will return a null value if no match were returned. If it's returning null then it would result in 0 (:0) if it's returning another value than null then it would return the count (.length).
More info related to Regular expression: https://www.rexegg.com/regex-quickstart.html
function processText() {
let pp = document.getElementById('root')
console.log(pp.innerHTML.match(/(?<!\w)a\w*/g))
return pp.innerHTML.match(/(?<!\w)a\w*/g);
}
processText()
<p id='root'>this is a apple</p>
Using the result of indexOf, 0 is the equivalent to startsWith
var str = document.getElementById("myTextarea").value;
var keyword = document.getElementById("myInput").value;
var n = str.indexOf(keyword);`
Working sample in this fiddle.
HTH
I'm loading json file from database with two fields words and grade. Each word is graded for example true has 1 while lie has -1. Then i take input from text filed and i need to grade it based on grades from JSON file and then calculate score by summarizing the grades, but i just can't seem to find the way to do that. Words that are not in file are not being calculated.
I tried string.search match but it's to complicated and in the end i couldn't get result the way i wanted. I tried array searches same thing. I searched for on line solution, but no one has done anything similar so i can't copy it.
JSON
[
{"word":"true","grade":1},
{"word":"hate","grade":-1},
{"word":"dog","grade":0.8},
{"word":"cat","grade":-0.8}
]
String
"Dogs are wonderful but i prefer cats, cats, i can not lie although dog is a true friend".
The first thing I'd do is turn your JSON data into a map which can easily be searched - key would be the word, and value the grade:
var json = [
{"word":"true","grade":1},
{"word":"hate","grade":-1},
{"word":"dog","grade":0.8},
{"word":"cat","grade":-0.8}
];
var map = json.reduce(function(p,c){
p.set(c.word.toLowerCase(),c.grade);
return p;
}, new Map());
console.log(...map);
Then, its just a case of splitting your string, whilst also calculating the total score - again reduce can be used
var json = [
{"word":"true","grade":1},
{"word":"hate","grade":-1},
{"word":"dog","grade":0.8},
{"word":"cat","grade":-0.8}
];
var map = json.reduce(function(p,c){
p.set(c.word.toLowerCase(),c.grade);
return p;
}, new Map());
var input = "Dogs are wonderful but i prefer cats cats i can not lie although dog is a true friend";
var score = input.split(' ').reduce(function(p,c){
var wordScore = map.get(c.toLowerCase()) || 0;
return p + wordScore;
},0);
console.log(score);
Note that I have manually removed punctuation in the above input - I'll leave that as an exercise for you.
Also note that "cats" != "cat" so some of your words wont be found!
Let's first think of the algorithm. Two options:
Search and count the input string as many times as number of words in your JSON, or
Check each word in your input string against the JSON contents.
Since the JSON length is known and (I presume) shorter than the possible input string, I would tend to prefer option 2.
Now, after selecting option 2, you need to split the input string into words and create an array containing one word each entry of the array.
You can achieve this using the mystring.split(" ") method. This, of course, does not take into account punctuations, but you can handle this using the same method.
Now, you can add to each entry in your JSON a field to count the number of appearances of each entry in the JSON within the string.
Finally, you sum the product of the counters and the grade.
console.log((function(rules, str) {
var sum = 0;
Array.prototype.forEach.call(rules, function(rule) {
var match = str.match(rule.regexp);
match && (sum += str.match(rule.regexp).length * rule.grade);
console.log([rule.regexp, match&&match.length, rule.grade, match&&match.length * rule.grade, sum]);
});
return sum;
})([{
"regexp": /true/g,
"grade": 1
}, {
"regexp": /hate/g,
"grade": -1
}, {
"regexp": /dog/g,
"grade": 0.8
}, {
"regexp": /cat/g,
"grade": -0.8
}], "Dogs are wonderful but i prefer cats, cats, i can not lie although dog is a true friend"));
i use regexp rather than string, u can use string and convert to regex at run time, hope this would help
I would like to be able to change a list of words in my coding, e.g.
var words= {
AFK Away from the keyboard 4U For you B4N By for now BBL Be back later BDAY Birthday CBA Can't be asked }
changed to this format:
var words= {
'AFK': "Away from the keyboard", '4U': "For you", 'B4N': "By for now", 'BBL': "Be back later", 'BDAY': "Birthday", 'CBA': "Can't be asked",
}
without having to change each word into the format manually using HTML/JavaScript. However I understand that this might not be possible, but I thought I'd see if anyone had an I idea on how to do it anyway. From what I read it looks as if I'll have to use Python and a Database, but I don't know anything about python what so ever really so I was hoping (probably vainly) that there is some HTML/JavaScript code that I haven't seen that solves this!
I found a similar question here, but it wasn't really what I wanted as it uses python: [turning data into a list
The thing I want to do is change all words in this format: e.g
AFK Away from the Keyboard
to a format with
'AFK': "Away from the keyboard",
the aim of this code is to translate text abbreviations to real English words which it is already doing, but in order to get a decent amount of words to translate I need to get words in the above format which would take forever if I formated each one individually. here is the rest of the code if that helps:
function replacer() {
var text= document.getElementById("textbox1").value;
for (var modifiers in translationwords){
text = text.replace(new RegExp('\\b' + modifiers + '\\b', 'gi'), translationwords[modifiers]); }
document.getElementById("textbox2").value=text;
document.getElementById("add").onclick= function storage() {
if(!document.cookie) document.cookie = "";
document.cookie = document.cookie +"<li>"+document.getElementById("textbox1").value+ ":"+"</li>";
document.cookie = document.cookie +"<li>" + document.getElementById("textbox2").value+ "</li>";
document.getElementById("array").innerHTML= document.cookie;
}
}
function textdelete(x) {
if (x.value=="ENTER TRANSLATION HERE"){
x.value="";
};
}
Thank You
If your words were in a string to begin with, then it would be arguably possible to translate it into an object the way you want it, but it might not be the most accurate translation.
Since it looks like all the abbreviations are all upper case and without spaces, we can look through the string, and set the all caps/numbers 'words' as properties, with the following string of text as the value. In javascript, something like this would work.
//set words as string
var string = "AFK Away from the keyboard 4U For you B4N By for now BBL Be back later BDAY Birthday CBA Can't be asked";
// create empty dictionary for storage
var dictionary = {};
// use regex to find all abbreviations and store them in an array
var abbreviations = string.match(/[A-Z0-9]+(?![a-z])\w/g);
// returns ["AFK", "4U", "B4N", "BBL", "BDAY", "CBA"]
// use regex to replace all abbreviations with commas...
englishWords = string.replace(/[A-Z0-9]+(?![a-z])\w/g, ',');
// Edit (see below):
englishWords = englishWords.replace(/\W,\W/g,',');
// End edit
// then split string into array based on commas
englishWords = englishWords.split(',').slice(1);
// finally loop over all abbreviations and add them to the dictionary with their meaning.
for(var i = 0; i < abbreviations.length; i++){
dictionary[abbreviations[i]] = englishWords[i];
}
Edit: the above solution still might have white space at the beginning or end of each english string. You can add this line of code just before splitting the string to remove the white space.
englishWords = englishWords.replace(/\W,\W/g,',');
Assuming you have the data as a list in python, or as a string, or that you can get it into that format:
>>> lst = 'AFK Away from the keyboard 4U For you B4N By for now BBL Be back later BDAY Birthday CBA Can\'t be asked'.split()
>>> lst
['AFK', 'Away', 'from', 'the', 'keyboard', '4U', 'For', 'you', 'B4N', 'By', 'for', 'now', 'BBL', 'Be', 'back', 'later', 'BDAY', 'Birthday', 'CBA', "Can't", 'be', 'asked']
Further assuming that the keywords are always in all-uppercase and that the values are never in all uppercase, and that there are no duplicate keys, you can create the following dictionary:
>>> keys = [x for x in lst if x.upper() == x]
>>> {keys[i]:' '.join(lst[lst.index(keys[i]):lst.index(keys[i+1])]) for i in range(len(keys)-1)}
{'BDAY': 'BDAY Birthday', 'BBL': 'BBL Be back later', '4U': '4U For you', 'B4N': 'B4N By for now', 'AFK': 'AFK Away from the keyboard'}
Let's say that this is your list of words:
["AFK","Away","from","the","keyboard","4U","For","you","B4N","By","for","now","BBL","Be","back","later","BDAY","Birthday","CBA","Can't","be","asked"]
Now, we can change that into an object simply by using regular JavaScript without any database or back-end langauge like Python.
var list = ["AFK","Away","from","the","keyboard","4U","For","you","B4N","By","for","now","BBL","Be","back","later","BDAY","Birthday","CBA","Can't","be","asked"] /*What we have*/;
var code = {} /*What we want*/;
var currKey = null /*Our current key*/, hasBeenSet = false /*Whether or not the property at our current key has been set*/;
for (var i = 0; i < words.length; i++) {
//If the current word is an abbreviation...
if (words[i] === words[i].toUpperCase()) {
currKey = words[i]; //We set currKey to it
hasBeenSet = false; //We make hasBeenSet false
}
//Otherwise, if the property at current key has been set, we add on to it.
else if (hasBeenSet) code[currKey] += " "+words[i];
//Otherwise, the property at current key hasn't been set...
else {
code[currKey] = words[i]; //We set the property to the current word
hasBeenSet = true; //We make hasBeenSet true
}
}
code //{ AFK: "Away from the keyboard", 4U: "For you", B4N: "By for now", BBL: "Be back later", BDAY:"Birthday", CBA:"Can't be asked" }
I have an array with incidents that has happened, that are written in free text and therefore aren't following a pattern except for some keywords, eg. "robbery", "murderer", "housebreaking", "car accident" etc. Those keywords can be anywhere in the text, and I want to find those keywords and add those to categories, eg. "Robberies".
In the end, when I have checked all the incidents I want to have a list of categories like this:
Robberies: 14
Murder attempts: 2
Car accidents: 5
...
The array elements can look like this:
incidents[0] = "There was a robbery on Amest Ave last night...";
incidents[1] = "There has been a report of a murder attempt...";
incidents[2] = "Last night there was a housebreaking in...";
...
I guess the best here is to use regular expressions to find the keywords in the texts, but I really suck at regexp and therefore need some help here.
The regular expressions is not correct below, but I guess this structure would work?
Is there a better way of doing this to avoid DRY?
var trafficAccidents = 0,
robberies = 0,
...
function FindIncident(incident) {
if (incident.match(/car accident/g)) {
trafficAccidents += 1;
}
else if (incident.match(/robbery/g)) {
robberies += 1;
}
...
}
Thanks a lot in advance!
The following code shows an approach you can take. You can test it here
var INCIDENT_MATCHES = {
trafficAccidents: /(traffic|car) accident(?:s){0,1}/ig,
robberies: /robbery|robberies/ig,
murder: /murder(?:s){0,1}/ig
};
function FindIncidents(incidentReports) {
var incidentCounts = {};
var incidentTypes = Object.keys(INCIDENT_MATCHES);
incidentReports.forEach(function(incident) {
incidentTypes.forEach(function(type) {
if(typeof incidentCounts[type] === 'undefined') {
incidentCounts[type] = 0;
}
var matchFound = incident.match(INCIDENT_MATCHES[type]);
if(matchFound){
incidentCounts[type] += matchFound.length;
};
});
});
return incidentCounts;
}
Regular expressions make sense, since you'll have a number of strings that meet your 'match' criteria, even if you only consider the differences in plural and singular forms of 'robbery'. You also want to ensure that your matching is case-insensitive.
You need to use the 'global' modifier on your regexes so that you match strings like "Murder, Murder, murder" and increment your count by 3 instead of just 1.
This allows you to keep the relationship between your match criteria and incident counters together. It also avoids the need for global counters (granted INCIDENT_MATCHES is a global variable here, but you can readily put that elsewhere and take it out of the global scope.
Actually, I would kind of disagree with you here . . . I think string functions like indexOf will work perfectly fine.
I would use JavaScript's indexOf method which takes 2 inputs:
string.indexOf(value,startPos);
So one thing you can do is define a simple temporary variable as your cursor as such . . .
function FindIncident(phrase, word) {
var cursor = 0;
var wordCount = 0;
while(phrase.indexOf(word,cursor) > -1){
cursor = incident.indexOf(word,cursor);
++wordCount;
}
return wordCount;
}
I have not tested the code but hopefully you get the idea . . .
Be particularly careful of the starting position if you do use it.
RegEx makes my head hurt too. ;) If you're looking for exact matches and aren't worried about typos and misspellings, I'd search the incident strings for substrings containing the keywords you're looking for.
incident = incident.toLowerCase();
if incident.search("car accident") > 0 {
trafficAccidents += 1;
}
else if incident.search("robbery") > 0 {
robberies += 1;
}
...
Use an array of objects to store all the many different categories you're searching for, complete with an appropiate regular expression and a count member, and you can write the whole thing in four lines.
var categories = [
{
regexp: /\brobbery\b/i
, display: "Robberies"
, count: 0
}
, {
regexp: /\bcar accidents?\b/i
, display: "Car Accidents"
, count: 0
}
, {
regexp: /\bmurder\b/i
, display: "Murders"
, count: 0
}
];
var incidents = [
"There was a robbery on Amest Ave last night..."
, "There has been a report of an murder attempt..."
, "Last night there was a housebreaking in..."
];
for(var x = 0; x<incidents.length; x++)
for(var y = 0; y<categories.length; y++)
if (incidents[x].match(categories[y].regexp))
categories[y].count++;
Now, no matter what you need, you can simply edit one section of code, and it will propagate through your code.
This code has the potential to categorize each incident in multiple categories. To prevent that, just add a 'break' statement to the if block.
You could do something like this which will grab all words found on each item in the array and it will return an object with the count:
var words = ['robbery', 'murderer', 'housebreaking', 'car accident'];
function getAllIncidents( incidents ) {
var re = new RegExp('('+ words.join('|') +')', 'i')
, result = {};
incidents.forEach(function( txt ) {
var match = ( re.exec( txt ) || [,0] )[1];
match && (result[ match ] = ++result[ match ] || 1);
});
return result;
}
console.log( getAllIncidents( incidents ) );
//^= { housebreaking: 1, car accident: 2, robbery: 1, murderer: 2 }
This is more a a quick prototype but it could be improved with plurals and multiple keywords.
Demo: http://jsbin.com/idesoc/1/edit
Use an object to store your data.
events = [
{ exp : /\brobbery|robberies\b/i,
// \b word boundary
// robbery singular
// | or
// robberies plural
// \b word boundary
// /i case insensitive
name : "robbery",
count: 0
},
// other objects here
]
var i = events.length;
while( i-- ) {
var j = incidents.length;
while( j-- ) {
// only checks a particular event exists in incident rather than no. of occurrences
if( events[i].exp.test( incidents[j] ) {
events[i].count++;
}
}
}
Yes, that's one way to do it, although matching plain-words with regex is a bit of overkill — in which case, you should be using indexOf as rbtLong suggested.
You can further sophisticate it by:
appending the i flag (match lowercase and uppercase characters).
adding possible word variations to your expression. robbery could be translated into robber(y|ies), thus matching both singular and plural variations of the word. car accident could be (car|truck|vehicle|traffic) accident.
Word boundaries \b
Don't use this. It'll require having non-alphanumeric characters surrounding your matching word and will prevent matching typos. You should make your queries as abrangent as possible.
if (incident.match(/(car|truck|vehicle|traffic) accident/i)) {
trafficAccidents += 1;
}
else if (incident.match(/robber(y|ies)/i)) {
robberies += 1;
}
Notice how I discarded the g flag; it stands for "global match" and makes the parser continue searching the string after the first match. This seems unnecessary as just one confirmed occurrence is enough for your needs.
This website offers an excellent introduction to regular expressions
http://www.regular-expressions.info/tutorial.html