I'm trying to write a twitter bot that replies to people with a random haiku, but I'm having trouble creating the structure for it in javascript. In the code I've attached you'll see that using p5.js I've loaded two text files, one with a bunch of nouns and the other with a bunch of adjectives. I then split them by syllable using some code I found, but I can't figure out how to re-organize my list into separate arrays per syllable amount.
like "oneSyllable = []", "twoSyllable = []" etc.
Any help would be greatly appreciated- even just an explanation of what the regex function does would help. This one- (/(?=[^laeiouy]es|ed|[^laeiouy]e)$/, '')
Also, is there an easier way to do this within javascript? Using p5 means I'll have to run it to the twitter bot using the command line, something which I still have to learn. If you have any additional information on making a haiku twitter bot please let me know! I've done a bunch of research but I can't find any source code for the several that are out there.
This is for a code final due soon and I'm way out of my depth!! Hope someone can help.
function setup() {
createCanvas(600, 6000);
fill(0);
loadStrings("./nouns.txt", doText);
loadStrings("./adjectives.txt", doText2);
}
function doText(data) {
for (var i=0; i<data.length; i++) {
text("Nouns list:", 5, 20);
text(data[i]+ ": " + (new_count(data[i])), 5, 20*i+50);
}
}
function doText2(data) {
for (var j=0; j<data.length; j++) {
text("Adjectives list:", 100, 20);
text(data[j]+ ": " + (new_count(data[j])), 100, 20*j+50);
}
}
function new_count(word) {
word = word.toLowerCase();
if(word.length <= 3) { return 1; }
word = word.replace(/(?=[^laeiouy]es|ed|[^laeiouy]e)$/, '');
word = word.replace(/^y/, '');
return word.match(/[aeiouy]{1,2}/g).length;
}
I suggest to store words organized by syllable count in a dictionary with keys being syllable count and values being lists of words having the respective syllable count.
Provided any object in JavaScript is an associative array, which is just another term for "dictionary", you may end up with the following function to re-organize your lists:
function groupBySyllableCount(wordList) {
var wordsBySyllableCount = {};
for (var i = 0, len = wordList.length; i < len; i++) {
var slblCount = new_count(wordList[i]);
if (wordsBySyllableCount[slblCount] === undefined) {
wordsBySyllableCount[slblCount] = [wordList[i]];
} else {
wordsBySyllableCount[slblCount].push(wordList[i]);
}
}
return wordsBySyllableCount;
}
// TEST & DEMO:
var nouns = ['air', 'time', 'community', 'year', 'people', 'woman', 'house', 'research'];
var nounsBySyllableCount = groupBySyllableCount(nouns);
console.log(nounsBySyllableCount);
function new_count(word) {
word = word.toLowerCase();
if(word.length <= 3) { return 1; }
word = word.replace(/(?:[^laeiouy]es|ed|[^laeiouy]e)$/, '');
word = word.replace(/^y/, '');
return word.match(/[aeiouy]{1,2}/g).length;
}
As of the (?=[^laeiouy]es|ed|[^laeiouy]e)$ regex, it matches
either es not preceded by l, a, e, i, o, u, or y,
or ed
or e not preceded by l, a, e, i, o, u, or y
but only if they are go just before end of string (a word in your case), which is denoted by the $ anchor. The (?=...) is a positive look-ahead and used here merely to group the [^laeiouy]es, ed and [^laeiouy]e patterns to state that each of them should be followed by end of string.
In fact using a positive look-ahead is an overkill here. Grouping with a capturing ((...)) or a non-capturing ((?:...)) group construct would be enough. See the amended regex in my demo above.
Related
TL:DR
According to the google docs, getResponseText() should return a string... but I get a message that claims it is an object when I try to sort it.. huh?
TypeError: Cannot find function sort in object
I was under the impression that a javascript string sort of works like an array, and it seems to behave like one because string[0] returns the first letter of a string..
DETAILS:
here is the sheet I am working
Hello everyone, I have a very unique situation where I need to update dirty strings (courtesy of an undesirable OCR import).
I have created a function that does the job but needs additional functionality.
Currently, the process goes like this:
enter your desired string
each cell (in your selection) is checked for that string
cells are updating with desired string if the match is over 50% alike
the check works like this:
compare the first letter of desired string (txtT[0])
against the first letter of target cell (valT[0])
compare additional letters [x] up to the length of the longest string
for example:
desired string = "testing"
target cell = "t3st1ng"
the loop goes like this:
create a point system do to math
(total points = length of longest string)
compare t and t ... if matching, add one point (+1 in this case because it matches)
compare e and 3 ... if matching, add one point (+0 in this case because it does not match)
compare s and s ... if matching, add one point (+1 in this case because it matches)
compare t and t ... if matching, add one point (+1 in this case because it matches)
compare i and 1 ... if matching, add one point (+0 in this case because it does not match)
compare n and n ... if matching, add one point (+1 in this case because it matches)
compare g and g ... if matching, add one point (+1 in this case because it matches)
points earned/total points = % of alike
The problem with this system is that if is based on the position of the letters in each string.
This causes problems when comparing strings like "testing" and "t est ing"
I tried to update it so that the first thing it does is SORT the string alphabetically, ignoring all special characters and non alphabetical characters.
That's when I came across an error:
TypeError: Cannot find function sort in object testing.
This does not make sense because my desired string is a string. See code where it says "this is where i get my error":
According to the google docs, getResponseText() should return a string... but I cannot call the sort method on the string.. which makes no sense!
function sandboxFunction() {
try {
var ui = SpreadsheetApp.getUi();
var ss = SpreadsheetApp.getActiveSpreadsheet();
var as = ss.getActiveSheet();
var ar = as.getActiveRange();
var sv = ui.prompt('enter desired string');
var txt = sv.getResponseText();
var txtT = txt.trim();
txtT = txtT.replace(/ /g, ''); //this is the trimmed comparison string
txtT = txtT.sort(); //***this is where I get my error***
ui.alert(txtT);
var vals = ar.getValues();
for (var r = 0; r < vals.length; r++) {
var row = vals[r];
for (var c = 0; c < row.length; c++) {
var val = row[c];
var valT = val.trim();
valT = valT.replace(/ /g, ''); // this is the trimmed comparison cell
ui.alert(valT);
//this is where we test the two
//test length
var tl = txtT.length;
var vl = valT.length;
if (vl < tl) {
ui.alert("different lengths.. applying fix");
for (vl; vl < tl; vl++) {
valT = valT.concat("x");
ui.alert(valT);
}
}
else if (tl < vl) {
ui.alert("different lengths.. applying fix");
for (tl; tl < vl; tl++) {
txtT = txtT.concat("x");
ui.alert(txtT);
}
}
if (valT.toUpperCase() == txtT.toUpperCase()) {
ui.alert("your strings match");
}
else {
var total = txtT.length;
var pts = 0;
for (var x = 0; x < total; x++) {
if (valT[x] == txtT[x]) {
pts++;
}
}
if (pts / total >= 0.5) {
ui.alert("at least 50% match, fixing text");
vals[r][c] = txt;
}
}
}
}
ar.setValues(vals);
}
catch (err) {
ui.alert(err);
}
}
You can't sort a string in that way, sort is a method of arrays.
You can convert your string to an array, later you can sort
var txtT = "This is a string".trim();
txtT = txtT.replace(/ /g, ''); //this is the trimmed comparison string
var txtArray = txtT.split(''); // Convert to array
var txtSorted = txtArray.sort(); // Use sort method
console.log(txtSorted);
See sort() docs
This is about a Chrome Extension.
Suppose a user select any text on a page, then clicks a button to save it. Via window.getSelection() I can get that text without the underlying html markup.
I store that text. For demo purposes, let's say the text is:
"John was much more likely to buy if he knew the price beforehand"
The next time the user visits the page, I want to find that text on the page. The issue is, the html for that text is actually:
<b>John was much more likely to buy if he knew the price <span class="italic">beforehand</span></b>
The second issue is that this system needs to work even if the selection is dirty, i.e. it starts/ends mid DOM node.
What I've build is bit of a fat solution, so I am curious how I can make it more efficient and/or smaller. This is the whole thing:
text.split("").map(function(el, i, arr){
if(specials.includes(el)){
return "\\"+el;
}
return el;
})
.join("(?:\\s*<[^>]+>\\s*)*\\s*");
where text is the saved text and specials is
var specials = [
'/', '.', '*', '+', '?', '|',
'(', ')', '[', ']', '{', '}', '\\'
];
The process is:
Split text into single characters
For each character, check if it's a special char and if so, prepend it with \
Join all letters together with regEx that check if there's any whitespace or html tags inbetween
My question is, can it be done in a better way? I get the "bruteforcing" feeling with this solution and I don't know if it would actually cause lag on larger sites/selection texts.
Plus, it doesn't work for SPAs where text may update a bit after the DOM is ready.
Thank you for any input.
EDIT:
So initially I was using mark.js, which doesn't handle this at all, but not 12 hours after I posted this question the maintainer release v8.0.0 that uses NodeList and handles my use case. The feature is "acrossElements", located here.
create a Range object
set it so that it spans the entire document from start to end
check if the string of interest is in its toString()
clone range twice
apply binary search by moving the start/end points of the subranges into roughly their midpoint. this can be approximated by finding the first descendant with > 1 child nodes and then splitting the child list
goto 3
this should roughly take n log m steps where n is the document text length and m the number of nodes.
Build the entire text representation of the document manually from each node with nodeType of Node.TEXT_NODE, saving the node reference and its text's start/end positions relative to the overall string in an array. Do it just once as DOM is slow, and you might want to search for multiple strings. Otherwise the other answer might be much faster (without actual benchmarks it's a moot point).
Apply HTML whitespace coalescing rules.
Otherwise you'll end up with huge spans of spaces and newline characters.
For example, Range.toString() doesn't strip them, meaning you'd have to convert your string to a RegExp with [\s\n\r]+ instead of spaces and all other special characters like {}()[]|^$*.?+ escaped.
Anyway, it'd be wise to use the converted RegExp on document.body.textContent before proceeding (easy to implement, many examples on the net, thus not included below).
A simplified implementation for plain-string search follows.
function TextMap(baseElement) {
this.baseElement = baseElement || document.body;
var textArray = [], textNodes = [], textLen = 0, collapseSpace = true;
var walker = document.createTreeWalker(this.baseElement, NodeFilter.SHOW_TEXT);
while (walker.nextNode()) {
var node = walker.currentNode;
var nodeText = node.textContent;
var parentName = node.parentNode.localName;
if (parentName==='noscript' || parentName==='script' || parentName==='style') {
continue;
}
if (parentName==='textarea' || parentName==='pre') {
nodeText = nodeText.replace(/^(\r\n|[\r\n])/, '');
collapseSpace = false;
} else {
nodeText = nodeText.replace(/^[\s\r\n]+/, collapseSpace ? '' : ' ')
.replace(/[\s\r\n]+$/, ' ');
collapseSpace = nodeText.endsWith(' ');
}
if (nodeText) {
var len = nodeText.length;
textArray.push(nodeText);
textNodes.push({
node: node,
start: textLen,
end: textLen + len - 1,
});
textLen += len;
}
}
this.text = textArray.join('');
this.nodeMap = textNodes;
}
TextMap.prototype.indexOf = function(str) {
var pos = this.text.indexOf(str);
if (pos < 0) {
return [];
}
var index1 = this.bisectLeft(pos);
var index2 = this.bisectRight(pos + str.length - 1, index1);
return this.nodeMap.slice(index1, index2 + 1)
.map(function(info) { return info.node });
}
TextMap.prototype.bisect =
TextMap.prototype.bisectLeft = function(pos) {
var a = 0, b = this.nodeMap.length - 1;
while (a < b - 1) {
var c = (a + b) / 2 |0;
if (this.nodeMap[c].start > pos) {
b = c;
} else {
a = c;
}
}
return this.nodeMap[b].start > pos ? a : b;
}
TextMap.prototype.bisectRight = function(pos, startIndex) {
var a = startIndex |0, b = this.nodeMap.length - 1;
while (a < b - 1) {
var c = (a + b) / 2 |0;
if (this.nodeMap[c].end > pos) {
b = c;
} else {
a = c;
}
}
return this.nodeMap[a].end >= pos ? a : b;
}
Usage:
var textNodes = new TextMap().indexOf('<span class="italic">');
When executed on this question's page:
[text, text, text, text, text, text]
Those are text nodes, so to access corresponding DOM elements use the standard .parentNode:
var textElements = textNodes.map(function(n) { return n.parentNode });
Array[6]
0: span.tag
1: span.pln
2: span.atn
3: span.pun
4: span.atv
5: span.tag
Recently, I've been attempting to emulate a small language in jQuery and JavaScript, yet I've come across what I believe is an issue. I think that I may be parsing everything completely wrong.
In the code:
#name Testing
#inputs
#outputs
#persist
#trigger
print("Test")
The current way I am separating and parsing the string is by splitting all of the code into lines, and then reading through this lines array using searches and splits. For example, I would find the name using something like:
if(typeof lines[line] === 'undefined')
{
}
else
{
if(lines[line].search('#name') == 0)
{
name = lines[line].split(' ')[1];
}
}
But I think that I may be largely wrong on how I am handling parsing.
While reading through examples on how other people are handling parsing of code blocks like this, it appeared that people parsed the entire block, instead of splitting it into lines as I do. I suppose the question of the matter is, what is the proper and conventional way of parsing things like this, and how do you suggest I use it to parse something such as this?
In simple cases like this regular expressions is your tool of choice:
matches = code.match(/#name\s+(\w+)/)
name = matches[1]
To parse "real" programming languages regexps are not powerful enough, you'll need a parser, either hand-written or automatically generated with a tool like PEG.
A general approach to parsing, that I like to take often is the following:
loop through the complete block of text, character by character.
if you find a character that signalizes the start of one unit, call a specialized subfunction to parse the next characters.
within each subfunction, call additional subfunctions if you find certain characters
return from every subfunction when a character is found, that signalizes, that the unit has ended.
Here is a small example:
var text = "#func(arg1,arg2)"
function parse(text) {
var i, max_i, ch, funcRes;
for (i = 0, max_i = text.length; i < max_i; i++) {
ch = text.charAt(i);
if (ch === "#") {
funcRes = parseFunction(text, i + 1);
i = funcRes.index;
}
}
console.log(funcRes);
}
function parseFunction(text, i) {
var max_i, ch, name, argsRes;
name = [];
for (max_i = text.length; i < max_i; i++) {
ch = text.charAt(i);
if (ch === "(") {
argsRes = parseArguments(text, i + 1);
return {
name: name.join(""),
args: argsRes.arr,
index: argsRes.index
};
}
name.push(ch);
}
}
function parseArguments(text, i) {
var max_i, ch, args, arg;
arg = [];
args = [];
for (max_i = text.length; i < max_i; i++) {
ch = text.charAt(i);
if (ch === ",") {
args.push(arg.join(""));
arg = [];
continue;
} else if (ch === ")") {
args.push(arg.join(""));
return {
arr: args,
index: i
};
}
arg.push(ch);
}
}
FIDDLE
this example just parses function expressions, that follow the syntax "#functionName(argumentName1, argumentName2, ...)". The general idea is to visit every character exactly once without the need to save current states like "hasSeenAtCharacter" or "hasSeenOpeningParentheses", which can get pretty messy when you parse large structures.
Please note that this is a very simplified example and it misses all the error handling and stuff like that, but I hope the general idea can be seen. Note also that I'm not saying that you should use this approach all the time. It's a very general approach, that can be used in many scenerios. But that doesn't mean that it can't be combined with regular expressions for instance, if it, at some part of your text, makes more sense than parsing each individual character.
And one last remark: you can save yourself the trouble if you put the specialized parsing function inside the main parsing function, so that all functions have access to the same variable i.
My question is similar to THIS question that hasn't been answered yet.
How can I make my code (or any javascript code that might be suggested?) find all possible solutions of a known string length with multiple missing characters in variation with repetition?
I'm trying to take a string of known character lengths and find missing characters from that string. For example:
var missing_string = "ov!rf!ow"; //where "!" are the missing characters
I'm hoping to run a script with a specific array such as:
var r = new Array("A","B","C","D","E","F","G","H","I","J","K",
"L","M","N","O","P","Q","R","S","T","U","V",
"W","X","Y","Z",0,1,2,3,4,5,6,7,8,9);
To find all the possible variations with repetition of those missing characters to get a result of:
ovArfAow
ovBrfAow
ovCrfAow
...
ovBrfBow
ovBrfCow
...
etc //ignore the case insensitive, just to emphasize the example
and of course, eventually find ovErfLow within all the variations with repetition.
I've been able to make it work with 1 (single) missing character. However, when I put 2 missing characters with my code it obviously repeats the same array character for both missing characters which is GREAT for repition but I also need to find without repetition as well and might need to have 3-4 missing characters as well which may or may not be repeated. Here's what I have so far:
var r = new Array("A","B","C","D","E","F","G","H","I","J","K",
"L","M","N","O","P","Q","R","S","T","U","V",
"W","X","Y","Z",0,1,2,3,4,5,6,7,8,9);
var missing_string = "he!!ow!r!d";
var bt_lng = missing_string.length;
var bruted="";
for (z=0; z<r.length; z++) {
for(var x=0;x<bt_lng;x++){
for(var y=0;y<r.length;y++){
if(missing_string.charAt(x) == "!"){
bruted += r[z];
break;
}
else if(missing_string.charAt(x) == r[y]){
bruted += r[y];
}
}
}
console.log("br: " + bruted);
bruted="";
}
This works GREAT with just ONE "!":
helloworAd
helloworBd
helloworCd
...
helloworLd
However with 2 or more "!", I get:
heAAowArAd
heBBowBrBd
heCCowCrCd
...
heLLowLrLd
which is good for the repetition part but I also need to test all possible array M characters in each missing character spot.
Maybe the following function in pure javascript is a possible solution for you. It uses Array.prototype.reduce to create the cartesian product c of the given alphabet x, whereby its power n depends on the count of the exclamation marks in your word w.
function combinations(w) {
var x = new Array(
"A","B","C","D","E","F","G","H","I","J","K",
"L","M","N","O","P","Q","R","S","T","U","V",
"W","X","Y","Z",0,1,2,3,4,5,6,7,8,9
),
n = w.match(/\!/g).length,
x_n = new Array(),
r = new Array(),
c = null;
for (var i = n; i > 0; i--) {
x_n.push(x);
}
c = x_n.reduce(function(a, b) {
var c = [];
a.forEach(function(a) {
b.forEach(function(b) {
c.push(a.concat([b]));
});
});
return c;
}, [[]]);
for (var i = 0, j = 0; i < c.length; i++, j = 0) {
r.push(w.replace(/\!/g, function(s, k) {
return c[i][j++];
}));
}
return r;
}
Call it like this console.log(combinations("ov!rf!ow")) in your browser console.
I have an array with incidents that has happened, that are written in free text and therefore aren't following a pattern except for some keywords, eg. "robbery", "murderer", "housebreaking", "car accident" etc. Those keywords can be anywhere in the text, and I want to find those keywords and add those to categories, eg. "Robberies".
In the end, when I have checked all the incidents I want to have a list of categories like this:
Robberies: 14
Murder attempts: 2
Car accidents: 5
...
The array elements can look like this:
incidents[0] = "There was a robbery on Amest Ave last night...";
incidents[1] = "There has been a report of a murder attempt...";
incidents[2] = "Last night there was a housebreaking in...";
...
I guess the best here is to use regular expressions to find the keywords in the texts, but I really suck at regexp and therefore need some help here.
The regular expressions is not correct below, but I guess this structure would work?
Is there a better way of doing this to avoid DRY?
var trafficAccidents = 0,
robberies = 0,
...
function FindIncident(incident) {
if (incident.match(/car accident/g)) {
trafficAccidents += 1;
}
else if (incident.match(/robbery/g)) {
robberies += 1;
}
...
}
Thanks a lot in advance!
The following code shows an approach you can take. You can test it here
var INCIDENT_MATCHES = {
trafficAccidents: /(traffic|car) accident(?:s){0,1}/ig,
robberies: /robbery|robberies/ig,
murder: /murder(?:s){0,1}/ig
};
function FindIncidents(incidentReports) {
var incidentCounts = {};
var incidentTypes = Object.keys(INCIDENT_MATCHES);
incidentReports.forEach(function(incident) {
incidentTypes.forEach(function(type) {
if(typeof incidentCounts[type] === 'undefined') {
incidentCounts[type] = 0;
}
var matchFound = incident.match(INCIDENT_MATCHES[type]);
if(matchFound){
incidentCounts[type] += matchFound.length;
};
});
});
return incidentCounts;
}
Regular expressions make sense, since you'll have a number of strings that meet your 'match' criteria, even if you only consider the differences in plural and singular forms of 'robbery'. You also want to ensure that your matching is case-insensitive.
You need to use the 'global' modifier on your regexes so that you match strings like "Murder, Murder, murder" and increment your count by 3 instead of just 1.
This allows you to keep the relationship between your match criteria and incident counters together. It also avoids the need for global counters (granted INCIDENT_MATCHES is a global variable here, but you can readily put that elsewhere and take it out of the global scope.
Actually, I would kind of disagree with you here . . . I think string functions like indexOf will work perfectly fine.
I would use JavaScript's indexOf method which takes 2 inputs:
string.indexOf(value,startPos);
So one thing you can do is define a simple temporary variable as your cursor as such . . .
function FindIncident(phrase, word) {
var cursor = 0;
var wordCount = 0;
while(phrase.indexOf(word,cursor) > -1){
cursor = incident.indexOf(word,cursor);
++wordCount;
}
return wordCount;
}
I have not tested the code but hopefully you get the idea . . .
Be particularly careful of the starting position if you do use it.
RegEx makes my head hurt too. ;) If you're looking for exact matches and aren't worried about typos and misspellings, I'd search the incident strings for substrings containing the keywords you're looking for.
incident = incident.toLowerCase();
if incident.search("car accident") > 0 {
trafficAccidents += 1;
}
else if incident.search("robbery") > 0 {
robberies += 1;
}
...
Use an array of objects to store all the many different categories you're searching for, complete with an appropiate regular expression and a count member, and you can write the whole thing in four lines.
var categories = [
{
regexp: /\brobbery\b/i
, display: "Robberies"
, count: 0
}
, {
regexp: /\bcar accidents?\b/i
, display: "Car Accidents"
, count: 0
}
, {
regexp: /\bmurder\b/i
, display: "Murders"
, count: 0
}
];
var incidents = [
"There was a robbery on Amest Ave last night..."
, "There has been a report of an murder attempt..."
, "Last night there was a housebreaking in..."
];
for(var x = 0; x<incidents.length; x++)
for(var y = 0; y<categories.length; y++)
if (incidents[x].match(categories[y].regexp))
categories[y].count++;
Now, no matter what you need, you can simply edit one section of code, and it will propagate through your code.
This code has the potential to categorize each incident in multiple categories. To prevent that, just add a 'break' statement to the if block.
You could do something like this which will grab all words found on each item in the array and it will return an object with the count:
var words = ['robbery', 'murderer', 'housebreaking', 'car accident'];
function getAllIncidents( incidents ) {
var re = new RegExp('('+ words.join('|') +')', 'i')
, result = {};
incidents.forEach(function( txt ) {
var match = ( re.exec( txt ) || [,0] )[1];
match && (result[ match ] = ++result[ match ] || 1);
});
return result;
}
console.log( getAllIncidents( incidents ) );
//^= { housebreaking: 1, car accident: 2, robbery: 1, murderer: 2 }
This is more a a quick prototype but it could be improved with plurals and multiple keywords.
Demo: http://jsbin.com/idesoc/1/edit
Use an object to store your data.
events = [
{ exp : /\brobbery|robberies\b/i,
// \b word boundary
// robbery singular
// | or
// robberies plural
// \b word boundary
// /i case insensitive
name : "robbery",
count: 0
},
// other objects here
]
var i = events.length;
while( i-- ) {
var j = incidents.length;
while( j-- ) {
// only checks a particular event exists in incident rather than no. of occurrences
if( events[i].exp.test( incidents[j] ) {
events[i].count++;
}
}
}
Yes, that's one way to do it, although matching plain-words with regex is a bit of overkill — in which case, you should be using indexOf as rbtLong suggested.
You can further sophisticate it by:
appending the i flag (match lowercase and uppercase characters).
adding possible word variations to your expression. robbery could be translated into robber(y|ies), thus matching both singular and plural variations of the word. car accident could be (car|truck|vehicle|traffic) accident.
Word boundaries \b
Don't use this. It'll require having non-alphanumeric characters surrounding your matching word and will prevent matching typos. You should make your queries as abrangent as possible.
if (incident.match(/(car|truck|vehicle|traffic) accident/i)) {
trafficAccidents += 1;
}
else if (incident.match(/robber(y|ies)/i)) {
robberies += 1;
}
Notice how I discarded the g flag; it stands for "global match" and makes the parser continue searching the string after the first match. This seems unnecessary as just one confirmed occurrence is enough for your needs.
This website offers an excellent introduction to regular expressions
http://www.regular-expressions.info/tutorial.html