How to get total sum of matches from a loop? - javascript

I'm trying to loop through an array to check whether any of the words in the array are in a body of text:
for(var i = 0; i < wordArray.length; i++ ) {
if(textBody.indexOf(wordArray[i]) >= 1) {
console.log("One or two words.");
// do something
}
else if (textBody.indexOf(wordArray[i]) >= 3) {
console.log("Three or more words.");
// do something
}
else {
console.log("No words match.");
// do something
}
}
where >= 1 and >= 3 are supposed to determine the number of matched words (although it might just be determining their index position in the array? As, in its current state it will console.log hundreds of duplicate strings from the if / else statement).
How do I set the if / else statement to do actions based off of the amount of matched words?
Any help would be greatly appreciated!

Try this:
for (var i = 0; i < wordArray.length; i++) {
var regex = new RegExp('\\b' + wordArray[i] + '\\b', 'ig');
var matches = textBody.match(regex);
var numberOfMatches = matches ? matches.length : 0;
console.log(wordArray[i] + ' found ' + numberOfMatches + " times");
}
indefOf will do partial matches. For example "This is a bust".indexOf("bus") would match even though that is probably not what you want. It is better to use a regular expression with the word boundry token \b to eliminate partial word matches. In the Regexp constructor you need to escape the slash so \b becomes \\b. The regex uses the i flag to ignore case and the g flag to find all matches. Replace the console.log line with your if/else logic based on the numberOfMatches variable.
UPDATE: Per your clarification you would change the above to
var numberOfMatches = 0;
for (var i = 0; i < wordArray.length; i++) {
var regex = new RegExp('\\b' + wordArray[i] + '\\b', 'ig');
var matches = textBody.match(regex);
numberOfMatches += matches ? matches.length : 0;
}
console.log(numberOfMatches);

indexOf() provides the index of the first match, not the number of matches. So currently you're testing first if it appears at index one, then at index three - not counting the number of matches.
I can think of a couple different approaches off the top of my head that would work, but I'm not going to write them for you because this sounds like school work. One would be to use match: see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/match and Count number of matches of a regex in Javascript
If you're scared of using regex, or can't be assed to spend the time learning how they work, you could get the index of the match, and if it matches make a substring excluding the portion up to that match, and test if it matches again, while incrementing a counter. indexOf() will return -1 if no matches are found.

You can split text to words with regExp and than find all occurrences of your word in this way
var text = "word1, word2, word word word word3"
var allWords = text.split(/\b/);
var getOccurrenceCount = function(word, allWords) {
return allWords.reduce(function(count, nextWord) {
count += word == nextWord ? 1 : 0;
return count;
}, 0);
};
getOccurrenceCount("word", allWords);

This may help you:
You have to use .match instead of .indexOf (get the index of the first occurence inside the string)
var textBody = document.getElementById('inside').innerHTML;
var wordArray = ['check','test'];
for(var i = 0; i < wordArray.length; i++ ) {
var regex = new RegExp( wordArray[i], 'g' );
var wordCount = (textBody.match(regex) || []).length;
console.log(wordCount + " times the word ["+ wordArray[i] +"]");
}
<body>
<p id="inside">
this is your test, check the test, how many test words check
<p>
</body>

I would first put the array into a hashmap, something like
_.each(array, function(a){map[a]=1})
Second split string into array by space and marks.
Loop through the new array to check if the word exist in the first map.
Make sure to compare string/words without cases.
This approach will help you improve the run time efficiency to linear.

Yes .indexOf gives you the first position of the word in the string. Many methods available to count a word in a string, I'm sharing my crazy version :
function matchesCount(word, str) {
return (' ' + str.replace(/[^A-Za-z]+/gi,' ') + ' ')
.split(' '+word+' ').length - 1;
}
console.log(matchesCount('test', 'A test to test how many test in this'));

Related

Insert character at every occurrence of another character in a string javascript

I have a string 3-5,10-15,20 and I want to insert p before every number. I wanted to just find each '-' and ',' and insert a 'p' after each one, and then one at the beginning.
Looping over it requires manipulating the string you're looping over, which wasn't working quite well for me. I feel like this is such a simple task, but I'm getting stuck. Any help would be appreciated.
The final result should look like p3-p5,p10-p15,p20.
This is what I tried:
input = `p${input}`;
for (let i = 0; i < input.length; i++) {
if (input[i] === '-' || input[i] === ',') {
input = `${input.slice(0, i + 1)}p${input.slice(i + 1)}`;
}
}
You could search for digits and replace the digits with added value.
var string = '3-5,10-15,20',
result = string.replace(/\d+/g, 'p$&');
console.log(result);
You could use a regex to match the - and the , character and replace them using a group.
const input = '3-5,10-15,20';
console.log(input.replace(/([-,])/g, "$1p"));

Checking if combination of any amount of strings exists

I'm solving a puzzle and I have an idea of how to solve this problem, but I would like some guidance and hints.
Suppose I have the following, Given n amount of words to input, and m amount of word combos without spaces, I will have some functionality as the following.
4
this
is
my
dog
5
thisis // outputs 1
thisisacat // 0, since a or cat wasnt in the four words
thisisaduck // 0, no a or cat
thisismy // 1 this,is,my is amoung the four words
thisismydog // 1
My thoughts
First What I was thinking of doing is storing those first words into an array. After that, I check if any of those words is the first word of those 5 words
Example: check if this is in the first word thisis. It is! Great, now remove that this, from thisis to get simply just is, now delete the original string that corresponded to that equality and keep iterating over the left overs (now is,my,dog are available). If we can keep doing this process, until we get an empty string. We return 1, else return 0!
Are my thoughts on the right track? I think this would be a good approach (By the way I would like to implement this in javascript)
Sorting words from long to short may in some cases help to find a solution quicker, but it is not a guarantee. Sentences that contain the longest word might only have a solution if that longest word is not used.
Take for instance this test case:
Words: toolbox, stool, boxer
Sentence: stoolboxer
If "toolbox" is taken as a word in that sentence, then the remaining characters cannot be matched with other valid words. Yet, there is a solution, but only if the word "toolbox" is not used.
Solution with a Regular Expression
When regular expressions are allowed as part of the solution, then it is quite simple. For the above example, the regular expression would be:
^(toolbox|stool|boxer)*$
If a sentence matches that expression, it is a solution. If not, then not. This is quite straightforward, and doesn't really require an algorithm. All is done by the regular expression interpreter. Here is a snippet:
var words = ['this','is','a','string'];
var sentences = ['thisis','thisisastring','thisisaduck','thisisastringg','stringg'];
var regex = new RegExp('^(' + words.join('|') + ')*$');
sentences.forEach(sentence => {
// search returns a position. It should be 0:
console.log(sentence + ': ' + (sentence.search(regex) ? 'No' : 'Yes'));
});
But using regular expressions in an algorithm-challenge feels like cheating: you don't really write the algorithm, but rely on the regular expression implementation to do the job for you.
Without Regular Expressions
You could use this algorithm: first check whether a word matches at the start of the input sentence, and if so, remove that first occurrence from it. Then repeat this for the remaining part of the sentence. If this can be repeated until no characters are left over, you have a solution.
If characters are left over which cannot be matched with any word... well, then you cannot really conclude there is no solution for that sentence. It might be that some earlier made word choice was the wrong one, and there was an alternative. So to cope with that, your algorithm could backtrack and try other words.
This principle can be implemented through recursion. To gain memory-efficiency, you could leave the original sentence in-tact, and work with an index in that sentence instead.
The algorithm is implemented in arrow-function testString:
var words = ['this','is','a','string'];
var sentences = ['thisis','thisisastring','thisisaduck','thisisastringg','stringg'];
var testString = (words, str, i = 0) =>
i >= str.length || words.some( word =>
str.substr(i, word.length) == word && testString(words, str, i + word.length)
);
sentences.forEach(sentence => {
console.log(sentence + ': ' + (testString(words, sentence) ? 'Yes' : 'No'));
});
Or, the same in non-arrow-function syntax:
var words = ['this','is','a','string'];
var sentences = ['thisis','thisisastring','thisisaduck','thisisastringg','stringg'];
var testString = function (words, str, i = 0) {
return i >= str.length || words.some(function (word) {
return str.substr(i, word.length) == word
&& testString(words, str, i + word.length);
});
}
sentences.forEach(function (sentence) {
console.log(sentence + ': ' + (testString(words, sentence) ? 'Yes' : 'No'));
});
... and without some(), forEach() or ternary operator:
var words = ['this','is','a','string'];
var sentences = ['thisis','thisisastring','thisisaduck','thisisastringg','stringg'];
function testString (words, str, i = 0) {
if (i >= str.length) return true;
for (var k = 0; k < words.length; k++) {
var word = words[k];
if (str.substr(i, word.length) == word
&& testString(words, str, i + word.length)) {
return true;
}
}
}
for (var n = 0; n < sentences.length; n++) {
var sentence = sentences[n];
if (testString(words, sentence)) {
console.log(sentence + ': Yes');
} else {
console.log(sentence + ': No');
}
}
Take the 4 words, put them into a regex.
Use that regex to split each string.
Take the length of the resulting array (subtract one for the initial length of one).
var size = 'thisis'.split(/this|is|my|dog/).length - 1
Or if your list of words is an array
var search = new RegExp(words.join('|'))
var size = 'thisis'.split(search).length - 1
Either way you are splitting up the string by the list of words you have defined.
You can sort the words by length to ensure that larger words are matched first by
words.sort(function (a, b) { return b.length - a.length })
Here is the solution for anyone interested
var input = ['this','is','a','string']; // This will work for any input, but this is a test case
var orderedInput = input.sort(function(a,b){
return b.length - a.length;
});
var inputRegex = new RegExp(orderedInput.join('|'));
// our combonation of words can be any size in an array, just doin this since prompt in js is spammy
var testStrings = ['thisis','thisisastring','thisisaduck','thisisastringg','stringg'];
var foundCombos = (regex,str) => !str.split(regex).filter(str => str.length).length;
var finalResult = testStrings.reduce((all,str)=>{
all[str] = foundCombos(inputRegex,str);
if (all[str] === true){
all[str] = 1;
}
else{
all[str] = 0;
}
return all;
},{});
console.log(finalResult);

Replace string between second set of [ and ]

I am learning regex, and I got a doubt. Let's consider
var s = "YYYN[1-20]N[]NYY";
Now, I want to replace/insert the '1-8' between [ and ] at its second occurrence.
Then output should be
YYYN[1-20]N[1-8]NYY
For that I had tried using replace and passing a function through it as shown below:
var nth = 0;
s = s.replace(/\[([^)]+)\]/g, function(match, i, original) {
nth++;
return (nth === 1) ? "1-8" : match;
});
alert(s); // But It wont work
I think that regex is not matchIing the string that I am using.
How can I fix it?
You regex \[([^)]+)\] will not match empty square brackets since + requires at least 1 character other than ). I guess you wanted to write \[[^\]]*\].
Here is a fix for your solution:
var s = "YYYN[1-20]N[]NYY";
var nth = 0;
s = s.replace(/\[[^\]]*\]/g, function (match, i, original) {
nth++;
return (nth !== 1) ? "[1-8]" : match;
});
alert(s);
Here is another way of doing it:
var s = "YYYN[1-20]N[]NYY";
var nth = 0;
s = s.replace(/(.*)\[\]/, "$1[1-8]");
alert(s);
The regex (.*)\[\] matches and captures into Group 1 greedily as much text as possible (thus we get the last set of empty []), and then matches empty square brackets. Then we restore the text before [] with $1 backreference and add out string 1-8.
If it’s only two occurences of square brackets, then this will work:
/(.*\[.*?\].*\[).*?(\].*)/
This RegEx has “YYYN[1-20]N[” as the first capturing group and “]NYY” as the second.
I suggest using simple split and join operations:
var s = "YYYN[1-20]N[]NYY";
var arr = s.split(/\[/)
arr[2] = '1-8' + arr[2]
var r = arr.join('[')
//=> YYYN[1-20]N[1-8]NYY
You can use following regex :
var s = "YYYN[1-20]N[]NYY";
var nth = 0;
s = s.replace(/([^[]+\[(?:[^[]+)\][^[]+)\[[^[]+\](.+)/, "$1[1-8]$2");
alert(s);
The first part ([^[]+\[([^[]+)\][^[]+) will match a string contain first sub-string between []. and \[[^[]+\] would be the second one which you want and the last part (.+?) match the rest of your string.

Using regular expression to split a string

I have a string which I need to separate correctly:
self.view.frame.size.height = 44
I need to get only view, frame, size, and height. And I need to do it with a regular expression.
So far I've tried a lot of variants, none of them are even close to what I want to get. And my code now looks like this:
var testString = 'self.view.frame.size.height = 44'
var re = new RegExp('\\.(.*)\\.', "g")
var array = re.exec(testString);
console.log('Array length is ' + array.length)
for (var i = 0; i < array.length; i++) {
console.log('<' + array[i] + ">");
}
And it doesn't work at all:
Array length is 2
<.view.frame.size.>
<view.frame.size>
I'm new at Javascript, so maybe I want the impossible, let me know.
Thanks.
In Javascript, executing a regexp with the g modifier doesn't return all the matches at once. You have to execute it repeatedly on the same input string, and each one returns the next match.
You also need to change the regexp so it only returns one word at a time. .* is greedy, so it returns the longest possible match, so it was returning all the words between the first and last .. [^.]* will match a sequence of non-dot characters, so it will just return one word. You can't include the second . in the regexp, because that will interfere with the repetition -- each repetition starts searching after the end of the previous match, and there's no beginning . after the ending . of the word. Also, there's no . after height, so the last word won't match it.
EDIT: I've changed the regexp to use \w* instead of [^.]*, because it was grabbing the whole height = 44 string instead of just height.
var testString = 'self.view.frame.size.height = 44';
var re = /\.(\w*)/g;
var array = [];
var result;
while (result = re.exec(testString)) {
array.push(result[1]);
}
console.log('Array length is ' + array.length)
for (var i = 0; i < array.length; i++) {
console.log('<' + array[i] + ">");
}
If you're sure that your data will be always in the same format you can use this:
function parse (string) {
return string.split(" = ").shift().split(".").splice(1);
}
In your context, split is a MUCH better option:
var str = "self.view.frame.size.height = 44";
var bits1 = str.split(" ")[0];
var bits2 = bits1.split(".");
bits2.shift(); // get rid of the unwanted self
console.log(bits2);

Count number of words in string using JavaScript

I am trying to count the number of words in a given string using the following code:
var t = document.getElementById('MSO_ContentTable').textContent;
if (t == undefined) {
var total = document.getElementById('MSO_ContentTable').innerText;
} else {
var total = document.getElementById('MSO_ContentTable').textContent;
}
countTotal = cword(total);
function cword(w) {
var count = 0;
var words = w.split(" ");
for (i = 0; i < words.length; i++) {
// inner loop -- do the count
if (words[i] != "") {
count += 1;
}
}
return (count);
}
In that code I am getting data from a div tag and sending it to the cword() function for counting. Though the return value is different in IE and Firefox. Is there any change required in the regular expression? One thing that I show that both browser send same string there is a problem inside the cword() function.
[edit 2022, based on comment] Nowadays, one would not extend the native prototype this way. A way to extend the native protype without the danger of naming conflicts is to use the es20xx symbol. Here is an example of a wordcounter using that.
Old answer: you can use split and add a wordcounter to the String prototype:
if (!String.prototype.countWords) {
String.prototype.countWords = function() {
return this.length && this.split(/\s+\b/).length || 0;
};
}
console.log(`'this string has five words'.countWords() => ${
'this string has five words'.countWords()}`);
console.log(`'this string has five words ... and counting'.countWords() => ${
'this string has five words ... and counting'.countWords()}`);
console.log(`''.countWords() => ${''.countWords()}`);
I would prefer a RegEx only solution:
var str = "your long string with many words.";
var wordCount = str.match(/(\w+)/g).length;
alert(wordCount); //6
The regex is
\w+ between one and unlimited word characters
/g greedy - don't stop after the first match
The brackets create a group around every match. So the length of all matched groups should match the word count.
This is the best solution I've found:
function wordCount(str) {
var m = str.match(/[^\s]+/g)
return m ? m.length : 0;
}
This inverts whitespace selection, which is better than \w+ because it only matches the latin alphabet and _ (see http://www.ecma-international.org/ecma-262/5.1/#sec-15.10.2.6)
If you're not careful with whitespace matching you'll count empty strings, strings with leading and trailing whitespace, and all whitespace strings as matches while this solution handles strings like ' ', ' a\t\t!\r\n#$%() d ' correctly (if you define 'correct' as 0 and 4).
You can make a clever use of the replace() method although you are not replacing anything.
var str = "the very long text you have...";
var counter = 0;
// lets loop through the string and count the words
str.replace(/(\b+)/g,function (a) {
// for each word found increase the counter value by 1
counter++;
})
alert(counter);
the regex can be improved to exclude html tags for example
//Count words in a string or what appears as words :-)
function countWordsString(string){
var counter = 1;
// Change multiple spaces for one space
string=string.replace(/[\s]+/gim, ' ');
// Lets loop through the string and count the words
string.replace(/(\s+)/g, function (a) {
// For each word found increase the counter value by 1
counter++;
});
return counter;
}
var numberWords = countWordsString(string);

Categories