I have to find blank spaces in a string, this includes enter, tabs and spaces using Javascript. I have this code to find spaces
function countThis() {
var string = document.getElementById("textt").value;
var spaceCount = (string.split(" ").length - 1);
document.getElementById("countRedundants").value = spaceCount;
}
This works fine, and gives me the total number of spaces.
The problem is, i want it to only count once, if the space/enter/tab is next to each other. I cant solve this and would appreciate some help or point in the right direction.
Thanks, Gustav
Tou can use regular expressions in your split:
var spaceCount = (string.split(/\s+/gi).length - 1);
Use regex in order to achieve this.
For instance, you could check how many matches of one or more tabs, spaces or newlines exist, and use their count.
The regex rule is : [\t\s\n]+ - meaning that one or more chuncks of tabs, spaces or newlines match the rule.
For JavaScript:
var test = "Test Test Test\nTest\nTest\n\n";
var spacesCount = test.split(/[\t\s\n]+/g).length - 1;
console.log(spacesCount);
Regex is a pretty efficient way of doing this. Alternatively, you would have to manually iterate via the object, and attempt to match the cases where one or multiple spaces, tabs, or newlines exist.
Consider that, what you are attempting to do, is used inside a compiler in order to recognize specific character sequences as specific elements, called tokens. This practice is called Lexical Analysis, or tokenization. Since regex exists, there is no need to perform this check manually, except if you want to do something very advanced or specific.
Here is an ugly solution without using any regex, performance wise it's optimal, but it could be made more pythonic.
def countThis(s):
count = 0
i = 0
while i < len(s):
while i < len(s) and not s[i].isspace():
i += 1
if i < len(s):
count += 1
i += 1
while i < len(s) and s[i].isspace():
i += 1
return count
print(countThis("str"))
print(countThis(" str toto"))
print(countThis("Hello, world!"))
Stéphane Ammar's solution is probably the easiest on the eyes, but if you want something more performant:
function countGaps(str) {
let gaps = 0;
const isWhitespace = ch => ' \t\n\r\v'.indexOf(ch) > -1;
for (let i = 0; i < str.length; i++)
if (isWhitespace(str[i]) && !isWhitespace(str[i - 1]))
++gaps;
return gaps;
}
Related
I'm using JavaScript, and my text is:
Dana's places, we're having people coming to us people wanna buy condos. They want to move quickly and we're just losing out on a lot of great places. Really what would you say this?
If I have an index position of 6, I want to get just the first sentence: Dana's places, we're having people coming to us people wanna buy condos.
If I have an index position of 80, I want to get just the second sentence: They want to move quickly and we're just losing out on a lot of great places.
How can I parse the sentence based on position?
If I understand correctly, you should be able to just
Split on periods.
Get the length of the strings.
Determine where the index lands based sentence length.
Considering you need to split on "?, !" as well, you just need to loop through the sentences and flatten them further. Aka, split again.
Honestly, probably cleaner to use a regex and group.
Here is the regex version
const paragraph = "Dana's places, we're having people coming to us people wanna buy condos. They want to move quickly and we're just losing out on a lot of great places. Really what would you say this?"
/**
* Finds sentence by character index
* #param index
* #param paragraph
*/
function findSentenceByCharacterIndex(index, paragraph) {
const regex = /([^.!?]*[.!?])/gm
const matches = paragraph.match(regex);
let cursor = 0;
let sentenceFound;
for (const sentence of matches) {
sentenceFound = sentence;
cursor += sentence.length;
if( cursor > index )
{
break;
}
}
return sentenceFound;
}
const found = findSentenceByCharacterIndex(5, paragraph);
If you split on the periods. The string object has a prototype method called split that returns an array of the split string. In the example below str is a variable that holds your string.
const str = 'first sentence. Second sentence. third sentence';
const sentences = str.split('.');
sentences[0] // first sentence
sentences[1] // second sentence, etc
Instead of trying to use Array.split, it may be best to do some traditional character by character parsing of the string. Since we know what index we're looking for, we can simply look around for the beginning and end of a sentence.
How does a sentence end? Typically with either a ., !, or a ? - knowing this we can test for these characters and decide what part of the string we should slice off and return back to the program. If before our chosen index there is no sentence enders(a.e. ?!.) we assume that the beginning of the string is the beginning of the current sentence(0) - we do the same with after our chosen index, except we assign str.length if there is no sentence ender after the index.
let str = "Dana's places, we're having people coming to us people wanna buy condos. They want to move quickly and we're just losing out on a lot of great places. Really what would you say this?";
let getSentence = (ind, str) => {
let beg, end, flag, sentenceEnder = ["!", ".", "?"];
Array.from(str).forEach((c, c_index) => {
if(c_index < ind && sentenceEnder.includes(c)) {
beg = c_index + 1;
}
if (flag) return;
if (c_index >= ind && sentenceEnder.includes(c)) {
end = c_index;
flag = true;
}
});
end = end || str.length;
beg = beg || 0;
return str.slice(beg, end);
}
console.log(getSentence(10, str));
console.log(getSentence(80, str));
I'm using RPG Maker MV which is a game creator that uses JavaScript to create plugins. I have a plugin in JavaScript already, however I'm trying to edit a part of the plugin so that it basically checks if a certain string exists in a character in the game and if it does, then sets specific variables to numbers within that string.
for (var i = 0; i < page.list.length; i++) {
if (page.list[i].code == 108 && page.list[i].parameters[0].contains("<post:" + (n) + "," + (n) + ">")) {
var post = page.list[i].parameters[0];
var array = post.split(',');
this._origMovement.x = Number(array[1]);
this._origMovement.y = Number(array[1]);
break;
};
};
So I know the first 2 lines work and contains works when I only put a specific string. However I can't figure out how to check for 2 numbers that are separated by a comma and wrapped in '<>' tags, without knowing what the numbers would be.
Then it needs to extract those numbers and assign one to this._origMovement.x and the other to this._origMovement.y.
Any help would be greatly appreciated.
This is one of those rare cases where I'd use a regular expression. If you haven't come across regular expressions before I suggest reading an introduction to them, such as this one: https://regexone.com/
In your case, you probable want something like this:
var myRegex = /<post:(\d+),(\d+)>/;
var matches = myParameter.match(myRegex);
this._origMovement.x = matches[1]; //the first number
this._origMovement.y = matches[2]; //the second number
The myRegex variable is a regular expression that looks for the pattern you describe, and has 2 capture groups which look for a string of one or more digits (\d+ means "one or more digits"). The result of the .match() call gives you an array containing the entire match and the results of the capture groups.
If you want to allow for decimal numbers, you'll need to use a different capture group that allows for a decimal point, such as ([\d\.]+), which means "a sequence of one or more digits and decimal points", or more sophisticated, (\d+\.?\d*), which is "a sequence of one or more digits, following by an optional decimal point, followed by zero or more digits).
There are lots of good tutorials around to help you write good regular expressions, and sites that will help you live-test your expressions to make sure they work correctly. They're a powerful tool, but be careful not to over-use them!
Got it to work. For anyone who may ever be interested, the code is below.
for (var i = 0; i < page.list.length; i++) {
if (page.list[i].code == 108 && page.list[i].parameters[0].contains("<post:")) {
var myRegex = /<post:(\d+),(\d+)>/;
var matches = page.list[i].parameters[0].match(myRegex);
this._origMovement.x = matches[1]; //the first number
this._origMovement.y = matches[2]; //the second number
break;
}
};
I am a JavaScript/GoogleScript Rookie, so please bear with me. I am trying to create a Script in Google Docs that will be able to locate all instances of words having exactly 10 characters and append an element to them which would in turn give me a url.
Example : Here is my link pineapples
I would like to find the 10 character string, being pineapple, and add google.com/ in front of each of the strings that have a length of 10.
Giving me "Here is my link google.com/pineapples."
function myFunction() {
var str = document.getElementById(str.length=10);
var res = str.replace("str.length=10", "br"+"str.length=10");
This seems completely wrong, but all I can come up with for now.
You can make it work by using a Regex and then using a backreference to refer to the matching group.
Regex: (\S{10})
it has 3 parts
\S matches anything other than a space, tab or newline.
{10} matches the above character exactly 10 times.
() is the Capturing Group, which is used later in the regex $1.
You can get more information here which explain the above Regex in detail.
You may change it to fit your need.
var stringVal = "Here is my link pineapples";
var stringReplaced = stringVal.replace(/(\S{10})/, "google.com/$1");
console.log(stringReplaced);
Here is a possible solution:
Split your string using space as a separator (this will give you an array)
Test the length of each part in a loop
Prepend google.com/ if a part has 10 characters
Join your array and enjoy your transformed string
var str = "Here is my link pineapples",
arr = str.split(' ');
for (var i = 0; i < arr.length; i++) {
if (arr[i].length === 10) {
arr[i] = 'google.com/' + arr[i];
}
}
console.log(arr.join(' '));
Okay so bear with me, but my idea is as follows:
The text that you want to replace, are they all within elements of the same class? If so, you could do something like this (jQuery hope you don't mind)
function myFunction(){
$('myClass').each(function(){
var innerText = $(this).text();
var substring = innerText.substr(0,9);
$(this).text(substring);
}
}
This regular expression looks for words with 3 or less characters so that a non-breaking space can be placed in before them.
smallwords = /(\s|^)(([a-zA-Z-_(]{1,2}('|’)*[a-zA-Z-_,;]{0,1}?\s)+)/gi, // words with 3 or less characters
Is there a way, to make the expression only apply itself to 2 words in a row?
Example
Currently, the string:
Singapore, the USA and Vietnam.
will be turned into:
Singapore, the USA and Vietnam.
if the expression only applied to 2 words in a row it would show
Singapore, the USA and Vietnam.
here's the full script:
ragadjust = function (s, method) {
if (document.querySelectorAll) {
var eles = document.querySelectorAll(s),
elescount = eles.length,
smallwords = /(\s|^)(([a-zA-Z-_(]{1,2}('|’)*[a-zA-Z-_,;]{0,1}?\s)+)/gi, // words with 3 or less characters
while (elescount-- > 0) {
var ele = eles[elescount],
elehtml = ele.innerHTML;
if (method == 'small-words' || method == 'all')
// replace small words
elehtml = elehtml.replace(smallwords, function(contents, p1, p2) {
return p1 + p2.replace(/\s/g, ' ');
});
ele.innerHTML = elehtml;
}
}
};
This is from RagAdjust
I know that this is not what you are asking for, but I figured a code review wouldn't hurt:
I think the word boundary \b is better, in this case, than \s|^.
You have the A-Z and a-z characters in your match, yet you are use the i case insensitive operator.
{0,1}? is redundant - either use the ? to make it optional, or use {0,1} to make it match zero or one times.
If your are going to have a dash in your character set put it at the end so that you don't have an ambiguous regex, for example this [a-z_-] is much better than [a-z-_].
If you don't need to capture a value, use the non-capturing parenthesis (?:).
So, here's your cleaned up regex:
/\b((?:[a-z_(-]{1,2}(?:'|’)*[a-z_,;-]?\s)+)/gi
I'm pretty sure the '|’ bit is some sort of typo when you pasted this in from your editor. Not sure what it is supposed to be.
This doesn't quite solve the issue the way you suggested but it does reduce the number of non breaking spaces that end up in the string. But it might give you some insight. Because you have the trailing g on both regex replacements, you're doing global replace. If you instead loop it with some max number of fixes, things work out a little differently.
Try changing the max number of replacements. I think the other thing that happens here (in my modified code) is that after you make one replacement, the spaces and small words are gone because you jammed in a nbsp which may or may not solve the issue you're trying to get around.
Here's my replacement function (simplified from your original). The basic mod is to remove the g from the regex's and add the loop. You should check out the codepen to see the full deal
var new_ragadjust = function (contents) {
MAX_NUMBER_OF_REPLACEMENTS = 5;
smallwords = /(\s|^)(([a-zA-Z-_(]{1,2}('|’)*[a-zA-Z-_,;]{0,1}?\s)+)/i; // words with 3 or less characters
var ii = 0;
var c = contents;
for (;ii < MAX_NUMBER_OF_REPLACEMENTS; ++ii) {
c = c.replace(smallwords, function(contents, p1, p2) {
return p1 + p2.replace(/\s/, ' ');
});
}
return c;
};
Codepen
http://cdpn.io/DKLtc
Also, to see the difference, you need to inspect elements to actually see where the nbsps end up (as you probably already knew).
My string is: (as(dh(kshd)kj)ad)... ()()
How is it possible to count the parentheses with a regular expression? I would like to select the string which begins at the first opening bracket and ends before the ...
Applying that to the above example, that means I would like to get this string: (as(dh(kshd)kj)ad)
I tried to write it, but this doesn't work:
var str = "(as(dh(kshd)kj)ad)... ()()";
document.write(str.match(/(.*)/m));
As I said in the comments, contrary to popular belief (don't believe everything people say) matching nested brackets is possible with regex.
The downside of using it is that you can only do it up to a fixed level of nesting. And for every additional level you wish to support, your regex will be bigger and bigger.
But don't take my word for it. Let me show you. The regex \([^()]*\) matches one level. For up to two levels see the regex here. To match your case, you'd need:
\(([^()]*|\(([^()]*|\([^()]*\))*\))*\)
It would match the bold part: (as(dh(kshd)kj)ad)... ()()
Check the DEMO HERE and see what I mean by fixed level of nesting.
And so on. To keep adding levels, all you have to do is change the last [^()]* part to ([^()]*|\([^()]*\))* (check three levels here). As I said, it will get bigger and bigger.
See Tim's answer for why this won't work, but here's a function that'll do what you're after instead.
function getFirstBracket(str){
var pos = str.indexOf("("),
bracket = 0;
if(pos===-1) return false;
for(var x=pos; x<str.length; x++){
var char = str.substr(x, 1);
bracket = bracket + (char=="(" ? 1 : (char==")" ? -1 : 0));
if(bracket==0) return str.substr(pos, (x+1)-pos);
}
return false;
}
getFirstBracket("(as(dh(kshd)kj)ad)... ()(");
There is a possibility and your approach was quite good:
Match will give you an array if you had some hits, if so you can look up the array length.
var str = "(as(dh(kshd)kj)ad)... ()()",
match = str.match(new RegExp('.*?(?:\\(|\\)).*?', 'g')),
count = match ? match.length : 0;
This regular expression will get all parts of your text that include round brackets. See http://gskinner.com/RegExr/ for a nice online regex tester.
Now you can use count for all brackets.
match will deliver a array that looks like:
["(", "as(", "dh(", "kshd)", "kj)", "ad)", "... (", ")", "(", ")"]
Now you can start sorting your results:
var newStr = '', open = 0, close = 0;
for (var n = 0, m = match.length; n < m; n++) {
if (match[n].indexOf('(') !== -1) {
open++;
newStr += match[n];
} else {
if (open > close) newStr += match[n];
close++;
}
if (open === close) break;
}
... and newStr will be (as(dh(kshd)kj)ad)
This is probably not the nicest code but it will make it easier to understand what you're doing.
With this approach there is no limit of nesting levels.
This is not possible with a JavaScript regex. Generally, regular expressions can't handle arbitrary nesting because that can no longer be described by a regular language.
Several modern regex flavors do have extensions that allow for recursive matching (like PHP, Perl or .NET), but JavaScript is not among them.
No. Regular expressions express regular languages. Finite automatons (FA) are the machines which recognise regular language. A FA is, as its name implies, finite in memory. With a finite memory, the FA can not remember an arbitrary number of parentheses - a feature which is needed in order to do what you want.
I suggest you use an algorithms involving an enumerator in order to solve your problem.
try this jsfiddle
var str = "(as(dh(kshd)kj)ad)... ()()";
document.write(str.match(/\((.*?)\.\.\./m)[1] );