JavaScript: How to find and retrieve numbers from a string

JavaScript: How to find and retrieve numbers from a string - javascript

I'm using RPG Maker MV which is a game creator that uses JavaScript to create plugins. I have a plugin in JavaScript already, however I'm trying to edit a part of the plugin so that it basically checks if a certain string exists in a character in the game and if it does, then sets specific variables to numbers within that string.
for (var i = 0; i < page.list.length; i++) {
if (page.list[i].code == 108 && page.list[i].parameters[0].contains("<post:" + (n) + "," + (n) + ">")) {
var post = page.list[i].parameters[0];
var array = post.split(',');
this._origMovement.x = Number(array[1]);
this._origMovement.y = Number(array[1]);
break;
};
};
So I know the first 2 lines work and contains works when I only put a specific string. However I can't figure out how to check for 2 numbers that are separated by a comma and wrapped in '<>' tags, without knowing what the numbers would be.
Then it needs to extract those numbers and assign one to this._origMovement.x and the other to this._origMovement.y.
Any help would be greatly appreciated.

This is one of those rare cases where I'd use a regular expression. If you haven't come across regular expressions before I suggest reading an introduction to them, such as this one: https://regexone.com/
In your case, you probable want something like this:
var myRegex = /<post:(\d+),(\d+)>/;
var matches = myParameter.match(myRegex);
this._origMovement.x = matches[1]; //the first number
this._origMovement.y = matches[2]; //the second number
The myRegex variable is a regular expression that looks for the pattern you describe, and has 2 capture groups which look for a string of one or more digits (\d+ means "one or more digits"). The result of the .match() call gives you an array containing the entire match and the results of the capture groups.
If you want to allow for decimal numbers, you'll need to use a different capture group that allows for a decimal point, such as ([\d\.]+), which means "a sequence of one or more digits and decimal points", or more sophisticated, (\d+\.?\d*), which is "a sequence of one or more digits, following by an optional decimal point, followed by zero or more digits).
There are lots of good tutorials around to help you write good regular expressions, and sites that will help you live-test your expressions to make sure they work correctly. They're a powerful tool, but be careful not to over-use them!

Got it to work. For anyone who may ever be interested, the code is below.
for (var i = 0; i < page.list.length; i++) {
if (page.list[i].code == 108 && page.list[i].parameters[0].contains("<post:")) {
var myRegex = /<post:(\d+),(\d+)>/;
var matches = page.list[i].parameters[0].match(myRegex);
this._origMovement.x = matches[1]; //the first number
this._origMovement.y = matches[2]; //the second number
break;
}
};

Related

Get id from url

I have the following example url: #/reports/12/expense/11.
I need to get the id just after the reports -> 12. What I am asking here is the most suitable way to do this. I can search for reports in the url and get the content just after that ... but what if in some moment I decide to change the url, I will have to change my algorythm.
What do You think is the best way here. Some code examples will be also very helpfull.

It's hard to write code that is future-proof since it's hard to predict the crazy things we might do in the future!
However, if we assume that the id will always be the string of consecutive digits in the URL then you could simply look for that:
function getReportId(url) {
var match = url.match(/\d+/);
return (match) ? Number(match[0]) : null;
}
getReportId('#/reports/12/expense/11'); // => 12
getReportId('/some/new/url/report/12'); // => 12

You should use a regular expression to find the number inside the string. Passing the regular expression to the string's .match() method will return an array containing the matches based on the regular expression. In this case, the item of the returned array that you're interested in will be at the index of 1, assuming that the number will always be after reports/:
var text = "#/reports/12/expense/11";
var id = text.match(/reports\/(\d+)/);
alert(id[1]);
\d+ here means that you're looking for at least one number followed by zero to an infinite amount of numbers.

var text = "#/reports/12/expense/11";
var id = text.match("#/[a-zA-Z]*/([0-9]*)/[a-zA-Z]*/")
console.log(id[1])
Regex explanation:
#/ matches the characters #/ literally
[a-zA-Z]* - matches a word
/ matches the character / literally
1st Capturing group - ([0-9]*) - this matches a number.
[a-zA-Z]* - matches a word
/ matches the character / literally

Regular expressions can be tricky (add expensive). So usually if you can efficiently do the same thing without them you should. Looking at your URL format you would probably want to put at least a few constraints on it otherwise the problem will be very complex. For instance, you probably want to assume the value will always appear directly after the key so in your sample report=12 and expense=11, but report and expense could be switched (ex. expense/11/report/12) and you would get the same result.
I would just use string split:
var parts = url.split("/");
for(var i = 0; i < parts.length; i++) {
if(parts[i] === "report"){
this.reportValue = parts[i+1];
i+=2;
}
if(parts[i] === "expense"){
this.expenseValue = parts[i+1];
i+=2;
}
}
So this way your key/value parts can appear anywhere in the array
Note: you will also want to check that i+1 is in the range of the parts array. But that would just make this sample code ugly and it is pretty easy to add in. Depending on what values you are expecting (or not expecting) you might also want to check that values are numbers using isNaN

Regular expression, specify a number of loops

This regular expression looks for words with 3 or less characters so that a non-breaking space can be placed in before them.
smallwords = /(\s|^)(([a-zA-Z-_(]{1,2}('|’)*[a-zA-Z-_,;]{0,1}?\s)+)/gi, // words with 3 or less characters
Is there a way, to make the expression only apply itself to 2 words in a row?
Example
Currently, the string:
Singapore, the USA and Vietnam.
will be turned into:
Singapore, the USA and Vietnam.
if the expression only applied to 2 words in a row it would show
Singapore, the USA and Vietnam.
here's the full script:
ragadjust = function (s, method) {
if (document.querySelectorAll) {
var eles = document.querySelectorAll(s),
elescount = eles.length,
smallwords = /(\s|^)(([a-zA-Z-_(]{1,2}('|’)*[a-zA-Z-_,;]{0,1}?\s)+)/gi, // words with 3 or less characters
while (elescount-- > 0) {
var ele = eles[elescount],
elehtml = ele.innerHTML;
if (method == 'small-words' || method == 'all')
// replace small words
elehtml = elehtml.replace(smallwords, function(contents, p1, p2) {
return p1 + p2.replace(/\s/g, ' ');
});
ele.innerHTML = elehtml;
}
}
};
This is from RagAdjust

I know that this is not what you are asking for, but I figured a code review wouldn't hurt:
I think the word boundary \b is better, in this case, than \s|^.
You have the A-Z and a-z characters in your match, yet you are use the i case insensitive operator.
{0,1}? is redundant - either use the ? to make it optional, or use {0,1} to make it match zero or one times.
If your are going to have a dash in your character set put it at the end so that you don't have an ambiguous regex, for example this [a-z_-] is much better than [a-z-_].
If you don't need to capture a value, use the non-capturing parenthesis (?:).
So, here's your cleaned up regex:
/\b((?:[a-z_(-]{1,2}(?:'|’)*[a-z_,;-]?\s)+)/gi
I'm pretty sure the '|’ bit is some sort of typo when you pasted this in from your editor. Not sure what it is supposed to be.

This doesn't quite solve the issue the way you suggested but it does reduce the number of non breaking spaces that end up in the string. But it might give you some insight. Because you have the trailing g on both regex replacements, you're doing global replace. If you instead loop it with some max number of fixes, things work out a little differently.
Try changing the max number of replacements. I think the other thing that happens here (in my modified code) is that after you make one replacement, the spaces and small words are gone because you jammed in a nbsp which may or may not solve the issue you're trying to get around.
Here's my replacement function (simplified from your original). The basic mod is to remove the g from the regex's and add the loop. You should check out the codepen to see the full deal
var new_ragadjust = function (contents) {
MAX_NUMBER_OF_REPLACEMENTS = 5;
smallwords = /(\s|^)(([a-zA-Z-_(]{1,2}('|’)*[a-zA-Z-_,;]{0,1}?\s)+)/i; // words with 3 or less characters
var ii = 0;
var c = contents;
for (;ii < MAX_NUMBER_OF_REPLACEMENTS; ++ii) {
c = c.replace(smallwords, function(contents, p1, p2) {
return p1 + p2.replace(/\s/, ' ');
});
}
return c;
};
Codepen
http://cdpn.io/DKLtc
Also, to see the difference, you need to inspect elements to actually see where the nbsps end up (as you probably already knew).

Javascript Regex - 9 chars long, starting with 'SO-' and ending with 6 numbers

Regular expressions are simply evil in my mind and no matter how many times I read any documentation I just cannot seem to grasp even the simplest of expressions!
I am trying to write what must be a very simple expression to query a variable in javascript but I just cannot get it to work properly.
I am trying to validate the following:-
The string must be 9 characters long, starting with SO- (case insensitive eg So-, so-, sO- and SO-) followed by 6 numbers.
So the following should all match
SO-123456,
So-123456,
sO-456789,
so-789123
but the following should fail
SO-12d456,
SO-1234567
etc etc
I have only managed to get this far so far
var _reg = /(SO-)\d{6}/i;
var _tests = new Array();
_tests[0] = "So-123456";
_tests[1] = "SO-123456";
_tests[2] = "sO-456789";
_tests[3] = "so-789123";
_tests[4] = "QR-123456";
_tests[5] = "SO-1234567";
_tests[6] = "SO-45k789";
for(var i = 0; i < _tests.length; i++){
var _matches = _tests[i].match(_reg);
if(_matches && _matches.length > 0)
$('#matches').append(i+'. '+_matches[0] + '<br/>');
}
Please see http://jsfiddle.net/TzHKd/ for above example
Test number 5 is matching although it should fail as there are 7 numbers and not 6.
Any assistance would be greatly appreciated.
Cheers

use this regexp instead
/^(so-)\d{6}$/i;
without ^ (string starting with) or $ (string ending with) you're looking for a generic substring match (that's the reason why when you have 7 digits your regexp return true).

By using the anchors ^ and $ (matching beginining of line and end of line respectively), you can make the regex match the whole line. Otherwise, the match with return true as soon as the characters in the regex are matched.
So, you will apply it like this:
var _reg = /^(so-)\d{6}$/i;

Count parentheses with regular expression

My string is: (as(dh(kshd)kj)ad)... ()()
How is it possible to count the parentheses with a regular expression? I would like to select the string which begins at the first opening bracket and ends before the ...
Applying that to the above example, that means I would like to get this string: (as(dh(kshd)kj)ad)
I tried to write it, but this doesn't work:
var str = "(as(dh(kshd)kj)ad)... ()()";
document.write(str.match(/(.*)/m));

As I said in the comments, contrary to popular belief (don't believe everything people say) matching nested brackets is possible with regex.
The downside of using it is that you can only do it up to a fixed level of nesting. And for every additional level you wish to support, your regex will be bigger and bigger.
But don't take my word for it. Let me show you. The regex \([^()]*\) matches one level. For up to two levels see the regex here. To match your case, you'd need:
\(([^()]*|\(([^()]*|\([^()]*\))*\))*\)
It would match the bold part: (as(dh(kshd)kj)ad)... ()()
Check the DEMO HERE and see what I mean by fixed level of nesting.
And so on. To keep adding levels, all you have to do is change the last [^()]* part to ([^()]*|\([^()]*\))* (check three levels here). As I said, it will get bigger and bigger.

See Tim's answer for why this won't work, but here's a function that'll do what you're after instead.
function getFirstBracket(str){
var pos = str.indexOf("("),
bracket = 0;
if(pos===-1) return false;
for(var x=pos; x<str.length; x++){
var char = str.substr(x, 1);
bracket = bracket + (char=="(" ? 1 : (char==")" ? -1 : 0));
if(bracket==0) return str.substr(pos, (x+1)-pos);
}
return false;
}
getFirstBracket("(as(dh(kshd)kj)ad)... ()(");

There is a possibility and your approach was quite good:
Match will give you an array if you had some hits, if so you can look up the array length.
var str = "(as(dh(kshd)kj)ad)... ()()",
match = str.match(new RegExp('.*?(?:\\(|\\)).*?', 'g')),
count = match ? match.length : 0;
This regular expression will get all parts of your text that include round brackets. See http://gskinner.com/RegExr/ for a nice online regex tester.
Now you can use count for all brackets.
match will deliver a array that looks like:
["(", "as(", "dh(", "kshd)", "kj)", "ad)", "... (", ")", "(", ")"]
Now you can start sorting your results:
var newStr = '', open = 0, close = 0;
for (var n = 0, m = match.length; n < m; n++) {
if (match[n].indexOf('(') !== -1) {
open++;
newStr += match[n];
} else {
if (open > close) newStr += match[n];
close++;
}
if (open === close) break;
}
... and newStr will be (as(dh(kshd)kj)ad)
This is probably not the nicest code but it will make it easier to understand what you're doing.
With this approach there is no limit of nesting levels.

This is not possible with a JavaScript regex. Generally, regular expressions can't handle arbitrary nesting because that can no longer be described by a regular language.
Several modern regex flavors do have extensions that allow for recursive matching (like PHP, Perl or .NET), but JavaScript is not among them.

No. Regular expressions express regular languages. Finite automatons (FA) are the machines which recognise regular language. A FA is, as its name implies, finite in memory. With a finite memory, the FA can not remember an arbitrary number of parentheses - a feature which is needed in order to do what you want.
I suggest you use an algorithms involving an enumerator in order to solve your problem.

try this jsfiddle
var str = "(as(dh(kshd)kj)ad)... ()()";
document.write(str.match(/\((.*?)\.\.\./m)[1] );

Regular Expression to match given word in last five words of pipe-delimited string

Say we have a string
blue|blue|green|blue|blue|yellow|yellow|blue|yellow|yellow|
And we want to figure out whether the word "yellow" occurs in the last 5 words of the string, specifically by returning a capture group containing these occurences if any.
Is there a way to do that with a regex?
Update: I'm feeding a regex engine some rules. For various reasons I'm trying to work with the engine rather than go outside it, which would be my last resort.

/\b(yellow)\|(?=(?:\w+\|){0,4}$)/g
This will return one hit for each yellow| that's followed by fewer than five words (per your definition of "word"). This assumes the sequence always ends with a pipe; if that's not the case, you might want to change it to:
/\b(yellow)(?=(?:\|\w+){0,4}\|?$)/g
EDIT (in response to comment): The definition of a "word" in this solution is arbitrary, and doesn't really correspond to real-world usage. To allow for hyphenated words like "real-world" you could use this:
/\b(yellow)\|(?=(?:\w+(?:-\w+)*\|){0,4}$)/g
...or, for this particular job, you could define a word as one or more of any characters except pipes:
/\b(yellow)\|(?=(?:[^|]+\|){0,4}$)/g

No need to use a Regex for such a simple thing.
Simply split on the pipe, and check with indexOf:
var group = 'blue|blue|green|blue|blue|yellow|yellow|blue|yellow|yellow';
if ( group.split('|').slice(-5).indexOf('yellow') == -1 ) {
alert('Not there :(');
} else {
alert('Found!!!');
}
Note: indexOf is not natively supported in IE < 9, but support for it can be added very easily.

Can't think of a way to do this with a single regular expression, but you can form one for each of the last five positions and sum the matches.
var string = "blue|blue|green|blue|blue|yellow|yellow|blue|yellow|yellow|";
var regexes = [];
regexes.push(/(yellow)\|[^|]+\|[^|]+\|[^|]+\|[^|]+\|$/);
regexes.push(/(yellow)\|[^|]+\|[^|]+\|[^|]+\|$/);
regexes.push(/(yellow)\|[^|]+\|[^|]+\|$/);
regexes.push(/(yellow)\|[^|]+\|$/);
regexes.push(/(yellow)\|$/);
var count = 0;
var regex;
while (regex = regexes.shift()) {
if (string.match(regex)) {
count++;
}
}
console.log(count);
Should find four matches.

We Keep Coding

JavaScript is the programming language of the Web.

JavaScript: How to find and retrieve numbers from a string - javascript

Related

Get id from url

Regular expression, specify a number of loops

Javascript Regex - 9 chars long, starting with 'SO-' and ending with 6 numbers

Count parentheses with regular expression

Regular Expression to match given word in last five words of pipe-delimited string

Categories

Resources