Design a regular expression for a sentence and its subset - javascript

For the following two sentences,
var first_sentence = 'My cat is sleeping';
var second_sentence = 'My cat is sleeping with a blanket';
I have tried to use the following regexp to get both verb (sleeping) and the noun (a blanket).
var regex = /My cat is (.+?)\s+with.?(.+)?/gi.exec('My cat is sleeping with a blanket');
console.log(regex);
/*
[ 0 : 'My cat is sleeping with a blanket'
1 : 'sleeping'
2 : 'a blanket'
index : 0
input : 'My cat is sleeping with a blanket'
length : 3 ]
*/
This regular expression got it well but when I apply it to the first sentence, it returns null, any idea about that ?
var regex = /My cat is (.+?)\s+with.?(.+)?/gi.exec('My cat is sleeping');
console.log(regex);
// null

In the first sentence, there is no \s+with.?(.+)? part that requires some text to be present (1+ whitespaces and then with). You need to wrap the part of the pattern that is optional with (?:....)?:
/My cat is (\S+)(?:\s+with\s+(.*))?/gi
See the regex demo
Details:
My cat is - a literal text
(\S+) - Group 1 capturing 1+ non-whitespace symbols
(?:\s+with\s+(.*))? - an optional sequence of:
\s+with\s+ - with word enclosed with 1+ whitespaces on both sides
(.*) - Group 2 capturing any 0+ chars other than line break symbols
JS:
var ss = [ "My cat is sleeping", "My cat is sleeping with a blanket"];
var rx = /My cat is (\S+)(?:\s+with\s+(.*))?/i;
for (var s = 0; s < ss.length; s++) {
document.body.innerHTML += "Testing \"<i>" + ss[s] + "</i>\"... ";
if ((m = ss[s].match(rx))!==null) {
document.body.innerHTML += "Found: <b>" + m[1] + "</b>" + (m[2] ? " and <b>" + m[2] : "") + "</b><br/>";
} else {
document.body.innerHTML += "NOT Matched: <b>" + ss[s] + "</b><br/>";
}
}

Related

Regular Expressions first Alphabetic rest alpanumeric

I'm trying to write a Regex
What I need is:
To start only with: A-z (Alphabetic)
Min. Length: 5
Max. Length: 10
The rest can be A-z0-9(Alphanumeric) but contain at least one number
What I have: ^[A-z][A-z0-9]{5,10}$
You can use
/^(?=.{5,10}$)[a-z][a-z]*\d[a-z\d]*$/i
See the regex demo
Details:
^ - start of string
(?=.{5,10}$) - the string should contain 5 to 10 any chars other than line break chars (this will be restricted by the consuming pattern later) up to the end of string
[a-z] - the first char must be an ASCII letter (i modifier makes the pattern case insensitive)
[a-z]* - 0+ ASCII letters
\d - 1 digit
[a-z\d]* - 0+ ASCII letters of digits
$ - end of string.
var ss = [ "ABCABCABC1","ABCA1BCAB","A1BCABCA","A1BCAB","A1BCA","A1BC","1BCABCABC1","ABCABC","ABCABCABCD"]; // Test strings
var rx = /^(?=.{5,10}$)[a-z][a-z]*\d[a-z\d]*$/i; // Build the regex dynamically
document.body.innerHTML += "Pattern: <b>" + rx.source
+ "</b><br/>"; // Display resulting pattern
for (var s = 0; s < ss.length; s++) { // Demo
document.body.innerHTML += "Testing \"<i>" + ss[s] + "</i>\"... ";
document.body.innerHTML += "Matched: <b>" + rx.test(ss[s]) + "</b><br/>";
}
var pattern = /^[a-z]{1}\w{4,9}$/i;
/* PATTERN
^ : Start of line
[a-z]{1} : One symbol between a and z
\w{4,9} : 4 to 9 symbols of any alphanumeric type
$ : End of line
/i : Case-insensitive
*/
var tests = [
"1abcdefghijklmn", //false
"abcdefghijklmndfvdfvfdv", //false
"1abcde", //false
"abcd1", //true
];
for (var i = 0; i < tests.length; i++) {
console.log(
tests[i],
pattern.test(tests[i])
)
}

How to apply a regex just for a part of string?

I have a string like this:
var str = "this is test
1. this is test
2. this is test
3. this is test
this is test
1. this test
2. this is test
this is test";
Also I have this regex:
/^[\s\S]*(?:^|\r?\n)\s*(\d+)(?![\s\S]*(\r?\n){2})/m
This capturing group $1 returns 2 from above string.
Now I have a position number: 65 and I want to apply that regex in this range of the string: [0 - 65]. (So I have to get 3 instead of 2). In general I want to limit that string from first to a specific position and then apply that regex on that range. How can I do that?
The simplest way is to apply it to just that substring:
var match = /^[\s\S]*(?:^|\r?\n)\s*(\d+)(?![\s\S]*(\r?\n){2})/m.exec(str.substring(0, 65));
// Note ----------------------------------------------------------------^^^^^^^^^^^^^^^^^
Example:
var str = "this is test\n1. this is test\n2. this is test\n3. this is test\nthis is test\n1. this test \n2. this is test\nthis is test";
var match = /^[\s\S]*(?:^|\r?\n)\s*(\d+)(?![\s\S]*(\r?\n){2})/m.exec(str.substring(0, 65));
// Note ----------------------------------------------------------------^^^^^^^^^^^^^^^^^
document.body.innerHTML = match ? "First capture: [" + match[1] + "]" : "(no match)";
Maybe such a build can help (source: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/exec)
var myRe = /ab*/g;
var str = 'abbcdefabh';
var myArray;
while ((myArray = myRe.exec(str)) !== null) {
var msg = 'Found ' + myArray[0] + '. ';
msg += 'Next match starts at ' + myRe.lastIndex;
console.log(msg);
}

Input field validation in JavaScript

I have a input field that should accept only these format (d- digit, c- character):
d.d
d.d.d
d.d.c
My code:
var format1 = /[0-9]{1,}\.[0-9]{1,}/g; // d.d
var format2 = /[0-9]{1,}\.[0-9]{1,}\.[0-9]{1,}/g; // d.d.d
var format3 = /[0-9]{1,}\.[0-9]{1,}\.[a-zA-Z]{1,}/g; // d.d.c
if (format1.test(input)) {
format1Con = true;
}
if (format2.test(input)) {
format2Con = true;
}
if (format3.test(input)) {
format3Con = true;
}
This code allows some wrong type values.
For example -
1.2.3.33,
1.2.ccccc (Here only one character should be aaccepted)
Please help with exact regular expression for my field format.
You can use
/^\d\.\d+(?:\.(?:\d|[a-z]))?$/i
See demo
The regex matches...
^ - Beginning of a string
\d - a digit
\. - a literal dot
\d+ - 1 or more digits
(?:\.(?:\d|[a-z]))? - an optional group matching...
\. - literal dot and...
(?:\d|[a-z]) - either a single digit or letters from [a-zA-Z] range (since i modifier is used)
$ - End of string
var re = /^\d\.\d+(?:\.(?:\d|[a-z]))?$/;
document.write('1.2.3.33: ' + re.test('1.2.3.33') + "<br/>");
document.write('1.2.ccccc: ' + re.test('1.2.ccccc') + "<br/>");
document.write('1.2: ' + re.test('1.2') + "<br/>");
document.write('1.22: ' + re.test('1.22') + "<br/>");
document.write('1.22.3: ' + re.test('1.22.3') + "<br/>");
document.write('1.2.3: ' + re.test('1.2.3') + "<br/>");
document.write('1.2.x: ' + re.test('1.2.x') + "<br/>");
You can modify your RegEXP like this:
var format1 = /^[0-9]{1,}\.[0-9]{1,}$/g; // d.d
var format2 = /^[0-9]{1,}\.[0-9]{1,}\.[0-9]{1,}$/g; // d.d.d
var format3 = /^[0-9]{1,}\.[0-9]{1,}\.[a-zA-Z]{1,}$/g; // d.d.c
Where ^ matches the start of the string and $ matches the end of the string.

javascript - count spaces before first character of a string

What is the best way to count how many spaces before the fist character of a string?
str0 = 'nospaces even with other spaces still bring back zero';
str1 = ' onespace do not care about other spaces';
str2 = ' twospaces';
Use String.prototype.search
' foo'.search(/\S/); // 4, index of first non whitespace char
EDIT:
You can search for "Non whitespace characters, OR end of input" to avoid checking for -1.
' '.search(/\S|$/)
Using the following regex:
/^\s*/
in String.prototype.match() will result in an array with a single item, the length of which will tell you how many whitespace chars there were at the start of the string.
pttrn = /^\s*/;
str0 = 'nospaces';
len0 = str0.match(pttrn)[0].length;
str1 = ' onespace do not care about other spaces';
len1 = str1.match(pttrn)[0].length;
str2 = ' twospaces';
len2 = str2.match(pttrn)[0].length;
Remember that this will also match tab chars, each of which will count as one.
You could use trimLeft() as follows
myString.length - myString.trimLeft().length
Proof it works:
let myString = ' hello there '
let spacesAtStart = myString.length - myString.trimLeft().length
console.log(spacesAtStart)
See https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/TrimLeft
str0 = 'nospaces';
str1 = ' onespace do not care about other spaces';
str2 = ' twospaces';
arr_str0 = str0.match(/^[\s]*/g);
count1 = arr_str0[0].length;
console.log(count1);
arr_str1 = str1.match(/^[\s]*/g);
count2 = arr_str1[0].length;
console.log(count2);
arr_str2 = str2.match(/^[\s]*/g);
count3 = arr_str2[0].length;
console.log(count3);
Here:
I have used regular expression to count the number of spaces before the fist character of a string.
^ : start of string.
\s : for space
[ : beginning of character group
] : end of character group
str.match(/^\s*/)[0].length
str is the string.

Regex - Match sets of words

In JavaScript, how would I get ['John Smith', 'Jane Doe'] from "John Smith - Jane Doe" where - can be any separator (\ / , + * : ;) and so on, using regex ?
Using new RegExp('[a-zA-Z]+[^\/|\-|\*|\+]', 'g') will just give me ["John ", "Smith ", "Jane ", "Doe"]
Try this, no regex:
var arr = str.split(' - ')
Edit
Multiple separators:
var arr = str.split(/ [-*+,] /)
If you want to match multiple words, you need to have a space in your character class. I'd think something like /[ a-zA-Z]+/g would be a starting point, used repeatedly with exec or via String#match, like this: Live copy | source
var str = "John Smith - Jane Doe";
var index;
var matches = str.match(/[ a-zA-Z]+/g);
if (matches) {
display("Found " + matches.length + ":");
for (index = 0; index < matches.length; ++index) {
display("[" + index + "]: " + matches[index]);
}
}
else {
display("No matches found");
}
But it's very limited, a huge number of names have characters other than A-Z, you may want to invert your logic and use a negated class (/[^...]/g, where ... is a list of possible delimiter characters). You don't want to leave "Elizabeth Peña" or "Gerard 't Hooft" out in the cold! :-)

Categories