Regex - Match sets of words - javascript

In JavaScript, how would I get ['John Smith', 'Jane Doe'] from "John Smith - Jane Doe" where - can be any separator (\ / , + * : ;) and so on, using regex ?
Using new RegExp('[a-zA-Z]+[^\/|\-|\*|\+]', 'g') will just give me ["John ", "Smith ", "Jane ", "Doe"]

Try this, no regex:
var arr = str.split(' - ')
Edit
Multiple separators:
var arr = str.split(/ [-*+,] /)

If you want to match multiple words, you need to have a space in your character class. I'd think something like /[ a-zA-Z]+/g would be a starting point, used repeatedly with exec or via String#match, like this: Live copy | source
var str = "John Smith - Jane Doe";
var index;
var matches = str.match(/[ a-zA-Z]+/g);
if (matches) {
display("Found " + matches.length + ":");
for (index = 0; index < matches.length; ++index) {
display("[" + index + "]: " + matches[index]);
}
}
else {
display("No matches found");
}
But it's very limited, a huge number of names have characters other than A-Z, you may want to invert your logic and use a negated class (/[^...]/g, where ... is a list of possible delimiter characters). You don't want to leave "Elizabeth Peña" or "Gerard 't Hooft" out in the cold! :-)

Related

Design a regular expression for a sentence and its subset

For the following two sentences,
var first_sentence = 'My cat is sleeping';
var second_sentence = 'My cat is sleeping with a blanket';
I have tried to use the following regexp to get both verb (sleeping) and the noun (a blanket).
var regex = /My cat is (.+?)\s+with.?(.+)?/gi.exec('My cat is sleeping with a blanket');
console.log(regex);
/*
[ 0 : 'My cat is sleeping with a blanket'
1 : 'sleeping'
2 : 'a blanket'
index : 0
input : 'My cat is sleeping with a blanket'
length : 3 ]
*/
This regular expression got it well but when I apply it to the first sentence, it returns null, any idea about that ?
var regex = /My cat is (.+?)\s+with.?(.+)?/gi.exec('My cat is sleeping');
console.log(regex);
// null
In the first sentence, there is no \s+with.?(.+)? part that requires some text to be present (1+ whitespaces and then with). You need to wrap the part of the pattern that is optional with (?:....)?:
/My cat is (\S+)(?:\s+with\s+(.*))?/gi
See the regex demo
Details:
My cat is - a literal text
(\S+) - Group 1 capturing 1+ non-whitespace symbols
(?:\s+with\s+(.*))? - an optional sequence of:
\s+with\s+ - with word enclosed with 1+ whitespaces on both sides
(.*) - Group 2 capturing any 0+ chars other than line break symbols
JS:
var ss = [ "My cat is sleeping", "My cat is sleeping with a blanket"];
var rx = /My cat is (\S+)(?:\s+with\s+(.*))?/i;
for (var s = 0; s < ss.length; s++) {
document.body.innerHTML += "Testing \"<i>" + ss[s] + "</i>\"... ";
if ((m = ss[s].match(rx))!==null) {
document.body.innerHTML += "Found: <b>" + m[1] + "</b>" + (m[2] ? " and <b>" + m[2] : "") + "</b><br/>";
} else {
document.body.innerHTML += "NOT Matched: <b>" + ss[s] + "</b><br/>";
}
}

How to replace a character with an specific indexOf to uppercase in a string?

Im trying to replace a character at a specific indexOf to uppercase.
My string is a surname plus the first letter in the last name,
looking like this: "lovisa t".
I check the position with this and it gives me the right place in the string. So the second gives me 8(in this case).
first = texten.indexOf(" ");
second = texten.indexOf(" ", first + 1);
And with this I replace the first letter to uppercase.
var name = texten.substring(0, second);
name=name.replace(/^./, name[0].toUpperCase());
But how do I replace the character at "second" to uppercase?
I tested with
name=name.replace(/.$/, name[second].toUpperCase());
But it did´t work, so any input really appreciated, thanks.
Your error is the second letter isn't in position 8, but 7.
Also this second = texten.indexOf(" ", first + 1); gives -1, not 8, because you do not have a two spaces in your string.
If you know that the string is always in the format surname space oneLetter and you want to capitalize the first letter and the last letter you can simply do this:
var name = 'something s';
name = name[0].toUpperCase() + name.substring(1, name.length - 1) + name[name.length -1].toUpperCase();
console.log(name)
Here's a version that does exactly what your question title asks for: It uppercases a specific index in a string.
function upperCaseAt(str, i) {
return str.substr(0, i) + str.charAt(i).toUpperCase() + str.substr(i + 1);
}
var str = 'lovisa t';
var i = str.indexOf(' ');
console.log(upperCaseAt(str, i + 1));
However, if you want to look for specific patterns in the string, you don't need to deal with indices.
var str = 'lovisa t';
console.log(str.replace(/.$/, function (m0) { return m0.toUpperCase(); }));
This version uses a regex to find the last character in a string and a replacement function to uppercase the match.
var str = 'lovisa t';
console.log(str.replace(/ [a-z]/, function (m0) { return m0.toUpperCase(); }));
This version is similar but instead of looking for the last character, it looks for a space followed by a lowercase letter.
var str = 'lovisa t';
console.log(str.replace(/(?:^|\s)\S/g, function (m0) { return m0.toUpperCase(); }));
Finally, here we're looking for (and uppercasing) all non-space characters that are preceded by the beginning of the string or a space character; i.e. we're uppercasing the start of each (space-separated) word.
All can be done by regex replace.
"lovisa t".replace(/(^|\s)\w/g, s=>s.toUpperCase());
Try this one (if it will be helpfull, better move constants to other place, due performance issues(yes, regexp creation is not fast)):
function normalize(str){
var LOW_DASH = /\_/g;
var NORMAL_TEXT_REGEXP = /([a-z])([A-Z])/g;
if(!str)str = '';
if(str.indexOf('_') > -1) {
str = str.replace(LOW_DASH, ' ');
}
if(str.match(NORMAL_TEXT_REGEXP)) {
str = str.replace(NORMAL_TEXT_REGEXP, '$1 $2');
}
if(str.indexOf(' ') > -1) {
var p = str.split(' ');
var out = '';
for (var i = 0; i < p.length; i++) {
if (!p[i])continue;
out += p[i].charAt(0).toUpperCase() + p[i].substring(1) + (i !== p.length - 1 ? ' ' : '');
}
return out;
} else {
return str.charAt(0).toUpperCase() + str.substring(1);
}
}
console.log(normalize('firstLast'));//First Last
console.log(normalize('first last'));//First Last
console.log(normalize('first_last'));//First Last

JS Regex - Match each not escaped specific characters

I'm trying to make a Regex in JavaScript to match each not escaped specific characters.
Here I'm looking for all the ' characters. They can be at the beginning or the end of the string, and consecutive.
E.g.:
'abc''abc\'abc
I should get 3 matchs: the 1st, 5 and 6th character. But not 11th which escaped.
You'll have to account for cases like \\' which should match, and \\\' which shouldn't. but you don't have lookbehinds in JS, let alone variable-length lookbehinds, so you'll have to use something else.
Use the following regex:
\\.|(')
This will match both all escaped characters and the ' characters you're looking for, but the quotes will be in a capture group.
Look at this demo. The matches you're interested in are in green, the ones to ignore are in blue.
Then, in JS, ignore each match object m where !m[1].
Example:
var input = "'abc''abc\\'abc \\\\' abc";
var re = /\\.|(')/g;
var m;
var positions = [];
while (m = re.exec(input)) {
if (m[1])
positions.push(m.index);
}
var pos = [];
for (var i = 0; i < input.length; ++i) {
pos.push(positions.indexOf(i) >= 0 ? "^" : " ");
}
document.getElementById("output").innerText = input + "\n" + pos.join("");
<pre id="output"></pre>
You can use:
var s = "'abc''abc\\'abc";
var cnt=0;
s.replace(/\\?'/g, function($0) { if ($0[0] != '\\') cnt++; return $0;});
console.log(cnt);
//=> 3

javascript - count spaces before first character of a string

What is the best way to count how many spaces before the fist character of a string?
str0 = 'nospaces even with other spaces still bring back zero';
str1 = ' onespace do not care about other spaces';
str2 = ' twospaces';
Use String.prototype.search
' foo'.search(/\S/); // 4, index of first non whitespace char
EDIT:
You can search for "Non whitespace characters, OR end of input" to avoid checking for -1.
' '.search(/\S|$/)
Using the following regex:
/^\s*/
in String.prototype.match() will result in an array with a single item, the length of which will tell you how many whitespace chars there were at the start of the string.
pttrn = /^\s*/;
str0 = 'nospaces';
len0 = str0.match(pttrn)[0].length;
str1 = ' onespace do not care about other spaces';
len1 = str1.match(pttrn)[0].length;
str2 = ' twospaces';
len2 = str2.match(pttrn)[0].length;
Remember that this will also match tab chars, each of which will count as one.
You could use trimLeft() as follows
myString.length - myString.trimLeft().length
Proof it works:
let myString = ' hello there '
let spacesAtStart = myString.length - myString.trimLeft().length
console.log(spacesAtStart)
See https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/TrimLeft
str0 = 'nospaces';
str1 = ' onespace do not care about other spaces';
str2 = ' twospaces';
arr_str0 = str0.match(/^[\s]*/g);
count1 = arr_str0[0].length;
console.log(count1);
arr_str1 = str1.match(/^[\s]*/g);
count2 = arr_str1[0].length;
console.log(count2);
arr_str2 = str2.match(/^[\s]*/g);
count3 = arr_str2[0].length;
console.log(count3);
Here:
I have used regular expression to count the number of spaces before the fist character of a string.
^ : start of string.
\s : for space
[ : beginning of character group
] : end of character group
str.match(/^\s*/)[0].length
str is the string.

JavaScript - How can I get the last different index in string but

Hi I'm going to make a calculator and I want a +/- button. I want to get the latest *, -, +, / in the string and define whats the laststring.
For example:
str="2+3*13"
I want this to be split into:
strA="2+3*"
strB="13"
Another example:
str="3-2+8"
Should split into:
strA="3-2+"
strB="8"
Use lastIndexOf and one of the substring methods:
var strA, strB,
// a generic solution for more operators might be useful
index = Math.max(str.lastIndexOf("+"), str.lastIndexOf("-"), str.lastIndexOf("*"), str.lastIndexOf("/"));
if (index < 0) {
strA = "";
strB = str;
} else {
strA = str.substr(0, index+1);
strB = str.substr(index+1);
}
You can use replace, match, and Regular Expressions:
str="3-2+8"
strA = str.replace(/\d+$/, "")
strB = str.match(/\d+$/)[0]
console.log(str, strA, strB);
> 3-2+8 3-2+ 8
You can use regular expression and split method:
var parts = "2 + 4 + 12".split(/\b(?=\d+\s*$)/);
Will give you and array:
["2 + 4 + ", "12"]
Couple of tests:
"(2+4)*230" -> ["(2+4)*", "230"]
"(1232-74) / 123 " -> ["(1232-74) / ", "123 "]
"12 * 32" -> ["12 * ", "32"]

Categories