How to get first 2 words? - javascript

Let data.title be ABC XYZ PQRS - www.aaa.tld.
Output needs to be like this ABC+XYZ
i've tried this:
var t = data.title.split(' ').join('+');
t = t.replace(/(([^\s]+\s\s*){1})(.*)/,"Unknown");
$("#log").text(t);

Here is one way to do it, no regex though, it only grabs the first two words and must have a space between those words.
First we split into and array, then we slice that array from the 0 index to 2(exclusive) or 1, and finally we join them with a '+':
var x = 'ABC XYZ PQRS';
var y = x.split(' ').slice(0,2).join('+');
// y = "ABC+XYZ"
Working Fiddle

Try using .match() with RegExp /([\w+]+)/g; concatenate first match, + character, second match
var matches = "ABC XYZ PQRS - www.aaa.tld".match(/([\w+]+)/g);
console.log(matches[0] + "+" + matches[1])

This is my general function for first n words. Haven't tested it extensively but it is fast even on long strings because it doesn't use a global regex or split every word. You can fine tune the regex for dealing with punctuation. I'm considering a hyphen as a delimiter but you can move that to the word portion instead if you prefer.
function regFirstWords(s, n) {
// ?: non-capturing subsequent sp+word.Change {} if you want to require n instead of allowing fewer
var a = s.match(new RegExp('[\\w\\.]+' + '(?:[\\s-]*[\\w\\.]+){0,' + (n - 1) + '}'));
return (a === undefined || a === null) ? '' : a[0];
}
To satisfy the OP's request to replace with '+'
regFirstWords('ABC XYZ PQRS - www.aaa.tld',2).replace(/\s/g,'+')

Related

Regex expression to get numbers without parentheses ()

I'm trying to create a regex that will select the numbers/numbers with commas(if easier, can trim commas later) that do not have a parentheses after and not the numbers inside the parentheses should not be selected either.
Used with the JavaScript's String.match method
Example strings
9(296,178),5,3(123),10
10,9(296,178),2,5,3(123),3(124,125)
10,7,5(296,293,444,1255),3(218),2,4
What i have so far:
/((^\d+[^\(])|(,\d+,)|(,*\d+$))/gm
I tried this in regex101 and underlined the numbers i would like to match and x on the one that should not.
You could start with a substitution to remove all the unwanted parts:
/\d*\(.*?\),?//gm
Demo
This leaves you with
5,10
10,2,5,
10,7,2,4
which makes the matching pretty straight forward:
/(\d+)/gm
If you want it as a single match expression you could use a negative lookbehind:
/(?<!\([\d,]*)(\d+)(?:,|$)/gm
Demo - and here's the same matching expression as a runnable javascript (skeleton code borrowed from Wiktor's answer):
const text = `9(296,178),5,3(123),10
10,9(296,178),2,5,3(123),3(124,125)
10,7,5(296,293,444,1255),3(218),2,4`;
const matches = Array.from(text.matchAll(/(?<!\([\d,]*)(\d+)(?:,|$)/gm), x=>x[1])
console.log(matches);
Here, I'd recommend the so-called "best regex trick ever": just match what you do not need (negative contexts) and then match and capture what you need, and grab the captured items only.
If you want to match integer numbers that are not matched with \d+\([^()]*\) pattern (a number followed with a parenthetical substring), you can match this pattern or match and capture the \d+, one or more digit matching pattern, and then simply grab Group 1 values from matches:
const text = `9(296,178),5,3(123),10
10,9(296,178),2,5,3(123),3(124,125)
10,7,5(296,293,444,1255),3(218),2,4`;
const matches = Array.from(text.matchAll(/\d+\([^()]*\)|(\d+)/g), x=> x[1] ?? "").filter(Boolean)
console.log(matches);
Details:
text.matchAll(/\d+\([^()]*\)|(\d+)/g) - matches one or more digits (\d+) + ( (with \() + any zero or more chars other than ( and ) (with [^()]*) + \) (see \)), or (|) one or more digits captured into Group 1 ((\d+))
Array.from(..., x=> x[1] ?? "") - gets Group 1 value, or, if not assigned, just adds an empty string
.filter(Boolean) - removes empty strings.
Using several replacement regexes
var textA = `9(296,178),5,3(123),10
10,9(296,178),2,5,3(123),3(124,125)
10,7,5(296,293,444,1255),3(218),2,4
`
console.log('A', textA)
var textB = textA.replace(/\(.*?\),?/g, ';')
console.log('B', textB)
var textC = textB.replace(/^\d+|\d+$|\d*;\d*/gm, '')
console.log('C', textC)
var textD = textC.replace(/,+/g, ' ').trim(',')
console.log('D', textD)
With a loop
Here is a solution which splits the lines on comma and loops over the pieces:
var inside = false;
var result = [];
`9(296,178),5,3(123),10
10,9(296,178),2,5,3(123),3(124,125)
10,7,5(296,293,444,1255),3(218),2,4
`.split("\n").map(line => {
let pieceArray = line.split(",")
pieceArray.forEach((piece, k) => {
if (piece.includes('(')) {
inside = true
} else if (piece.includes(')')) {
inside = false
} else if (!inside && k > 0 && k < pieceArray.length-1 && !pieceArray[k-1].includes(')')) {
result.push(piece)
}
})
})
console.log(result)
It does print the expected result: ["5", "7"]

Javascript string replace certain characters

I have this string:
var s = '/channels/mtb/videos?page=2&per_page=100&fields=uri%2Cname%2Cdescription%2Cduration%2Cwidth%2Cheight%2Cprivacy%2Cpictures.sizes&sort=date&direction=asc&filter=embeddable&filter_embeddable=true'
I want to repace per_page number (in this case 100, but it can be any number from 1-100, maybe more?)
I can select first part of the string with:
var s1 = s.substr(0, s.lastIndexOf('per_page=')+9)
which give me:
/channels/mtb/videos?page=2&per_page=
but how would I select next '&' after that so I can replace number occurrence?
dont assume same order of parameters!
You can use following regex to replace the content you want.
regex:- /per_page=[\d]*/g(this is only for your requirement)
var new_no=12; //change 100 to 12
var x='/channels/mtb/videos?page=2&per_page=100&fields=uri%2Cname%2Cdescription%2Cduration%2Cwidth%2Cheight%2Cprivacy%2Cpictures.sizes&sort=date&direction=asc&filter=embeddable&filter_embeddable=true';
var y=x.replace(/per_page=[\d]*/g,'per_page='+new_no);
console.log(y);
Explanation:-
/per_page=[\d]*/g
/ ----> is for regex pattern(it inform that from next character onward whatever it encounter will be regex pattern)
per_page= ----> try to find 'per_page=' in string
[\d]* ----> match 0 or more digit (it match until non digit encounter)
/g ---->/ to indicate end of regex pattern and 'g' is for global means find in all string(not only first occurrence)
Use replace with a regular expression to find the numbers after the text per_page=. Like this:
s.replace(/per_page=\d+/,"per_page=" + 33)
Replace the 33 with the number you want.
Result:
"/channels/mtb/videos?page=2&per_page=33&fields=uri%2Cname%2Cdescription%2Cduration%2Cwidth%2Cheight%2Cprivacy%2Cpictures.sizes&sort=date&direction=asc&filter=embeddable&filter_embeddable=true"
Start with the index from the lastIndexOf-per_page instead of 0.
Get the index of the first & and create a substr s2 to the end.
Then concat s1 + nr + s2.
I would not use regex, because it is much slower for this simple stuff.
With Array.filter you can do this, where one split the text into key/value pairs, and filter out the one that starts with per_page=.
Stack snippet
var s = '/channels/mtb/videos?page=2&per_page=100&fields=uri%2Cname%2Cdescription%2Cduration%2Cwidth%2Cheight%2Cprivacy%2Cpictures.sizes&sort=date&direction=asc&filter=embeddable&filter_embeddable=true'
var kv_pairs = s.split('&');
var s2 = s.replace((kv_pairs.filter(w => w.startsWith('per_page=')))[0],'per_page=' + 123);
//console.log(s2);
var matches = /(.*\bper_page=)(\d+)(.*)/;
if (matches) {
s = matches[0] + newValue + matches[2];
}

javascript - regexp exec internal index doesn't progress if first char is not a match

I need to match numbers that are not preceeded by "/" in a group.
In order to do this I made the following regex:
var reg = /(^|[^,\/])([0-9]*\.?[0-9]*)/g;
First part matches start of the string and anything else except "/", second part matches a number. Everything works ok regarding the regex (it matches what I need). I use https://regex101.com/ for testing. Example here: https://regex101.com/r/7UwEUn/1
The problem is that when I use it in js (script below) it goes into an infinite loop if first character of the string is not a number. At a closer look it seems to keep matching the start of the string, never progressing further.
var reg = /(^|[^,\/])([0-9]*\.?[0-9]*)/g;
var text = "a 1 b";
while (match = reg.exec(text)) {
if (typeof match[2] != 'undefined' && match[2] != '') {
numbers.push({'index': match.index + match[1].length, 'value': match[2]});
}
}
If the string starts with a number ("1 a b") all is fine.
The problem appears to be here (^|[^,/]) - removing ^| will fix the issue with infinite loop but it will not match what I need in strings starting with numbers.
Any idea why the internal index is not progressing?
Infinite loop is caused by the fact your regex can match an empty string. You are not likely to need empty strings (even judging by your code), so make it match at least one digit, replace the last * with +:
var reg = /(^|[^,\/])([0-9]*\.?[0-9]+)/g;
var text = "a 1 b a 2 ana 1/2 are mere (55";
var numbers=[];
while (match = reg.exec(text)) {
numbers.push({'index': match.index + match[1].length, 'value': match[2]});
}
console.log(numbers);
Note that this regex will not match numbers like 34. and in that case you may use /(^|[^,\/])([0-9]*\.?[0-9]+|[0-9]*\.)/g, see this regex demo.
Alternatively, you may use another "trick", advance the regex lastIndex manually upon no match:
var reg = /(^|[^,\/])([0-9]*\.?[0-9]+)/g;
var text = "a 1 b a 2 ana 1/2 are mere (55";
var numbers=[];
while (match = reg.exec(text)) {
if (match.index === reg.lastIndex) {
reg.lastIndex++;
}
if (match[2]) numbers.push({'index': match.index + match[1].length, 'value': match[2]});
}
console.log(numbers);

Add colon (:) after every 2nd character using Javascript

I have a string and want to add a colon after every 2nd character (but not after the last set), eg:
12345678
becomes
12:34:56:78
I've been using .replace(), eg:
mystring = mystring.replace(/(.{2})/g, NOT SURE WHAT GOES HERE)
but none of the regex for : I've used work and I havent been able to find anything useful on Google.
Can anyone point me in the right direction?
Without the need to remove any trailing colons:
mystring = mystring.replace(/..\B/g, '$&:')
\B matches a zero-width non-word boundary; in other words, when it hits the end of the string, it won't match (as that is considered to be a word boundary) and therefore won't perform the replacement (hence no trailing colon, either).
$& contains the matched substring (so you don't need to use a capture group).
mystring = mystring.replace(/(..)/g, '$1:').slice(0,-1)
This is what comes to mind immediately. I just strip off the final character to get rid of the colon at the end.
If you want to use this for odd length strings as well, you just need to make the second character optional. Like so:
mystring = mystring.replace(/(..?)/g, '$1:').slice(0,-1)
If you're looking for approach other than RegEx, try this:
var str = '12345678';
var output = '';
for(var i = 0; i < str.length; i++) {
output += str.charAt(i);
if(i % 2 == 1 && i > 0) {
output += ':';
}
}
alert(output.substring(0, output.length - 1));
Working JSFiddle
A somewhat different approach without regex could be using Array.prototype.reduce:
Array.prototype.reduce.call('12345678', function(acc, item, index){
return acc += index && index % 2 === 0 ? ':' + item : item;
}, ''); //12:34:56:78
mystring = mytring.replace(/(.{2})/g, '\:$1').slice(1)
try this
Easy, just match every group of up-to 2 characters and join the array with ':'
mystring.match(/.{1,2}/g).join(':')
var mystring = '12345678';
document.write(mystring.match(/.{1,2}/g).join(':'))
no string slicing / trimming required.
It's easier if you tweak what you're searching for to avoid an end-of-line colon(using negative lookahead regex)
mystring = mystring.replace(/(.{2})(?!$)/g, '\$1:');
mystring = mystring.replace(/(.{2})/g, '$1\:')
Give that a try
I like my approach the best :)
function colonizer(strIn){
var rebuiltString = '';
strIn.split('').forEach(function(ltr, i){
(i % 2) ? rebuiltString += ltr + ':' : rebuiltString += ltr;
});
return rebuiltString;
}
alert(colonizer('Nicholas Abrams'));
Here is a demo
http://codepen.io/anon/pen/BjjNJj

Replace string between second set of [ and ]

I am learning regex, and I got a doubt. Let's consider
var s = "YYYN[1-20]N[]NYY";
Now, I want to replace/insert the '1-8' between [ and ] at its second occurrence.
Then output should be
YYYN[1-20]N[1-8]NYY
For that I had tried using replace and passing a function through it as shown below:
var nth = 0;
s = s.replace(/\[([^)]+)\]/g, function(match, i, original) {
nth++;
return (nth === 1) ? "1-8" : match;
});
alert(s); // But It wont work
I think that regex is not matchIing the string that I am using.
How can I fix it?
You regex \[([^)]+)\] will not match empty square brackets since + requires at least 1 character other than ). I guess you wanted to write \[[^\]]*\].
Here is a fix for your solution:
var s = "YYYN[1-20]N[]NYY";
var nth = 0;
s = s.replace(/\[[^\]]*\]/g, function (match, i, original) {
nth++;
return (nth !== 1) ? "[1-8]" : match;
});
alert(s);
Here is another way of doing it:
var s = "YYYN[1-20]N[]NYY";
var nth = 0;
s = s.replace(/(.*)\[\]/, "$1[1-8]");
alert(s);
The regex (.*)\[\] matches and captures into Group 1 greedily as much text as possible (thus we get the last set of empty []), and then matches empty square brackets. Then we restore the text before [] with $1 backreference and add out string 1-8.
If it’s only two occurences of square brackets, then this will work:
/(.*\[.*?\].*\[).*?(\].*)/
This RegEx has “YYYN[1-20]N[” as the first capturing group and “]NYY” as the second.
I suggest using simple split and join operations:
var s = "YYYN[1-20]N[]NYY";
var arr = s.split(/\[/)
arr[2] = '1-8' + arr[2]
var r = arr.join('[')
//=> YYYN[1-20]N[1-8]NYY
You can use following regex :
var s = "YYYN[1-20]N[]NYY";
var nth = 0;
s = s.replace(/([^[]+\[(?:[^[]+)\][^[]+)\[[^[]+\](.+)/, "$1[1-8]$2");
alert(s);
The first part ([^[]+\[([^[]+)\][^[]+) will match a string contain first sub-string between []. and \[[^[]+\] would be the second one which you want and the last part (.+?) match the rest of your string.

Categories