Regex to find emoji names with colon and skintone - javascript

I'm using EmojiMart for my parser.
I've seen this related question but it seem to be different from mine.
So I need to return the emoji names or :code: for them to be able to decode it.
So example I have this text:
:+1::skin-tone-6::man-pouting:Hello world:skin-tone-
6:lalalalla:person_with_pouting_face: :poop::skin-tone-11: mamamia
:smile: :skin-tone-6:
It should match the whole :+1::skin-tone-6:
and not a separate :+1:, :skin-tone-6:: - only if there’s no space between them. (notice the space between :smile: and :skin-tone-6: )
Conditions:
It should only match the :code::skintone: if skintone is 2-6
If I do str.split(regex) this is my expected result (array):
- :+1::skin-tone-6:
- :man-pouting:
- Hello world
- :skin-tone-6:
- lalalalla
- :person_with_pouting_face:
- :poop:
- :skin-tone-11:
- mamamia
- :smile:
- :skin-tone-6:

You may use String#split() with the
/(:[^\s:]+(?:::skin-tone-[2-6])?:)/
regex. See the regex demo.
Details
: - a colon
[^\s:]+ - 1+ chars other than whitespace and :
(?:::skin-tone-[2-6])? - an optional sequence of
::skin-tone- - a literal substring
[2-6] - a digit from 2 to 6
: - a colon.
JS demo:
var s = ":+1::skin-tone-6::man-pouting:Hello world:skin-tone-6:lalalalla:person_with_pouting_face: :poop::skin-tone-11: mamamia :smile: :skin-tone-6:";
var reg = /(:[^\s:]+(?:::skin-tone-[2-6])?:)/;
console.log(s.split(reg).filter(x => x.trim().length !=0 ));
The .filter(x => x.trim().length !=0 ) removes all blank items from the resulting array. For ES5 and older, use .filter(function(x) { return x.trim().length != 0; }).

Related

Javascript To Return an Array of Words Over a Certain Length

I need to return words from an string that are over a certain length and do this on 1 line of code.
Say I need to return all words over 2 chars in length...
So far I have...
const wordsOver2Chars = str => str.match(/\w+\s+(.{2,})/g);
console.log(
wordsOver2Chars('w gh w qwe regh aerguh eriygarew hw whio wh w')
);
This does not work.
str.match(/\w+\s+/g) will return an array of words but I cannot figure out how to add in the length limiter as well.
Using split(' ').match(\regExp) errors.
Use .split then .filter.
console.log('w gh w qwe regh aerguh eriygarew hw whio wh w'.split(" ").filter(word => word.length > 2))
The \w metacharacter matches word characters. When you add a + sign to it, you are implying that you want a word character chain of length at least 1, if you add another \w in front of it, you get min length of 2. And so on and so forth.
const wordsOver2Chars = str => str.match(/\w\w\w+/g);
console.log(wordsOver2Chars('w gh w qwe regh aerguh eriygarew hw whio wh w'));
This is probably the easiest to understand approach, you are matching a single wordcharacter, followed by another one, and then followed by a 1+ chain.
If you want to be technically correct you can use curly brackets to define the number of elements, (3 being min, and empty after a comma meaning not defined max)
const wordsOver2Chars = str => str.match(/\w{3,}/g);
console.log(wordsOver2Chars('w gh w qwe regh aerguh eriygarew hw whio wh w'));
Do you need to use regex? The easiest way would be to do
const wordsOver2Chars = s => s.split(' ').filter(w => w.length > 2).join(' ');
Split the string at ' ', then filter the resulting array to only contain words with length > 2 and join them again.

Regex to extract two numbers with spaces from string

I have problem with simple rexex. I have example strings like:
Something1\sth2\n649 sth\n670 sth x
Sth1\n\something2\n42 036 sth\n42 896 sth y
I want to extract these numbers from strings. So From first example I need two groups: 649 and 670. From second example: 42 036 and 42 896. Then I will remove space.
Currently I have something like this:
\d+ ?\d+
But it is not a good solution.
You can use
\n\d+(?: \d+)?
\n - Match new line
\d+ - Match digit from 0 to 9 one or more time
(?: \d+)? - Match space followed by digit one or more time. ( ? makes it optional )
let strs = ["Something1\sth2\n649 sth\n670 sth x","Sth1\n\something2\n42 036 sth\n42 896 sth y"]
let extractNumbers = str => {
return str.match(/\n\d+(?: \d+)?/g).map(m => m.replace(/\s+/g,''))
}
strs.forEach(str=> console.log(extractNumbers(str)))
If you need to remove the spaces. Then the easiest way for you to do this would be to remove the spaces and then scrape the numbers using 2 different regex.
str.replace(/\s+/, '').match(/\\n(\d+)/g)
First you remove spaces using the \s token with a + quantifier using replace.
Then you capture the numbers using \\n(\d+).
The first part of the regex helps us make sure we are not capturing numbers that are not following a new line, using \ to escape the \ from \n.
The second part (\d+) is the actual match group.
var str1 = "Something1\sth2\n649 sth\n670 sth x";
var str2 = "Sth1\n\something2\n42 036 sth\n42 896 sth y";
var reg = /(?<=\n)(\d+)(?: (\d+))?/g;
var d;
while(d = reg.exec(str1)){
console.log(d[2] ? d[1]+d[2] : d[1]);
}
console.log("****************************");
while(d = reg.exec(str2)){
console.log(d[2] ? d[1]+d[2] : d[1]);
}

How to get from the string substring after second hyphen?

If I have three and more hyphens in string I need to get from the string substring after second hyphen.
For example, I have this string:
"someStr1 - someStr2 - someStr3 - someStr4"
As you can see it has 3 hyphens, I need to get from string above substring:
"someStr3 - someStr4"
I know that I need to get the index position of second hyphen and then I can use substring function
But I don't know how to check if there is more then 3 hyphens and how to check thet position is of the second hyphen.
You can use the RegEx (?<=([^-]*-){2}).*
(?<=([^-]*-){2}) makes sure there is 2 - before your match
(?<= ... ) is a positive lookbehind
[^-]* matches anything but a -, 0 or more times
- matches - literally
.* matches anything after those 2 dashes.
Demo.
const data = "someStr1 - someStr2 - someStr3 - someStr4";
console.log(/(?<=([^-]*-){2}).*/.exec(data)[0]);
Split string to array with - and check if array.length > 3 which means at least three - in the string. If true, join the array from index == 2 to the end with - and trim the string.
var text = "someStr1 - someStr2 - someStr3 - someStr4"
var textArray = text.split('-')
if(textArray.length>3){
console.log(textArray.slice(2).join('-').trim())
}
How about something like this:
var testStr = "someStr1 - someStr2 - someStr3 - someStr4";
var hyphenCount = testStr.match(/-/g).length;
if(hyphenCount > 2){
var reqStr = testStr.split('-').slice(-2).join('-');
console.log(reqStr) // logs "someStr3 - someStr4"
}

validate text field using regular expression javaScript

I want to validate a text field to accept just text like this :
1,2;2,3;1-3
1-2;4;2,3;4;1-3
12
I don't want the types like this :
;1
,1
-1
1;;2
1,,2
1--2
1-2-3
1,2,3
1,2-3
so I make this regular expression but it seems doesn't work like what I want
var reg = /^\d*(((?!.*--)(?!.*,,)(?!.*;;)(?!.*,;)(?!.*,-)(?!.*-;)(?!.*-,)(?!.*;,)(?!.*;-))[,-;])*\d$/
thanks for your help :)
you can simply use the regex
function match(str){
return str.match(/^(?!.*([-,])\d+\1)(?!.*,\d+-)\d+(?:[-,;]\d+)*$/) != null
}
console.log(match(';1'));
console.log(match(',1'));
console.log(match('1;;2'));
console.log(match('1-3'));
console.log(match('12'));
console.log(match('1,2;2,3;1-3'));
console.log(match('1-2;4;2,3;4;1-3'));
console.log(match('1,2,3'));
take a look at regex demo
Here's my attempt. Based on your examples I've assumed that semi-colons are used to separate 'ranges', where a 'range' can be a single number or a pair separated by either a comma or a hyphen.
var re = /^\d+([,\-]\d+)?(;\d+([,\-]\d+)?)*$/;
// Test cases
[
'1',
'1,2',
'1-2',
'1;2',
'1,2;2,3;1-3',
'1-2;4;2,3;4;1-3',
'12',
';1',
',1',
'-1',
'1;;2',
'1,,2',
'1--2',
'1-2-3',
'1,2,3',
'1,2-3'
].forEach(function(str) {
console.log(re.test(str));
});
The first part, \d+([,\-]\d+)? matches a 'range' and the second part (;\d+([,\-]\d+)?)* allows further 'ranges' to be added, each starting with a semi-colon.
You can add in ?: to make the groups non-capturing if you like. That's probably a good idea but I wanted to keep my example as simple as I could so I've left them out.
You may use
/^\d+(?:(?:[-,;]\d+){3,})?$/
See the regex demo
Details
^ - start of string
\d+ - 1 or more digits
(?:(?:[-,;]\d+){3,})? - 1 or 0 sequences of:
(?:[-,;]\d+){3,} - 3 sequences of:
[-,;] - a -, , or ;
\d+ - 1 or more digits
$ - end of string
var ss = [ '1,2;2,3;1-3','1-2;4;2,3;4;1-3','12',';1',',1','-1','1;;2','1,,2','1--2','1-2-3','1,2,3','1,2-3',';1',',1','-1','1;;2','1,,2','1--2' ];
var rx = /^\d+(?:(?:[-,;]\d+){3,})?$/;
for (var s of ss) {
console.log(s, "=>", rx.test(s));
}
NOTE: the [,-;] creates a range between , and ; and matches much more than just ,, - or ; (see demo).

Retrieve BSR and category from string with RegExp

When I parse Amazon products I get this such of string.
"#19 in Home Improvements (See top 100)"
I figured how to retrieve BSR number which is /#\d*/
But have no idea how to retrieve Category which is going after in and end until brackets (See top 100).
I suggest
#(\d+)\s+in\s+([^(]+?)\s*\(
See the regex demo
var re = /#(\d+)\s+in\s+([^(]+?)\s*\(/;
var str = '#19 in Home Improvements (See top 100)';
var m = re.exec(str);
if (m) {
console.log(m[1]);
console.log(m[2]);
}
Pattern details:
# - a hash
(\d+) - Group 1 capturing 1 or more digits
\s+in\s+ - in enclosed with 1 or more whitespaces
([^(]+?) - Group 2 capturing 1 or more chars other than ( as few as possible before th first...
\s*\( - 0+ whitespaces and a literal (.

Categories