javascript regular expression match special string - javascript

I have a string like this:
惊讶! 学会这些 625 单词就可以走遍天下!
(Animal) ​动物​:
Dog ​- 狗, ​Cat ​- 猫, ​Fish ​- 鱼, ​Bird ​- 鸟, ​Cow ​- 牛, ​Pig ​- 猪, ​Mouse ​- 老鼠,
Horse ​- 马, ​Wing ​- 翅膀, ​Animal ​- 动物.
(Transportation) ​交通​:
Train ​- 火车, ​Plane ​- 飞机, ​Car ​- 汽车, ​Truck ​- 卡车, ​Bicycle ​- 自行车​
,
I want to match Dog - 狗,Fish - 鱼 ...
my regular expression is: const reg = /(.*)\s*-\s*(.*),/g, but the result is not what i expect.so how to write a correct one?
my final answer is [[Dog,狗],[Cat,猫],...],I know I should use regular expression, but it has trouble

You can use
/(\S+)[\s\u200B]*-[\s\u200B]*([^,]*)/g
See the regex demo.
Details:
(\S+) - Group 1: one or more non-whitespace chars
[\s\u200B]* - zero or more whitespaces or ZWJ symbol
- - a hyphen
[\s\u200B]* - zero or more whitespaces or ZWJ symbol
([^,]*) - Group 2: zero or more chars other than a comma.

Related

Regex add space in string if the word is longer than 4 characters and have numbers

I try to create a regex with 2 condition:
if word length more than 4 character
And if the word contains numbers
I need to add spaces
So like: iph12 return iph12, but iphone12 return iphone 12
I wrote regex
.replace(/\d+/gi, ' $& ').trim()
and this function return in anyway string like iphone 12. I tried to use function
.replace(/(?=[A-Z]+\d|\d+[A-Z])[A-Z\d]{,4}/i, ' $& ').trim()
but without second argument in {,4} it's not working. So is this possible?
You can use
text.replace(/\b([a-zA-Z]{4,})(\d+)\b/g, '$1 $2')
See the regex demo. Details:
\b - word boundary
([a-zA-Z]{4,}) - Group 1: four or more ASCII letters
(\d+) - Group 2: one or more digits
\b - word boundary
See the JavaScript demo:
const texts = ['iphone12', 'iph12'];
const regex = /\b([a-zA-Z]{4,})(\d+)\b/g;
for (const text of texts) {
console.log(text, '=>', text.replace(regex, '$1 $2'));
}
Output:
iphone12 => iphone 12
iph12 => iph12

Regex for first number after a number and period

I have an ordered list with names and addresses that is structured like:
1. Last, First 123 Main St buncha
buncha buncha
2. Lasta, Firsta 234 Lane St etc etc
So I need a regex that finds the number that immediately follows the number with a period. So in this case an array containing [123, 234]. I have a couple of patterns I've tried. The one that I think is the closest is
/(?![0-9]+\.)[0-9]+/gim;
unfortunately this just returns every number, but i think its in the right area. Any help would be appreciated.
Use a positive lookbehind to explicitly match the number, period, and the text between that and the number in the address.
const string = `1. Last, First 123 Main St buncha
buncha buncha
2. Lasta, Firsta 234 Lane St etc etc`;
let regex = /(?<=^\d+\.\D*)\d+/gm;
console.log(string.match(regex));
Something like this
const source =
`1. Last, First 123 Main St buncha
buncha buncha
2. Lasta, Firsta 234 Lane St etc etc`;
const result = source.match(/(?<=(\d+\..+))\d+/gm);
console.log(result);
You could also use a capturing group
^\d+\.\D*(\d+)
Explanation
^ Start of string
\d+\. Match 1+ digits and .
\D* Match 0+ any char except a digit
(\d+) Capture group 1, match 1+ digits
Regex demo
const regex = /^\d+\.\D*(\d+)/gm;
const str = `1. Last, First 123 Main St buncha
buncha buncha
2. Lasta, Firsta 234 Lane St etc etc`;
let res = Array.from(str.matchAll(regex), m => m[1]);
console.log(res);

Regex Conditional Lookahead JavaScript

I just used regex101, to create the following regex.
([^,]*?)=(.*?)(?(?=, )(?:, )|(?:$))(?(?=[^,]*?=)(?:(?=[^,]*?=))|(?:$))
It seems to work perfectly for my use case of getting keys and values that are comma separated while still preserving commas in the values.
Problem is, I want to use this Regex in Node.js (JavaScript), but while writing this entire Regex in regex101, I had it set to PCRE (PHP).
It looks like JavaScript doesn't support Conditional Lookaheads ((?(?=...)()|()).
Is there a way to get this working in JavaScript?
Examples:
2 matches
group 1: id, group 2: 1
group 1: name, group 2: bob
id=1, name=bob
3 matches
group 1: id, group 2: 2
group 1: type, group 2: store
group 1: description, group 2: Hardwood Store
id=2, type=store, description=Hardwood Store
4 matches
group 1: id, group 2: 4
group 1: type, group 2: road
group 1: name, group 2: The longest road name, in the entire world, and universe, forever
group 1: built, group 2: 20190714
id=4, type=road, name=The longest road name, in the entire world, and universe, forever, built=20190714
3 matches
group 1: id, group 2: 3
group 1: type, group 2: building
group 1: builder, group 2: Random Name, and Other Person, with help from Final Person
id=3, type=building, builder=Random Name, and Other Person, with help from Final Person
You may use
/([^,=\s][^,=]*)=(.*?)(?=(?:,\s*)?[^,=]*=|$)/g
See the regex demo.
Details
([^,=\s][^,=]*) - Group 1:
[^,=\s] - a char other than ,, = and whitespace
[^,=]* - zero or more chars other than , and =
= - a = char
(.*?) - Group 2: any zero or more chars other than line break chars, as few as possible
(?=(?:,\s*)?[^,=]*=|$) - a positive lookahead that requires an optional sequence of , and 0+ whitespaces and then 0+ chars other than , and = and then a = or end of string immediately to the right of the current location
JS demo:
var strs = ['id=1, name=bob','id=2, type=store, description=Hardwood Store', 'id=4, type=road, name=The longest road name, in the entire world, and universe, forever, built=20190714','id=3, type=building, builder=Random Name, and Other Person, with help from Final Person']
var rx = /([^,=\s][^,=]*)=(.*?)(?=(?:,\s*)?[^,=]*=|$)/g;
for (var s of strs) {
console.log("STRING:", s);
var m;
while (m=rx.exec(s)) {
console.log(m[1], m[2])
}
}
Maybe, these expressions would be somewhat close to what you might want to design:
([^=\n\r]*)=\s*([^=\n\r]*)\s*(?:,|$)
or
\s*([^=\n\r]*)=\s*([^=\n\r]*)\s*(?:,|$)
not sure though.
DEMO
The expression is explained on the top right panel of this demo if you wish to explore/simplify/modify it.
const regex = /\s*([^=\n\r]*)=\s*([^=\n\r]*)\s*(?:,|$)/gm;
const str = `id=3, type=building, builder=Random Name, and Other Person, with help from Final Person
id=4, type=road, name=The longest road name, in the entire world, and universe, forever, built=20190714
id=2, type=store, description=Hardwood Store
id=1, name=bob
`;
let m;
while ((m = regex.exec(str)) !== null) {
// This is necessary to avoid infinite loops with zero-width matches
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
// The result can be accessed through the `m`-variable.
m.forEach((match, groupIndex) => {
console.log(`Found match, group ${groupIndex}: ${match}`);
});
}
RegEx Circuit
jex.im visualizes regular expressions:
Yet another way to do it
\s*([^,=]*?)\s*=\s*((?:(?![^,=]*=)[\S\s])*)(?=[=,]|$)
https://regex101.com/r/J6SSGr/1
Readable version
\s*
( [^,=]*? ) # (1), Key
\s* = \s* # =
( # (2 start), Value
(?:
(?! [^,=]* = )
[\S\s]
)*
) # (2 end)
(?= [=,] | $ )
Ultimate PCRE version
\s*([^,=]*?)\s*=\s*((?:(?!\s*[^,=]*=)[\S\s])*(?<![,\s]))\s*(?=[=,\s]|$)
https://regex101.com/r/slfMR1/1
\s* # Wsp trim
( [^,=]*? ) # (1), Key
\s* = \s* # Wsp trim = Wsp trim
( # (2 start), Value
(?:
(?! \s* [^,=]* = )
[\S\s]
)*
(?<! [,\s] ) # Wsp trim
) # (2 end)
\s* # Wsp trim
(?= [=,\s] | $ ) # Field seperator

Regex for comma separated 3 letter words

I want to create a regex for exactly 3 letter words only, separated by commas. 3 letter words can be padded with space(at most 1 on each side)
Valid Examples:
ORD
JFK, LAX
ABC,DEF, GHK,REW, ASD
Invalid Examples:
ORDA
OR
ORD,
JFK, LA
I tried the following but couldn't get it to work.
^(?:[A-Z ]+,)*[A-Z ]{3} +$
Try this: ^([ ]?[A-Z]{3}[ ]?,)*([ ]?[A-Z]{3}[ ]?)+$
https://regex101.com/r/HFeN0D/2/
It matches at least one three letter word (with spaces), preceded by any number of words three letter words with commas after them.
Try this pattern:
^[A-Z]{3}(?:[ ]?,[ ]?[A-Z]{3})*$
This pattern matches an initial three letter word, followed by two more terms separated by a comma with optional spaces.
You can do this with the pattern: ^((:? ?[A-Z]{3} ?,)*(?: ?[A-Z]{3} ?))+$
var str = `ORD
JFK, LAX
ABC,DEF, GHK,REW, ASD
ORDA
OR
ORD,
JFK, LA`;
let result = str.match(/^((:? ?[A-Z]{3} ?,)*(?: ?[A-Z]{3} ?))+$/gm);
document.getElementById('match').innerHTML = result.join('<br>');
<p id="match"></p>

Using jQuery to find and replace wildcard text

I'm rather new to jQuery / JS and wondering how to do the following:
Big Sofa - Pink
Big Sofa - Blue
Small Sofa - Red
Small Sofa - Grey
What I need to do is remove all the text before and including the "-" so it just shows the colour, need to wildcard it so will get anything and replace with nothing.
Is this possible?
I would recommend using regex.
Here's example when you're using span for each entry:
$(document).ready(function() {
$("span").each(function() {
var text = $(this).text();
text = text.replace(/(\w+\s)+-\s/g, "");
$(this).text(text);
});
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<span>Sofa test - red</span>
<span>Table test - blue</span>
try this simple one
var a = "Big Sofa - Pink\nBig Sofa - Blue\nSmall Sofa - Red\nSmall Sofa - Grey"
var newA = a.split( "\n" ).map( function(value){ return value.split( "-" ).pop() } ).join( "\n" );
console.log( newA );
This regular expression will work on multiline strings:
var str = 'Big Sofa - Pink\nBig Sofa - Blue\nSmall Sofa - Red\nSmall Sofa - Grey';
str.replace(/.+\-\s*/g, '').split(/\n/);
You can do it as
var text = "Big Sofa - Pink";
var color = text.substring(text.indexOf("-") + 1)
Your best bet is to use regular expressions.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions
var text = 'Big Sofa - Pink';
var color = text.replace(/^.+?- (\w+)$/i, '$1'); // Pink
^ matches the start of the string
. matches any character, including white space
+ indicates that the preceding matcher have to occur 1 or more times
? after a multiplier indicates that the matching is non greedy (matches the minimum possible)
- matches literally a dash and a space
(\w+) matches any word composed by 1 or more word characters (letters, numbers and underscore) and capture its value in a group (group 1 in this case)
$ matches the end of the string
i (at the end) indicates that the match is case insensitive
The replace string ($1) replaces the entire match by the content of the capture group 1. In our case it matches the color.

Categories