Match two equal character in word and replace with one RE

Match two equal character in word and replace with one RE - javascript

I want to convert Two equal character into single one like bannana should be banana //remove "nn" into single "n". ( except : "aa" all should be convert like above)
i/p : khuddar >> o/p : khudar
i/p : maanas >> o/p : maanas
i/p : hello >> o/p : helo
i/p : apple >> o/p : aple
Need regular expression to do these type of work.

Use capturing group and backreference.
Here's a javascript example:
"khuddar".replace(/([^a])\1/g, "$1")
// => "khudar"
"maanas".replace(/([^a])\1/g, "$1")
// => "maanas"
[^a] - matches a character that is not a.
(...) - matches the regular expression and save it to group 1 (2, 3, .. if there's more parentheses after it).
\1 - backreference for the group 1. If the matched part was b, \1 also refer b.

If you need to only match any letters but a, you can use
.replace(/([b-z])\1/ig, "$1")
See the regex demo
Regex explanation:
([b-z]) - Capture group 1 capturing any ASCII letter from b till z (and A to Z because the /i modifier making the pattern case-insensitive)
\1 - a inline backreference that matches the text value captured with the group above (thus, the pattern matches 2 identical ASCII letters)
In the replacement pattern, $1 numbered replacement backreference is used that replaces the 2 identical ASCII letters with 1 occurrence of this letter.
var re = /([b-z])\1/gi;
var str = 'khuddar<br/>maanas<br/>hello<br/>apple<br/>F11';
var subst = '$1';
var result = str.replace(re, subst);
document.body.innerHTML = result;

Related

Replace not numbers or words to underscore but leave dash and remove spaces around it

So I got this string
'word word - word word 24/03/21'
And I would like to convert it to
'word_word-word_word_24_03_21'
I have tried this
replace(/[^aA-zZ0-9]/g, '_')
But I get this instead
word_word___word_word_24_03_21

You can use 2 .replace() calls:
const s = 'word word - word word 24/03/21'
var r = s.replace(/\s*-\s*/g, '-').replace(/[^-\w]+/g, '_')
console.log(r)
//=> "word_word-word_word_24_03_21"
Explanation:
.replace(/\s*-\s*/g, '-'): Remove surrounding spaces of a hyphen
.replace(/[^-\w]+/g, '_'): Replace all character that are not a hyphen and not a word character with an underscore

You can use
console.log(
'word word - word word 24/03/21'.replace(/\s*(-)\s*|[^\w-]+/g, (x,y) => y || "_")
)
Here,
/\s*(-)\s*|[^\w-]+/g - matches and captures into Group 1 a - enclosed with zero or more whitespaces, and just matches any non-word char excluding -
(x,y) => y || "_") - replaces with Group 1 if it was matched, and if not, replacement is a _ char.

With a function for replace and an alternation in the pattern, you could also match:
(\s*-\s*) Match a - between optional whtiespace chars
| Or
[^a-zA-Z0-9-]+ Match 1+ times any of the listed ranges
In the callback, check if group 1 exists. If it does, return only a -, else return _
Note that this notation [^aA-zZ0-9] is not the same as [a-zA-Z0-9], see what [A-z] matches.
let s = "word word - word word 24/03/21";
s = s.replace(/(\s*-\s*)|[^a-zA-Z0-9-]+/g, (_, g1) => g1 ? "-" : "_");
console.log(s);

You can use the + regex operator to replace 1 or more continuous matches at once.
let s = 'word word - word word 24/03/21';
let r = s
.replace(/[^aA-zZ0-9]*-[^aA-zZ0-9]*/g, '-')
.replace(/[^aA-zZ0-9-]+/g, '_');
console.log(r);
// 'word_word-word_word_24_03_21'

Regex to match many times

I'm trying to match a type definition
def euro : t1 -> t2 -> t3 (and this pattern my repeat further in other examples)
I came up with this regex
^def ([^\s]*)\s:\s([^\s]*)(\s->\s[^\s]*)*
But while it matches euro and t1 it
then matches -> t2 rather than t2
fails to match anything with t3
I can't see what I am doing wrong, and my goal is to capture
euro t1 t2 t3
as four separate items, and what I currently get is
0: "def euro : t1 -> t2 -> t3"
1: "euro"
2: "t1"
3: " -> t3"

You can't use a repeated capturing group in JS regex, all but the last values will be "dropped", re-written upon each subsequent iteration.
When creating a regular expression that needs a capturing group to grab part of the text matched, a common mistake is to repeat the capturing group instead of capturing a repeated group. The difference is that the repeated capturing group will capture only the last iteration, while a group capturing another group that's repeated will capture all iterations.
The way out can be capturing the whole substring and then split it. Here is an example:
var s = "def euro : t1 -> t2 -> t3";
var rx = /^def (\S*)\s:\s(\S*)((?:\s->\s\S*)*)/;
var res = [];
var m = s.match(rx);
if (m) {
res = [m[1], m[2]];
for (var s of m[3].split(" -> ").filter(Boolean)) {
res.push(s);
}
}
console.log(res);
Pattern details
^ - start of string
def - a literal substring
(\S*) - Capturing group 1: 0+ non-whitespace chars
\s:\s - a : enclosed with single whitespaces
(\S*) - Capturing group 2: 0+ non-whitespace chars
((?:\s->\s\S*)*) - Capturing group 3: 0+ repetitions of the following pattern sequences:
\s->\s - whitespace, ->, whitespace
\S* - 0+ non-whitespace chars

Details:
?: - creates a non-capturing group
$1 - recieves the result of first capturing group i.e., \w+
\s[\:\-\>]+\s - matches " : " or " -> "
\w+ - matches repeating alphanumeric pattern
let str = 'def euro : t1 -> t2 -> t3';
let regex = /(?:def\s|\s[\:\-\>]+\s)(\w+)/g;
let match = str.replace(regex, '$1\n').trim().split('\n');
console.log(match);

What will be the regular expression for below requirement in javascript

Criteria:
any word that start with a and end with b having middle char digit. this word should not be on the line which start with char '#'
Given string:
a1b a2b a3b
#a4b a5b a6b
a7b a8b a9b
Expected output:
a1b
a2b
a3b
a7b
a8b
a9b
regex: ?i need it for javascipt.
So far tried below thing:
var text_content =above_mention_content
var reg_exp = /^[^#]?a[0-9]b/gmi;
var matched_text = text_content.match(reg_exp);
console.log(matched_text);
Getting below output:
[ 'a1b', ' a7b' ]

Your /^[^#]?a[0-9]b/gmi will match multiple occurrences of the pattern matching the start of line, then 1 or 0 chars other than #, then a, digit and b. No checking for a whole word, nor actually matching words farther than at the beginning of a string.
You may use a regex that will match lines starting with # and match and capture the words you need in other contexts:
var s = "a1b a2b a3b\n#a4b a5b a6b\n a7b a8b a9b";
var res = [];
s.replace(/^[^\S\r\n]*#.*|\b(a\db)\b/gm, function($0,$1) {
if ($1) res.push($1);
});
console.log(res);
Pattern details:
^ - start of a line (as m multiline modifier makes ^ match the line start)
[^\S\r\n]* - 0+ horizontal whitespaces
#.* - a # and any 0+ chars up to the end of a line
| - or
\b - a leading word boundary
(a\db) - Group 1 capturing a, a digit, a b
\b - a trailing word boundary.
Inside the replace() method, a callback is used where the res array is populated with the contents of Group 1 only.

I would suggest to use 2 reg ex:
First Reg ex fetches the non-hashed lines:
^[^#][a\db\s]+
and then another reg ex for fetching individual words(from each line):
^a\db\s

RegExp match word till space or character

I'm trying to match all the words starting with # and words between 2 # (see example)
var str = "#The test# rain in #SPAIN stays mainly in the #plain";
var res = str.match(/(#)[^\s]+/gi);
The result will be ["#The", "#SPAIN", "#plain"] but it should be ["#The test#", "#SPAIN", "#plain"]
Extra: would be nice if the result would be without the #.
Does anyone has a solution for this?

You can use
/#\w+(?:(?: +\w+)*#)?/g
See the demo here
The regex matches:
# - a hash symbol
\w+ - one or more alphanumeric and underscore characters
(?:(?: +\w+)*#)? - one or zero occurrence of:
(?: +\w+)* - zero or more occurrences of one or more spaces followed with one or more word characters followed with
# - a hash symbol
NOTE: If there can be characters other than word characters (those in the [A-Za-z0-9_] range), you can replace \w with [^ #]:
/#[^ #]+(?:(?: +[^ #]+)*#)?/g
See another demo
var re = /#[^ #]+(?:(?: +[^ #]+)*#)?/g;
var str = '#The test-mode# rain in #SPAIN stays mainly in the #plain #SPAIN has #the test# and more #here';
var m = str.match(re);
if (m) {
// Using ES6 Arrow functions
m = m.map(s => s.replace(/#$/g, ''));
// ES5 Equivalent
/*m = m.map(function(s) {
return s.replace(/#$/g, '');
});*/ // getting rid of the trailing #
document.body.innerHTML = "<pre>" + JSON.stringify(m, 0, 4) + "</pre>";
}

You can also try this regex.
#(?:\b[\s\S]*?\b#|\w+)
(?: opens a non capture group for alternation
\b matches a word boundary
\w matches a word character
[\s\S] matches any character
See demo at regex101 (use with g global flag)

javascript insert after every digit

I am trying to add a space after every occurrence of a digit with javascript.
"2tim" will be "2 tim"
js
var v = '2tim';
v.replace(/(\d+)/, /\1 /);

There are 3 things wrong with your code:
The second argument to replace should be a string.
To use a captured group, use the dollar sign.
You don't want to capture all digits into the same group (\d+). Just capture one digit, and make the regex global.
var v = '2tim';
v = v.replace(/(\d)/g, '$1 ');
Here's the fiddle: http://jsfiddle.net/qujsq/
If you want to add a space only after a group of digits, then do use a +:
var v = '12times';
v = v.replace(/(\d+)/g, '$1 ');

We Keep Coding

JavaScript is the programming language of the Web.

Match two equal character in word and replace with one RE - javascript

Related

Replace not numbers or words to underscore but leave dash and remove spaces around it

Regex to match many times

What will be the regular expression for below requirement in javascript

RegExp match word till space or character

javascript insert after every digit

Categories

Resources