can i replace regExp with another regExp? - javascript

var str='The_Andy_Griffith_Show'; // string to perform replace on
var regExp1=/\s|[A-Z]/g;
var regExp2=/[^A-Z]/g; // regular expression
var str2 =str.replace(regExp2,regExp1);
// expected output: The_ Andy_ Griffith_ Show
I want to replace all the first capital letters of a string with a space and that same letter, and if that's not possible is there a workaround?

If you want to add a space before any captial letter, it is enough to use
var str='The_Andy_Griffith_Show';
str = str.replace(/[A-Z]/g, ' $&')
console.log(str); // => " The_ Andy_ Griffith_ Show"
Here, /[A-Z]/g matches all ASCII uppercase letters and $& is a backreference to the whole match value.
If you want to only add a space before the first capital letter in a word, you need to use capturing groups and backreferences to thier values in the replacement pattern:
var str='The_Andy_Griffith_Show'; // string to perform replace on
str = str.replace(/(^|[^A-Z])([A-Z])/g, '$1 $2')
console.log(str); // => " The_ Andy_ Griffith_ Show"
Remove ^| if you do not want to add space before a capital letter at the string start (i.e. use /([^A-Z])([A-Z])/g).

Just an alternative to the other answers.
To get that expected result you could just match the non-uppercases that are followed by an uppercase character, then replace them with the match $& and a space.
For example:
var str='The_Andy_Griffith_Show';
str = str.replace(/[^A-Z](?=[A-Z])/g, '$& ')
console.log(str);
Or simply match those uppercases followed by an uppercase character.
var str='The_Andy_Griffith_Show';
str = str.replace(/[_](?=[A-Z])/g, '$& ')
console.log(str);

To add space to all occurrences of capital letters:
var str = 'The_Andy_Griffith_Show',
str2 = str.replace(/[A-Z]/g, letter => ` ${letter}`);
console.log(str2);
Notice that if you want no to add space to the first letter occurrence, just use the regular expression /(?!^)[A-Z]/g.

Related

Finding exact words in text, excluding quoted words

In the javascript code below I need to find in a text exact words, but excluding the words that are between quotes. This is my attempt, what's wrong with the regex? It should find all the words excluding word22 and "word3". If I use only \b in the regex it selects exact words but it doesn't exclude the words between quotes.
var text = 'word1, word2, word22, "word3" and word4';
var words = [ 'word1', 'word2', 'word3' , 'word4' ];
words.forEach(function(word){
var re = new RegExp('\\b^"' + word + '^"\\b', 'i');
var pos = text.search(re);
if (pos > -1)
alert(word + " found in position " + pos);
});
First, we'll use a function to escape the characters of the word, just in case there's some that have special meaning for regexp.
// from https://stackoverflow.com/a/30851002/240443
function regExpEscape(literal_string) {
return literal_string.replace(/[-[\]{}()*+!<=:?.\/\\^$|#\s,]/g, '\\$&');
}
Then, we construct a regular expression as an alternation between individual word regexps. For each word, we assert that it starts with a word boundary, ends with a word boundary, and has an even number of quote characters between its end, and the end of string. (Note that from the end of word3, there is only one quote till the end of string, which is odd.)
let text = 'word1, word2, word22, "word3" and word4';
let words = [ 'word1', 'word2', 'word3' , 'word4' ];
let regexp = new RegExp(words.map(word =>
'\\b' + regExpEscape(word) + '\\b(?=(?:[^"]*"[^"]*")*[^"]*$)').join('|'), 'g')
text.match(regexp)
// => word1, word2, word4
while ((m = regexp.exec(text))) {
console.log(m[0], m.index);
}
// word1 0
// word2 7
// word4 34
EDIT: Actually, we can speed the regexp up a bit if we factor out the surrounding conditions:
let regexp = new RegExp(
'\\b(?:' +
words.map(regExpEscape).join('|') +
')\\b(?=(?:[^"]*"[^"]*")*[^"]*$)', 'g')
Your excluding of the quote character is wrong, that's actually matching the beginning of the string followed by a quote. Trying this instead
var re = new RegExp('\\b[^"]' + word + '[^"]\\b', 'i');
Also, this site is amazing to help you debug regex : https://regexpal.com
Edit: Because \b will match on quotation marks, this needs to be tweaked further. Unfortunately javascript doesn't support lookbehinds, so we have to get a little tricky.
var re = new RegExp('(?:^|[^"\\w])' + word + '(?:$|[^"\\w])','i')
So what this is doing is saying
(?: Don't capture this group
^ | [^"\w]) either match the start of the line, or any non word (alphanumeric and underscore) character that isn't a quote
word capture and match your word here
(?: Don't capture this group either
$|[^"\w) either match the end of the line, or any non word character that isn't a quote again

Return word before or after a string with newline characters

In short: I want to return the word right before or after a newline character in a string. How would I accomplish that?
I want to return: 1,150 and Svendborg
This is my string:
var newline = /\n/;
var str = "Specialzed Road Expert 2017\nkr.1,150 - Svendborg\n\nSpecialzed"
This will essentially match a whole line with a leading and trailing newline character with groups to match just the first and last "words".
var str = "Specialzed Road Expert 2017\nkr.1,150 - Svendborg\n\nSpecialzed";
var matches = str.match(/\n([^\s]+).*?([^\s]+)\n/);
console.log(matches);
Your words would be in matches[1] and matches[2] with matches[0] being the whole line.

concat string using javascript

var str = 'abc 123 hello xyz';
How to concat above string to abc123helloxyz? I tried to trim but it left spaces still between characters. I can't use split(' ') too as the space is not one some time.
You might use a regex successfully. \s checks for the occurences for any white spaced charachters. + accounts for more than once occurences of spaces. and `/g' to check for continue searching even after the first occurences is found.
var str = 'abc 123 hello xyz';
str = str.replace(/\s+/g, "");
console.log(str);
Use a regex.
var newstr = str.replace(/ +/g, '')
This will replace any number of spaces with an empty string.
You can also expand it to include other whitespace characters like so
var newstr = str.replace(/[ \n\t\r]+/g, '')
Replace the spaces in the string:
str = str.replace(" ", "");
Edit: as has been brought to my attention, this only replaces the first occurrence of the space character. Using regex as per the other answers is indeed the correct way to do it.
The cleanest way is to use the following regex:
\s means "one space", and \s+ means "one or more spaces".
/g flag means (replace all occurrences) and replace with the empty string.
var str = 'abc 123 hello xyz';
console.log("Before: " + str);
str = str.replace(/\s+/g, "");
console.log("After: " + str);

regex to remove number (year only) from string

I know the regex that separates two words as following:
input:
'WonderWorld'
output:
'Wonder World'
"WonderWorld".replace(/([A-Z])/g, ' $1');
Now I am looking to remove number in year format from string, what changes should be done in the above code to get:
input
'WonderWorld 2016'
output
'Wonder World'
You can match the location before an uppercase letter (but excluding the beginning of a line) with \B(?=[A-Z]) and match the trailing spaces if any with 4 digits right before the end (\s*\b\d{4}\b). In a callback, check if the match is not empty, and replace accordingly. If a match is empty, we matched the location before an uppercase letter (=> replace with a space) and if not, we matched the year at the end (=> replace with empty string). The four digit chunks are only matched as whole words due to the \b word boundaries around the \d{4}.
var re = /\B(?=[A-Z])|\s*\d{4}\b/g;
var str = 'WonderWorld 2016';
var result = str.replace(re, function(match) {
return match ? "" : " ";
});
document.body.innerHTML = "<pre>'" + result + "'</pre>";
A similar approach, just a different pattern for matching glued words (might turn out more reliable):
var re = /([a-z])(?=[A-Z])|\s*\b\d{4}\b/g;
var str = 'WonderWorld 2016';
var result = str.replace(re, function(match, group1) {
return group1 ? group1 + " " : "";
});
document.body.innerHTML = "<pre>'" + result + "'</pre>";
Here, ([a-z])(?=[A-Z]) matches and captures into Group 1 a lowercase letter that is followed with an uppercase one, and inside the callback, we check if Group 1 matched (with group1 ?). If it matched, we return the group1 + a space. If not, we matched the year at the end, and remove it.
Try this:
"WonderWorld 2016".replace(/([A-Z])|\b[0-9]{4}\b/g, ' $1')
How about this, a single regex to do what you want:
"WonderWorld 2016".replace(/([A-Z][a-z]+)([A-Z].*)\s.*/g, '$1 $2');
"Wonder World"
get everything apart from digits and spaces.
re-code of #Wiktor Stribiżew's solution:
str can be any "WonderWorld 2016" | "OneTwo 1000 ThreeFour" | "Ruby 1999 IamOnline"
str.replace(/([a-z])(?=[A-Z])|\s*\d{4}\b/g, function(m, g) {
return g ? g + " " : "";
});
import re
remove_year_regex = re.compile(r"[0-9]{4}")
Test regex expression here

regex: Match when between text but not between digits

Please help.
I need a regular expression (to be used in javascript) to replace "." with "#" in a text containing Unicode characters.
Replacement takes place only when "." appears between text but not between digits.
Input: "ΦΨ. ABC. DEF. 123.456"
Desired output: "ΦΨ# ABC# DEF# 123.456"
Any suggestions?
You can use capturing groups in the regex and use back-references to obtain the required result:
var re = /(\D)\.(\D)/g;
var str = 'ΦΨ. ABC. DEF. 123.456';
var subst = '$1#$2';
result = str.replace(re, subst);
alert(result);
Regex Explanation:
\D - A non-digit character
\. - A literal dot
The non-digit characters are captured into groups, and then inserted back with the help of $1 and $2 back-references.
try this:
var str = "ΦΨ. ABC. DEF. 123.456";
str.replace(/[^\d.]+\.[^\d]/g, function (m) {
return m.replace('.', '#')
});

Categories