JavaScript Regex - Remove Whitespace from Start and End

JavaScript Regex - Remove Whitespace from Start and End - javascript

I worked on the below challenge for about 3 hours and none of my code was working. Decided to look at the solution to understand why I was not working. When I looked at the solution I was confused because I thought that \s to identify white spaces not to remove them... can someone give me hand and explain why the usage of \s instead of \S and why using the empty string ("") to get rid of the white spaces on both ends.
CHALLENGE
Write a regex and use the appropriate string methods to remove whitespace at the beginning and end of strings.
//SOLUTION
let hello = " Hello, World! ";
let wsRegex = /^\s+|\s+$/g;
let result = hello.replace(wsRegex, "");

\s means whitespace characters in regex, like <space>, <tab>, etc.
^ means the beginning of the string
$ means the end of the string
| means OR (match the left side or the right side)
+ means 1 or more (based off of the rule on the left)
/a regex/g the g means "global", aka "match multiple times" since you could need to match at the beginning AND end
So the regex means:
/^\s+|\s+$/g
/ / Wrap the regex (how you do it in JS)
^\s+ Try to match at the beginning one or more whitespace chars
| Or...
\s+$ Try to match whitespace chars at the end
g Match as many times as you can
String.prototype.replace replaces the match(es) found in the regex with the string provided as the 2nd argument, in this case an empty string.
So the process internally is:
Look for all sections that match the regex (which will be the whitespace at the beginning and the whitespace at the end
Replace each match with "", removing those matches entirely
let hello = " Hello, World! ";
let wsRegex = /^\s+|\s+$/g;
let result = hello.replace(wsRegex, "");
console.log('"' + result + '"');
Most people use String.prototype.replaceAll instead of .replace when they use the global flag (
let hello = " Hello, World! ";
let wsRegex = /^\s+|\s+$/g;
let result = hello.replaceAll(wsRegex, "");
console.log('"' + result + '"');

The second argument of replace is for what you will replace from the match(es) of the first argument.
The regex will match/select the spaces on the beginning (^) and on the end ($) of the string, and then will be replaced by "".
When you use the regex /(\S)/g you're matching everything but spaces, in this case you will use something like hello.replace(/(\S)/g, '$1');
$1 means the first group of your regex.

Related

How to remove characters in a string that matches a RegEx while preserving spaces?

I am trying to remove characters from a string so that it will match this RegEx: ^[-a-zA-Z0-9._:,]+$. For example:
const input = "test // hello". The expected output would be "test hello". I tried the following:
input.replace(/^[-a-zA-Z0-9._:,]+$/g, "")
But this does not seem to work

The example output "hello world" that you give does not match your regex, because the regex does not allow spaces. Assuming you want to keep spaces, use
input.replace(/[^-a-zA-Z0-9._:, ]/g, "")
The negation character ^ must be inside the [...]. The + is not needed, because /g already ensures that all matching characters are replaced (that is, removed).
If you also want to condense consecutive spaces into a single space (as implied by your example), use
input.replace(/[^-a-zA-Z0-9._:, ]/g, "").replace(/\s+/g, " ")

I like to use the following canonical approach:
var input = "test // hello";
var output = input.replace(/\s*[^-a-zA-Z0-9._:, ]+\s*/g, " ").trim()
console.log(output);
The logic here is to target all unwanted characters and their surrounding whitespace. We replace with just a single space. Then we do a trim at the end in case there might be an extra leading/trailing space.

Javascript regular expression to extract characters from mid string with optional end character

I would like to extract characters from mid string with optional end character. If the optional end character is not found, extract until end of string. The first characters are S= and the last optional character is &.
Example #1:
"rilaS=testingabc"
should extract:
"testingabc"
Example #2:
"rilaS=testing123&thistest"
should extract:
"testing123"
This is what I have so far (Javascript):
var Str = "rilaS=testing123&thistest";
var tmpStr = Str.match("S=(.*)[\&]{0,1}");
var newStr = tmpStr[1];
alert(newStr);
But it does not detect that the end should be the ampersand (if found). Thank you before hand.
Answer (By ggorlen)
var Str = "rilaS=testing123&thistest";
var tmpStr = Str.match("S=([^&]*)");
var newStr = tmpStr[1];
alert(newStr);

You may use /S=([^&]*)/ to grab from an S= to end of line or &:
["rilaS=testingabc", "rilaS=testing123&thistest"].forEach(s =>
console.log(s.match(/S=([^&]*)/)[1])
);

Just in case you are wondering why your original regex didn't work: the problem is that the (.*) pattern is greedy - meaning it will happily slurp up anything, including &, and not leave it for for later items to match. This is why you want the "not &" - it will match up to, but not including the &.

regex precceded by two or more special character

I am stuck with creating regex such that if the word is preceded or ended by special character more than one regex on each side regex 'exec' method should throw null. Only if word is wrap with exactly one bracket on each side 'exec' method should give result Below is the regular expression I have come up with.
If the string is like "(test)" or then only regex.exec should have values for other combination such as "((test))" OR "((test)" OR "(test))" it should be null. Below code is not throwing null which it should. Please suggest.
var w1 = "\(test\)";
alert(new RegExp('(^|[' + '\(\)' + '])(' + w1 + ')(?=[' + '\(\)' + ']|$)', 'g').exec("this is ((test))"))

If you have a list of words and want to filter them, you can do the following.
string.split(' ').filter(function(word) {
return !(/^[!##$%^&*()]{2,}.+/).test(word) || !(/[!##$%^&*()]{2,}$).test(word)
});
The split() function splits a string at a space character and returns an array of words, which we can then filter.
To keep the valid words, we will test two regex expressions to see if the word starts or ends with 2 or more special characters respectively.
RegEx Breakdown
^ - Expression starts with the following
[] - A single character in the block
!##$%^&*() - These are the special characters I used. Replace them with the ones you want.
{2,} - Matches 2 or more of the preceeding characters
.+ - Matches 1 or more of any character
$ - Expression ends with the following
To use the exec function this way do this
!(/^[!##$%^&*()]{2,}.+/).exec(string) || !(/[!##$%^&*()]{2,}$).exec(string)

If I understand correctly, you are looking for any string which contains (test), anywhere in it, and exactly that, right?
In that case, what you probably need is the following:
var regExp = /.*[^)]\(test\)[^)].*/;
alert(regExp.exec("this is ((test))")); // → null
alert(regExp.exec("this is (test))" )); // → null
alert(regExp.exec("this is ((test)" )); // → null
alert(regExp.exec("this is (test) ...")); // → ["this is (test) ..."]
Explanation:
.* matches any character (except newline) between zero and unlimited times, as many times as possible.
[^)] match a single character but not the literal character )
This makes sure there's your test string in the given string, but it is only ever wrapped with one brace in every side!

You can use the following regex:
(^|[^(])(\(test\))(?!\))
See regex demo here, replace with $1<span style="new">$2</span>.
The regex features an alternation group (^|[^(]) that matches either start of string ^ or any character other than (. This alternation is a kind of a workaround since JS regex engine does not support look-behinds.
Then, (\(test\)) matches and captures (test). Note the round brackets are escaped. If they were not, they would be treated as a capturing group delimiters.
The (?!\)) is a look-ahead that makes sure there is no literal ) right after test). Look-aheads are supported fully by JS regex engine.
A JS snippet:
var re = /(^|[^(])(\(test\))(?!\))/gi;
var str = 'this is (test)\nthis is ((test))\nthis is ((test)\nthis is (test))\nthis is ((test\nthis is test))';
var subst = '$1<span style="new">$2</span>';
var result = str.replace(re, subst);
alert(result);

Javascript Regex with variable and $1

I have read How do you pass a variable to a Regular Expression javascript
I'm looking to create a regular expression to get and replace a value with a variable..
section = 'abc';
reg = new RegExp('\[' + section + '\]\[\d+\]','g');
num = duplicate.replace(reg,"$1++");
where $1 = \d+ +1
and... without increment... it doesn't work...
it returns something like:
[abc]$1
Any idea?

Your regex is on the right track, however to perform any kind of operation you must use a replacement callback:
section = "abc";
reg = new RegExp("(\\["+section+"\\]\\[)(\\d+)(\\])","g");
num = duplicate.replace(reg,function(_,before,number,after) {
return before + (parseInt(number,10)+1) + after;
});

I think you need to read up more on Regular Expressions. Your current regular expression comes out to:
/[abc][d+]/g
Which will match an "a" "b" or "c", followed by a "d" or "+", like: ad or c+ or bd or even zebra++ etc.
A great resource to get started is: http://www.regular-expressions.info/javascript.html

I see at least two problems.
The \ character has a special meaning in JavaScript strings. It is used to escape special characters in the string. For example: \n is a new line, and \r is a carriage return. You can also escape quotes and apostrophes to include them in your string: "This isn't a normally \"quoted\" string... It has actual \" characters inside the string as well as delimiting it."
The second problem is that, in order to use a backreference ($1, $2, etc.) you must provide a capturing group in your pattern (the regex needs to know what to backreference). Try changing your pattern to:
'\\[' + section + '\\]\\[(\\d+)\\]'
Note the double-backslashes. This escapes the backslash character itself, allowing it to be a literal \ in a string. Also note the use of ( and ) (the capturing group). This tells the regex what to capture for $1.
After the regex is instantiated, with section === 'abc':
new RegExp('\\[' + section + '\\]\\[(\\d+)\\]', 'g');
Your pattern is now:
/\[abc\]\[(\d+)\]/g
And your .replace will return \d+++ (where \d+ is the captured digits from the input string).
Demo: http://jsfiddle.net/U46yx/

Why isn't this regex matching the expected way?

I'm trying to get rid of the slash character in case it exists at the end of my string. I used the following expression, intending to match any character not being slash at the end of the line.
var str = "http://hazaa.com/blopp/";
str.match("[^/$]+", "g");
For some reason (surely logical and explainable but not graspabled to me on my own), I get the split into three string looking as follows.
["http:", "hazaa.com", "blopp"]
What am I assuming wrongly?
How to resolve it?

In str.match("[^/$]+", "g");, why put dollar sign inside bracket? It's supposed to be outside, namely, str.match("[^/]+$", "g");.
To remove all the trailing slash, you can use str.replace(/\/+$/, ""). (If you'd like to remove the last trailing slash ONLY, remove the + in the replace's regex)
Update:
One more way that doesn't use replace:
function stripEndingSlashes(str) {
var matched = str.match("(.*[^/]+)/*$");
return matched ? matched[1] : "";
}

The regexp is choosing "everything except slash". That is why match() returns the parts of the string between slashes.
You can resolve it with the replace() function:
var str = "http://hazaa.com/blopp/";
//replace the last slash with an empty string to remove it
str = str.replace(/\/$/,'');
The regexp literal should always be surrounded between / characters. So here the regexp is:
\/ : this means a single slash character. In order to prevent Javascript from interpreting your slash as the end of regexp, it needs to be 'escaped' with a backslash.
$ : this means the end of the string

Your current regex will match the portion of string until the first / or $ is encountered. The second parameter is ignored; there is no second parameter for String.match.
To remove the trailing slash, use the String.replace function:
var str = "http://hazaa.com/blopp/";
str = str.replace(/\/$/, "");
console.log(str);
// "http://hazaa.com/blopp"
If you need to check whether a string ends with a slash, use the String.match method like this:
var str = "http://hazaa.com/blopp/";
var match = str.match(/\/$/);
console.log(match);
// null if string does not end with /
// ["/"] if string ends with a /
If you need to grab every thing except the last character(s) being /, use this:
var r = /(.+?)\/*$/;
console.log("http://hazaa.com/blopp//".match(r)); // ["http://hazaa.com/blopp//", "http://hazaa.com/blopp"]
console.log("http://hazaa.com/blopp/".match(r)); // ["http://hazaa.com/blopp/", "http://hazaa.com/blopp"]
console.log("http://hazaa.com/bloppA".match(r)); // ["http://hazaa.com/bloppA", "http://hazaa.com/bloppA"]
The 2nd index in the returned array contains the desired portion of the URL. The regex works as follows:
(.+?) un-greedy match (and capture) any character
\/*$ matches optional trailing slash(es)
The first portion regex is intentionally changed to un-greedy. If it was greedy, it would attempt to find the biggest match as long the the whole regex matches (consuming the trailing / in the process). When ungreedy, it will find the smallest match as long as the whole regex matches.

Why use regex? Just check if the last symbol of the string is a slash and then slice. Like this:
if (str.slice(-1) === '/') {
str = str.slice(0, -1);
}

We Keep Coding

JavaScript is the programming language of the Web.

JavaScript Regex - Remove Whitespace from Start and End - javascript

Related

How to remove characters in a string that matches a RegEx while preserving spaces?

Javascript regular expression to extract characters from mid string with optional end character

regex precceded by two or more special character

Javascript Regex with variable and $1

Why isn't this regex matching the expected way?

Categories

Resources