Given measurement data like:
2"
3" Contract
When coming back from the server it looks like this:
"\"2\"\"\""
"\"3\"\" Contract\""
e.g. as shown within the image:
I want the data to be displayed as a proper measurement to the user. So:
2"
3" Contract
As shown above
I resorted to complicated regexes to get the second example working (3" Contract) but it would just turn 2" to 2.
let measurement_formatted = value.replace("\"\"", '\"');
measurement_formatted = measurement_formatted.replace(/(^"|"$)/g, '');
measurement_formatted = measurement_formatted.replace("\"", '\"');
How can I develop a proper regex for both cases?
First of all, those \ before the " are just put there to tell you that the " (preceded by a\) is being escaped.
Based on that, the string "\"3\"\" Contract\"" is the same as '"3"" Contract"' because escaping " is no longer needed when the string is delimited by ' character.
To answer, or rather land some help (which I'll always gladly do), you may use the following regex /^"*|(\D)"/g in conjunction with the replace method :
/ : tells the JS engine that we're creating a regex.
^"* : tells the JS engine to match any " at the start of the string (0 or more).
| : acts as the logical OR operator.
(\D)" :
(\D) : creates a matching group of any NON-NUMERIC character.
" : the literal " character.
g : tells the JS engine to match all the occurrences of that regex.
The idea here is to tell the replace method to replace all " characters that are preceded by a non-numeric character with that matched non-numeric character and entirely delete the " character.
Here's a live example :
const regex = /^"*|(\D)"/g;
/** $1 : means write down the first matched capturing group */
console.log('"3"" Contract"'.replace(regex, '$1')); // 3" Contract
console.log('2"'.replace(/^"*|(\D)"/g, '$1')); // 2"
Learn more about the replace method.
Hope i managed to land some help.
Related
I want to write a regular expression, in JavaScript, for finding the string starting and ending with :.
For example "hello :smile: :sleeping:" from this string I need to find the strings which are starting and ending with the : characters. I tried the expression below, but it didn't work:
^:.*\:$
My guess is that you not only want to find the string, but also replace it. For that you should look at using a capture in the regexp combined with a replacement function.
const emojiPattern = /:(\w+):/g
function replaceEmojiTags(text) {
return text.replace(emojiPattern, function (tag, emotion) {
// The emotion will be the captured word between your tags,
// so either "sleep" or "sleeping" in your example
//
// In this function you would take that emotion and return
// whatever you want based on the input parameter and the
// whole tag would be replaced
//
// As an example, let's say you had a bunch of GIF images
// for the different emotions:
return '<img src="/img/emoji/' + emotion + '.gif" />';
});
}
With that code you could then run your function on any input string and replace the tags to get the HTML for the actual images in them. As in your example:
replaceEmojiTags('hello :smile: :sleeping:')
// 'hello <img src="/img/emoji/smile.gif" /> <img src="/img/emoji/sleeping.gif" />'
EDIT: To support hyphens within the emotion, as in "big-smile", the pattern needs to be changed since it is only looking for word characters. For this there is probably also a restriction such that the hyphen must join two words so that it shouldn't accept "-big-smile" or "big-smile-". For that you need to change the pattern to:
const emojiPattern = /:(\w+(-\w+)*):/g
That pattern is looking for any word that is then followed by zero or more instances of a hyphen followed by a word. It would match any of the following: "smile", "big-smile", "big-smile-bigger".
The ^ and $ are anchors (start and end respectively). These cause your regex to explicitly match an entire string which starts with : has anything between it and ends with :.
If you want to match characters within a string you can remove the anchors.
Your * indicates zero or more so you'll be matching :: as well. It'll be better to change this to + which means one or more. In fact if you're just looking for text you may want to use a range [a-z0-9] with a case insensitive modifier.
If we put it all together we'll have regex like this /:([a-z0-9]+):/gmi
match a string beginning with : with any alphanumeric character one or more times ending in : with the modifiers g globally, m multi-line and i case insensitive for things like :FacePalm:.
Using it in JavaScript we can end up with:
var mytext = 'Hello :smile: and jolly :wave:';
var matches = mytext.match(/:([a-z0-9]+):/gmi);
// matches = [':smile:', ':wave:'];
You'll have an array with each match found.
I am reading about split and below is a variable looking at the string values. However I do not understand what the symbols are looking for.
According to the page: If separator contains capturing parentheses, matched results are returned in the array.
var myString = 'Hello 1 word. Sentence number 2.';
var splits = myString.split(/(\d)/);
console.log(splits);
// Results
[ "Hello ", "1", " word. Sentence number ", "2", "." ]
My question is, what is happening here? Parentheses "(" or ")" is not part of the string. Why is space or "." separated for some and not the other?
Another one is /\s*;\s*
States it removes semi-colon before and after if there are 0 or more space. Does this mean /\s* mean it looks for a space and remove and ';' in this case is the separator?
var names = 'Harry Trump ;Fred Barney; Helen Rigby ; Bill Abel ;Chris Hand ';
console.log(names);
var re = /\s*;\s*/;
var nameList = names.split(re);
console.log(nameList);
// Results
["Harry Trump", "Fred Barney", "Helen Rigby", "Bill Abel", "Chris Hand "]
If so why doesn't /\s*^\s*/ remobe space before and after ^ symbol if my string looked like this.
var names = 'Harry Trump ^Fred Barney^ Helen Rigby ^ Bill Abel ^Chris Hand ';
console.log(names);
var re = /\s*^\s*/;
var nameList = names.split(re);
console.log(nameList);
I would like to know what the symbols mean and why they are in certain order. Thanks you.
It seems you got your examples from here.
First let's look at this one /(\d)/.
Working inside out, recognize that \d escapes all digits.
Now, from the article, wrapping the parentheses around the escape tells the split method to keep the delimiter (which in this case is any digit) in the returned array. Notice that without the parentheses, the returned array wouldn't have numeric elements (as strings of course). Lastly, it is wrapped in slashes (//) to create a regular expression. Basically this case says: split the string by digits and keep the digits in the returned array.
The second case /\s*;\s* is a little more complicated and will take some understanding of regular expressions. First note that \s escapes a space. In regular expressions, a character c followed by a * says 'look for 0 or more of c, in consecutive order'. So this regular expression matches strings like ' ; ', ';', etc (I added the single quotes to show the spaces). Note that in this case, we don't have parentheses, so the semicolons will be excluded from the returned array.
If you're still stuck, I'd suggest reading about regular expressions and practice writing them. This website is great, just be be weary of the fact that regular expressions on that site may be slightly different than those used in javascript in terms of syntax.
The 1st example below splits the input string at any digit, keeping the delimiter (i.e. the digit) in the final array.
The 2nd example below shows that leaving the parentheses out still splits the array at any digit, but those digit delimiters are not included in the final array.
The 3rd example below splits the input string any time the following pattern is encountered: as many consecutive spaces as possible (including none) immediately followed by a semi-colon immediately followed by as many consecutive spaces as possible (including none).
The 4th example below shows that you can indeed split a similar input string as in the 3rd example but with "^" replacing ";". However, because the "^" by itself means "the start of the string" you have to tell JavaScript to find the actual "^" by putting a backslash (i.e. a special indicator designated for this purpose) right in front of it, i.e. "\^".
const show = (msg) => {console.log(JSON.stringify(msg));};
var myString = 'Hello 1 word. Sentence number 2.';
var splits1 = myString.split(/(\d)/);
show(splits1);
var splits2 = myString.split(/\d/);
show(splits2);
var names1 = 'Harry Trump ;Fred Barney; Helen Rigby ; Bill Abel ;Chris Hand ';
var nameList1 = names1.split(/\s*;\s*/);
show(nameList1);
var names2 = 'Harry Trump ^Fred Barney^ Helen Rigby ^ Bill Abel ^Chris Hand ';
var nameList2 = names2.split(/\s*\^\s*/);
show(nameList2);
I have a string like the following:
SOME TEXT (BI1) SOME MORE TEXT (BI17) SOME FINAL TEXT (BI1234)
Question
I am trying to make a regex to get just the information between the curly brackets, for example the end string would look like:
BI1 BI17 BI1234
I have found this example on stackoverflow which will get the first value BI1, but will ignore the rest after.
Get text between two rounded brackets
this is the REGEX I created from the above link: /\(([^)]+)\)/g but it includes the brackets, I want to remove these.
I am using this website to attempt to solve this query which has a testing window to see if the regex entered works:
http://www.regexr.com
Additional Information
there can be any amount of numbers also, which is why I have given 3 different examples.
this is a continous string, not on seperate lines
thanks for any help on this matter.
While this isn't possible using just regexes, you can do it with string#split and the following regex:
\).*?\(|^.*?\(|\).*?$
Yielding code that looks a bit like this:
function getBracketed(str) {
return str.split(/\).*?\(|^.*?\(|\).*?$/).filter(Boolean);
}
(You need to filter out the empty strings that'll appear at the beginning and end if you do it this way - hence the extra operation).
Regex demo on Regex101
Code demo on Repl.it
If you need to keep all inside parentheses and remove everything else, you might use
var str = "SOME TEXT (BI1) SOME MORE TEXT (BI17) SOME FINAL TEXT (BI1234)";
var result = str.replace(/.*?\(([^()]*)\)/g, " $1").trim();
console.log(result);
If you need to get only the BI+digits pattern inside parentheses, use
/.*?\((BI\d+)\)/g
Details:
.*? - match any 0+ chars other than linebreak symbols
\( - match a (
(BI\d+) - Group 1 capturing BI + 1 or more digits (\d+) (or [^()]* - zero or more chars other than ( and ))
\) - a closing ).
To get all the values as array (say, for later joining), use
var str = "SOME TEXT (BI1) SOME MORE TEXT (BI17) SOME FINAL TEXT (BI1234)";
var re = /\((BI\d+)\)/g;
var res =str.match(re).map(function(s) {return s.substring(1, s.length-1);})
console.log(res);
console.log(res.join(" "));
I am trying to create a regular expression for the string filtering. I want to get the symbol "#" and anything that is written after that and before a space.
Can someone help me with this?
For example:
hi I am #vaibhav .
The expected result this regular expression should give is vaibhav.
I made this:
/#[a-z]*/
However, I am not sure if this will confirm to the above mentioned criteria.
To get a substring from the # up to the first space after it, use
#\S+
See demo
The \S means a non-whitespace character.
If you do not need #, use a capturing group:
#(\S+)
The value you need will be in Group 1. See another demo.
If you are using JavaScript:
var re = /#(\S+)/g;
var str = 'hi I am #vaibhav . hi, and I am #strib .';
var m;
while ((m = re.exec(str)) !== null) {
document.write("The value is: <b>" + m[1] + "</b><br/>");
}
The simplest solution is to use a negated set.
Search characters that are not '#'
Read in the '#'
Now capture characters that are not ' '
If you're trying to match and capture you can accomplish that like this:
[^#]*#([^ ]*).*
[Live Example]
If you only want to search then you don't need to match the whole string and you can just extract the actual match section:
#([^ ]*)
[Live Example]
The most complicated situation is where you need to deal with an escaped '#'. Here's an example of a match using that:
(?:[^\\#]|\\.)*#([^ ]*).*
[Live Example]
You can do that with lookarounds.
Edited version:
(?<=#)\w+
Demo on regex101
I have some data in a textarea :
(yes it is multiline)
"#ObjectTypeID", DbType.In
"#ObjectID", DbType.Int32,
"#ClaimReasonID", DbType.I
"#ClaimReasonDetails", DbTy
"#AccidendDate", DbType.Da
"#AccidendPlaceID", DbType
"#AccidendPlaceDetails", Db
"#TypeOfMedicalTreatment",
"#MedicalTreatmentDate", Db
"#CreatedBy", DbType.Int32
"#Member_ID", DbType.Strin
.ExecuteScalar(command).ToS
In each row - I want to remove those sections : (from " (include) till the end of row) :
Visually : ( I sampled only 4 )
I've managed to do this :
value=value.replace(/\"[a-z,. ]+(?!.*\")/gi,'')
Which means : search the first " where have charters after it , which doesnot have a future "
This will yield the required results :
"#ObjectTypeID
"#ObjectID32,
"#ClaimReasonID
"#ClaimReasonDetails
"#AccidendDate
"#AccidendPlaceID
"#AccidendPlaceDetails
"#TypeOfMedicalTreatment
"#MedicalTreatmentDate
"#CreatedBy32
"#Member_ID
.ExecuteScalar(command).ToS
Question:
I understand why it is working , but I dont understand why the following is not working :
value=value.replace(/\".+(?!.*\")/gi,'')
http://jsbin.com/fanep/4/edit
I mean : it suppose to search " where has charters after it , which doesn't has future " ....
What am I missing ? I really hate to declare [a-z,. ]
+ is greedy. Since "the whole thing" matches your rule of "must not have a " after", it will go with that.
The reason your first regex works is because you are disallowing most characters by explicitly whitelisting certain ones.
To fix, try adding ? after the + - this will make it lazy instead, matching as little as possible while still meeting the rules.
Additionally, you are searching for the stuff you want to keep... and then deleting it.
Try this instead:
val = val.replace(/"[^"]*(?=[\r\n]|$)/g,'');
This will remove everything from the last " to the end of a line (or end of the input).
value=value.replace(/\"[a-z,. ]+(?!.*\")/gi,'')
means: search the first " where have charters after it, which doesnot have a future "
To be exact: It matches the first " that has some of the characters [a-z,. ] after it, which then is not (in any distance) followed by another ".
I dont understand why the following is not working:
value=value.replace(/\".+(?!.*\")/gi,'')
You have removed the restriction of the character class. .+ will now match any char, including quotes. Regardless whether greedy or not, it will now find the first " that is followed by an amount of any characters (including other quotes) that are no more followed by quotes - i.e. it will suffice if .+ matches until the last quote.
I really hate to declare [a-z,. ]
You can just use the class of all characters except quotes: [^"]. Indeed, I think the following lookahead-free version matches your intent better:
value = value.replace(/"[^"\n\r]*/gi, '');
The one that doesn't work fails because the .+ is greedy. It eats up all it can. (Visual tools can help here, such as this one: http://regex101.com/r/eJ5kJ2/1) We can make it clearer that .+ is matching too much by putting it in a capture group: http://regex101.com/r/qF7nR9/1 Which show us:
In your one that does work (http://regex101.com/r/kR8vL6/1), you've changed that to [a-z,. ]+, which means "one or more a to z, comma, period, or space" (note that the . there is just a period, not a wildcard). That's much more limited (in particular, it doesn't include #).
Side note: There's no need to escape the " with a backslash, " isn't a special character in regular expressions.
Why the below regex is not working?
\".+(?!.*\")
Answer:
\" matches the first " and the following .+ would match greedily upto the last character. Because the last character in a line isn't followed by any character zero or more times plus \, the above regex would match the whole line undoubtably.
For your case, you could simply use the below regex to match from the second " upto the end of the line anchor.
\"[^"\n]*$
DEMO