JavaScript regex to extract different parts of a string - javascript

This is an interesting one - I am looking for a JavaScript regex solution to extract different parts from a string. Any input is much appreciated.
Example -
";1XYZ123_UK;1;2.3;evt14=0.0|evt87=0.0,;1XYZ456_UK;4;5.6;evt14=0.0;fly;0;0;;;"
I am trying to extract just these bits from the string ignoring the rest-
“1XYZ123_UK;2;3;1XYZ456_UK;4;5.6;”
Basically extract anything starting with 1XYZ up until it encounters 'evt'.

var s = ';1XYZ123_UK;1;2e.3;evt14=0.0|evt87=0.0,;1XYZ456_UK;4;5.6;evt14=0.0';
var r = s.match(/1XYZ((?!evt).)*/g);
Will give you your desired strings:
["1XYZ123_UK;1;2e.3;", "1XYZ456_UK;4;5.6;"]

var s= ";1XYZ123_UK;1;2.3;evt14=0.0|evt87=0.0,;1XYZ456_UK;4;5.6;evt14=0.0"
s = s.replace(/(evt.+?(?:\||;|$))/g, "");
console.log(s) // ";1XYZ123_UK;1;2.3;1XYZ456_UK;4;5.6;"

Use groups ((...)) to capture parts of the matched string. After a successful match the substrings captured can be accessed via the array returned from String.match or Regex.exec.
The first element of the array (index 0) is the whole match, the next (index 1) is the first capture.
Eg.
var re = /1XY(.*)evt/
var result = theString.match(re)
then, if there is a match (result is not null) then
result[0]
will be the whole match (starting 1XY and ending evt) while
result[1]
will be the text between those strings.

Related

Whats wrong with this regex logic

I am trying to fetch the value after equal sign, its works but i am getting duplicated values , any idea whats wrong here?
// Regex for finding a word after "=" sign
var myregexpNew = /=(\S*)/g;
// Regex for finding a word before "=" sign
var mytype = /(\S*)=/g;
//Setting data from Grid Column
var strNew = "QCById=20";
var matchNew = myregexpNew.exec(strNew);
var newtype = mytype.exec(strNew);
alert(matchNew);
https://jsfiddle.net/6vjjv0hv/
exec returns an array, the first element is the global match, the following ones are the submatches, that's why you get ["=20", "20"] (using console.log here instead of alert would make it clearer what you get).
When looking for submatches and using exec, you're usually interested in the elements starting at index 1.
Regarding the whole parsing, it's obvious there are better solution, like using only one regex with two submatches, but it depends on the real goal.
You can try without using Regex like this:
var val = 'QCById=20';
var myString = val.substr(val.indexOf("=") + 1);
alert(myString);
Presently exec is returning you the matched value.
REGEXP.exec(SOMETHING) returns an array (see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/exec).
The first item in the array is the full match and the rest matches the parenthesized substrings.
You do not get duplicated values, you just get an array of a matched value and the captured text #1.
See RegExp#exec() help:
If the match succeeds, the exec() method returns an array and updates properties of the regular expression object. The returned array has the matched text as the first item, and then one item for each capturing parenthesis that matched containing the text that was captured.
Just use the [1] index to get the captured text only.
var myregexpNew = /=(\S*)/g;
var strNew = "QCById=20";
var matchNew = myregexpNew.exec(strNew);
if (matchNew) {
console.log(matchNew[1]);
}
To get values on both sides of =, you can use /(\S*)=(\S*)/g regex:
var myregexpNew = /(\S*)=(\S*)/g;
var strNew = "QCById=20";
var matchNew = myregexpNew.exec(strNew);
if (matchNew) {
console.log(matchNew[1]);
console.log(matchNew[2]);
}
Also, you may want to add a check to see if the captured values are not undefined/empty since \S* may capture an empty string. OR use /(\S+)=(\S+)/g regex that requires at least one non-whitespace character to appear before and after the = sign.

remove all but a specific portion of a string in javascript

I am writing a little app for Sharepoint. I am trying to extract some text from the middle of a field that is returned:
var ows_MetaInfo="1;#Subject:SW|NameOfADocument
vti_parservers:SR|23.0.0.6421
ContentTypeID:SW|0x0101001DB26Cf25E4F31488B7333256A77D2CA
vti_cachedtitle:SR|NameOfADocument
vti_title:SR|ATitleOfADocument
_Author:SW:|TheNameOfOurCompany
_Category:SW|
ContentType:SW|Document
vti_author::SR|mrwienerdog
_Comments:SW|This is very much the string I need extracted
vti_categories:VW|
vtiapprovallevel:SR|
vti_modifiedby:SR|mrwienerdog
vti_assignedto:SR|
Keywords:SW|Project Name
ContentType _Comments"
So......All I want returned is "This is very much the string I need extracted"
Do I need a regex and a string replace? How would you write the regex?
Yes, you can use a regular expression for this (this is the sort of thing they are good for). Assuming you always want the string after the pipe (|) on the line starting with "_Comments:SW|", here's how you can extract it:
var matchresult = ows_MetaInfo.match(/^_Comments:SW\|(.*)$/m);
var comment = (matchresult==null) ? "" : matchresult[1];
Note that the .match() method of the String object returns an array. The first (index 0) element will be the entire match (here, we the entire match is the whole line, as we anchored it with ^ and $; note that adding the "m" after the regex makes this a multiline regex, allowing us to match the start and end of any line within the multi-line input), and the rest of the array are the submatches that we capture using parenthesis. Above we've captured the part of the line that you want, so that will present in the second item in the array (index 1).
If there is no match ("_Comments:SW|" doesnt appear in ows_MetaInfo), then .match() will return null, which is why we test it before pulling out the comment.
If you need to adjust the regex for other scenarios, have a look at the Regex docs on Mozilla Dev Network: https://developer.mozilla.org/en/JavaScript/Guide/Regular_Expressions
You can use this code:
var match = ows_MetaInfo.match(/_Comments:SW\|([^\n]+)/);
if (match)
document.writeln(match[1]);
I'm far from competent with RegEx, so here is my RegEx-less solution. See comments for further detail.
var extractedText = ExtractText(ows_MetaInfo);
function ExtractText(arg) {
// Use the pipe delimiter to turn the string into an array
var aryValues = ows_MetaInfo.split("|");
// Find the portion of the array that contains "vti_categories:VW"
for (var i = 0; i < aryValues.length; i++) {
if (aryValues[i].search("vti_categories:VW") != -1)
return aryValues[i].replace("vti_categories:VW", "");
}
return null;
}​
Here's a working fiddle to demonstrate.

String manipulation - getting value after the last position of a char

How I can get the value after last char(. ; + _ etc.):
e.g.
string.name+org.com
I want to get "com".
Is there any function in jQuery?
Use lastIndexOf and substr to find the character and get the part of the string after it:
var extension = name.substr(name.lastIndexOf(".") + 1);
Demo: http://jsfiddle.net/Guffa/K3BWn/
A simple and readable approch to get the substring after the last occurrence of a character from a defined set is to split the string with a regular expression containing a character class and then use pop() to get the last element of the resulting array:
The pop() method removes the last element from an array and returns that element.
See a JS demo below:
var s = 'string.name+org.com';
var result = s.split(/[.;+_]/).pop();
console.log(result);
to split at all non-overlapping occurrences of the regex by default.
NOTE: If you need to match ^, ], \ or -, you may escape them and use anywhere inside the character class (e.g. /[\^\-\]\\]/). It is possible to avoid escaping ^ (if you do not put it right after the opening [), - (if it is right after the opening [, right before the closing ], after a valid range, or between a shorthand character class and another symbol): /[-^\]\\]/.
Also, if you need to split with a single char, no regex is necessary:
// Get the substring after the last dot
var result = 'string.name+org.com'.split('.').pop();
console.log(result);
Not jQuery, just JavaScript: lastIndexOf and substring would do it (not since the update indicating multiple characters). As would a regular expression with a capture group containing a character class followed by an end-of-string anchor, e.g. /([^.;+_]+)$/ used with RegExp#exec or String#match.
E.g. (live copy | source):
var match = /([^.;+_]+)$/.exec(theStringToTest),
result = match && match[1];
var s = "string.name+org.com",
lw = s.replace(/^.+[\W]/, '');
console.log(lw) /* com */
this will also work for
string.name+org/com
string.name+org.info
You can use RegExp Object.
Try this code:
"http://stackoverflow.com".replace(/.*\./,"");
I'll throw in a crazy (i.e. no RegExp) one:
var s = 'string.name+org.com';
var a = s.split('.'); //puts all sub-Strings delimited by . into an Array
var result = a[a.length-1]; //gets the last element of that Array
alert(result);​
EDIT: Since the update of the question is demanding mutiple delimiters to work this is probably not the way to go. Too crazy.....
use javascript function like
url.substr(url.length - 3);
maybe this is too late to consider, this codes works fine for me using jquery
var afterDot = value.substr(value.lastIndexOf('_') + 1);
You could just replate '_' to '.'
var myString = 'asd/f/df/xc/asd/test.jpg'
var parts = myString.split('/');
var answer = parts[parts.length - 1];
console.log(answer);

Regex in javascript complex

string str contains somewhere within it http://www.example.com/ followed by 2 digits and 7 random characters (upper or lower case). One possibility is http://www.example.com/45kaFkeLd or http://www.example.com/64kAleoFr. So the only certain aspect is that it always starts with 2 digits.
I want to retrieve "64kAleoFr".
var url = str.match([regex here]);
The regex you’re looking for is /[0-9]{2}[a-zA-Z]{7}/.
var string = 'http://www.example.com/64kAleoFr',
match = (string.match(/[0-9]{2}[a-zA-Z]{7}/) || [''])[0];
console.log(match); // '64kAleoFr'
Note that on the second line, I use the good old .match() trick to make sure no TypeError is thrown when no match is found. Once this snippet has executed, match will either be the empty string ('') or the value you were after.
you could use
var url = str.match(/\d{2}.{7}$/)[0];
where:
\d{2} //two digits
.{7} //seven characters
$ //end of the string
if you don't know if it will be at the end you could use
var url = str.match(/\/\d{2}.{7}$/)[0].slice(1); //grab the "/" at the begining and slice it out
what about using split ?
alert("http://www.example.com/64kAleoFr".split("/")[3]);
var url = "http://www.example.com/",
re = new RegExp(url.replace(/\./g,"\\.") + "(\\d{2}[A-Za-z]{7})");
str = "This is a string with a url: http://www.example.com/45kaFkeLd in the middle.";
var code = str.match(re);
if (code != null) {
// we have a match
alert(code[1]); // "45kaFkeLd"
}​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​
The url needs to be part of the regex if you want to avoid matching other strings of characters elsewhere in the input. The above assumes that the url should be configurable, so it constructs a regex from the url variable (noting that "." has special meaning in a regex so it needs to be escaped). The bit with the two numbers and seven letter is then in parentheses so it can be captured.
Demo: http://jsfiddle.net/nnnnnn/NzELc/
http://www\\.example\\.com/([0-9]{2}\\w{7}) this is your pattern. You'll get your 2 digits and 7 random characters in group 1.
If you notice your example strings, both strings have few digits and a random string after a slash (/) and if the pattern is fixed then i would rather suggest you to split your string with slash and get the last element of the array which was the result of the split function.
Here is how:
var string = "http://www.example.com/64kAleoFr"
ar = string.split("/");
ar[ar.length - 1];
Hope it helps

Javascript Regexp - Match Characters after a certain phrase

I was wondering how to use a regexp to match a phrase that comes after a certain match. Like:
var phrase = "yesthisismyphrase=thisiswhatIwantmatched";
var match = /phrase=.*/;
That will match from the phrase= to the end of the string, but is it possible to get everything after the phrase= without having to modify a string?
You use capture groups (denoted by parenthesis).
When you execute the regex via match or exec function, the return an array consisting of the substrings captured by capture groups. You can then access what got captured via that array. E.g.:
var phrase = "yesthisismyphrase=thisiswhatIwantmatched";
var myRegexp = /phrase=(.*)/;
var match = myRegexp.exec(phrase);
alert(match[1]);
or
var arr = phrase.match(/phrase=(.*)/);
if (arr != null) { // Did it match?
alert(arr[1]);
}
phrase.match(/phrase=(.*)/)[1]
returns
"thisiswhatIwantmatched"
The brackets specify a so-called capture group. Contents of capture groups get put into the resulting array, starting from 1 (0 is the whole match).
It is not so hard, Just assume your context is :
const context = "https://example.com/pa/GIx89GdmkABJEAAA+AAAA";
And we wanna have the pattern after pa/, so use this code:
const pattern = context.match(/pa\/(.*)/)[1];
The first item include pa/, but for the grouping second item is without pa/, you can use each what you want.
Let try this, I hope it work
var p = /\b([\w|\W]+)\1+(\=)([\w|\W]+)\1+\b/;
console.log(p.test('case1 or AA=AA ilkjoi'));
console.log(p.test('case2 or AA=AB'));
console.log(p.test('case3 or 12=14'));
If you want to get value after the regex excluding the test phrase, use this:
/(?:phrase=)(.*)/
the result will be
0: "phrase=thisiswhatIwantmatched" //full match
1: "thisiswhatIwantmatched" //matching group

Categories