I am writing a little app for Sharepoint. I am trying to extract some text from the middle of a field that is returned:
var ows_MetaInfo="1;#Subject:SW|NameOfADocument
vti_parservers:SR|23.0.0.6421
ContentTypeID:SW|0x0101001DB26Cf25E4F31488B7333256A77D2CA
vti_cachedtitle:SR|NameOfADocument
vti_title:SR|ATitleOfADocument
_Author:SW:|TheNameOfOurCompany
_Category:SW|
ContentType:SW|Document
vti_author::SR|mrwienerdog
_Comments:SW|This is very much the string I need extracted
vti_categories:VW|
vtiapprovallevel:SR|
vti_modifiedby:SR|mrwienerdog
vti_assignedto:SR|
Keywords:SW|Project Name
ContentType _Comments"
So......All I want returned is "This is very much the string I need extracted"
Do I need a regex and a string replace? How would you write the regex?
Yes, you can use a regular expression for this (this is the sort of thing they are good for). Assuming you always want the string after the pipe (|) on the line starting with "_Comments:SW|", here's how you can extract it:
var matchresult = ows_MetaInfo.match(/^_Comments:SW\|(.*)$/m);
var comment = (matchresult==null) ? "" : matchresult[1];
Note that the .match() method of the String object returns an array. The first (index 0) element will be the entire match (here, we the entire match is the whole line, as we anchored it with ^ and $; note that adding the "m" after the regex makes this a multiline regex, allowing us to match the start and end of any line within the multi-line input), and the rest of the array are the submatches that we capture using parenthesis. Above we've captured the part of the line that you want, so that will present in the second item in the array (index 1).
If there is no match ("_Comments:SW|" doesnt appear in ows_MetaInfo), then .match() will return null, which is why we test it before pulling out the comment.
If you need to adjust the regex for other scenarios, have a look at the Regex docs on Mozilla Dev Network: https://developer.mozilla.org/en/JavaScript/Guide/Regular_Expressions
You can use this code:
var match = ows_MetaInfo.match(/_Comments:SW\|([^\n]+)/);
if (match)
document.writeln(match[1]);
I'm far from competent with RegEx, so here is my RegEx-less solution. See comments for further detail.
var extractedText = ExtractText(ows_MetaInfo);
function ExtractText(arg) {
// Use the pipe delimiter to turn the string into an array
var aryValues = ows_MetaInfo.split("|");
// Find the portion of the array that contains "vti_categories:VW"
for (var i = 0; i < aryValues.length; i++) {
if (aryValues[i].search("vti_categories:VW") != -1)
return aryValues[i].replace("vti_categories:VW", "");
}
return null;
}
Here's a working fiddle to demonstrate.
Related
I have the following example url: #/reports/12/expense/11.
I need to get the id just after the reports -> 12. What I am asking here is the most suitable way to do this. I can search for reports in the url and get the content just after that ... but what if in some moment I decide to change the url, I will have to change my algorythm.
What do You think is the best way here. Some code examples will be also very helpfull.
It's hard to write code that is future-proof since it's hard to predict the crazy things we might do in the future!
However, if we assume that the id will always be the string of consecutive digits in the URL then you could simply look for that:
function getReportId(url) {
var match = url.match(/\d+/);
return (match) ? Number(match[0]) : null;
}
getReportId('#/reports/12/expense/11'); // => 12
getReportId('/some/new/url/report/12'); // => 12
You should use a regular expression to find the number inside the string. Passing the regular expression to the string's .match() method will return an array containing the matches based on the regular expression. In this case, the item of the returned array that you're interested in will be at the index of 1, assuming that the number will always be after reports/:
var text = "#/reports/12/expense/11";
var id = text.match(/reports\/(\d+)/);
alert(id[1]);
\d+ here means that you're looking for at least one number followed by zero to an infinite amount of numbers.
var text = "#/reports/12/expense/11";
var id = text.match("#/[a-zA-Z]*/([0-9]*)/[a-zA-Z]*/")
console.log(id[1])
Regex explanation:
#/ matches the characters #/ literally
[a-zA-Z]* - matches a word
/ matches the character / literally
1st Capturing group - ([0-9]*) - this matches a number.
[a-zA-Z]* - matches a word
/ matches the character / literally
Regular expressions can be tricky (add expensive). So usually if you can efficiently do the same thing without them you should. Looking at your URL format you would probably want to put at least a few constraints on it otherwise the problem will be very complex. For instance, you probably want to assume the value will always appear directly after the key so in your sample report=12 and expense=11, but report and expense could be switched (ex. expense/11/report/12) and you would get the same result.
I would just use string split:
var parts = url.split("/");
for(var i = 0; i < parts.length; i++) {
if(parts[i] === "report"){
this.reportValue = parts[i+1];
i+=2;
}
if(parts[i] === "expense"){
this.expenseValue = parts[i+1];
i+=2;
}
}
So this way your key/value parts can appear anywhere in the array
Note: you will also want to check that i+1 is in the range of the parts array. But that would just make this sample code ugly and it is pretty easy to add in. Depending on what values you are expecting (or not expecting) you might also want to check that values are numbers using isNaN
I'm getting nowhere with this...
I need to test a string if it contains %2 and at the same time does not contain /. I can't get it to work using regex. Here is what I have:
var re = new RegExp(/.([^\/]|(%2))*/g);
var s = "somePotentially%2encodedStringwhichMayContain/slashes";
console.log(re.test(s)) // true
Question:
How can I write a regex that checks a string if it contains %2 while not containing any / slashes?
While the link referred to by Sebastian S. is correct, there's an easier way to do this as you only need to check if a single character is not in the string.
/^[^\/]*%2[^\/]*$/
EDIT: Too late... Oh well :P
Try the following:
^(?!.*/).*%2
either use inverse matching as shown here: Regular expression to match a line that doesn't contain a word?
or use indexOf(char) in an if statement. indexOf returns the position of a string or char in a string. If not found, it will return -1:
var s = "test/";
if(s.indexOf("/")!=-1){
//contains "/"
}else {
//doesn't contain "/"
}
In following code:
"a sasas b".match(/sas/g) //returns ["sas"]
The string actually include two sas strings, a [sas]as b and a sa[sas] b.
How can I modify RegEx to match both?
Another example:
"aaaa".match(/aa/g); //actually include [aa]aa,a[aa]a,aa[aa]
Please consider the issue in general not just above instances.
A pure RexEx solution is preferred.
If you want to match at least one such "merged" occurrence, then you could do something like:
"a sasas b".match(/s(as)+/g)
If you want to retrieve the matches as separate results, then you have a bit more work to do; this is not a case that regular expressions are designed to handle. The basic algorithm would be:
Attempt a match. If it was unsuccessful, stop.
Extract the match you are interested in and do whatever you want with it.
Take the substring of the original target string, starting from one character following the first character in your match.
Start over, using this substring as the new input.
(To be more efficient, you could match with an offset instead of using substrings; that technique is discussed in this question.)
For example, you would start with "a sasas b". After the first match, you have "sas". Taking the substring that starts one character after the match starts, we would have "asas b". The next match would find the "sas" here, and you would again repeat the process with "as b". This would fail to match, so you would be done.
This significantly-improved answer owes itself to #EliGassert.
String.prototype.match_overlap = function(re)
{
if (!re.global)
re = new RegExp(re.source,
'g' + (re.ignoreCase ? 'i' : '')
+ (re.multiline ? 'm' : ''));
var matches = [];
var result;
while (result = re.exec(this))
matches.push(result),
re.lastIndex = result.index + 1;
return matches.length ? matches : null;
}
#EliGassert points out that there is no need to walk through the entire string character by character; instead we can find a match anywhere (i.e. do without the anchor), and then continue one character after the index of the found match. While researching how to retrieve said index, I found that the re.lastIndex property, used by exec to keep track of where it should continue its search, is in fact settable! This works rather nicely with what we intend to do.
The only bit needing further explanation might be the beginning. In the absence of the g flag, exec may never return null (always returning its one match, if it exists), thus possibly going into an infinite loop. Since, however, match_overlap by design seeks multiple matches, we can safely recompile any non-global RegExp as a global RegExp, importing the i and m options as well if set.
Here is a new jsFiddle: http://jsfiddle.net/acheong87/h5MR5/.
document.write("<pre>");
document.write('sasas'.match_overlap(/sas/));
document.write("\n");
document.write('aaaa'.match_overlap(/aa/));
document.write("\n");
document.write('my1name2is3pilchard'.match_overlap(/[a-z]{2}[0-9][a-z]{2}/));
document.write("</pre>");
Output:
sas,sas
aa,aa,aa
my1na,me2is,is3pi
var match = "a sasas b".match(/s(?=as)/g);
for(var i =0; i != match.length; ++i)
alert(match[i]);
Going off of the comment by Q. Sheets and the response by cdhowie, I came up with the above solution: it consumes ONE character in the regular expression and does a lookahead for the rest of the match string. With these two pieces, you can construct all the positions and matching strings in your regular expression.
I wish there was an "inspect but don't consume" operator that you could use to actually include the rest of the matching (lookahead) string in the results, but there unfortunately isn't -- at least not in JS.
Here's a generic way to do it:
String.prototype.match_overlap = function(regexp)
{
regexp = regexp.toString().replace(/^\/|\/$/g, '');
var re = new RegExp('^' + regexp);
var matches = [];
var result;
for (var i = 0; i < this.length; i++)
if (result = re.exec(this.substr(i)))
matches.push(result);
return matches.length ? matches : null;
}
Usage:
var results = 'sasas'.match_overlap(/sas/);
Returns:
An array of (overlapping) matches, or null.
Example:
Here's a jsFiddle in which this:
document.write("<pre>");
document.write('sasas'.match_overlap(/sas/));
document.write("\n");
document.write('aaaa'.match_overlap(/aa/));
document.write("\n");
document.write('my1name2is3pilchard'.match_overlap(/[a-z]{2}[0-9][a-z]{2}/));
document.write("</pre>");
returns this:
sas,sas
aa,aa,aa
my1na,me2is,is3pi
Explanation:
To explain a little bit, we intend for the user to pass a RegExp object to this new function, match_overlap, as he or she would do normally with match. From this we want to create a new RegExp object anchored at the beginning (to prevent duplicate overlapped matches—this part probably won't make sense unless you encounter the issue yourself—don't worry about it). Then, we simply match against each substring of the subject string this and push the results to an array, which is returned if non-empty (otherwise returning null). Note that if the user passes in an expression that is already anchored, this is inherently wrong—at first I stripped anchors out, but then I realized I was making an assumption in the user's stead, which we should avoid. Finally one could go further and somehow merge the resulting array of matches into a single match result resembling what would normally occur with the //g option; and one could go even further and make up a new flag, e.g. //o that gets parsed to do overlap-matching, but this is getting a little crazy.
How I can get the value after last char(. ; + _ etc.):
e.g.
string.name+org.com
I want to get "com".
Is there any function in jQuery?
Use lastIndexOf and substr to find the character and get the part of the string after it:
var extension = name.substr(name.lastIndexOf(".") + 1);
Demo: http://jsfiddle.net/Guffa/K3BWn/
A simple and readable approch to get the substring after the last occurrence of a character from a defined set is to split the string with a regular expression containing a character class and then use pop() to get the last element of the resulting array:
The pop() method removes the last element from an array and returns that element.
See a JS demo below:
var s = 'string.name+org.com';
var result = s.split(/[.;+_]/).pop();
console.log(result);
to split at all non-overlapping occurrences of the regex by default.
NOTE: If you need to match ^, ], \ or -, you may escape them and use anywhere inside the character class (e.g. /[\^\-\]\\]/). It is possible to avoid escaping ^ (if you do not put it right after the opening [), - (if it is right after the opening [, right before the closing ], after a valid range, or between a shorthand character class and another symbol): /[-^\]\\]/.
Also, if you need to split with a single char, no regex is necessary:
// Get the substring after the last dot
var result = 'string.name+org.com'.split('.').pop();
console.log(result);
Not jQuery, just JavaScript: lastIndexOf and substring would do it (not since the update indicating multiple characters). As would a regular expression with a capture group containing a character class followed by an end-of-string anchor, e.g. /([^.;+_]+)$/ used with RegExp#exec or String#match.
E.g. (live copy | source):
var match = /([^.;+_]+)$/.exec(theStringToTest),
result = match && match[1];
var s = "string.name+org.com",
lw = s.replace(/^.+[\W]/, '');
console.log(lw) /* com */
this will also work for
string.name+org/com
string.name+org.info
You can use RegExp Object.
Try this code:
"http://stackoverflow.com".replace(/.*\./,"");
I'll throw in a crazy (i.e. no RegExp) one:
var s = 'string.name+org.com';
var a = s.split('.'); //puts all sub-Strings delimited by . into an Array
var result = a[a.length-1]; //gets the last element of that Array
alert(result);
EDIT: Since the update of the question is demanding mutiple delimiters to work this is probably not the way to go. Too crazy.....
use javascript function like
url.substr(url.length - 3);
maybe this is too late to consider, this codes works fine for me using jquery
var afterDot = value.substr(value.lastIndexOf('_') + 1);
You could just replate '_' to '.'
var myString = 'asd/f/df/xc/asd/test.jpg'
var parts = myString.split('/');
var answer = parts[parts.length - 1];
console.log(answer);
I am trying to target ?state=wildcard in this statement :
?state=uncompleted&dancing=yes
I would like to target the entire line ?state=uncomplete, but also allow it to find whatever word would be after the = operator. So uncomplete could also be completed, unscheduled, or what have you.
A caveat I am having is granted I could target the wildcard before the ampersand, but what if there is no ampersand and the param state is by itself?
Try this regular expression:
var regex = /\?state=([^&]+)/;
var match = '?state=uncompleted&dancing=yes'.match(regex);
match; // => ["?state=uncompleted", "uncompleted"]
It will match every character after the string "\?state=" except an ampersand, all the way to the end of the string, if necessary.
Alternative regex: /\?state=(.+?)(?:&|$)/
It will match everything up to the first & char or the end of the string
IMHO, you don't need regex here. As we all know, regexes tend to be slow, especially when using look aheads. Why not do something like this:
var URI = '?state=done&user=ME'.split('&');
var passedVals = [];
This gives us ['?state=done','user=ME'], now just do a for loop:
for (var i=0;i<URI.length;i++)
{
passedVals.push(URI[i].split('=')[1]);
}
Passed Vals wil contain whatever you need. The added benefit of this is that you can parse a request into an Object:
var URI = 'state=done&user=ME'.split('&');
var urlObjects ={};
for (var i=0;i<URI.length;i++)
{
urlObjects[URI[i].split('=')[0]] = URI[i].split('=')[1];
}
I left out the '?' at the start of the string, because a simple .replace('?','') can fix that easily...
You can match as many characters that are not a &. If there aren't any &s at all, that will of course also work:
/(\?state=[^&]+)/.exec("?state=uncompleted");
/(\?state=[^&]+)/.exec("?state=uncompleted&a=1");
// both: ["?state=uncompleted", "?state=uncompleted"]