I have a solution for my question, but I'm trying to get better at regex especially in javascript. I just wanted to bring this to the community to see if I could write this in a better way.
So, I get a datetime string that comes from .net and I need to extract the date from it.
Currently what I have is:
var time = "2009-07-05T00:00:00.0000000-05:00".match(/(^\d{4}).(\d{2}).(\d{2})/i);
As I said, this works, but I was hoping to make it more direct to only grab the Year, month, day in the array. What I get with this is 4 results with the first being YYYY-MM-DD, YYYY, MM, DD.
Essentially I just want the 3 results returned, not so much that this doesn't work (because I can just ignore the first index in the array), but so that I can learn to use regex a bit better.
Anytime you have parenthesis in your regex, the value that matches those parenthesis will be returned as well.
time[0] is what matches the whole expression
time[1] is what matches ([\d]{4}), i.e. the year
time[2] is what matches the first ([\d]{2}), i.e. the month
time[3] is what matches the second ([\d]{2}), i.e. the date
You can't change this behavior to remove time[0], and you don't really want to (since the underlying code is already generating it, removing it wouldn't give any performance benefit).
If you don't care about getting back the value from a parenthesized expression, you can use (?:expression) to make it non-matching.
I don't think that you can do that but you can do
var myregexp = /(^\d{4})-(\d{2})-(\d{2})/g;
var match = myregexp.exec(subject);
while (match != null) {
for (var i = 1; i < match.length; i++) {
// matched text: match[i]
}
match = myregexp.exec(subject);
}
And then just loop from the index 1. The first item in the array is a match and then the groups are children of that match
Related
I'm new to regex, and have been researching all night how to remove the first 2 zeros from a string like "08/08/2017" (without removing 0 in "2017")
The 5+ regex tutorials I've reviewed do not seem to cover what I need here.
The date could be any sysdate returned from the system. So the regex also needs to work for "12/12/2017"
Here is the best I have come up with:
let sysdate = "08/08/2017"
let todayminuszero = str.replace("0","");
let today = todayminus0.replace("0","");
It works, but obviously it's unprofessional.
From the tutorials, I'm pretty sure I can do something along the lines of this:
str.replace(/\d{2}//g,""),);
This pattern would avoid getting the 3rd zero in str.
Replacement String would have to indicate 8/8/
Not sure how to write this though.
For date manipulation I would use other functions(best date related) but, this should do it, for the case that you stated. If you need other formats or so, I would suggest removing the zeros in an different way, but It all depends on you UseCase.
let sysdate = "08/08/2017";
let todayminuszero = sysdate.replace(/0(?=\d\/)/gi,"");
console.info(todayminuszero);
(?= ... ) is called Lookahead and with this you can see what is there, without replacing it
in this case we are checking for a number and a slash. (?=\d\/)
here some more information, if you want to read about lookahead and more http://www.regular-expressions.info/lookaround.html
A good place to test regex expressions is https://regex101.com/
I always use this for more advance expressions, since it displays all matching groups and so, with a great explaination. Great resource/help, if you are learning or creating difficult Expressions.
Info: as mentioned by Rajesh, the i flag is not needed for this Expression, I just use it out of personal preference. This flag just sets the expression-match to case insensitive.
-- Out of Scope, but may be interesting --
A longer solution without regex could look like this:
let sysdate = "08/08/2017";
let todayminuszero = sysdate.split("/").map(x => parseInt(x)).join("/");
console.info(todayminuszero);
Backside, this solution has many moving parts, the split function to make an array(´"08/08/2017"´ to ´["08", "08", "2017"]´), the map function, with a lambda function => and the parseInt function, to make out of each string item a nice integer (like: "08" to 8, ... ) and at last the join function that creates the final string out of the newly created integer array.
you should use this
let sysdate = "08/08/2017"
let todayminuszero = sysdate.replace(/(^|\/)0/g,"$1");
console.log(todayminuszero);
function stripLeadingZerosDate(dateStr){
return dateStr.split('/').reduce(function(date, datePart){
return date += parseInt(datePart) + '/'
}, '').slice(0, -1);
}
console.log(stripLeadingZerosDate('01/02/2016'));
console.log(stripLeadingZerosDate('2016/02/01'));
look at here
function stripLeadingZerosDate(dateStr){
return dateStr.split('/').reduce(function(date, datePart){
return date += parseInt(datePart) + '/'
}, '').slice(0, -1);
}
console.log(stripLeadingZerosDate('01/02/2016'));// 1/2/2016
console.log(stripLeadingZerosDate('2016/02/01'));// "2016/2/1"
By first 2 zeros, I understand you mean zero before 8 in month and in date.
You can try something like this:
Idea
Create a regex that captures group of number representing date, month and year.
Use this regex to replace values.
Use a function to return processed value.
var sysdate = "08/08/2017"
var numRegex = /(\d)+/g;
var result = sysdate.replace(numRegex, function(match){
return parseInt(match)
});
console.log(result)
I have a string with times (formatted HH:MM) each on a new line. I want to create a JS function to check if there is any times that does not belong. It should simply return true or false.
Example correct string: var s = "5:45\n07:00\n13:00\n17:00";
5:45
07:00
13:00
17:00
Example incorrect string: var s = "5:45\n07:00\n55:00\n17:00";
5:45
07:00
55:00 // incorrect date here, should return false
17:00
My regex experience is little to none. Playing around on Scriptular I created this expression to detect times that do match:
/^[0-2]?[0-9]\:[0-5][0-9]$/m. This however is not sufficient.
So, how can I get this to work with a string s as indicated above?
function checkIfStringConforms(s)
{
var all_good = [some magic with regex here]
return all_good;
}
PS: I have Googled around and checked answers on SO. My regex skill is... eh.
Your regex is OK, but it would also match 29:00, so it needs some improvement. Then, it's always a bit more difficult to find non-matches than it is to find matches. You could try and remove all matches from the string and then see if it's empty (except for whitespace):
result = s.replace(/^(?:2[0-3]|[01]?[0-9]):[0-5][0-9]$/mg, "");
If result is empty after that, there were no illegal times in your string.
It can be done without the use of any regex. Just split on new-line and see if every date matches your format. For that we could use Array.every
function checkIfStringConforms(s) {
return s.split("\n").every(function(str){
var arr = str.split(":");
return (arr[0] < 24 && arr[0] > -1) && arr[1] < (60 && arr[1] > -1)
});
}
/(((2[^0-3]|[3-9].):..)|(..?:[^0-5].))(\n|$)/
Regexp returns true if your s var has at least one invalid time. Please, check it carefully before use – your question is quite broad and restrictions are not fully defined. Regex assumes that you have something like x:xx or xx:xx in each line (x is a digit) – I’m not sure this assumption covers all your data.
I have the following example url: #/reports/12/expense/11.
I need to get the id just after the reports -> 12. What I am asking here is the most suitable way to do this. I can search for reports in the url and get the content just after that ... but what if in some moment I decide to change the url, I will have to change my algorythm.
What do You think is the best way here. Some code examples will be also very helpfull.
It's hard to write code that is future-proof since it's hard to predict the crazy things we might do in the future!
However, if we assume that the id will always be the string of consecutive digits in the URL then you could simply look for that:
function getReportId(url) {
var match = url.match(/\d+/);
return (match) ? Number(match[0]) : null;
}
getReportId('#/reports/12/expense/11'); // => 12
getReportId('/some/new/url/report/12'); // => 12
You should use a regular expression to find the number inside the string. Passing the regular expression to the string's .match() method will return an array containing the matches based on the regular expression. In this case, the item of the returned array that you're interested in will be at the index of 1, assuming that the number will always be after reports/:
var text = "#/reports/12/expense/11";
var id = text.match(/reports\/(\d+)/);
alert(id[1]);
\d+ here means that you're looking for at least one number followed by zero to an infinite amount of numbers.
var text = "#/reports/12/expense/11";
var id = text.match("#/[a-zA-Z]*/([0-9]*)/[a-zA-Z]*/")
console.log(id[1])
Regex explanation:
#/ matches the characters #/ literally
[a-zA-Z]* - matches a word
/ matches the character / literally
1st Capturing group - ([0-9]*) - this matches a number.
[a-zA-Z]* - matches a word
/ matches the character / literally
Regular expressions can be tricky (add expensive). So usually if you can efficiently do the same thing without them you should. Looking at your URL format you would probably want to put at least a few constraints on it otherwise the problem will be very complex. For instance, you probably want to assume the value will always appear directly after the key so in your sample report=12 and expense=11, but report and expense could be switched (ex. expense/11/report/12) and you would get the same result.
I would just use string split:
var parts = url.split("/");
for(var i = 0; i < parts.length; i++) {
if(parts[i] === "report"){
this.reportValue = parts[i+1];
i+=2;
}
if(parts[i] === "expense"){
this.expenseValue = parts[i+1];
i+=2;
}
}
So this way your key/value parts can appear anywhere in the array
Note: you will also want to check that i+1 is in the range of the parts array. But that would just make this sample code ugly and it is pretty easy to add in. Depending on what values you are expecting (or not expecting) you might also want to check that values are numbers using isNaN
In following code:
"a sasas b".match(/sas/g) //returns ["sas"]
The string actually include two sas strings, a [sas]as b and a sa[sas] b.
How can I modify RegEx to match both?
Another example:
"aaaa".match(/aa/g); //actually include [aa]aa,a[aa]a,aa[aa]
Please consider the issue in general not just above instances.
A pure RexEx solution is preferred.
If you want to match at least one such "merged" occurrence, then you could do something like:
"a sasas b".match(/s(as)+/g)
If you want to retrieve the matches as separate results, then you have a bit more work to do; this is not a case that regular expressions are designed to handle. The basic algorithm would be:
Attempt a match. If it was unsuccessful, stop.
Extract the match you are interested in and do whatever you want with it.
Take the substring of the original target string, starting from one character following the first character in your match.
Start over, using this substring as the new input.
(To be more efficient, you could match with an offset instead of using substrings; that technique is discussed in this question.)
For example, you would start with "a sasas b". After the first match, you have "sas". Taking the substring that starts one character after the match starts, we would have "asas b". The next match would find the "sas" here, and you would again repeat the process with "as b". This would fail to match, so you would be done.
This significantly-improved answer owes itself to #EliGassert.
String.prototype.match_overlap = function(re)
{
if (!re.global)
re = new RegExp(re.source,
'g' + (re.ignoreCase ? 'i' : '')
+ (re.multiline ? 'm' : ''));
var matches = [];
var result;
while (result = re.exec(this))
matches.push(result),
re.lastIndex = result.index + 1;
return matches.length ? matches : null;
}
#EliGassert points out that there is no need to walk through the entire string character by character; instead we can find a match anywhere (i.e. do without the anchor), and then continue one character after the index of the found match. While researching how to retrieve said index, I found that the re.lastIndex property, used by exec to keep track of where it should continue its search, is in fact settable! This works rather nicely with what we intend to do.
The only bit needing further explanation might be the beginning. In the absence of the g flag, exec may never return null (always returning its one match, if it exists), thus possibly going into an infinite loop. Since, however, match_overlap by design seeks multiple matches, we can safely recompile any non-global RegExp as a global RegExp, importing the i and m options as well if set.
Here is a new jsFiddle: http://jsfiddle.net/acheong87/h5MR5/.
document.write("<pre>");
document.write('sasas'.match_overlap(/sas/));
document.write("\n");
document.write('aaaa'.match_overlap(/aa/));
document.write("\n");
document.write('my1name2is3pilchard'.match_overlap(/[a-z]{2}[0-9][a-z]{2}/));
document.write("</pre>");
Output:
sas,sas
aa,aa,aa
my1na,me2is,is3pi
var match = "a sasas b".match(/s(?=as)/g);
for(var i =0; i != match.length; ++i)
alert(match[i]);
Going off of the comment by Q. Sheets and the response by cdhowie, I came up with the above solution: it consumes ONE character in the regular expression and does a lookahead for the rest of the match string. With these two pieces, you can construct all the positions and matching strings in your regular expression.
I wish there was an "inspect but don't consume" operator that you could use to actually include the rest of the matching (lookahead) string in the results, but there unfortunately isn't -- at least not in JS.
Here's a generic way to do it:
String.prototype.match_overlap = function(regexp)
{
regexp = regexp.toString().replace(/^\/|\/$/g, '');
var re = new RegExp('^' + regexp);
var matches = [];
var result;
for (var i = 0; i < this.length; i++)
if (result = re.exec(this.substr(i)))
matches.push(result);
return matches.length ? matches : null;
}
Usage:
var results = 'sasas'.match_overlap(/sas/);
Returns:
An array of (overlapping) matches, or null.
Example:
Here's a jsFiddle in which this:
document.write("<pre>");
document.write('sasas'.match_overlap(/sas/));
document.write("\n");
document.write('aaaa'.match_overlap(/aa/));
document.write("\n");
document.write('my1name2is3pilchard'.match_overlap(/[a-z]{2}[0-9][a-z]{2}/));
document.write("</pre>");
returns this:
sas,sas
aa,aa,aa
my1na,me2is,is3pi
Explanation:
To explain a little bit, we intend for the user to pass a RegExp object to this new function, match_overlap, as he or she would do normally with match. From this we want to create a new RegExp object anchored at the beginning (to prevent duplicate overlapped matches—this part probably won't make sense unless you encounter the issue yourself—don't worry about it). Then, we simply match against each substring of the subject string this and push the results to an array, which is returned if non-empty (otherwise returning null). Note that if the user passes in an expression that is already anchored, this is inherently wrong—at first I stripped anchors out, but then I realized I was making an assumption in the user's stead, which we should avoid. Finally one could go further and somehow merge the resulting array of matches into a single match result resembling what would normally occur with the //g option; and one could go even further and make up a new flag, e.g. //o that gets parsed to do overlap-matching, but this is getting a little crazy.
There is a data parameter for a div that looks as follows:
<div data-params="[possibleText&]start=2011-11-01&end=2011-11-30[&possibleText]">
</div>
I want to remove the from the start through the end of the second date from that data-params attribute. There may or may not be text before the start and after the date after the second date.
How can I accomplish this using javascript or jQuery? I know how to get the value of the "data-params" attribute and how to set it, I'm just not sure how to remove just that part from the string.
Thank you!
Note: The dates will not always be the same.
I'd use a regular expression:
var text = $('div').attr('data-params');
var dates = text.match(/start=\d{4}-\d{2}-\d{2}&end=\d{4}-\d{2}-\d{2}/)[0]
// dates => "start=2011-11-01&end=2011-11-30"
The regular expression is not too complex. The notation \d means "match any digit" and \d{4} means "match exactly 4 digits". The rest is literal characters. So you can see how it works. Finally, that [0] at the end is because javascript match returns an array where the first element is the whole match and the rest are subgroups. We don't have any subgroups and we do want the whole match, so we just grab the first element, hence [0].
If you wanted to pull out the actual dates instead of the full query string, you can create subgroups to match by adding parenthesis around the parts you want, like this:
var dates = text.match(/start=(\d{4}-\d{2}-\d{2})&end=(\d{4}-\d{2}-\d{2})/)
// dates[0] => "start=2011-11-01&end=2011-11-30"
// dates[1] => "2011-11-01"
// dates[2] => "2011-11-30"
Here, dates[1] is the start date (the first subgroup based on parenthesis) and dates[2] is the end date (the second subgroup).
My regex skills aren't that good but this should do it
var txt = "[possibleText&]start=2011-11-01&end=2011-11-30[&possibleText]";
var requiredTxt = txt.replace(/^(.*)start=\d{4}-\d{2}-\d{2}&end=\d{4}-\d{2}-\d{2}(.*)$/, "$1$2");
I'm sure there are better ways to match your string with regex, but the $1 and $2 will put the first group and second group match into your requiredTxt stripping out the start/end stuff in the middle.
Say you have your data-params in a variable foo. Call foo.match as follows:
foo.match("[\\?&]start=([^&#]*)"); //returns ["&start=2011-11-01", "2011-11-01"]
foo.match("[\\?&]end=([^&#]*)"); //returns ["&end=2011-11-30", "2011-11-30"]