I have a code which extract query string parameters :
So ( for example) if the window url is :
....&a=1&.....
--The code first using split on & and then do split on the =
however , sometimes we use base64 values , which can have extra finals ='s (padding).
And here is where my code is messed up.
the result is N4JOJ7yZTi5urACYrKW5QQ and it should be N4JOJ7yZTi5urACYrKW5QQ==
So I enhance my regex to :
search = such that after it -> ( there is no end OR there is no [=])
'a=N4JOJ7yZTi5urACYrKW5QQ=='.split(/\=(?!($|=))/)
it does work. ( you can run it on console)
but the result is ["a", undefined, "N4JOJ7yZTi5urACYrKW5QQ=="]
Why am I getting undefined
How can i cure my regex for yielding only ["a", "N4JOJ7yZTi5urACYrKW5QQ=="]
p.s.
I know i can replace all the finals ='s to something temporary and then replace it back
but this tag is tagged as regex. So im looking a way to fix my regex.
This happens because you have additional match ($|=). You can exclude it from matching with ?::
"a=N4JOJ7yZTi5urACYrKW5QQ==".split(/=(?!(?:$|=))/);
However, you can always flatten that match and remove extra block:
"a=N4JOJ7yZTi5urACYrKW5QQ==".split(/=(?!$|=)/);
The url needs to be encoded
'a=N4JOJ7yZTi5urACYrKW5QQ=='
should be
'a=N4JOJ7yZTi5urACYrKW5QQ%3D%3D'
Look into encodeURIComponent()
And if you want to use a reg expression to get the key from the value
> "abc=fooo".match(/([^=]+)=?(.*)?/);
["abc=fooo", "abc", "fooo"]
why must you use split? a regex match with two captures, like /^(.+)=(.+)$/ would seem more obvious.
Related
I have URL pathnames that look similar to this: /service-area/i-need-this/but-not-this/. The /service-area/ part never changes, and the rest of the path is dynamic.
I need to get the part of the URL saying i-need-this.
Here was my attempt:
location.pathname.match(new RegExp('/service-area/' + "(.*)" + '/'));.
The goal was to get everything between /service-area/ and / but it's actually going up to the last occurrence of /, not the first occurrance. So the output from this is actually i-need-this/but-not-this.
I'm not so good with regex, is there a way it can be tweaked to get the desired result?
You need a lazy regex rather than a greedy one - so (.*?) instead of (.*). See also: What do 'lazy' and 'greedy' mean in the context of regular expressions?
You can do this without a regex too using replace and split:
var path = '/service-area/i-need-this/but-not-this/';
var res = path.replace('/service-area/', '').split('/')[0];
console.log(res);
I have a url like http://www.somedotcom.com/all/~childrens-day/pr?sid=all.
I want to extract childrens-day. How to get that? Right now I am doing it like this
url = "http://www.somedotcom.com/all/~childrens-day/pr?sid=all"
url.match('~.+\/');
But what I am getting is ["~childrens-day/"].
Is there a (definitely there would be) short and sweet way to get the above text without ["~ and /"] i.e just childrens-day.
Thanks
You could use a negated character class and a capture group ( ) and refer to capture group #1. The caret (^) inside of a character class [ ] is considered the negation operator.
var url = "http://www.somedotcom.com/all/~childrens-day/pr?sid=all";
var result = url.match(/~([^~]+)\//);
console.log(result[1]); // "childrens-day"
See Working demo
Note: If you have many url's inside of a string you may want to add the ? quantifier for a non greedy match.
var result = url.match(/~([^~]+?)\//);
Like so:
var url = "http://www.somedotcom.com/all/~childrens-day/pr?sid=all"
var matches = url.match(/~(.+?)\//);
console.log(matches[1]);
Working example: http://regex101.com/r/xU4nZ6
Note that your regular expression wasn't actually properly delimited either, not sure how you got the result you did.
Use non-capturing groups with a captured group then access the [1] element of the matches array:
(?:~)(.+)(?:/)
Keep in mind that you will need to escape your / if using it also as your RegEx delimiter.
Yes, it is.
url = "http://www.somedotcom.com/all/~childrens-day/pr?sid=all";
url.match('~(.+)\/')[1];
Just wrap what you need into parenteses group. No more modifications into your code is needed.
References: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp
You could just do a string replace.
url.replace('~', '');
url.replace('/', '');
http://www.w3schools.com/jsref/jsref_replace.asp
Ok, So I hit a little bit of a snag trying to make a regex.
Essentially, I want a string like:
error=some=new item user=max dateFrom=2013-01-15T05:00:00.000Z dateTo=2013-01-16T05:00:00.000Z
to be parsed to read
error=some=new item
user=max
dateFrom=2013-01-15T05:00:00.000Z
ateTo=2013-01-16T05:00:00.000Z
So I want it to pull known keywords, and ignore other strings that have =.
My current regex looks like this:
(error|user|dateFrom|dateTo|timeFrom|timeTo|hang)\=[\w\s\f\-\:]+(?![(error|user|dateFrom|dateTo|timeFrom|timeTo|hang)\=])
So I'm using known keywords to be used dynamically so I can list them as being know.
How could I write it to include this requirement?
You could use a replace like so:
var input = "error=some=new item user=max dateFrom=2013-01-15T05:00:00.000Z dateTo=2013-01-16T05:00:00.000Z";
var result = input.replace(/\s*\b((?:error|user|dateFrom|dateTo|timeFrom|timeTo|hang)=)/g, "\n$1");
result = result.replace(/^\r?\n/, ""); // remove the first line
Result:
error=some=new item
user=max
dateFrom=2013-01-15T05:00:00.000Z
dateTo=2013-01-16T05:00:00.000Z
Another way to tokenize the string:
var tokens = inputString.split(/ (?=[^= ]+=)/);
The regex looks for space that is succeeded by (a non-space-non-equal-sign sequence that ends with a =), and split at those spaces.
Result:
["error=some=new item", "user=max", "dateFrom=2013-01-15T05:00:00.000Z", "dateTo=2013-01-16T05:00:00.000Z"]
Using the technique above and adapt your regex from your question:
var tokens = inputString.split(/(?=\b(?:error|user|dateFrom|dateTo|timeFrom|timeTo|hang)=)/);
This will correctly split the input pointed out by Qtax mentioned in the comment: "error=user=max foo=bar"
["error=", "user=max foo=bar"]
I have strings in my program that are like so:
var myStrings = [
"[asdf] thisIsTheText",
"[qwerty] andSomeMoreText",
"noBracketsSometimes",
"[12345]someText"
];
I want to capture the strings "thisIsTheText", "andSomeMoreText", "noBracketsSometimes", "someText". The pattern of inputs will always be the same, square brackets with something in them (or maybe not) followed by some spaces (again, maybe not), and then the actual text I want.
How can I do this?
Thanks
One approach:
var actualTextYouWant = originalString.replace(/^\[[^\]]+\]\s*/, '');
This will return a copy of originalString with the initial [...] and whitespace removed.
This should get you started:
/(?:\[[^]]*])?\s*(\w+)/
I've a string done like this: "http://something.org/dom/My_happy_dog_%28is%29cool!"
How can I remove all the initial domain, the multiple underscore and the percentage stuff?
For now I'm just doing some multiple replace, like
str = str.replace("http://something.org/dom/","");
str = str.replace("_%28"," ");
and go on, but it's really ugly.. any help?
Thanks!
EDIT:
the exact input would be "My happy dog is cool!" so I would like to get rid of the initial address and remove the underscores and percentage and put the spaces in the right place!
The problem is that trying to put a regex on Chrome "something goes wrong". Is it a problem of Chrome or my regex?
I'd suggest:
var str = "http://something.org/dom/My_happy_dog_%28is%29cool!";
str.substring(str.lastIndexOf('/')+1).replace(/(_)|(%\d{2,})/g,' ');
JS Fiddle demo.
The reason I took this approach is that RegEx is fairly expensive, and is often tricky to fine tune to the point where edge-cases become less troublesome; so I opted to use simple string manipulation to reduce the RegEx work.
Effectively the above creates a substring of the given str variable, from the index point of the lastIndexOf('/') (which does exactly what you'd expect) and adding 1 to that so the substring is from the point after the / not before it.
The regex: (_) matches the underscores, the | just serves as an or operator and the (%\d{2,}) serves to match digit characters that occur twice in succession and follow a % sign.
The parentheses surrounding each part of the regex around the |, serve to identify matching groups, which are used to identify what parts should be replaced by the ' ' (single-space) string in the second of the arguments passed to replace().
References:
lastIndexOf().
replace().
substring().
You can use unescape to decode the percentages:
str = unescape("http://something.org/dom/My_happy_dog_%28is%29cool!")
str = str.replace("http://something.org/dom/","");
Maybe you could use a regular expression to pull out what you need, rather than getting rid of what you don't want. What is it you are trying to keep?
You can also chain them together as in:
str.replace("http://something.org/dom/", "").replace("something else", "");
You haven't defined the problem very exactly. To get rid of all stretches of characters ending in %<digit><digit> you'd say
var re = /.*%\d\d/g;
var str = str.replace(re, "");
ok, if you want to replace all that stuff I think that you would need something like this:
/(http:\/\/.*\.[a-z]{3}\/.*\/)|(\%[a-z0-9][a-z0-9])|_/g
test
var string = "http://something.org/dom/My_happy_dog_%28is%29cool!";
string = string.replace(/(http:\/\/.*\.[a-z]{3}\/.*\/)|(\%[a-z0-9][a-z0-9])|_/g,"");