I have strings in my program that are like so:
var myStrings = [
"[asdf] thisIsTheText",
"[qwerty] andSomeMoreText",
"noBracketsSometimes",
"[12345]someText"
];
I want to capture the strings "thisIsTheText", "andSomeMoreText", "noBracketsSometimes", "someText". The pattern of inputs will always be the same, square brackets with something in them (or maybe not) followed by some spaces (again, maybe not), and then the actual text I want.
How can I do this?
Thanks
One approach:
var actualTextYouWant = originalString.replace(/^\[[^\]]+\]\s*/, '');
This will return a copy of originalString with the initial [...] and whitespace removed.
This should get you started:
/(?:\[[^]]*])?\s*(\w+)/
Related
I have a string that will be formatted something like ___<test#email.com>____ where the underscores is irrelevant stuff I don't need but varys in length. I need to select and store what is between the brackets.
My problem is that all of the sub string solutions I have seen operate off of a hard integer location in the string. But the start and end of the substring I want to select (the brackets) will never be the same.
So I thought if I could use something to find the location of the brackets then feed that to a substring solution that would work. But all of the ways I have found of identifying special characters only reports if there are special characters, not where they are.
Thanks in advance!
based on this answer
var text = '___<test#email.com>____';
var values = text.split(/[<>]+/);
console.log(values); // your values should be at indexes 1, 3, 5, etc...
Here's a regex that should set you on your way.
let string = "asdf asdf asdf as <thing#stuff.com> jl;kj;l kj ;lkj ;lk j;lk";
let myMatches = string.match(/<.*>/g);
let myMatch = myMatches[0].slice(1).slice(0,-1);
The .match function returns an array of matches, so you can find multiple <stuff> entries.
There's probably a way to do it without the slicing, but that's all I've got for now.
With Regex:
var myRe = /<(.*)>/g;
var myArray = myRe.exec("____<asdf>___");
if (myArray)
console.log(myArray[1]);
Regex test here
JSFiddle test here
I have a url like http://www.somedotcom.com/all/~childrens-day/pr?sid=all.
I want to extract childrens-day. How to get that? Right now I am doing it like this
url = "http://www.somedotcom.com/all/~childrens-day/pr?sid=all"
url.match('~.+\/');
But what I am getting is ["~childrens-day/"].
Is there a (definitely there would be) short and sweet way to get the above text without ["~ and /"] i.e just childrens-day.
Thanks
You could use a negated character class and a capture group ( ) and refer to capture group #1. The caret (^) inside of a character class [ ] is considered the negation operator.
var url = "http://www.somedotcom.com/all/~childrens-day/pr?sid=all";
var result = url.match(/~([^~]+)\//);
console.log(result[1]); // "childrens-day"
See Working demo
Note: If you have many url's inside of a string you may want to add the ? quantifier for a non greedy match.
var result = url.match(/~([^~]+?)\//);
Like so:
var url = "http://www.somedotcom.com/all/~childrens-day/pr?sid=all"
var matches = url.match(/~(.+?)\//);
console.log(matches[1]);
Working example: http://regex101.com/r/xU4nZ6
Note that your regular expression wasn't actually properly delimited either, not sure how you got the result you did.
Use non-capturing groups with a captured group then access the [1] element of the matches array:
(?:~)(.+)(?:/)
Keep in mind that you will need to escape your / if using it also as your RegEx delimiter.
Yes, it is.
url = "http://www.somedotcom.com/all/~childrens-day/pr?sid=all";
url.match('~(.+)\/')[1];
Just wrap what you need into parenteses group. No more modifications into your code is needed.
References: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp
You could just do a string replace.
url.replace('~', '');
url.replace('/', '');
http://www.w3schools.com/jsref/jsref_replace.asp
Ok, So I hit a little bit of a snag trying to make a regex.
Essentially, I want a string like:
error=some=new item user=max dateFrom=2013-01-15T05:00:00.000Z dateTo=2013-01-16T05:00:00.000Z
to be parsed to read
error=some=new item
user=max
dateFrom=2013-01-15T05:00:00.000Z
ateTo=2013-01-16T05:00:00.000Z
So I want it to pull known keywords, and ignore other strings that have =.
My current regex looks like this:
(error|user|dateFrom|dateTo|timeFrom|timeTo|hang)\=[\w\s\f\-\:]+(?![(error|user|dateFrom|dateTo|timeFrom|timeTo|hang)\=])
So I'm using known keywords to be used dynamically so I can list them as being know.
How could I write it to include this requirement?
You could use a replace like so:
var input = "error=some=new item user=max dateFrom=2013-01-15T05:00:00.000Z dateTo=2013-01-16T05:00:00.000Z";
var result = input.replace(/\s*\b((?:error|user|dateFrom|dateTo|timeFrom|timeTo|hang)=)/g, "\n$1");
result = result.replace(/^\r?\n/, ""); // remove the first line
Result:
error=some=new item
user=max
dateFrom=2013-01-15T05:00:00.000Z
dateTo=2013-01-16T05:00:00.000Z
Another way to tokenize the string:
var tokens = inputString.split(/ (?=[^= ]+=)/);
The regex looks for space that is succeeded by (a non-space-non-equal-sign sequence that ends with a =), and split at those spaces.
Result:
["error=some=new item", "user=max", "dateFrom=2013-01-15T05:00:00.000Z", "dateTo=2013-01-16T05:00:00.000Z"]
Using the technique above and adapt your regex from your question:
var tokens = inputString.split(/(?=\b(?:error|user|dateFrom|dateTo|timeFrom|timeTo|hang)=)/);
This will correctly split the input pointed out by Qtax mentioned in the comment: "error=user=max foo=bar"
["error=", "user=max foo=bar"]
I've a string done like this: "http://something.org/dom/My_happy_dog_%28is%29cool!"
How can I remove all the initial domain, the multiple underscore and the percentage stuff?
For now I'm just doing some multiple replace, like
str = str.replace("http://something.org/dom/","");
str = str.replace("_%28"," ");
and go on, but it's really ugly.. any help?
Thanks!
EDIT:
the exact input would be "My happy dog is cool!" so I would like to get rid of the initial address and remove the underscores and percentage and put the spaces in the right place!
The problem is that trying to put a regex on Chrome "something goes wrong". Is it a problem of Chrome or my regex?
I'd suggest:
var str = "http://something.org/dom/My_happy_dog_%28is%29cool!";
str.substring(str.lastIndexOf('/')+1).replace(/(_)|(%\d{2,})/g,' ');
JS Fiddle demo.
The reason I took this approach is that RegEx is fairly expensive, and is often tricky to fine tune to the point where edge-cases become less troublesome; so I opted to use simple string manipulation to reduce the RegEx work.
Effectively the above creates a substring of the given str variable, from the index point of the lastIndexOf('/') (which does exactly what you'd expect) and adding 1 to that so the substring is from the point after the / not before it.
The regex: (_) matches the underscores, the | just serves as an or operator and the (%\d{2,}) serves to match digit characters that occur twice in succession and follow a % sign.
The parentheses surrounding each part of the regex around the |, serve to identify matching groups, which are used to identify what parts should be replaced by the ' ' (single-space) string in the second of the arguments passed to replace().
References:
lastIndexOf().
replace().
substring().
You can use unescape to decode the percentages:
str = unescape("http://something.org/dom/My_happy_dog_%28is%29cool!")
str = str.replace("http://something.org/dom/","");
Maybe you could use a regular expression to pull out what you need, rather than getting rid of what you don't want. What is it you are trying to keep?
You can also chain them together as in:
str.replace("http://something.org/dom/", "").replace("something else", "");
You haven't defined the problem very exactly. To get rid of all stretches of characters ending in %<digit><digit> you'd say
var re = /.*%\d\d/g;
var str = str.replace(re, "");
ok, if you want to replace all that stuff I think that you would need something like this:
/(http:\/\/.*\.[a-z]{3}\/.*\/)|(\%[a-z0-9][a-z0-9])|_/g
test
var string = "http://something.org/dom/My_happy_dog_%28is%29cool!";
string = string.replace(/(http:\/\/.*\.[a-z]{3}\/.*\/)|(\%[a-z0-9][a-z0-9])|_/g,"");
I have a string, which I want to extract the value out. The string is something like this:
cdata = "![CDATA[cu1hcmod6rbg3eenmk9p80c484ma9B]]";
And I want cu1hcmod6rbg3eenmk9p80c484ma9B. In other words, I want anything inside the ![[CDATA[*]].
I tried to use the following javascript snippet:
cdata = "![CDATA[cu1hcmod6rbg3eenmk9p80c484ma9B]]";
rePattern = new RegExp("![?:\\s+]]","m");
arrMatch = rePattern.exec( cdata );
result = arrMatch[0];
But the code is not working, I'm pretty sure that it's the way I how specify the matching string that's causing the problem. Any idea how to fix it?
Your pattern should be something like...
/^!\[CDATA\[(.+?)\]\]$/
Which is...
Match literal starting ![CDATA[.
Lazy match everything up until the closing ] and save it in capturing group $1 (thanks Phrogz for his excellent suggestion).
Match extra ]].
Your string should be available as arrMatch[1].
Try this:
var cdata = "![CDATA[cu1hcmod6rbg3eenmk9p80c484ma9B]]";
var regPattern = /(.*CDATA\[)(.*)(\]\].*)/gm;
alert(cdata.replace(regPattern, "$2"));