Parse string regex for known keys but leave separator - javascript

Ok, So I hit a little bit of a snag trying to make a regex.
Essentially, I want a string like:
error=some=new item user=max dateFrom=2013-01-15T05:00:00.000Z dateTo=2013-01-16T05:00:00.000Z
to be parsed to read
error=some=new item
user=max
dateFrom=2013-01-15T05:00:00.000Z
ateTo=2013-01-16T05:00:00.000Z
So I want it to pull known keywords, and ignore other strings that have =.
My current regex looks like this:
(error|user|dateFrom|dateTo|timeFrom|timeTo|hang)\=[\w\s\f\-\:]+(?![(error|user|dateFrom|dateTo|timeFrom|timeTo|hang)\=])
So I'm using known keywords to be used dynamically so I can list them as being know.
How could I write it to include this requirement?

You could use a replace like so:
var input = "error=some=new item user=max dateFrom=2013-01-15T05:00:00.000Z dateTo=2013-01-16T05:00:00.000Z";
var result = input.replace(/\s*\b((?:error|user|dateFrom|dateTo|timeFrom|timeTo|hang)=)/g, "\n$1");
result = result.replace(/^\r?\n/, ""); // remove the first line
Result:
error=some=new item
user=max
dateFrom=2013-01-15T05:00:00.000Z
dateTo=2013-01-16T05:00:00.000Z

Another way to tokenize the string:
var tokens = inputString.split(/ (?=[^= ]+=)/);
The regex looks for space that is succeeded by (a non-space-non-equal-sign sequence that ends with a =), and split at those spaces.
Result:
["error=some=new item", "user=max", "dateFrom=2013-01-15T05:00:00.000Z", "dateTo=2013-01-16T05:00:00.000Z"]
Using the technique above and adapt your regex from your question:
var tokens = inputString.split(/(?=\b(?:error|user|dateFrom|dateTo|timeFrom|timeTo|hang)=)/);
This will correctly split the input pointed out by Qtax mentioned in the comment: "error=user=max foo=bar"
["error=", "user=max foo=bar"]

Related

JS how do I select a string from between two special characters?

I have a string that will be formatted something like ___<test#email.com>____ where the underscores is irrelevant stuff I don't need but varys in length. I need to select and store what is between the brackets.
My problem is that all of the sub string solutions I have seen operate off of a hard integer location in the string. But the start and end of the substring I want to select (the brackets) will never be the same.
So I thought if I could use something to find the location of the brackets then feed that to a substring solution that would work. But all of the ways I have found of identifying special characters only reports if there are special characters, not where they are.
Thanks in advance!
based on this answer
var text = '___<test#email.com>____';
var values = text.split(/[<>]+/);
console.log(values); // your values should be at indexes 1, 3, 5, etc...
Here's a regex that should set you on your way.
let string = "asdf asdf asdf as <thing#stuff.com> jl;kj;l kj ;lkj ;lk j;lk";
let myMatches = string.match(/<.*>/g);
let myMatch = myMatches[0].slice(1).slice(0,-1);
The .match function returns an array of matches, so you can find multiple <stuff> entries.
There's probably a way to do it without the slicing, but that's all I've got for now.
With Regex:
var myRe = /<(.*)>/g;
var myArray = myRe.exec("____<asdf>___");
if (myArray)
console.log(myArray[1]);
Regex test here
JSFiddle test here

How can I inverse matched result of the pattern?

Here is my string:
Organization 2
info#something.org.au more#something.com market#gmail.com single#noidea.com
Organization 3
headmistress#money.com head#skull.com
Also this is my pattern:
/^.*?#[^ ]+|^.*$/gm
As you see in the demo, the pattern matches this:
Organization 2
info#something.org.au
Organization 3
headmistress#money.com
My question: How can I make it inverse? I mean I want to match this:
more#something.com market#gmail.com single#noidea.com
head#skull.com
How can I do that? Actually I can write a new (and completely different) pattern to grab expected result, but I want to know, Is "inverting the result of a pattern" possible?
No, I don't believe there is a way to directly inverse a Regular Expression but keeping it the same otherwise.
However, you could achieve something close to what you're after by using your existing RegExp to replace its matches with an empty string:
var everythingThatDidntMatchStr = str.replace(/^.*?#[^ ]+|^.*$/gm, '');
You can replace the matches from first RegExp by using Array.prototype.forEach() to replace matched RegExp with empty string using `String.ptototype.replace();
var re = str.match(/^.*?#[^ ]+|^.*$/gm);
var res = str;
re.forEach(val => res = res.replace(new RegExp(val), ""));

Find and Replace all occurrences of a phrase in a json string using capturing groups

I have a stringified JSON which looks like this:
...
"message":null,"elementId:["xyz1","l9ie","xyz1"]}}]}], "startIndex":"1",
"transitionTime":"3","sourceId":"xyz1","isLocked":false,"autoplay":false
,"mutevideo":false,"loopvideo":false,"soundonhover":false,"videoCntrlVisibility":0,
...,"elementId:["dgff","xyz1","jkh90"]}}]}]
... it goes on.
The part I need to work on is the value of the elementId key. (The 2nd key in the first line, and the last key).
This key is present in multiple places in the JSON string. The value of this key is an array containing 4-character ids.
I need to replace one of these ids with a new one.
The kernel of the idea is something like:
var elemId = 'xyz1' // for instance
var regex = new RegExp(elemId, 'g');
var newString = jsonString.replace(regex, newRandomId);
jsonString = newString;
There are a couple of problems with this approach. The regex will match the id anywhere in the JSON. I need a regex which only matches it inside the elementId array; and nowhere else.
I'm trying to use a capturing group to match just the occurrences I need, but I can't quite crack it. I have:
/.*elementId":\[".*(xyz1).*"\]}}]/
But this doesn't match the 1st occurence of 'xyz1 in the array.
So, firstly, I need a regex which can match all the 'xyz1's inside elementId; but nowhere else. The sequence of square and curly brackets after elementId ends doesn't change anywhere in the string, if that helps.
Secondly, even if I have a capturing group that works, string.replace doesn't act as expected. Instead of replacing just the match inside the capturing group, it replaces the whole match.
So, my second requirement is replacing only the captured groups, not the whole match.
What a need is a piece of js code which will replace my 'xyz1's where needed and return the following string (assuming the newRandomId is 'abcd'):
"message":null,"elementId:["abcd","l9ie","abcd"]}}]}], "startIndex":"1",
"transitionTime":"3","sourceId":"xyz1","isLocked":false,"autoplay":false
,"mutevideo":false,"loopvideo":false,"soundonhover":false,"videoCntrlVisibility":0,
...,"elementId:["dgff","abcd","jkh9"]}}]}]
Note that the value of 'sourceId' is unaffected.
EDIT: I have to work with the JSON. I can't parse it and work with the object since I don't know all the places the old id might be in the object and looping through it multiple times (for multiple elements) would be time-consuming
Assuming you can't just parse and change the JS object, you could use 2 regexes: one to extract the array and the one to change the desired ids inside:
var output = input.replace(/("elementId"\s*:\s*\[)((?:".{4}",?)*)(\])/g, function(_,start,content,end){
return start + content.replace(/"xyz1"/g, '"rand"') + end;
});
The arguments _, start, content, end are produced as result of the regex (documentation here):
_ is the whole matched string (from "elementId:\[ to ]). I choose this name because it's an old convention for arguments you don't use
start is the first group ("elementId:\[)
content is the second captured group, that is the internal part of the array
end id the third group, ]
Using the groups instead of hardcoding the start and end parts in the returned string serves two purposes
avoid duplication (DRY principle)
make it possible to have variable strings (for example in my regex I accept optional spaces after the :)
var input = document.getElementById("input").innerHTML.trim();
var output = input.replace(/("elementId":\s*\[)((?:".{4}",?)*)(\])/g, function(_,start,content,end){
return start + content.replace(/"xyz1"/g, '"rand"') + end;
});
document.getElementById("output").innerHTML = output;
Input:
<pre id=input>
"message":null,"elementId":["xyz1","l9ie","xyz1"]}}]}], "startIndex":"1",
"transitionTime":"3","sourceId":"xyz1","isLocked":false,"autoplay":false
,"mutevideo":false,"loopvideo":false,"soundonhover":false,"videoCntrlVisibility":0,
...,"elementId":["dgff","xyz1","jkh9"]}}]}]
</pre>
Output:
<pre id=output>
</pre>
Notes:
it would be easy to do the whole operation in one regex if they weren't repetition of the searched id in one array. But the present structure makes it easy to handle several ids to replace at once.
I use non captured groups (?:...) in order to unclutter the arguments passed to the external replacing callback

Extract text from HTML with Javascript regex

I am trying to parse a webpage and to get the number reference after <li>YM#. For example I need to get 1234-234234 in a variable from the HTML that contains
<li>YM# 1234-234234 </li>
Many thanks for your help someone!
Rich
currently, your regex only matches if there is a single number before the dash and a single number after it. This will let you get one or more numbers in each place instead:
/YM#[0-9]+-[0-9]+/g
Then, you also need to capture it, so we use a cgroup to captue it:
/YM#([0-9]+-[0-9]+)/g
Then we need to refer to the capture group again, so we use the following code instead of the String.match
var regex = /YM#([0-9]+-[0-9]+)/g;
var match = regex.exec(text);
var id = match[1];
// 0: match of entire regex
// after that, each of the groups gets a number
(?!<li>YM#\s)([\d-]+)
http://regexr.com?30ng5
This will match the numbers.
Try this:
(<li>[^#<>]*?# *)([\d\-]+)\b
and get the result in $2.

Javascript Regex after specific string

I have several Javascript strings (using jQuery). All of them follow the same pattern, starting with 'ajax-', and ending with a name. For instance 'ajax-first', 'ajax-last', 'ajax-email', etc.
How can I make a regex to only grab the string after 'ajax-'?
So instead of 'ajax-email', I want just 'email'.
You don't need RegEx for this. If your prefix is always "ajax-" then you just can do this:
var name = string.substring(5);
Given a comment you made on another user's post, try the following:
var $li = jQuery(this).parents('li').get(0);
var ajaxName = $li.className.match(/(?:^|\s)ajax-(.*?)(?:$|\s)/)[1];
Demo can be found here
Below kept for reference only
var ajaxName = 'ajax-first'.match(/(\w+)$/)[0];
alert(ajaxName);
Use the \w (word) pattern and bind it to the end of the string. This will force a grab of everything past the last hyphen (assuming the value consists of only [upper/lower]case letters, numbers or an underscore).
The non-regex approach could also use the String.split method, coupled with Array.pop.
var parts = 'ajax-first'.split('-');
var ajaxName = parts.pop();
alert(ajaxName);
you can try to replace ajax- with ""
I like the split method #Brad Christie mentions, but I would just do
function getLastPart(str,delimiter) {
return str.split(delimiter)[1];
}
This works if you will always have only two-part strings separated by a hyphen. If you wanted to generalize it for any particular piece of a multiple-hyphenated string, you would need to write a more involved function that included an index, but then you'd have to check for out of bounds errors, etc.

Categories