Javascript regex match and get values from string - javascript

I've got a string of text which can have specific tags in it.
Example: var string = '<pause 4>This is a line of text.</pause><pause 7>This is the next part of the text.</pause>';
What I'm trying to do is do a regex match against the <pause #></pause> tag.
For each tags found, in this case it's <pause 4></pause> and <pause 7></pause>. What I want is to grab the value 4 and 7, and the string length divided by for the string in between the <pause #>...</pause> tags.
What I have for now is not much.
But I cant figure out how to grab all the cases, then loop through each one and grab the values I'm looking for.
My function for this looks like this for now, it's not much:
/**
* checkTags(string)
* Just check for tags, and add them
* to the proper arrays for drawing later on
* #return string
*/
function checkTags(string) {
// Regular expresions we will use
var regex = {
pause: /<pause (.*?)>(.*?)<\/pause>/g
}
var matchedPauses = string.match(regex.pause);
// For each match
// Grab the pause seconds <pause SECONDS>
// Grab the length of the string divided by 2 "string.length/2" between the <pause></pause> tags
// Push the values to "pauses" [seconds, string.length/2]
// Remove the tags from the original string variable
return string;
}
If anyone can explain my how I could do this I would be very thankful! :)

match(/.../g) doesn't save subgroups, you're going to need exec or replace to do that. Here's an example of a replace-based helper function to get all matches:
function matchAll(re, str) {
var matches = [];
str.replace(re, function() {
matches.push([...arguments]);
});
return matches;
}
var string = '<pause 4>This is a line of text.</pause><pause 7>This is the next part of the text.</pause>';
var re = /<pause (\d+)>(.+?)<\/pause>/g;
console.log(matchAll(re, string))
Since you're removing tags anyways, you can also use replace directly.

You need to make a loop to find all matched groups of your RegExp pattern in the text.
The matched group is an array containing the original text, the matched value and the match text.
var str = '<pause 4>This is a line of text.</pause><pause 7>This is the next part of the text.</pause>';
function checkTags(str) {
// Regular expresions we will use
var regex = {
pause: /<pause (.*?)>(.*?)\<\/pause>/g
}
var matches = [];
while(matchedPauses = regex.pause.exec(str)) {
matches.push([matchedPauses[1], matchedPauses[2].length /2]);
};
return matches;
}
console.log(checkTags(str));

As a start point since you have not much so far you could try this one
/<pause [0-9]+>.*<\/pause>/g
Than to get the number out there you match again using
/[0-9]+>/g
To get rid of the last sign >
str = str.slice(0, -1);

Related

I need help getting the first n characters of a string up to when a number character starts

I'm working with a string where I need to extract the first n characters up to where numbers begin. What would be the best way to do this as sometimes the string starts with a number: 7EUSA8889er898 I would need to extract 7EUSA But other string examples would be SWFX74849948, I would need to extract SWFX from that string.
Not sure how to do this with regex my limited knowledge is blocking me at this point:
^(\w{4}) that just gets me the first four characters but I don't really have a stopping point as sometimes the string could be somelongstring292894830982 which would require me to get somelongstring
Using \w will match a word character which includes characters and digits and an underscore.
You could match an optional digit [0-9]? from the start of the string ^and then match 1+ times A-Za-z
^[0-9]?[A-Za-z]+
Regex demo
const regex = /^[0-9]?[A-Za-z]+/;
[
"7EUSA8889er898",
"somelongstring292894830982",
"SWFX74849948"
].forEach(s => console.log(s.match(regex)[0]));
Can use this regex code:
(^\d+?[a-zA-Z]+)|(^\d+|[a-zA-Z]+)
I try with exmaple and good worked:
1- somelongstring292894830982 -> somelongstring
2- 7sdfsdf5456 -> 7sdfsdf
3- 875werwer54556 -> 875werwer
If you want to create function where the RegExp is parametrized by n parameter, this would be
function getStr(str,n) {
var pattern = "\\d?\\w{0,"+n+"}";
var reg = new RegExp(pattern);
var result = reg.exec(str);
if(result[0]) return result[0].substr(0,n);
}
There are answers to this but here is another way to do it.
var string1 = '7EUSA8889er898';
var string2 = 'SWFX74849948';
var Extract = function (args) {
var C = args.split(''); // Split string in array
var NI = []; // Store indexes of all numbers
// Loop through list -> if char is a number add its index
C.map(function (I) { return /^\d+$/.test(I) === true ? NI.push(C.indexOf(I)) : ''; });
// Get the items between the first and second occurence of a number
return C.slice(NI[0] === 0 ? NI[0] + 1 : 0, NI[1]).join('');
};
console.log(Extract(string1));
console.log(Extract(string2));
Output
EUSA
SWFX7
Since it's hard to tell what you are trying to match, I'd go with a general regex
^\d?\D+(?=\d)

conditional substring in JS

I have a function that should clean a string , actually I have two kind of string "SATISFACTION._IS1.TOUTES_SITUATIONS.current_month_note" or "SATISFACTION._IS1.TOUTES_SITUATIONS.note" .
PS for information TOUTES_SITUATIONS is variable
What I would return is "TOUTES_SITUATIONS"
Here's my code
const extractSituation: Function = (sentence: string): string => {
return sentence.substring(
sentence.lastIndexOf('1.') + 2,
sentence.lastIndexOf('.n'),
);
};
actually it handles only one type of sentence "SATISFACTION._IS1.TOUTES_SITUATIONS.note" but not "SATISFACTION._IS1.TOUTES_SITUATIONS.current_month_note"
How can I do to handle both of them ?
Array's index start from 0. Try something like this:
const extractSituation: Function = (sentence: string): string => {
return sentence.split('.')[2];
};
You might be able to just use a regex for this to pull out the text between the first SATISFACTION._IS1. and last .:
let s = "SATISFACTION._IS1.TOUTES_SITUATIONS.current_month_note"
let s2 = "SATISFACTION._IS1.TOUTES_someother.text__SITUATIONS.note"
let regex = /^SATISFACTION._IS1\.(.*)\..*$/
console.log(s.match(regex)[1])
console.log(s2.match(regex)[1])
I'm assuming:
1) You need to handle a general case where the 'situation' could be any string without periods.
2) The break before and after the 'situation' is delimited by '.'
You can retrieve the substring from the start of the situation to the end of the sentence, then find the index of the next '.' to find the substring containing only the situation.
const extractSituation: Function = (sentence: string): string => {
// sentence truncated up to start of situation
var situation = sentence.substring(sentence.lastIndexOf('1.') + 2);
// Up to the next period from start of situation
return situation.substring(0, situation.indexOf('.'));
};
This code only works given that you can assume every situation is preceded by your '1.' index.

Why does this jQuery code not work?

Why doesn't the following jQuery code work?
$(function() {
var regex = /\?fb=[0-9]+/g;
var input = window.location.href;
var scrape = input.match(regex); // returns ?fb=4
var numeral = /\?fb=/g;
scrape.replace(numeral,'');
alert(scrape); // Should alert the number?
});
Basically I have a link like this:
http://foo.com/?fb=4
How do I first locate the ?fb=4 and then retrieve the number only?
Consider using the following code instead:
$(function() {
var matches = window.location.href.match(/\?fb=([0-9]+)/i);
if (matches) {
var number = matches[1];
alert(number); // will alert 4!
}
});
Test an example of it here: http://jsfiddle.net/GLAXS/
The regular expression is only slightly modified from what you provided. The global flag was removed, as you're not going to have multiple fb='s to match (otherwise your URL will be invalid!). The case insensitive flag flag was added to match FB= as well as fb=.
The number is wrapped in curly brackets to denote a capturing group which is the magic which allows us to use match.
If match matches the regular expression we specify, it'll return the matched string in the first array element. The remaining elements contain the value of each capturing group we define.
In our running example, the string "?fb=4" is matched and so is the first value of the returned array. The only capturing group we have defined is the number matcher; which is why 4 is contained in the second element.
If you all you need is to grab the value of fb, just use capturing parenthesis:
var regex = /\?fb=([0-9]+)/g;
var input = window.location.href;
var tokens = regex.exec(input);
if (tokens) { // there's a match
alert(tokens[1]); // grab first captured token
}
So, you want to feed a querystring and then get its value based on parameters?
I had had half a mind to offer Get query string values in JavaScript.
But then I saw a small kid abusing a much respectful Stack Overflow answer.
// Revised, cooler.
function getParameterByName(name) {
var match = RegExp('[?&]' + name + '=([^&]*)')
.exec(window.location.search);
return match ?
decodeURIComponent(match[1].replace(/\+/g, ' '))
: null;
}
And while you are at it, just call the function like this.
getParameterByName("fb")
How about using the following function to read the query string parameter in JavaScript:
function getQuerystring(key, default_) {
if (default_==null)
default_="";
key = key.replace(/[\[]/,"\\\[").replace(/[\]]/,"\\\]");
var regex = new RegExp("[\\?&]"+key+"=([^&#]*)");
var qs = regex.exec(window.location.href);
if(qs == null)
return default_;
else
return qs[1];
}
and then:
alert(getQuerystring('fb'));
If you are new to Regex, why not try Program that illustrates the ins and outs of Regular Expressions

splitting a string in js where a pattern DOES NOT match?

i am trying to split a TextArea value where a pattern does not match
the text is like following:
Some Good Tutorials
http://a.com/page1
http://a.com/page2
Some Good Images
http://i.com/p1
http://i.com/p2
Some Good Videos
http://m.com/p1
http://m.com/p2
now i want to get only the links from the text so a better solution would be to split the whole string in an array of strings where the a line is not a url and then from amongst this array split each string with "\n"
edit:
okay i found a solution, i can find lines which does not begin with http:// or https:// and replace them with a good place holder after than i can get the links
though i am weak in regex so can someone tell me how to do this in javascript?
Match the pattern. don't split with it.
value=value.match(/http\:\/\/.+/g)
(.+matches everything to the end of a line)
Solved finally! Here is the code:
function split_lines() {
var oText = $('linkTxtArea').value;
removeBlankLines(); // a helper function to remove blank lines
oText = oText.split("\n"); // first split the string into an array
for (i = 0; i < oText.length; i++) // loop over the array
{
if (!oText[i].match(/^http:/)) // check to see if the line does not begins with http:
{
oText[i] = oText[i].replace(oText[i], "!replaced!"); // replace it with '!replaced!'
}
}
oText = oText.toString().split("!replaced!"); // now join the array to a string and then split that string by '!replaced!'
for (i = 1; i < oText.length; i++)
{
oText[i] = oText[i].replace(/^,/,"").replace(/,$/,""); // there were some extra commas left so i fixed it
}
return oText;
}

How to get portion of the attribute value using jquery

I have attribute value as:
<div id = "2fComponents-2fPromotion-2f" class = "promotion">
Now I want to get only portion of it, say Promotion and its value 2f, how can I get this using jquery ? Do we have built in function for it ?
You can use a regular expression here:
var attId = $(".promotion").attr("id");
// Perform a match on "Promotion-" followed by 2 characters in the range [0-9a-f]
var match = attId.match(/Promotion-([0-9a-f]{2})/);
alert(match[1]); // match[0] contains "Promotion-2f", match[1] contains "2f"
This assumes that the "value" of Promotion is a hexadecimal value and the characters [a-f] will always be lower case. It's also easily adjusted to match other values, for instance, if I change the regex to /component-([0-9a-f]{2})/, the match array would be ["component-3a", "3a"].
The match method takes a regular expression as its input and searches the string for the results. The result is returned as an array of matches, with the first index being the complete match (equivalent regex for this only would be /Promotion-[0-9a-f]{2}/). Any sub-expression (expressions enclosed in parenthesis) matches are added to the array in the order they appear in the expression, so the (Promotion) part of the expression is added to the array at index 1 and ([0-9a-f]{2}) is added at index 2.
match method on MSDN
var id = $("div.promotion").attr("id");
var index = id.indexOf("Promotion");
var promotion = '';
// if the word 'Promotion' is present
if(index !== -1) {
// extract it up to the end of the string
promotion = id.substring(index);
// split it at the hyphen '-', the second offset is the promotion code
alert(promotion.split('-')[1]);
} else {
alert("promotion code not found");
}
you can get the id attribute like this:
var id= $('div.promotion').attr('id');
But then I think you would have to use regular expressions to parse data from the string, the format doesn't appear to be straight forward.
If you are storing lots of info in the id could you consider using multiple attributes like:
<div class="promotion" zone="3a-2f-2f" home="2f"></div>
Then you could get the data like this:
var zone= $('div.promotion').attr('zone');
var home= $('div.promotion').attr('home');
Or you could use jQuery.data()
HTH
$(function () {
var promotion = $('.promotion').attr('id').match(/Promotion-([0-9a-f]{2})/);
if (promotion.length > 0) {
alert(promotion[1]);
}
else {
return false;
}
});

Categories