match css rules in javascript - javascript

I need to create a regular expression to find class inside a css file.
For example I have this css file:
#label-blu{
}
.label-blu, .test{
}
.label-blu-not-match{
}
.label-blu{
}
.label-blu span{
}
In this case I need to return 3 match
This is my regular expression:
var css = data;
var find_css = 'label-blu';
var found = css.match(/([#|\.]?)([\w|:|\s|\.]+)/gmi).length;
console.log('found: ' + found);
Inside var data there is all the css string
How can I solve?
Thanks

There are two points:
("word-does-not-include-hyphen").replace(/\w+/g, 'test')
And are you sure you should be matching against css label text label-blu? rather than the full css text itself? Currently you are finding the separations across the hyphen for label-blu...
var css = 'label-blu';
var found = css.match(/([#|\.]?)([\w|:|\s|\.]+)/gmi);
/// which gives ['label','blu']
Which is the reason for the returned length of two, rather than three. Were you not hoping to match the three items in the css text i.e
#label-blu
.label-blu-not-match
.label-blu
If so you will need to use a different text to match, the entire css, rather than just the string 'label-blue'.
However if you are trying to match:
#label-blu
.label-blu, .test
.label-blu
.label-blu span
Then you will need a different RegExp and the entire css string. Just need clarification on which route you need?
update
It's still not clear exactly out of the css text what you wish to match, this is the reason why I have outlined exactly. However, on the assumption you want to match the last four items I mention (and assuming you don't wish to match label-blu-not-match) then the following should help:
http://jsfiddle.net/5d7JX/
var found = csstext.match(/[#\.]label-blu([,:\s\.][^\{]*)?\{/gmi);
However the above is not full-proof for all possible css formats, nor does it protect against matches within the css rule-sets themselves. Generally speaking scanning through code that is usually quite complicated to parse into something logical using only Regular Expressions is frowned upon; unless you are solving a very specific use-case.
update 2
Yes excluding the ID selectors just involves removing the # part of the Reg Exp...
var found = csstext.match(/\.label-blu([,:\s\.][^\{]*)?\{/gmi);
I recommend that you read up on your regular expressions, this site is a good place:
http://www.regular-expressions.info/
update 3
To include a variable as part of a regular expression you will need to make sure you escape the characters to make the string literal, so any special characters wont interfere. As far as I'm aware there isn't a built in function to escape or quote for regular expressions in JavaScript; however you can find one here:
How to escape regular expression in javascript?
So if you add this to your code:
RegExp.quote = function(str) {
return (str+'').replace(/([.?*+^$[\]\\(){}|-])/g, "\\$1");
};
You then also need to convert your regexp to the object equivalent:
var reg = new RegExp('\\.label-blu([,:\\s\\.][^\\{]*)?\\{', 'gmi');
var found = csstext.match(reg);
And then add this:
var label = 'label-blu';
var reg = new RegExp('\\.' + RegExp.quote(label) + '([,:\\s\\.][^\\{]*)?\\{', 'gmi');
var found = csstext.match(reg);
http://jsfiddle.net/5d7JX/1/

In your example if you use:
var findClass = /(\.label-blu)(?!-)+/g;
var found = css.match(findClass).length;
should return 3...
maybe a better solution is:
var findClass = /(\.label-blu)[\s{,]+/g;
var found = css.match(findClass).length;
to cover a possibility when you might have something else rather than '-' added to your wanted class and it will only look for the class that's followed by a 'space' a '{' or a ','...
let me know if you have any questions

Related

Extract inner text from anchor tag string using a regular expression in JavaScript

I am new to angular js . I have regex which gets all the anchor tags. My reg ex is
/<a[^>]*>([^<]+)<\/a>/g
And I am using the match function here like ,
var str = 'abc.jagadale#gmail.com'
So Now I am using the code like
var value = str.match(/<a[^>]*>([^<]+)<\/a>/g);
So, Here I am expecting the output to be abc.jagadale#gmail.com , But I am getting the exact same string as a input string . can any one please help me with this ? Thanks in advance.
Why are you trying to reinvent the wheel?
You are trying to parse the HTML string with a regex it will be a very complicated task, just use DOM or jQuery to get the links contents, they are made for this.
Put the HTML string as the HTML of a jQuery/DOM element.
Then fetch this created DOM element to get all the a elements
inside it and return their contents in an array.
This is how should be your code:
var str = 'abc.jagadale#gmail.com';
var results = [];
$("<div></div>").html(str).find("a").each(function(l) {
results.push($(this).text());
});
Demo:
var str = 'abc.jagadale#gmail.com';
var results = [];
$("<div></div>").html(str).find("a").each(function(l) {
results.push($(this).text());
});
console.log(results);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
You need to capture the group inside the anchor tags. The regular expression already matches the inner group ([^<]+) But, when matching there are different ways to extract that inner text.
When using the Match function it will return an array of matched elements, the first one, will match the whole regular expression and the following elements will match the included groups in the regular expression.
Try this:
var reg = /<a[^>]*>([^<]+)<\/a>/g
reg.exec(str)[1]
Also the match function will return an array only if the g flag is not present.
Check https://javascript.info/regexp-groups for further documentation.
Brief
Don't use regex for this. Regex is a great tool, don't get me wrong, but it's not what you're looking for. Regex cannot properly parse HTML and should only be used to do so if it's a limited, known set of HTML.
Try, for example, adding content:">" to your style attribute. You'll see your pattern now fails or gives you an incorrect result. I don't like to use this quote all the time, but I think it's necessary to use it in this case:
Some people, when confronted with a problem, think "I know, I'll use
regular expressions." Now they have two problems.
Use builtin functions. jQuery makes this super easy to accomplish. See my Code section for a demonstration. It's way more legible than any regex variant.
Code
DOM from page
The following snippet gets all anchors on the actual page.
$("a").each(function() {
console.log($(this).text())
})
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
abc.jagadale#gmail.com
abc2.jagadale#gmail.com
DOM in string
The following snippet gets all anchors in the string (converted to DOM element)
var s = `email3#domain.com
email4#domain.com`
$("<div></div>").html(s).find("a").each(function() {
console.log($(this).text())
})
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
email1#domain.com
email2#domain.com
Given the use case of parsing a string, instead of having an actual DOM to work with, it does seem like regex is the way to go, unless you want to load the HTML into a document fragment and parse that.
One way to get all of your matches is to make use of split:
var htmlstr = "<p><a href='url'>asdf#bsdf.com</a></p>"
var matches = htmlstr.split(/<a.+?>([A-Za-z.#]+)<\/a>/).filter((t, i) => i % 2)
Using a regex with split returns all of the matches along with the text around them, then filtering by index % 2 will pare it down to just the regex matches.

Regex lookbehind workaround for Javascript?

I am terrible at regex so I will communicate my question a bit unconventionally in the name of trying to better describe my problem.
var TheBadPattern = /(\d{2}:\d{2}:\d{2},\d{3})/;
var TheGoodPattern = /([a-zA-Z0-9\-,.;:'"])(?:\r\n?|\n)([a-zA-Z0-9\-])/gi;
// My goal is to then do this
inputString = inputString.replace(TheGoodPattern, '$1 $2);
Question: I want to match all the good patterns and do the subsequent find/replace UNLESS they are proceeded by the bad pattern, any ideas on how? I was able to accomplish this in other languages that support lookbehind but I am at a loss without it? (ps: from what I understand, JS does not support lookahead/lookbehind or if you prefer, '?>!', '?<=')
JavaScript does support lookaheads. And since you only need a lookbehind (and not a lookahead, too), there is a workaround (which doesn't really aid the readability of your code, but it works!). So what you can do is reverse both the string and the pattern.
inputString = inputString.split("").reverse().join("");
var pattern = /([a-z0-9\-])(?:\n\r?|\r)([a-z0-9\-,.;:'"])(?!\d{3},\d{2}:\d{2}:\d{2})/gi
inputString = inputString.replace(TheGoodPattern, '$1 $2');
inputString = inputString.split("").reverse().join("");
Note that you had redundantly used the upper case letters (they are being taken care of the i modifier).
I would actually test it for you if you supplied some example input.
I have also used the reverse methodology recommended by m.buettner, and it can get pretty tricky depending on your patterns. I find that workaround works well if you are matching simple patterns or strings.
With that said I thought I would go a bit outside the box just for fun. This solution is not without its own foibles, but it also works and it should be easy to adapt to existing code with medium to complicated regular expressions.
http://jsfiddle.net/52QBx/
js:
function negativeLookBehind(lookBehindRegExp, matchRegExp, modifiers)
{
var text = $('#content').html();
var badGoodRegex = regexMerge(lookBehindRegExp, matchRegExp, modifiers);
var badGoodMatches = text.match(badGoodRegex);
var placeHolderMap = {};
for(var i = 0;i<badGoodMatches.length;i++)
{
var match = badGoodMatches[i];
var placeHolder = "${item"+i+"}"
placeHolderMap[placeHolder] = match;
$('#content').html($('#content').html().replace(match, placeHolder));
}
var text = $('#content').html();
var goodRegex = matchRegExp;
var goodMatches = text.match(goodRegex);
for(prop in placeHolderMap)
{
$('#content').html($('#content').html().replace(prop, placeHolderMap[prop]));
}
return goodMatches;
}
function regexMerge(regex1, regex2, modifiers)
{
/*this whole concept could be its own beast, so I just asked to have modifiers for the combined expression passed in rather than determined from the two regexes passed in.*/
return new RegExp(regex1.source + regex2.source, modifiers);
}
var result = negativeLookBehind(/(bad )/gi, /(good\d)/gi, "gi");
alert(result);
​
html:
<div id="content">Some random text trying to find good1 text but only when that good2 text is not preceded by bad text so bad good3 should not be found bad good4 is a bad oxymoron anyway.</div>​
The main idea is find all the total patterns (both the lookbehind and the real match) and temporarily remove those from the text being searched. I utilized a map as the values being hidden could vary and thus each replacement had to be reversible. Then we can run just the regex for the items you really wanted to find without the ones that would have matched the lookbehind getting in the way. After the results are determined we swap back in the original items and return the results. It is a quirky, yet functional, workaround.

Javascript Regex after specific string

I have several Javascript strings (using jQuery). All of them follow the same pattern, starting with 'ajax-', and ending with a name. For instance 'ajax-first', 'ajax-last', 'ajax-email', etc.
How can I make a regex to only grab the string after 'ajax-'?
So instead of 'ajax-email', I want just 'email'.
You don't need RegEx for this. If your prefix is always "ajax-" then you just can do this:
var name = string.substring(5);
Given a comment you made on another user's post, try the following:
var $li = jQuery(this).parents('li').get(0);
var ajaxName = $li.className.match(/(?:^|\s)ajax-(.*?)(?:$|\s)/)[1];
Demo can be found here
Below kept for reference only
var ajaxName = 'ajax-first'.match(/(\w+)$/)[0];
alert(ajaxName);
Use the \w (word) pattern and bind it to the end of the string. This will force a grab of everything past the last hyphen (assuming the value consists of only [upper/lower]case letters, numbers or an underscore).
The non-regex approach could also use the String.split method, coupled with Array.pop.
var parts = 'ajax-first'.split('-');
var ajaxName = parts.pop();
alert(ajaxName);
you can try to replace ajax- with ""
I like the split method #Brad Christie mentions, but I would just do
function getLastPart(str,delimiter) {
return str.split(delimiter)[1];
}
This works if you will always have only two-part strings separated by a hyphen. If you wanted to generalize it for any particular piece of a multiple-hyphenated string, you would need to write a more involved function that included an index, but then you'd have to check for out of bounds errors, etc.

jquery / javascript: regex to replace instances of an html tag

I'm trying to take some parsed XML data, search it for instances of the tag and replace that tag (and anything that may be inside the font tag), and replace it with a simple tag.
This is how I've been doing my regexes:
var emailReg = /^([\w-\.]+#([\w-]+\.)+[\w-]{2,4})?$/; //Test against valid email
console.log('regex: ' + emailReg.test(testString));
and so I figured the font regex would be something like this:
var fontReg = /'<'+font+'[^><]*>|<.'+font+'[^><]*>','g'/;
console.log('regex: ' + fontReg.test(testString));
but that isn't working. Anyone know a way to do this? Or what I might be doing wrong in the code above?
I think namuol's answer will serve you better then any RegExp-based solution, but I also think the RegExp deserves some explanation.
JavaScript doesn't allow for interpolation of variable values in RegExp literals.
The quotations become literal character matches and the addition operators become 1-or-more quantifiers. So, your current regex becomes capable of matching these:
# left of the pipe `|`
'<'font'>
'<''''fontttt'>
# right of the pipe `|`
<#'font'>','g'
<#''''fontttttt'>','g'
But, it will not match these:
<font>
</font>
To inject a variable value into a RegExp, you'll need to use the constructor and string concat:
var fontReg = new RegExp('<' + font + '[^><]*>|<.' + font + '[^><]*>', 'g');
On the other hand, if you meant for literal font, then you just needed:
var fontReg = /<font[^><]*>|<.font[^><]*>/g;
Also, each of those can be shortened by using .?, allowing the halves to be combined:
var fontReg = new RegExp('<.?' + font + '[^><]*>', 'g');
var fontReg = /<.?font[^><]*>/g;
If I understand your problem correctly, this should replace all font tags with simple span tags using jQuery:
$('font').replaceWith(function () {
return $('<span>').append($(this).contents());
});
Here's a working fiddle: http://jsfiddle.net/RhLmk/2/

Regexp for matching numbers and units in an HTML fragment?

I'm trying to make a regexp that will match numbers, excluding numbers that are part of other words or numbers inside certain html tags. The part for matching numbers works well but I can't figure out how to find the numbers inside the html.
Current code:
//number regexp part
var prefix = '\\b()';//for future use
var baseNumber = '((\\+|-)?([\\d,]+)(?:(\\.)(\\d+))?)';
var SIBaseUnit = 'm|kg|s|A|K|mol|cd';
var SIPrefix = 'Y|Z|E|P|T|G|M|k|h|ia|d|c|m|µ|n|p|f|a|z|y';
var SIUnit = '(?:('+SIPrefix+')?('+SIBaseUnit+'))';
var generalSuffix = '(PM|AM|pm|am|in|ft)';
var suffix = '('+SIUnit+'|'+generalSuffix+')?\\b';
var number = '(' + prefix + baseNumber + suffix + ')';
//trying to make it match only when not within tags or inside excluded tags
var htmlBlackList = 'script|style|head'
var htmlStartTag = '<[^(' + htmlBlackList + ')]\\b[^>]*?>';
var reDecimal = new RegExp(htmlStartTag + '[^<]*?' + number + '[^>]*?<');
<script>
var htmlFragment = "<script>alert('hi')</script>";
var style = "<style>.foo { font-size: 14pt }</style>";
// ...
</script>
<!-- turn off this style for now
<style> ... </style>
-->
Good luck getting a regular expression to figure that out.
You're using JavaScript, so I'm guessing you're probably running in a browser. Which means you have access to the DOM, giving you access to the browser's very capable HTML parser. Use it.
The [^] regex modifier only works on single characters, not on compound expressions like (script|style|head). What you want is ?! :
var htmlStartTag = '<(?!(' + htmlBlackList + ')\\b)[^>]*?>';
(?! ... ) means 'not followed by ...' but [^ ... ] means 'a single character not in ...'.
I'm trying to make a regexp that will match numbers, excluding numbers that are part of other words or numbers inside certain html tags.
Regex cannot parse HTML. Do not use regex to parse HTML. Do not pass Go. Do not collect £200.
To ‘only match something not-within something else’ you would need a negative lookbehind assertion (“(?<!”), but JavaScript Regexps do not support lookbehind, and most other regex implementations don't support the complex variable-length lookbehind you'd need to have any hope of matching a context like being inside a tag. Even if you did have variable-length lookbehind, that'd still not reliably parse HTML, because as previously mentioned many times every day, regex cannot parse HTML.
Use an HTML parser. A browser HTML parser will be able to digest even partial input without complaining.

Categories