JavaScript: Find nested [quote] - javascript

I want to do form validation at user side with JavaScript (jQuery is also used). The goal is to remove nested bbCode [quote] tags deeper than level 2. Say, we have this text:
[quote=SoundMAX][quote=Laplundik][quote=SoundMAX]
blahblahblah[/quote]
blahblah
[/quote]
blah[/quote]
And get this:
[quote=SoundMAX][quote=Laplundik]
blahblah
[/quote]
blah[/quote]
My only idea is to .replace [quote] with <div>, then create DOM object and remove anything deeper than 2 with jQuery, and parse all backwards to bbCode. But that solution seems too complicated, are there more elegant one?
EDIT:
Thanks for nice solutions. Based on darioo's answer, I did this:
var text=$('#edit-privatemsgbody').val();
var tmp=[];
var level=0;
for (var i=0,l=text.length;i<l;i++){
if(text[i]=='['&&text[i+1]=='q') level++;
if(text[i-6]=='q'&&text[i-7]=='/'&&text[i-8]=='[') level--;
if(level<3) tmp.push(text[i]);
}
alert(tmp.join(''));
Which works just fine.
But idealmachine's solution was like a flash. I didn't know about replace callback function parameters before, now that is handy! I'll settle with it.

Actually, you can use regex if you look at it as a limited tool that cannot handle the nesting itself. The .replace string method can call a function to find the replacement text for each match, which allows us to track how deep we are in the markup structure (code also posted at http://jsfiddle.net/Zbgr3/3/):
var quoteLevel = 0;
alert(s.replace(/\[(\/?)quote[^\]]*\]|./gi, function(tag, slash) {
// Opening tag?
if(tag.length > 1 && !slash.length) quoteLevel += 1;
// What to strip
var strip = quoteLevel > 2;
// Closing tag?
if(tag.length > 1 && slash.length) quoteLevel -= 1;
if(strip) return '';
return tag;
}));
If you want some tolerance for errors in the markup, you could add some extra code that, for example, prevents quoteLevel from falling below zero.

Use a regular array as a stack. Every time you encounter [quote], increase your array by one using its push() method. When you encounter [/quote], decrease your array by one using its pop() method.
If you encounter [quote] and your array length is 2, remove that [quote], and remove the next [/quote] you encounter.
If you don't have the same number of open and closed quotes, then you'll have to handle that in a way you find appropriate.

Related

How to make indexOf only match 'hi' as a match and not 'hirandomstuffhere'?

Basically I was playing around with an Steam bot for some time ago, and made it auto-reply when you said things in an array, I.E an 'hello-triggers' array, which would contain things like "hi", "hello" and such. I made so whenever it received an message, it would check for matches using indexOf() and everything worked fine, until I noticed it would notice 'hiasodkaso', or like, 'hidemyass' as an "hi" trigger.
So it would match anything that contained the word even if it was in the middle of a word.
How would I go about making indexOf only notice it if it's the exact word, and not something else in the same word?
I do not have the script that I use but I will make an example that is pretty much like it:
var hiTriggers = ['hi', 'hello', 'yo'];
// here goes the receiving message function and what not, then:
for(var i = 0; i < hiTriggers.length; i++) {
if(message.indexOf(hiTriggers[i]) >= 0) {
bot.sendMessage(SteamID, randomHelloMsg[Math stuff here blabla]); // randomHelloMsg is already defined
}
}
Regex wouldn't be used for this, right? As it is to be used for expressions or whatever. (my English isn't awesome, ikr)
Thanks in advance. If I wasn't clear enough on something, please let me know and I'll edit/formulate it in another way! :)
You can extend prototype:
String.prototype.regexIndexOf = function(regex, startpos) {
var indexOf = this.substring(startpos || 0).search(regex);
return (indexOf >= 0) ? (indexOf + (startpos || 0)) : indexOf;
}
and do:
var foo = "hia hi hello";
foo.regexIndexOf(/hi\b/);
Or if you don't want to extend the string object:
foo.substr(i).search(/hi\b/);
both examples where taken from the top answers of Is there a version of JavaScript's String.indexOf() that allows for regular expressions?
Regex wouldn't be used for this, right? As it is to be used for expressions or whatever. (my > English isn't awesome, ikr)
Actually, regex is for any old pattern matching. It's absolutely useful for this.
fmsf's answer should work for what you're trying to do, however, in general extending native objects prototypes is frowned upon afik. You can easily break libraries by doing so. I'd avoid it when possible. In this case you could use his regexIndexOf function by itself or in concert with something like:
//takes a word and searches for it using regexIndexOf
function regexIndexWord(word){
return regexIndexOf("/"+word+"\b/");
}
Which would let you search based on your array of words without having to add the special symbols to each one.

Splitting a long phrase into an array

I need to take the phrase
It’s that time of year when you clean out your closets, dust off shelves, and spruce up your floors. Once you’ve taken care of the dust and dirt, what about some digital cleaning? Going through all your files and computers may seem like a daunting task, but we found ways to make the process fairly painless.
and upon pressing a button
split it into an array
iterate over that array at each step
Build SPAN elements as you go, along with the attributes
Add the SPAN elements to the original DIV
Add a click handler to the SPAN elements, or to the DIV, which causes the style on the SPAN to change on mouseover.
So far I had
function splitString(stringToSplit, separator) {
var arrayOfStrings = stringToSplit.split(separator);
print('The original string is: "' + stringToSplit + '"');
print('The separator is: "' + separator + '"');
print("The array has " + arrayOfStrings.length + " elements: ");
for (var i=0; i < arrayOfStrings.length; i++)
print(arrayOfStrings[i] + " / ");
}
var space = " ";
var comma = ",";
splitString(tempestString, space);
splitString(tempestString);
splitString(monthString, comma);
for (var i=0; i < myArray.length; i++)
{
}
var yourSpan = document.createElement('span');
yourSpan.innerHTML = "Hello";
var yourDiv = document.getElementById('divId');
yourDiv.appendChild(yourSpan);
yourSpan.onmouseover = function () {
alert("On MouseOver");
}
and for html I have
The DIV that will serve as your input (and output) is here, with
id="transcriptText":</p>
<div id="transcriptText"> It’s that time of year when you clean out your
closets, dust off shelves, and spruce up your floors. Once you’ve taken
care of the dust and dirt, what about some digital cleaning? Going
through all your files and computers may seem like a daunting task, but
we found ways to make the process fairly painless.</div>
<br>
<div id="divideTranscript" class="button"> Transform the
Transcript! </div>
Any help on how to move one? I have been stuck for quite some time
Well, first off this looks like homework.
That said, I'll try to help without giving you the actual code, since we're not supposed to give actual working solutions to homework. You're splitting the string too many times (once is all that's needed based on the instructions you gave) and you have to actually store the result of the split call somewhere that your other code can use it.
Your instructions say to add attributes to the span, but not which attributes nor what their contents should be.
Your function should follow the instructions:
1) Split the string. Since it doesn't specify on what, I'd assume words. So split it on spaces only and leave the punctuation where it is.
2) with the array of words returned from the split() function, iterate over it like you attempt to, but inside the braces that scope the loop is where you want to concatenate the <span> starting and ending tags around the original word.
3) use the document.createElement() to make that current span into a DOM element. Attach the mouseover and click handlers to it, then appendChild() it to the div.
add the handler to your button to call the above function.
Note that it's possibly more efficient to use the innerHTML() function to insert all the spans at once, but then you have to loop again to add the hover/click handlers.

How to search DOM to count the number of $ symbol found on a product page?

I am looking to find the best possible way to find how many $ symbols are on a page. Is there a better method than reading document.body.innerHTML and calc how many $-as are on that?
Your question can be split into two parts:
How can we get the the webpage text content without HTML tags?
We can generalize the second question a bit.
How can we find the number of string occurrences in another string?
And the 'best possible way to do this':
Amaan got the idea right of finding the text, but lets take it further.
var text = document.body.innerText || document.body.textContent;
Adding textContent to the code helps us cover more browsers, since innerText is not supported by all of them.
The second part is a bit trickier. It all depends on the number of '$' symbol occurrences on the page.
For example, if we know for sure, that there is at least one occurrence of the symbol on the page we would use this code:
text.match(/\$/g).length;
Which performs a global regular expression match on the given string and counts the length of the returned array. It's pretty fast and concise.
On the other hand, if we're not sure if the symbol appears on the page at least once, we should modify the code to look like this:
if (match = text.match(/\$/g)) {
match.length;
}
This just checks the value returned by the match function and if it's null, does nothing.
I would recommend using the third option only when there is a large occurrence of the symbols in the page or you're going to perform the search many many times. This is a custom function (taken from here) to count the occurrence of the specified string in another string. It performs better than the other two, but is longer and harder to understand.
var occurrences = function(string, subString, allowOverlapping) {
string += "";
subString += "";
if (subString.length <= 0) return string.length + 1;
var n = 0,
pos = 0;
var step = (allowOverlapping) ? (1) : (subString.length);
while (true) {
pos = string.indexOf(subString, pos);
if (pos >= 0) {
n++;
pos += step;
} else break;
}
return (n);
};
occurrences(text, '$');
I'm also including a little jsfiddle 'benchmark' so you can compare these three different approaches yourself.
Also: No, there isn't a better way of doing this than just getting the body text and counting how many '$' symbols there are.
You should probably use document.body.innerText or document.body.textContent to avoid getting your HTML give you false positives.
Something like this should work:
document.body.innerText.match(/\$/g).length;
An alternate way I can think of, would be to use window.find like this:
var len = 0;
while(window.find('$') === true){
len++;
}
(This may be unreliable because it depends on where the user clicked last. It will work fine if you do it onload, before any user interaction.)

Regex to search html return, but not actual html jQuery

I'm making a highlighting plugin for a client to find things in a page and I decided to test it with a help viewer im still building but I'm having an issue that'll (probably) require some regex.
I do not want to parse HTML, and im totally open on how to do this differently, this just seems like the the best/right way.
http://oscargodson.com/labs/help-viewer
http://oscargodson.com/labs/help-viewer/js/jquery.jhighlight.js
Type something in the search... ok, refresh the page, now type, like, class or class=" or type <a you'll notice it'll search the actual HTML (as expected). How can I only search the text?
If i do .text() it'll vaporize all the HTML and what i get back will just be a big blob of text, but i still want the HTML so I dont lose formatting, links, images, etc. I want this to work like CMD/CTRL+F.
You'd use this plugin like:
$('article').jhighlight({find:'class'});
To remove them:
.jhighlight('remove')
==UPDATE==
While Mike Samuel's idea below does in fact work, it's a tad heavy for this plugin. It's mainly for a client looking to erase bad words and/or MS Word characters during a "publishing" process of a form. I'm looking for a more lightweight fix, any ideas?
You really don't want to use eval, mess with innerHTML or parse the markup "manually". The best way, in my opinion, is to deal with text nodes directly and keep a cache of the original html to erase the highlights. Quick rewrite, with comments:
(function($){
$.fn.jhighlight = function(opt) {
var options = $.extend($.fn.jhighlight.defaults, opt)
, txtProp = this[0].textContent ? 'textContent' : 'innerText';
if ($.trim(options.find.length) < 1) return this;
return this.each(function(){
var self = $(this);
// use a cache to clear the highlights
if (!self.data('htmlCache'))
self.data('htmlCache', self.html());
if(opt === 'remove'){
return self.html( self.data('htmlCache') );
}
// create Tree Walker
// https://developer.mozilla.org/en/DOM/treeWalker
var walker = document.createTreeWalker(
this, // walk only on target element
NodeFilter.SHOW_TEXT,
null,
false
);
var node
, matches
, flags = 'g' + (!options.caseSensitive ? 'i' : '')
, exp = new RegExp('('+options.find+')', flags) // capturing
, expSplit = new RegExp(options.find, flags) // no capturing
, highlights = [];
// walk this wayy
// and save matched nodes for later
while(node = walker.nextNode()){
if (matches = node.nodeValue.match(exp)){
highlights.push([node, matches]);
}
}
// must replace stuff after the walker is finished
// otherwise replacing a node will halt the walker
for(var nn=0,hln=highlights.length; nn<hln; nn++){
var node = highlights[nn][0]
, matches = highlights[nn][1]
, parts = node.nodeValue.split(expSplit) // split on matches
, frag = document.createDocumentFragment(); // temporary holder
// add text + highlighted parts in between
// like a .join() but with elements :)
for(var i=0,ln=parts.length; i<ln; i++){
// non-highlighted text
if (parts[i].length)
frag.appendChild(document.createTextNode(parts[i]));
// highlighted text
// skip last iteration
if (i < ln-1){
var h = document.createElement('span');
h.className = options.className;
h[txtProp] = matches[i];
frag.appendChild(h);
}
}
// replace the original text node
node.parentNode.replaceChild(frag, node);
};
});
};
$.fn.jhighlight.defaults = {
find:'',
className:'jhighlight',
color:'#FFF77B',
caseSensitive:false,
wrappingTag:'span'
};
})(jQuery);
If you're doing any manipulation on the page, you might want to replace the caching with another clean-up mechanism, not trivial though.
You can see the code working here: http://jsbin.com/anace5/2/
You also need to add display:block to your new html elements, the layout is broken on a few browsers.
In the javascript code prettifier, I had this problem. I wanted to search the text but preserve tags.
What I did was start with HTML, and decompose that into two bits.
The text content
Pairs of (index into text content where a tag occurs, the tag content)
So given
Lorem <b>ipsum</b>
I end up with
text = 'Lorem ipsum'
tags = [6, '<b>', 10, '</b>']
which allows me to search on the text, and then based on the result start and end indices, produce HTML including only the tags (and only balanced tags) in that range.
Have a look here: getElementsByTagName() equivalent for textNodes.
You can probably adapt one of the proposed solutions to your needs (i.e. iterate over all text nodes, replacing the words as you go - this won't work in cases such as <tag>wo</tag>rd but it's better than nothing, I guess).
I believe you could just do:
$('#article :not(:has(*))').jhighlight({find : 'class'});
Since it grabs all leaf nodes in the article it would require valid xhtml, that is, it would only match link in the following example:
<p>This is some paragraph content with a link</p>
DOM traversal / selector application could slow things down a bit so it might be good to do:
article_nodes = article_nodes || $('#article :not(:has(*))');
article_nodes.jhighlight({find : 'class'});
May be something like that could be helpful
>+[^<]*?(s(<[\s\S]*?>)?e(<[\s\S]*?>)?e)[^>]*?<+
The first part >+[^<]*? finds > of the last preceding tag
The third part [^>]*?<+ finds < of the first subsequent tag
In the middle we have (<[\s\S]*?>)? between characters of our search phrase (in this case - "see").
After regular expression searching you could use the result of the middle part to highlight search phrase for user.

Javascript Regex and getElementByID

I'm trying to search for all elements in a web page with a certain regex pattern.
I'm failing to understand how to utilize Javascript's regex object for this task. My plan was to collect all elements with a jQuery selector
$('div[id*="Prefix_"]');
Then further match the element ID in the collection with this
var pattern = /Prefix_/ + [0 - 9]+ + /_Suffix$/;
//Then somehow match it.
//If successful, modify the element in some way, then move onto next element.
An example ID would be "Prefix_25412_Suffix". Only the 5 digit number changes.
This looks terrible and probably doesn't work:
1) I'm not sure if I can store all of what jQuery's returned into a collection and then iterate through it. Is this possible?? If I could I could proceed with step two. But then...
2) What function would I be using for step 2? The regex examples all use String.match method. I don't believe something like element.id.match(); is valid?
Is there an elegant way to run through the elements identified with a specific regex and work with them?
Something in the vein of C#
foreach (element e in
ElementsCollectedFromIDRegexMatch) { //do stuff }
Just use the "filter" function:
$('div[id*=Prefix_]').filter(function() {
return /^Prefix_\d+_Suffix$/.test(this.id);
}).each(function() {
// whatever you need to do here
// "this" will refer to each element to be processed
});
Using what jQuery returns as a collection and iterating through it is, in fact, the fundamental point of the whole library, so yes you can do that.
edit — a comment makes me realize that the initial selector with the "id" test is probably not useful; you could just operate on all the <div> elements on the page to start with, and let your own filtering pluck out the ones you really want.
You can use filter function. i.e:
$('div[id*="Prefix_"]').filter(function(){
return this.id.match(/Prefix_\d+_Suffix/);
});
You could do something like
$('div[id*="Prefix_"]').each(function(){
if($(this).attr('id').search(/do your regex here/) != -1) {
//change the dom element here
}
});
You could try using the filter method, to do something like this...
var pattern = /Prefix_/ + [0 - 9]+ + /_Suffix$/;
$('div[id*="Prefix_"]').filter(function(index)
{
return $(this).attr("id").search(pattern) != -1;
}
);
... and return a jQuery collection that contains all (if any) of the elements which match your spec.
Can't be sure of the exact syntax, off the top of my head, but this should at least point you in the right direction

Categories