Get all html between two elements - javascript

Problem:
Extract all html between two headers including the headers html. The header text is known, but not the formatting, tag name, etc. They are not within the same parent and might (well, almost for sure) have sub children within it's own children).
To clarify: headers could be inside a <h1> or <div> or any other tag. They may also be surrounded by <b>, <i>, <font> or more <div> tags. The key is: the only text within the element is the header text.
The tools I have available are: C# 3.0 utilizing a WebBrowser control, or Jquery/Js.
I've taken the Jquery route, traversing the DOM, but I've ran into the issue of children and adding them appropriately. Here is the code so far:
function getAllBetween(firstEl,lastEl) {
var collection = new Array(); // Collection of Elements
var fefound =false;
$('body').find('*').each(function(){
var curEl = $(this);
if($(curEl).text() == firstEl)
fefound=true;
if($(curEl).text() == lastEl)
return false;
// need something to add children children
// otherwise we get <table></table><tbody></tbody><tr></tr> etc
if (fefound)
collection.push(curEl);
});
var div = document.createElement("DIV");
for (var i=0,len=collection.length;i<len;i++){
$(div).append(collection[i]);
}
return($(div).html());
}
Should I be continueing down this road? With some sort of recursive function checking/handling children, or would a whole new approach be better suited?
For the sake of testing, here is some sample markup:
<body>
<div>
<div>Start</div>
<table><tbody><tr><td>Oops</td></tr></tbody></table>
</div>
<div>
<div>End</div>
</div>
</body>
Any suggestions or thoughts are greatly appreciated!

My thought is a regex, something along the lines of
.*<(?<tag>.+)>Start</\1>(?<found_data>.+)<\1>End</\1>.*
should get you everything between the Start and end div tags.

Here's an idea:
$(function() {
// Get the parent div start is in:
var $elie = $("div:contains(Start)").eq(0), htmlArr = [];
// Push HTML of that div to the HTML array
htmlArr.push($('<div>').append( $elie.clone() ).html());
// Keep moving along and adding to array until we hit END
while($elie.find("div:contains(End)").length != 1) {
$elie = $elie.next();
htmlArr.push($('<div>').append( $elie.clone() ).html());
};
// htmlArr now has the HTML
// let's see what it is:
alert(htmlArr.join(""));
});​
Try it out with this jsFiddle example
This takes the entire parent div that start is in. I'm not sure that's what you want though. The outerHTML is done by $('<div>').append( element.clone() ).html(), since outerHTML support is not cross browser yet. All the html is stored in an array, you could also just store the elements in the array.

Related

Find the tag JavaScript is running in

Generating HTML source on backend, I am using separate independent widgets.
I am simply including pieces of markup like this to the resulting HTML output.
<div>
I want to work with this DOM element
<script>
new Obj(/*but I can't get this <div> as a parameter! */);
</script>
</div>
I'm looking for a way to find the DOM element in which the obj is created (Without any unique IDs). This would add flexibility to my app and speed up the development. But is that technicaly possible in JavaScript?
You could seed an element in there and then get it's parent, and then remove the element.
<div>
I want to work with this DOM element
<script>
document.write("<div id='UniqueGUID_3477zZ7786_' style='display:none;'></div>");
var thatDivYouWanted;
(function(){
var target = document.getElementById("UniqueGUID_3477zZ7786_");
thatDivYouWanted = target.parentNode;
target.parentNode.removeChild(target);
})();
new Obj(/*but I can't get this <div> as a parameter! */);
</script>
</div>
The following code works:
<script>
function Obj(color) {
var scriptTags = document.getElementsByTagName("script");
var scriptTag = scriptTags[scriptTags.length - 1];
// find parent or do whatsoever
var divTag = scriptTag.parentNode;
divTag.style.backgroundColor = color;
}
</script>
<div>
I want to work with this DOM element
<script>new Obj("green");</script>
</div>
<div>
I want to work with this DOM element
<script>new Obj("yellow");</script>
</div>
<div>
I want to work with this DOM element
<script>new Obj("lime");</script>
</div>
This method has very simple code and has almost zero impact on performance.
Note: I am pretty sure this won't work IE6 (as far as I remember it does not support manipulating open tags).
I believe your approach is not ideal. If you're trying to obtain the <div>, it should be done programmatically in a conventional way using JavaScript and the API's that empower you to query the target <div>
Instead of executing inline, you can execute in a separate scope in a controlled way (DOM Ready then Query then Your Method). You can target your div by using an ID, CSS class name, or any other CSS selector in JavaScript.
This allows you to pretty much do the follow anywhere you want, not inline.
// on dom ready...
var div = document.getElementById('myDiv'), // replace with any other selector method
myObject = new Object(div);
Need to find your div? https://developer.mozilla.org/en-US/docs/DOM/Document.querySelectorAll
If you know beforehand how the page will be structured, you could use for example:
document.getElementsByTagName("div")[4]
to access the 5th div.

JS - Remove a tag without deleting content

I am wondering if it is possible to remove a tag but leave the content in tact? For example, is it possible to remove the SPAN tag but leave SPAN's content there?
<p>The weather is sure <span>sunny</span> today</p> //original
<p>The weather is sure sunny today</p> //turn it into this
I have tried using this method of using replaceWith(), but it it turned the HTML into
<p>
"The weather is sure "
"sunny"
" today"
</p>
EDIT : After testing all of your answers, I realized that my code is at fault. The reason why I keep getting three split text nodes is due to the insertion of the SPAN tag. I'll create another question to try to fix my problem.
<p>The weather is sure <span>sunny</span> today</p>;
var span=document.getElementsByTagName('span')[0]; // get the span
var pa=span.parentNode;
while(span.firstChild) pa.insertBefore(span.firstChild, span);
pa.removeChild(span);
jQuery has easier ways:
var spans = $('span');
spans.contents().unwrap();
With different selector methods, it is possible to remove deeply nested spans or just direct children spans of an element.
There are several ways to do it. Jquery is the most easy way:
//grab and store inner span html
var content = $('p span').html;
//"Re"set inner p html
$('p').html(content);
Javascript can do the same using element.replace. (I don't remember the regex to do the replace in one stroke, but this is the easy way)
paragraphElement.replace("<span>", "");
paragraphElement.replace("</span>", "");
It's just three text nodes instead of one. It doesn't make a visible difference does it?
If it's a problem, use the DOM normalize method to combine them:
$(...)[0].normalize();
$(function(){
var newLbl=$("p").clone().find("span").remove().end().html();
alert(newLbl);
});​
Example : http://jsfiddle.net/7gWdM/6/
If you're not looking for a jQuery solution, here something that's a little more lightweight and focused on your scenario.
I created a function called getText() and I used it recursively. In short, you can get the child nodes of your p element and retrieve all the text nodes within that p node.
Just about everything in the DOM is a node of some sort. Looking up at the following links I found that text nodes have a numerical nodeType value of 3, and when you identify where your text nodes are, you get their nodeValueand return it to be concatenated to the entire, non-text-node-free value.
https://developer.mozilla.org/en/nodeType
https://developer.mozilla.org/En/DOM/Node.nodeValue
var para = document.getElementById('p1') // get your paragraphe
var texttext = getText(para); // pass the paragraph to the function
para.innerHTML = texttext // set the paragraph with the new text
function getText(pNode) {
if (pNode.nodeType == 3) return pNode.nodeValue;
var pNodes = pNode.childNodes // get the child nodes of the passed element
var nLen = pNodes.length // count how many there are
var text = "";
for (var idx=0; idx < nLen; idx++) { // loop through the child nodes
if (pNodes[idx].nodeType != 3 ) { // if the child not isn't a text node
text += getText(pNodes[idx]); // pass it to the function again and
// concatenate it's value to your text string
} else {
text += pNodes[idx].nodeValue // otherwise concatenate the value of the text
// to the entire text
}
}
return text
}
I haven't tested this for all scenarios, but it will do for what you're doing at the moment. It's a little more complex than a replace string since you're looking for the text node and not hardcoding to remove specific tags.
Good Luck.
If someone is still looking for that, the complete solution that has worked for me is:
Assuming we have:
<p>hello this is the <span class="highlight">text to unwrap</span></p>
the js is:
// get the parent
var parentElem = $(".highlight").parent();
// replacing with the same contents
$(".highlight").replaceWith(
function() {
return $(this).contents();
}
);
// normalize parent to strip extra text nodes
parentElem.each(function(element,index){
$(this)[0].normalize();
});
If it’s the only child span inside the parent, you could do something like this:
HTML:
<p class="parent">The weather is sure <span>sunny</span> today</p>;
JavaScript:
parent = document.querySelector('.parent');
parent.innerHTML = parent.innerText;
So just replace the HTML of the element with its text.
You can remove the span element and keep the HTML content or internal text intact. With jQuery’s unwrap() method.
<html>
<head>
<script src="https://code.jquery.com/jquery-1.12.4.min.js"></script>
<script type="text/javascript">
$(document).ready(function(){
$("button").click(function(){
$("p").find("span").contents().unwrap();
});
});
</script>
</head>
<body>
<p>The weather is sure <span style="background-color:blue">sunny</span> today</p>
<button type="button">Remove span</button>
</body>
</html>
You can see an example here: How to remove a tag without deleting its content with jQuery

How to swap two div tags?

I want to swap two html div tags entirely, tags and all. I tried the code below code but it does not work.
jQuery('#AllBlock-'+Id).insertAfter('#AllBlock-'+Id.next().next());
How to swap two div tags entirely.
You have some bracket mismatching in your code, it looks like you might be trying to do this:
jQuery('#AllBlock-'+Id).insertAfter($('#AllBlock-'+Id').next().next());
Which would take something like:
<div id="AllBlock-5"></div>
<div id="AllBlock-6"></div>
<div id="AllBlock-7"></div>
And, if called with Id 5, turn it into this:
<div id="AllBlock-6"></div>
<div id="AllBlock-7"></div>
<div id="AllBlock-5"></div>
This is because you're taking block 5, and moving it (using insertAfter) to the place after the block that's next().next() (or next-but-one) from itself, which would be block 7.
If you want to always swap #AllBlock-Id with #AllBlock-[Id+2], so they switch places and end up like the following:
<div id="AllBlock-7"></div>
<div id="AllBlock-6"></div>
<div id="AllBlock-5"></div>
You might want to try:
var $block = jQuery('#AllBlock-'+Id);
var $pivot = $block.next();
var $blockToSwap = $pivot.next();
$blockToSwap.insertBefore($pivot);
$block.insertAfter($pivot);
You can't do this because you can't concatenate a string and a jQuery object.
Try this:
var div = $('#AllBlock-'+Id);
div.insertAfter(div.next().next());
it should be like this
you should close the bracket after Id,
jQuery('#AllBlock-'+Id).insertAfter('#AllBlock-'+Id).next().next());
You'll need to detach the existing dom object first, then re-use it later:
$('#divid').detach().insertAfter('#someotherdivid');
What I understand is you want to swap a div when clicked with the last div. What will you do if it is the last div? move it to the top?
This solution should solve the problem, furthermore, you can modify this regex to match the format of your ID. This can probably be made more concise and robust. For example, you could get the last ID a bit more sophisticatedly. This may just be modifying the selector or something more. I mean, you do not want to go rearranging the footer or something just because its the last div on the page.
$('div').click(function() {
//set regex
var re = /(^\w+-)(\d+)$/i;
//get attr broken into parts
var str = $(this).attr('id').match(re)[1],
id = $(this).attr('id').match(re)[2];
//get div count and bulid last id
var lastStr = $('div:last').attr('id').match(re)[1],
lastID = $('div:last').attr('id').match(re)[2];
//if we have any div but the last, swap it with the end
if ( id !== lastID ) {
$(this).insertAfter('#'+lastStr+lastID);
}
//otherwise, move the last one to the top of the stack
else {
$(this).insertBefore('div:first');
} });
Check out this working fiddle: http://jsfiddle.net/sQYhD/
You may also be interested in the jquery-ui library: http://jqueryui.com/demos/sortable/

How to append text to a div element?

I’m using AJAX to append data to a <div> element, where I fill the <div> from JavaScript. How can I append new data to the <div> without losing the previous data found in it?
Try this:
var div = document.getElementById('divID');
div.innerHTML += 'Extra stuff';
Using appendChild:
var theDiv = document.getElementById("<ID_OF_THE_DIV>");
var content = document.createTextNode("<YOUR_CONTENT>");
theDiv.appendChild(content);
Using innerHTML:
This approach will remove all the listeners to the existing elements as mentioned by #BiAiB. So use caution if you are planning to use this version.
var theDiv = document.getElementById("<ID_OF_THE_DIV>");
theDiv.innerHTML += "<YOUR_CONTENT>";
Beware of innerHTML, you sort of lose something when you use it:
theDiv.innerHTML += 'content';
Is equivalent to:
theDiv.innerHTML = theDiv.innerHTML + 'content';
Which will destroy all nodes inside your div and recreate new ones. All references and listeners to elements inside it will be lost.
If you need to keep them (when you have attached a click handler, for example), you have to append the new contents with the DOM functions(appendChild,insertAfter,insertBefore):
var newNode = document.createElement('div');
newNode.innerHTML = data;
theDiv.appendChild(newNode);
If you want to do it fast and don't want to lose references and listeners use: .insertAdjacentHTML();
"It does not reparse the element it is being used on and thus it does not corrupt the existing elements inside the element. This, and avoiding the extra step of serialization make it much faster than direct innerHTML manipulation."
Supported on all mainline browsers (IE6+, FF8+,All Others and Mobile): http://caniuse.com/#feat=insertadjacenthtml
Example from https://developer.mozilla.org/en-US/docs/Web/API/Element/insertAdjacentHTML
// <div id="one">one</div>
var d1 = document.getElementById('one');
d1.insertAdjacentHTML('afterend', '<div id="two">two</div>');
// At this point, the new structure is:
// <div id="one">one</div><div id="two">two</div>
If you are using jQuery you can use $('#mydiv').append('html content') and it will keep the existing content.
http://api.jquery.com/append/
IE9+ (Vista+) solution, without creating new text nodes:
var div = document.getElementById("divID");
div.textContent += data + " ";
However, this didn't quite do the trick for me since I needed a new line after each message, so my DIV turned into a styled UL with this code:
var li = document.createElement("li");
var text = document.createTextNode(data);
li.appendChild(text);
ul.appendChild(li);
From https://developer.mozilla.org/en-US/docs/Web/API/Node/textContent :
Differences from innerHTML
innerHTML returns the HTML as its name indicates. Quite often, in order to retrieve or write text within an element, people use innerHTML. textContent should be used instead. Because the text is not parsed as HTML, it's likely to have better performance. Moreover, this avoids an XSS attack vector.
Even this will work:
var div = document.getElementById('divID');
div.innerHTML += 'Text to append';
An option that I think is better than any of the ones mentioned so far is Element.insertAdjacentText().
// Example listener on a child element
// Included in this snippet to show that the listener does not get corrupted
document.querySelector('button').addEventListener('click', () => {
console.log('click');
});
// to actually insert the text:
document.querySelector('div').insertAdjacentText('beforeend', 'more text');
<div>
<button>click</button>
</div>
Advantages to this approach include:
Does not modify the existing nodes in the DOM; does not corrupt event listeners
Inserts text, not HTML (Best to only use .insertAdjacentHTML when deliberately inserting HTML - using it unnecessarily is less semantically appropriate and can increase the risk of XSS)
Flexible; the first argument to .insertAdjacentText may be beforebegin, beforeend, afterbegin, afterend, depending on where you'd like the text to be inserted
you can use jQuery. which make it very simple.
just download the jQuery file add jQuery into your HTML
or you can user online link:
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.3/jquery.min.js"></script>
and try this:
$("#divID").append(data);
The following method is less general than others however it's great when you are sure that your last child node of the div is already a text node. In this way you won't create a new text node using appendData MDN Reference AppendData
let mydiv = document.getElementById("divId");
let lastChild = mydiv.lastChild;
if(lastChild && lastChild.nodeType === Node.TEXT_NODE ) //test if there is at least a node and the last is a text node
lastChild.appendData("YOUR TEXT CONTENT");
java script
document.getElementById("divID").html("this text will be added to div");
jquery
$("#divID").html("this text will be added to div");
Use .html() without any arguments to see that you have entered.
You can use the browser console to quickly test these functions before using them in your code.
Why not just use setAttribute ?
thisDiv.setAttribute('attrName','data you wish to append');
Then you can get this data by :
thisDiv.attrName;

How to insert HTML string in between two DOM nodes?

Consider following DOM fragment:
<div id="div-1">foo</div>
<div id="div-2">bar</div>
Is it possible to insert HTML string (EDIT: that contains tags to render) between divs without wrapping it in another div (EDIT: or some other tag) created via document.createElement and setting its innerHTML property?
Most browsers support element#insertAdjacentHTML(), which finally became standard in the HTML5 specification. Unfortunately, Firefox 7 and lower don't support it, but I managed to find a workaround that uses ranges to insert the HTML. I've adapted it below to work for your scenario:
var el = document.getElementById("div-2"),
html = "<span>Some HTML <b>here</b></span>";
// Internet Explorer, Opera, Chrome, Firefox 8+ and Safari
if (el.insertAdjacentHTML)
el.insertAdjacentHTML ("beforebegin", html);
else {
var range = document.createRange();
var frag = range.createContextualFragment(html);
el.parentNode.insertBefore(frag, el);
}
Live example: http://jsfiddle.net/AndyE/jARTf/
This does it for straight text, which is how I read your original question (see below for an update for strings that include tags):
var div = document.getElementById('div-2');
var textNode = document.createTextNode('your text');
div.parentNode.insertBefore(textNode, div);
Live example
If you start with:
<div id="div-1">foo</div>div id="div-2">bar</div>
(note that there's no whitespace between them) then the result of the above is exactly what you would get with this HTML:
<div id="div-1">foo</div>your text<div id="div-2">bar</div>
If you really have that whitespace between the divs, you'll already have a text node there and the above will insert another one next to it. For virtually all intents and purposes, that doesn't matter, but if that bothers you, you can append to the existing text node instead if you like:
var text = 'your text';
var div = document.getElementById('div-2');
var prev = div.previousSibling;
if (prev && prev.nodeType == 3) { // 3 == TEXT_NODE
// Prev is a text node, append to it
prev.nodeValue = prev.nodeValue + text;
}
else {
// Prev isn't a text node, insert one
var textNode = document.createTextNode(text);
div.parentNode.insertBefore(textNode, div);
}
Live example
Links to W3C docs: insertBefore, createTextNode
Including HTML tags
In your revised question you've said you want to include tags to be interpreted in doing all this. It's possible, but it's roundabout. First you put the HTML string into an element, then you move the stuff over, like this:
var text, div, tempdiv, node, next, parent;
// The text
text = 'your text <em>with</em> <strong>tags</strong> in it';
// Get the element to insert in front of, and its parent
div = document.getElementById('div-2');
parent = div.parentNode;
// Create a temporary container and render the HTML to it
tempdiv = document.createElement('div');
tempdiv.innerHTML = text;
// Walk through the children of the container, moving them
// to the desired target. Note that we have to be sure to
// grab the node's next sibling *before* we move it, because
// these things are live and when we moev it, well, the next
// sibling will become div-2!
node = tempdiv.firstChild;
next = node.nextSibling;
while (node) {
parent.insertBefore(node, div);
node = next;
next = node ? node.nextSibling : undefined;
}
Live example
But here there be dragons, you have to select the container element as appropriate to the content you're inserting. For instance, we couldn't use a <tr> in your text with the code above because we're inserting it into a div, not a tbody, and so that's invalid and the results are implementation-specific. These sorts of complexities are why we have libraries to help us out. You've asked for a raw DOM answer and that's what the above is, but I really would check out jQuery, Closure, Prototype, YUI, or any of several others. They'll smooth a lot of stuff over for you.
var neuB = document.createElement("b");
var neuBText = document.createTextNode("mit fettem Text ");
neuB.appendChild(neuBText);
document.getElementById("derText").insertBefore(neuB, document.getElementById("derKursiveText"));
You search for: insertBefore
Using jquery it is very simple:
$("#div-1").after("Other tag here!!!");
See: jquery.after
It is obvious that javascript is not a pure javascript solution.

Categories