Retrieve html code of selected text - javascript

In mozilla, I can select a text and print the selected text using contentWindow.getSelection(). But I am trying to get the underlying html code block for this selected text. Is there any way I can retrieve it?
I need to extract urls and other informations like src, etc. underneath any clickable text that a user selects. I need the code block of its parent node.
Thanks.

Retrieving the HTML should be relatively easy, but it depends on what you are wanting. window.getSelection() returns a selection object. You can use:
window.getSelection().anchorNode to obtain the Node in which the selection begins and
window.getSelection().focusNode to get the Node in which the selection ends.
For instance:
let selection = contentWindow.getSelection();
let firstElement = selection.anchorNode;
let lastElement = selection.focusNode;
What you do once you have the nodes/elements will depend on what it is that you are actually wanting to find. You have not specified that, so manipulating it past finding those nodes would just be a guess as to what you are wanting. For instance, you just might want to find the parent of the anchorNode, verify that it contains the focusNode (firstElement.parentNode.contains(lastElement)) (if not then continue finding the next parent until it does) and use the parent's innerHTML. Alternately, maybe you want to find the first parent element of the anchorNode which contains the focusNode and then use a TreeWalker to walk the DOM tree until you find the anchorNode and start accumulating the HTML until you encounter the focusNode.

Do you have a mouse event listener or something before you do contentWindow.getSelection?
If you do you can get the selected node by doing:
function onMouseUp(event) {
var aWindow = event.target.ownerDocument.defaultView;
// should test if aWindow is chrome area or actually content area
var contentWindow = aWindow.document instanceof Ci.nsIHTMLDocument ? aWindow : null; // i guessed here but testing if its content window is done in some similar way
if (!contentWindow) { return }
// do contentWindow.getSelection im not familiar with the code, if selection exists // check if more then one range selected then get node for each, however im going to assume only one range is selected
var nodeOfFirstRange = event.explicitOriginalTarget
var elementOfNode = nodeOfFirstRange.parentNode;
var htmlOfElement = elementOfNode.innerHTML;
}
Services.wm.getMostRecentWindow('navigator:browser').gBrowser.addEventListener('mouseup');
issue with this code is if user mouses down in content window and then highlights and mouseup while mouse its outside of content window, like on chrome window or even outside the browser (like if the browser window was not in maximum or if user mousedup in taskbar of os etc) so just use this code as a guide

Related

Get HTML Node (Not Necessarily Element) On Click

I have a web application that, after clicking any node in the HTML, needs to retrieve the index of that node in its parent's childNodes array. However, I am having trouble getting the currently selected node through an onclick event. The returned target of the event is the containing element rather than the specific node inside the element. This difference is important when text nodes exist, such as:
<div>This is Node 1<span>node 2</span>, node 3, and <span>node 4</span></div>
If you click on the spans for Node 2 or Node 4, it's straightforward to know where you are. However, if you click on the text for Node 1 and Node 3, I can't seem to find where the event would help you figure out which part of the actual content was clicked on.
This happens to be important because a later operation needs to check for certain properties either forward or backward through the document until the first match. So, if both Node 2 and Node 4 are a match for the search, I need to know if I am in Node 1 or Node 3 in order to know which one to return. For example, if searching rightwards, starting in Node 1 means that Node 2 should be returned, and starting in Node 3 means that Node 4 should be returned. Obviously, this is a simplification, but it demonstrates the issue. Does anyone know the canonical solution for this? If I can get the node object or the index, that should be sufficient. jquery is fine, but not necessary.
Maybe somthing like this demo could help you out a bit:
document.getElementsByTagName('div')[0].addEventListener('click', function () {
var fullStr = this.innerHTML.replace(/<[^>]*>/g, ''),
sel = window.getSelection(),
str = sel.anchorNode.data,
clickPos = sel.focusOffset,
wordPosLeft = str.slice(0, clickPos + 1).search(/\S+$/),
wordPosRight = str.slice(clickPos).search(/\s/),
wordClicked,
nextWordRegex,
nextWordPosLeft,
nextWord;
if(wordPosRight < 0) {
wordClicked = str.slice(wordPosLeft);
} else {
wordClicked = str.slice(wordPosLeft, wordPosRight + clickPos);
}
nextWordRegex = new RegExp(wordClicked);
nextWordPosLeft = fullStr.search(nextWordRegex) + wordClicked.length;
nextWord = fullStr.slice(nextWordPosLeft).match(/^\s*(\S*)\s*.*$/)[1];
console.log('wordClicked: ' + wordClicked);
console.log('nextWord: ' + nextWord);
});
See this fiddle.
You need to get Your nodes in some containers. If You would click on "Node 1" text, function will return You a <div> element. But, if You would change Your code on this:
<div>
<span>This is Node 1</span>
<span>node 2</span>
<span>, node 3, and </span>
<span>node 4</span>
</div>
it would work and return <span> container. Not possible in other way, I think.
You can eventually make some JavaScript split() or regex operations.
If you're just trying to work out the text of the element you clicked, minus child nodes text, I have a solution:
$('body').on('click', function(e) {
alert('Node Text: '+$(e.target).clone().children().remove().end().text());
});
http://jsfiddle.net/xoegujqu/1/
Essentially, delegate the click event to the highest-level element you want this to run for (in this example it just used body, but you'll probably want to be more specific). use $(e.target) to get the element that was actually clicked, .clone() to clone it so you can modify it without affecting the actual page content, .children().remove() to remove all it's descendant elements, .end() to go back to the previous jQuery selector object, then finally .text() to get the remaining text content.
check out even bubbling / propagation
Also: https://developer.mozilla.org/en-US/docs/Web/API/EventTarget.addEventListener
useCapture section
It is not possible to do this as far as I know. You cannot:
Detect events on text nodes.
Detect the position of the text node relative to window or page.
This answer gives an idea with some good insight, but does not do what you want (return the index of the node).
I believe you are out of luck, unless you can find a way to use the solution above to determine index.

How can I restyle a word when rendering a pdf with pdf.js?

When I runder a pdf with pdf.js I would like to be able to restyle some words (e.g. highlight) is this possible?
Yes, it is possible to highlight words while working with PDF.js
As a page contains
a canvas (for the rendered content)
an HTMLDivElement (for the non-rendered text content)
you can use the latter one to select text elements.
Having access to the Selection API on your browser, you can get the selection via document.getSelection().
The following code demonstrates how to do that if the selected text does not (internally) span across multiple HTMLElements:
var s = document.getSelection();
var oldstr = s.anchorNode.textContent;
var textBeforeSelection = oldstr.substr(0, s.anchorOffset);
var textInsideSelection = oldstr.substr(s.anchorOffset, s.focusOffset - s.anchorOffset);
var textAfterSelection = oldstr.substr(s.focusOffset, oldstr.length - s.focusOffset);
foo.anchorNode.parentElement.innerHTML
= textBeforeSelection
+ "<span class='highlight'>"
+ textInsideSelection
+ "</span>"
+ textAfterSelection;
For a selection that spans multiple (internal) HTMLElements you might be able to traverse the DOM starting at s.anchorNode by successively calling nextSibling until you reach at s.focusNode.
I say might, because elements can be positioned in the document in a different order than the one they have on the view.
Assuming s.anchorNode is not s.focusNode,
s.anchorNode would be highlighted from 0 to s.anchorOffset
s.focusNode from s.focusOffset until the end of the node and
all nodes between them could be highlighted entirely
This works (or at least might work) for text nodes - the idea could be extended to non-text nodes by surrounding each non-text node with the highlighting span.

Save the reference $(this) to DB

I'm not sure how to ask this question correct since my understanding of the DOM is lacking.
What I'm trying to do is to catch any click event on any given DOM element. I then want to save the element type as well as the complete reference to element in a Database. But I'm not sure this is at all possible?
What i want to achieve is to save a hole interaction with a web app, in a way so you can later replay every action performed on the site, in a given session.
I have tried different approaches like getting the X and Y position of the clicked element, and later on trigger a click on those x-y coordinates, but theres several problems with this approach. I've also tried to traverse the Dom backwards until i reach the body tag, to build a unike selecter, but this also have it's shortcomings.. The best solution i can think of would be to save what ever $(this) contains.
If click events are the only thing you want to track, you probably want to add click event handlers to every clickable element on the page.
This would require starting at the <body> and walking the DOM, adding handlers as you go.
At the same time, I'd add a new data-xpath attribute to each element containing an XPath selector so you can use it in your handler to note the element being clicked, and so replay the user's interaction.
See http://www.w3schools.com/xpath/xpath_intro.asp for an introduction to XPath.
Faling a sleep yesterday i got an idea and ended up with this code today. - It works as intended but I'm guessing that Xpath would perform better!?
$(document).click(function(event) {
var target = $(event.target);
var parents = target.parents();
var myParents = '';
$($(parents).get().reverse()).each(function(key, value){
var parentIndex = $(this).index()+1;
myParents += $(this).prop("tagName")+':NTH-CHILD('+parentIndex+') > ';
});
var childIndex = target.index()+1;
var childTag = target.get(0).tagName;
myParents += childTag+':NTH-CHILD('+childIndex+')';
alert(myParents);
});
The above code will return a unique selector-string likes this:
HTML:NTH-CHILD(1) > BODY:NTH-CHILD(2) > SECTION:NTH-CHILD(1) > UL:NTH-CHILD(1) > LI:NTH-CHILD(3) > A:NTH-CHILD(1)

Obtaining a DOM Range by clicking anywhere within an Element

Given the following HTML...
<p>Today is <span data-token="DateTime.DayOfWeek">$$DayOfWeek$$</span>,
</p>
<p>Tomorrow is the next day, etc, etc....</p>
Clicking on $$DayOfWeek$$ returns a DOM Range object (via a component, which is a WYSIWIG editor bundled with KendoUI).
I can then access the entire Element like so...
var element = range.startContainer.parentElement;
console.log(element);
which outputs...
<span data-token="DateTime.DayOfWeek">$$DayOfWeek$$</span>
What i am trying to figure out is how to construct a Range object that consists of the entire Element, as a Range.
The desired 'high level' behaviour is to single click a piece of text, and have the browser select all the text within that element, returning a Range object.
Happy to accept a jQuery solution.
HTML
<p>Today is <span data-token="DateTime.DayOfWeek">$$DayOfWeek$$</span>,</p>
<p>Tomorrow is the next day, etc, etc....</p>
JS
var span = document.querySelector('[data-token]');
span.addEventListener('click', function() {
var sel = window.getSelection();
var range = document.createRange();
sel.removeAllRanges();
range.setStart(span.childNodes[0], 0);
range.setEnd(span.childNodes[0], span.innerText.length);
sel.addRange(range);
});
Here's a fiddle for you:
http://jsfiddle.net/V66zH/2/
It' might not be super cross browser, but works in chrome. See JavaScript Set Window selection for some additional optimizations elsewhere.
Also assumes only one childNode as in your example html
Some additional reference for Ranges (https://developer.mozilla.org/en-US/docs/Web/API/range) and Selections (https://developer.mozilla.org/en-US/docs/Web/API/Selection)
here is a way i came up with that seems to work if i understand you correctly, that you want the element surrounding a click to produce a range containing everything in that element.
without the onclick code, which i assume you can handle, here is the DOM range code you describe:
var sel=document.getSelection(); //find the node that was clicked
var rng=sel.getRangeAt(); //get a range on that node
//now, extend the start and end range to the whole element:
rng.setStart(rng.startContainer.parentNode.firstChild);
rng.setEndAfter(rng.endContainer.parentNode.lastChild);
//DEMO: verify the correct range using a temp div/alert:
var t=document.createElement("div");
t.appendChild(rng.cloneContents());
alert(t.innerHTML);

Javascript get range compared to a parent element

I have a function that return an array (won't work in IE) with two elements
the html code of what the user select inside a div (id=text)
the range of the selection
In case the user select a simple string inside the text div the range return the correct values but when the user select a string inside an element child of div (div#text->p for example) range's values are related to the child element but i want them to be related to the parent (div#text)
Here there's a JsFiddle http://jsfiddle.net/paglia_s/XKjr5/: if you select a string of normal text or normal text + bolded text in the teatarea you'll get the right selection while if you select the bolded word ("am") you'll get the wrong one because the range is related to the child element.
There's a way to do so that the range is always related to div#text?
You could use my Rangy library and its new TextRange module, which provides methods of Range and selection to convert to and from character offsets within the visible text of a container element. For example:
var container = document.getElementById("text");
var sel = rangy.getSelection();
if (sel.rangeCount > 0) {
var range = sel.getRangeAt(0);
var rangeOffsets = range.toCharacterRange(container);
}
rangeOffsets has properties start and end relative to the visible text inside container. The visible text isn't necessarily the same as what jQuery's text() method returns, so you'll need to use Rangy's innerText() implementation. Example:
http://jsfiddle.net/timdown/KGMnq/5/
Alternatively, if you don't want to use Rangy, you could adapt functions I've posted on Stack Overflow before. However, these rely on DOM Range and Selection APIs so won't work on IE < 9.
If you don't want to use a library here is a way which worked for me.
The function returns the cursor offset relative to the textContent of the given node (not in relation to the sub nodes).
Note: The current cursor position must lie in the given node or in any of its sub-nodes.
It's not cross-browser compatible (specially not for IE), but I think it's not much work to fix that as well:
function getCursorPositionInTextOf(element) {
var range = document.createRange(),
curRange = window.getSelection().getRangeAt(0);
range.setStart(element, 0);
range.setEnd(curRange.startContainer, curRange.startOffset);
//Measure the length of the text from the start of the given element to the start of the current range (position of the cursor)
return document.createElement("div").appendChild(range.cloneContents()).textContent.length;
}

Categories