Convert GET HTML response into document - javascript

I am developing a page that reads the source of another page and I need to extract certain information out of that page. I currently have the project snagging the live source with the data however I cannot for the life of me figure out how to convert this string into a document.
My rationale for using a document is that I need to use getElementById etc to get the value of these items.
What have I tried?
Assigning the HTML to an invisible div on my page. This kind of works though it doesn't render the entire HTML string and provides a "shorter" rendition of this page.
Manually finding the substrings. As you can imagine this is a crappy way to do things and provides very unreliable results.
DOM parser to convert the doc and then query it but that fails miserably.
Any assistance at all would be seriously appreciated.
pertinent code:
$.ajax({
method: "GET",
dataType: '',
crossDomain: true,
xhrFields: {
withCredentials: true
},
success: function(res) {
//shows the entire source just fine.
console.log("Value of RES: " + res);
bootbox.hideAll();
//shows a "truncated" copy of the source
alert(res);
$("#hiddendiv").html(x);
var name = document.findElementById("myitem");
alert(name);
},

Create a hidden IFRAME on your document. Then set the contents of that IFRAME to the HTML that you want to query. Target that IFRAME with your javascript when you do your querying. See How can I access iframe elements with Javascript? to understand how.

Another (probably better) option, is to use jQuery. jQuery allows you to create HTML, manipulate it, and query against it in memory. Querying DOM elements in jQuery is even easier than it is in pure javascript. See: http://jquery.com/.
//Get a jQuery object representing your HTML
var $html = $( "<div><span id='label'></span></div>" );
//Query against it
var $label = $html.find( "#label" ); //finds the span with and id of 'label'

Related

How to examine the contents of data returned from an ajax call

I have an ajax call to a PHP module which returns some HTML. I want to examine this HTML and extract the data from some custom attributes before considering whether to upload the HTML into the DOM.
I can see the data from the network activity as well as via console.log. I want to extract the values for the data-pk attribute and test them before deciding whether to upload the HTML or just bypass it.
$.ajax({
url: "./modules/get_recent.php",
method: "POST",
data: {chat_id:chat_id, chat_name:chat_name, host_id:host_id, host_name:host_name}, // received as a $_POST array
success: function(data)
{
console.log(data);
},
})
and some of the console log data are:
class="the_pks" data-pk="11"
class="the_pks" data-pk="10"
etc.
In the above data I want to extract and 'have a look at' the numbers 11 and 10.
I just do not know how to extract these data-pk values from the returned data from the ajax call. Doing an each on class 'the_pks' does not work as at the time I am looking at the data they have not been loaded into the DOM.
I have searched SO for this but have not come up with an answer.
Any advice will be most appreciated.
I hope I understand your question.
If you get a HTML as a response, you can always create an element and insert that html inside it, without adding it to the DOM. And after that you can search it as you do with a DOM element.
const node = document.createElement("div");
//then you can do
node.appendChild(data);
// or
node.innerHTML = data;
And after that you can search inside the node:
node.querySelectorAll("[data-pk]")
I will re-engineer this - it was probably a clumsy way to try and achieve what I wanted anyway. Thanks to those who tried to help.

How to extract XML tag contents using jquery?

My ASP.NET MVC4 controller returns an XML string, when we pass it a SERIAL. Now when I send a request using C#, it works fine, XML string comes back , and looks like
<CalculatedCode> 12312312 </CalculatedCode>
I need to also do it via jquery like below. Query is working but it is returning an XMLDocumentObject , instead of an xml string. I looked at Jquery documentation to try to parse it, but I'm a jquery noob and I'm sure I'm making an error in the code.
$.ajax({
url: '#Url.Action("Index", "Home")',
type: 'GET',
dataType: 'xml',
data: { SERIAL: serial}, //SERIAL comes from a textbox
success: function (responseData) {
//Code below is messed up, it simply needs to find the CalculatedCode tag
//and extract the value within this tag
xmlDoc = $.parseXML(response);
$xml = $(xmlDoc);
$thecode = $xml.find("CalculatedCode");
// ToDo: Bug stackoverflow members with this noob question
}
});
Thank you very much :)
It's already parsed when you set the dataType to XML, so no need for $.parseXML, but if the element is a root element, find() doesn't work, as it only finds children, you'll need filter()
$xml = $(responseData);
$thecode = $xml.filter("CalculatedCode").text();
an trick that gets the element either way is to append the xml to another element :
$xml = $('<div />').append(responseData);
$thecode = $xml.find("CalculatedCode").text();

Loading ASPX page with Javascript intact, with a twist

I am writing a single page web application for Microsoft SharePoint.
I'd like to pull in content with $.get(), but I've run into a bit of a catch 22.
If I pull in the content like this:
function getLocalPage(url, callback) {
$.get(url, function(data) {
var $page = $(data).filter('.tileContent').html();
callback($page);
});
}
I get the node I'm looking for, but my script tags have been stripped out.
If I pull in content like this:
(reference to: jquery html() strips out script tags )
function getLocalPage(url, callback) {
$.get(url, function(data) {
var dom = $(data);
dom.filter('script').each(function(){
$.globalEval(this.text || this.textContent || this.innerHTML || '');
});
var $page = dom.filter('.tileContent');
callback($page);
});
}
The javascript embedded in SharePoint blows my page up, and seems to cause a full postback.
Is there any way to get only the node I would like, with the script tags intact?
Can't seem to have it both ways.
Rather than the shorthand jQuery method .get(), try .ajax() with dataType: 'html'
The .ajax() documentation says of dataType: 'html' :
"html": Returns HTML as plain text; included script tags are evaluated when inserted in the DOM.
It also says :
"If 'html' is specified, any embedded JavaScript inside the retrieved data is executed before the HTML is returned as a string."
The emphasis here is on the word "before" meaning that the embedded JavaScript, when executed, cannot act directly on the HTML with which it is delivered (or a resulting DOM-fragment), though it can act directly on the existing DOM, prior to fragment insertion.
However, any functions included in the embedded JavaScript do indeed become available to act on the HTML/DOM-fragment and the DOM at large, if called later. The first opportunity to call such functions is in the .ajax() success handler (or chained .done() handler).
Unless extraordinary (and potentially messy) measures are taken, the success handler (or any functions called by it) will need prior "knowledge" of the names of any functions that are delivered in this way, otherwise (realistically) they will be uncallable.
I'm not sure about code delivered inside a $(function() {...}); structure, which may execute when the current event thread has completed(?). If so, then it could potentially be made to act on the delivered HTML/DOM-fragment. By all means try.
With all that in mind, try the following approach, with appropriately phrased JavaScript in the response.
function getLocalPage(url, callback) {
$.ajax({
url: url,
dataType: 'html',
success: function(data) {
callback($(data).find('.tileContent'));
}
});
}
or better :
function getLocalPage(url, callback) {
$.ajax({
url: url,
dataType: 'html'
}).done(function(data) {
callback($(data).find('.tileContent'));
});
}
Notes:
I changed .filter() to .find(), which seems more appropriate but you may need to change it back.
Your final version may well have further .ajax() options.

Update link text with content from XML document

For each link, e.g.
http://www.wowhead.com/item=78363
I'd like to retrieve the ID at the end of the URL in the href attribute. For example, 78363, as seen above. Using this ID, I'd like to retrieve an XML page and get data from it based on the ID. The URL of the XML document is the same as the link to the item, but ending with &xml, so:
http://www.wowhead.com/item=78363&xml
From XML page I need the value inside the CDATA section seen below:
<name>
<![CDATA[Vagaries of Time]]>
</name>
That is, "Vagaries of Time". Then I need to insert the name inside the tag:
Vagaries of Time
How can I accomplish this?
Loop through the links based on a regular expression, send an Ajax request, parse the text using regular expressions, and done.
$('a').each(function() {
var $this = $(this);
if(/item=\d+$/.test(this.href)) {
$.ajax({
url: this.href + '&xml',
dataType: 'text',
success: function(data) {
$this.text(/<!\[CDATA\[([^\]]+)/.exec(data)[1]);
}
});
}
});
You'll most likely want to add some error-checking, of course. Also, if your XML document is more complex than the example you posted, consider parsing using the native XML capabilities instead - I just used a regular expression for simplicity there.

jQuery, Ajax and getting a complete html structure back

I'm new to jQuery and to some extent JavaScript programming. I've successfully started to use jQuery for my Ajax calls however I'm stumped and I'm sure this is a newbie question but here goes.
I'm trying to return in an Ajax call a complete html structure, to the point a table structure. However what keeps happening is that jQuery either strips the html tags away and only inserts the deepest level of "text" or the special characters like <,>, etc get replaced with the escaped ones
I need to know how to turn off this processing of the received characters. Using firebug I see the responses going out of my WebServer correctly but the page received by the user and thus processed by jQuery are incorrect. A quick example will so what I mean.
I'm sending something like this
<results><table id="test"><tr>test</tr></table></results>
what shows up on my page if I do a page source view is this.
<results><table....
so you can see the special characters are getting converted and I don't know how to stop it.
The idea is for the <results></results> to be the xml tag and the text of that tag to be what gets placed into an existing <div> on my page.
Here is the JavaScript that I'm using to pull down the response and inserts:
$.post(url, params, function(data)
{
$('#queryresultsblock').text(data)
}, "html");
I've tried various options other than "html" like, "xml", "text" etc. They all do various things, the "html" gets me the closest so far.
The simplest way is just to return your raw HTML and use the html method of jQuery.
Your result:
<table id="test"><tr>test</tr></table>
Your Javascript call:
$.post(url, params, function(data){ $('#queryresultsblock').html(data) })
Another solution with less control — you can only do a GET request — but simpler is to use load:
$("#queryresultsblock").load(url);
If you must return your result in a results XML tag, you can try adding a jQuery selector to your load call:
$("#queryresultsblock").load(url + " #test");
You can't put unescaped HTML inside of XML. There are two options I see as good ways to go.
One way is to send escaped HTML in the XML, then have some JavaScript on the client side unescape that HTML. So you would send
<results><results><table....
And the javascript would convert the < to < and such.
The other option, and what I would do, is to use JSON instead of XML.
{'results': "<table id="test"><tr>test</tr></table>" }
The JavaScript should be able to extract that HTML structure as a string and insert it directly into your page without any sort of escaping or unescaping.
The other thing you could do is create an external .html file with just your HTML code snippet in it. So create include.html with
<results><table id="test"><tr>test</tr></table></results>
As the contents, then use a jquery .load function to get it onto the page. See it in action here.

Categories