Using jQuery on ajax response triggers additional network requests - javascript

I am writing a small script that takes a bunch of links from a page, fetches them and scours the results for some data.
E.g. like this:
let listLinks = $('.item a');
listLinks.each(function() {
let url = this.href;
fetch(url, {
credentials: 'include'
})
.then(response => response.text())
.then(function(html) {
let name = $('#title h1', html);
})
});
My problem is the fact that once we reach selector on the response the network tab in my browser's dev-tools lights up with requests for a ton of resources, as if something (jquery?) is just loading the entire page!
What the hell is going on here?
I don't want to load the entire page(resources and all), I just want to take a bunch of text from the html response!
Edit: After some more scrutiny, I discovered it only makes network requests for any images on the ajaxed page, but not scripts or stylesheets.
It does not make these requests if I try to process the html in another way - say, call .indexOf() on it. Only if I decide to traverse it via jquery.
Edit2: Poking around in dev tools, the network tab has an "initiator" column. It says this is the initiator for the requests: github code. I don't know what to make of that however...
P.S. Inb4 "just regex it".

I've discovered the cause:
My code above(relevant line):
$('#title h1', html)
is equivalent to
$(html).find('#title h1')
And $(html) essentially creates DOM elements. Actual, literal DOM objects.
When you create an <img> element(which the HTML I parse contains), the browser automatically issues a network request.
Relevant StackOverflow question:
Set img src without issuing a request
With the code in the question the created DOM elements are still associated with the current document(as noted here), therefore the browser automatically makes a request for new <img>s it doesn't have yet.
The correct solution is to create a separate document, e.g.
let parser = new DOMParser();
let doc = parser.parseFromString(html, "text/html");
let name = $('#title h1', doc);
No network requests go out in this case.
JSFiddle

The problem is that you are using fetch. Use jQuery.AJAX
$.ajax({
url: 'URL',
type: 'GET',
dataType: 'HTML',
success: function(responseHTML) {
console.log(responseHTML);
}
});

Related

How to call a fetch request and wait for it's answer inside a onBeforeRequest in a web extension

I'm trying to write a web extension that stops the requests from a url list provided locally, fetches the URL's response, analyzes it in a certain way and based on the analysis results, blocks or doesn't block the request.
Is that even possible?
The browser doesn't matter.
If it's possible, could you provide some examples?
I tried doing it with Chrome extensions, but it seems like it's not possible.
I heard it's possible on mozilla though
I think that this is only possible using the old webRequestBlocking API which Chrome is removing as a part of Manifest v3. Fortunately, Firefox is planning to continue supporting blocking web requests even as they transition to manifest v3 (read more here).
In terms of implementation, I would highly recommend referring to the MDN documentation for webRequest, in particular their section on modifying responses and their documentation for the filterResponseData method.
Mozilla have also provided a great example project that demonstrates how to achieve something very close to what I think you want to do.
Below I've modified their background.js code slightly so it is a little closer to what you want to do:
function listener(details) {
if (mySpecialUrls.indexOf(details.url) === -1) {
// Ignore this url, it's not on our list.
return {};
}
let filter = browser.webRequest.filterResponseData(details.requestId);
let decoder = new TextDecoder("utf-8");
let encoder = new TextEncoder();
filter.ondata = event => {
let str = decoder.decode(event.data, {stream: true});
// Just change any instance of Example in the HTTP response
// to WebExtension Example.
str = str.replace(/Example/g, 'WebExtension Example');
filter.write(encoder.encode(str));
filter.disconnect();
}
// This is a BlockingResponse object, you can set parameters here to e.g. cancel the request if you want to.
// See: https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webRequest/BlockingResponse#type
return {};
}
browser.webRequest.onBeforeRequest.addListener(
listener,
// 'main_frame' means this will only affect requests for the main frame of the browser (e.g. the HTML for a page rather than the images, CSS, etc. that are loaded afterwards). You might want to look into whether you want to expand this.
{urls: ["*://*/*"], types: ["main_frame"]},
["blocking"]
);
Correction:
The above example only works properly if the response data fits in one chunk. If it is larger (and you still want to inspect the entirety of the response data), you would need to put all of the data into a buffer, and then work on it once all data has been received. See the document here for more information: https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webRequest/StreamFilter/ondata#webextension_examples (the code section titled "This example combines all buffers into a single buffer" would be of most interest to you I think).
In terms of using this API to block responses, data is only returned from this URL if you call filter.write(), so if you don't like the response, you can simply not call it (just call filter.close()) and an empty response will be returned. You can also only return part of the full response body by filter.write()ing only the bits that you want to return.

How to send user-created HTML to W3C validator for automatic checking—without cross-domain errors?

I am writing an application for users, in which they input valid HTML into a text field.
I have a button in jQuery which tries to load the text field area into the W3C validator:
$('#inspecthtml').on('click', function() {
var storyhtml = $('#story').text();
validatorurl= "http://validator.w3.org/#validate_by_input";
var newWin = open(validatorurl,'Validator','height=600,width=600');
newWin.onload = function() {
newWin.document.getElementById("fragment").value=storyhtml;
}
});
I get an error message in the console (using Chrome):
Unsafe JavaScript attempt to access frame with URL
http://api.flattr.com/button/view/?url=http%3A%2F%2Fvalidator.w3.org%2F&title=View%20W3C-Validator%20on%20flattr.com&
from frame with URL http://validator.w3.org/#validate_by_input. The
frame being accessed set 'document.domain' to 'flattr.com', but the
frame requesting access did not. Both must set 'document.domain' to
the same value to allow access.
I attribute this to the cross domain security (see Unsafe JavaScript attempt to access frame with URL)
My question: Is there a way to send the data to the validator, so my users can check their own mark-up?
I think the code snippet below will you can get the same effect and user experience you’re after.
It’s written using jQuery’s $.ajax(…) with some DOMParser and document.write(…) to put the styled results and UI of the W3C HTML Checker into a new window the way it seems you want.
var validator_baseurl= "https://validator.w3.org/nu/";
var validator_requesturl = validator_baseurl
+ "?showsource=yes&showoutline=yes";
$.ajax({
url: validator_requesturl,
type: "POST",
crossDomain: true,
data: storyhtml,
contentType: "text/html;charset=utf-8",
dataType: "html",
success: function (response) {
var results = (new DOMParser()).parseFromString(response, "text/html");
results.querySelector("link[rel=stylesheet]").href
= validator_baseurl + "style.css";
results.querySelector("script").src
= validator_baseurl + "script.js";
results.querySelector("form").action
= validator_requesturl;
var newWin = window.open("about:blank",
"Checker results", "height=825,width=700");
newWin.document.open();
newWin.document.write(results.documentElement.outerHTML);
newWin.document.close();
newWin.location.hash = "#textarea";
setTimeout(function() {
newWin.document.querySelector("textarea").rows = "5";
}, 1000)
}
});
Explanation
causes a POST request to be sent to the W3C HTML Checker
makes the storyhtml text the POST body
makes text/html;charset=utf-8 the POST body’s media type (what the checker expects)
causes the checker to actually check the storyhtml contents automatically
shows the checker results in a new window right when it’s first opened, in one step (so your users don’t need to do a second step to manually submit it for checking themselves)
replaces relative URLs for the checker’s frontend CSS+JS with absolute URLs (otherwise in this “standalone window” context, the CSS wouldn’t get applied, and the script wouldn’t run)
newWin.location.hash = "#textarea" is needed to make the checker show the textarea
Notes
intentionally uses the current W3C HTML Checker (not the legacy W3C markup validator)
intentionally sends the content to be checked as a POST body, not multipart/form-data); the checker supports multipart/form-data but making it a POST body is easier and better
the setTimeout textarea bit isn’t required; I just put it to make the results visible without scrolling (bottom part of new window below textarea); you can of course remove it if you want
sets the new window’s height and width a bit larger than the 600x600 in the question’s original code; again, I just did that to make things easier to see; change them however you want
uses standard DOM ops that may have better jQuery methods/idioms (I don’t normally use jQuery, so I can imagine there are ways to streamline the code in it further around JQuery)
could of course also be done without using jQuery at all—using standard Fetch or XHR instead (and I’d be happy to also add examples here that use Fetch and XHR if desired)
tested & works as expected in Edge, Firefox, Chrome & Safari; but as with any code that uses document.open, Safari users need to unset Preferences > Security > Block pop-up windows

jquery load() equivalent for offline use

I am looking for an equivalent to jquery's load() method that will work offline. I know from jquery's documentation that it only works on a server. I have some files from which I need to call the html found inside a particular <div> in those files. I simply want to take the entire site and put it on a computer without an internet connection, and have that portion of the site (the load() portion) function just as if it was connected to the internet. Thanks.
Edit: BTW, it doesn't have to be js; it can be any language that will work.
Edit2:
My sample code (just in case there are syntax errors I am missing; this is for the files in the same directory):
function clickMe() {
var book = document.getElementById("book").value;
var chapter = document.getElementById("chapter").value;
var myFile = "'" + book + chapter + ".html'";
$('#text').load(myFile + '#source')
}
You can't achieve load() over the file protocol, no other ajax request is going to work for html files. I have tried even with the crossDomain and isLocale option on without anything success, even if precising the protocol.
The problem is that even if jQuery is trying the browser will stop the request for security issues (well most browsers as the snippet below works in FF) as it allows you to load locale file so you could get access to a lot of things.
The one thing you could load locally is javascript files, but that probably means changing a lot of the application/website architecture.
Only works in FF
$.ajax({
url: 'test.html',
type: 'GET',
dataType: 'text',
isLocale: true,
success: function(data) {
document.body.innerHTML = data;
}
});
What FF does well is that it detect that the file requesting local files is on the file protocol too when other don't. I am not sure if it has restriction over the type of files you can request.
You can still use the JQuery load function in this context:
You would could add an OfflineContent div on your page:
<div id="OfflineContent">
</div>
And then click a button which calls:
$('#OfflineContent').load('OfflinePage.html #contentToLoad');
Button code:
$("#btnLoadContent").click(function() {
$('#OfflineContent').load('OfflinePage.html #contentToLoad');
});
In the OfflinePage.html you could have to have another section called contentToLoad which would display on the initial page.

Parsing Ajax loaded HTML content in IE

I have seen similar questions around, but none of them seem to have answers that help my case...
Basically, I want to load some HTML in using $.ajax() (Which is on a different domain), and have it parsed into it's own DOM, so I can apply attributes and manipulate HTML in my actual window DOM.
$.ajax({
type: 'GET',
url: 'http://example.com/index.html',
dataType: 'html',
crossDomain: true,
cache: false,
success: function(data)
{
var src = $('body img', data).first().attr("src");
//also tried: var src = $('body', $(data)).first().attr("src");
$('#someDiv img').attr("src", src);
}
});
Where an example HTML file is:
<html>
<body>
<img src="someurl"></img>
</body>
</html>
It works in Firefox, but not IE, no matter what I try, whenever I try to parse and read, it returns null.
Any suggestions?
EDIT:
It appears there was some ambiguity with my question. The issue is the parsing, not the AJAX. The AJAX returns the html string correctly, but jQuery fails to parse it.
EDIT 2:
I found a 'solution', but it isn't nearly as nice as I wanted it to be, it chopping and sorting through the HTML string, and extracting data, rather than applying it to a DOM. Seems to run efficiently, as I can predict the order of data.
Boiled down, it is something like this:
var imgsrcs = new Array(5);
var searchItem = '<img src="';
for (var a=0; a<5; a++) {
var startLoc = data.search(searchItem) + searchItem.length;
for (var i=0; i<data.length; i++) {
if (data.charAt(startLoc + i) == '"')
break;
imgsrcs[a] += data.charAt(startLoc + i);
}
data = data.substring(startLoc + i, data.length);
}
$('.image').each(function(i) {
$(this).attr("src", imgsrcs[i]);
});
Fairly ugly, but I solved my problem, so I thought I may as well post it.
This is a Same Origin Policy problem.
The crossDomain flag in jquery's ajax function doesn't automatically make cross domain requests work in all browsers (not all browsers support CORS). Since you're requesting this from a different domain, a normal request won't actually be able to read the data (or even make the request).
Normally, for json data, you can do JSONP, which is what the crossDomain often flag enables. However, JSON is unique because it can be natively read in javascript. Since HTML cannot be read, you'd need to wrap it in parseable javascript to employ a trick like JSONP.
Rather than do that on your own, though, I'd highly suggest that you look into the easyXDM library in order to do cross domain messages like this. You'd essentially open up a hidden iframe on the other domain, and pass messages back and forth between the parent and the hidden frame. And, since the hidden frame is on the same domain as the html, it will have no problem ajaxing for it.
http://easyxdm.net/wp/

prototype ajax fetching images from other domain

i have two paths like:
a) localhost/firstapplication/
b) localhost/secondapplication/images
in firstapplication i do a ajax-request to secondapplication/html/index.html. e.g. i fetch the whole responsetext.
in secondapplication there are some img-tags:
<img src="../images/testpicture.png" alt="test" />
my problem: if i append the whole responsetext my browser is looking for the images.. the link is relative, wich means in: firstapplication/images.
But i want the images of the secondapplication.
Is there any way to get them really easy? Or do i have to change all values of the src-attributes in each img tag from "../images" to a fix path like "localhost/secondapplication/images/"?
thanks for support.
im working with prototype js 1.7 and i'd prefere a solution with this framework. thanks!
If firstapplication and secondapplication are on different domains the AJAX will not work due to Same Origin Policy. As such, I have not given a response to your image problem because once deployed on live your code will not work.
I see a few possibilities
Use an iframe instead of AJAX.
Have the second domain serve absolute URLs.
Manipulate the URLs when the AJAX completes.
new Ajax.Updater('secondapplication/html/index.html', 'ELEMENT_ID', {
onSuccess: function(response){
var receiver = $(this.container.success);
var otherDomain = 'http://localhost/secondapplication/';
var selector = '[src]:not([src^=/]):not([src^=http])';
receiver.select(selector).each(function(element) {
element.src = otherDomain+element.readAttribute('src');
});
selector = '[href]:not([href^=/]):not([href^=http]):not([href^=#])';
receiver.select(selector).each(function(element) {
element.href = otherDomain+element.readAttribute('href');
});
}
});
// otherDomain must end in a solidus, /
// not tested

Categories