GM_xmlhttpRequest's responsetext is missing some HTML

GM_xmlhttpRequest's responsetext is missing some HTML - javascript

If I go to this Google Maps page, some of the HTML is missing in View Source, but shows up in Firebug.
Likewise, when that same URL is passed to my function, the following HTML does not show up in the responseText, but it does show in Firebug when I open the page.
<a id="mapmaker-link" class="kd-button mini left" style="" href="https://www.google.com/mapmaker?ll=41.06877,-112.047203&spn=0.038696,0.132093&t=h&z=14&vpsrc=0&q=1093+W+3090+S,Syracuse,+UT&utm_medium=website&utm_campaign=relatedproducts_maps&utm_source=mapseditbutton_normal">
Here is the function I'm using:
function updateMap(url) {
GM_xmlhttpRequest(
{
method: 'GET',
url: url,
onload: function(resp) {
var ll = resp.responseText.split("mapmaker?")[1];
ll = ll.split("&")[0];
document.getElementById('googlemap').href = url+"&"+ll;
}
});
}
I have placed a sample responseText value at pastebin.com/Tt8nrzG8.

The response is "missing" HTML because the called page loads that HTML (and almost all of the page's content) via AJAX.
GM_xmlhttpRequest (and all other current AJAX methods) only gets the static source of a given page. Such XHR requests cannot process a requested page's javascript, like a browser does when you browse to the page.
In fact, if you save that sample responseText, that you linked, as an HTML file; you'll see it looks like this:
See "How to get an AJAX get-request to wait for the page to be rendered before returning a response?", for the same type of problem. But note that the answer recommends that you use an API, if one is available.
So, use the Google Maps API to get the lat/long you want for your URL.
Or, the easiest approach is still to have the script also run on Google maps pages and do a one-time zoom on links with your special URL parameter -- like I recommended on your previous question. This has the added advantage that no calls to Google are made/needed until you actually decide to click your Google Maps link.
If you do opt for the iframe approach (again, NOT recommended for ANY Google site), beware that you will need to adjust the URL to tell Google to allow iframing and the lat/long information will be in a different part of the page.

Related

Manifest V3 web extension overwrite response body

How would I overwrite the response body for an image with a dynamic value in a Manifest V3 Chrome extension?
This overwrite would happen in the background, as per the Firefox example, (see below) meaning no attaching debuggers or requiring users to press a button every time the page loads to modify the response.
I'm creating a web extension that would store an image in the extension's IndexedDB storage and then override the response body with that image on requests to a certain image. A redirect to a dataurl: I have it working in a Manifest V2 extension in Firefox via the browser.webRequest.onBeforeRequest api with the following code, but browser.webRequest and MV2 are depreciated in Chrome. In MV3, browser.webRequest was replaced with browser.declarativeNetRequest, but it doesn't have the same level of access, as you can only redirect and modify headers, not the body.
Firefox-compatible example:
browser.webRequest.onBeforeRequest.addListener(
(details) => {
const request = browser.webRequest.filterResponseData(details.requestId);
request.onstart = async () => {
request.write(racetrack);
request.disconnect();
};
},
{
urls: ['https://www.example.com/image.png'],
},
['requestBody', 'blocking']
);
The Firefox solution is the only one that worked for me, albeit being exlusive to Firefox. I attempted to write a POC userscript with xhook to modify the content of a DOM image element, but it didn't seem to return the modified image as expected. Previously, I tried using a redirect to a data URI and an external image, but while the redirect worked fine, the website threw an error that it couldn't load the required resources.
I'm guessing I'm going to have to write a content script that injects a Service Worker (unexplored territory for me) into the page and create a page rule that redirects, say /extension-injected-sw.js to either a web-available script, but I'm not too sure about how to pull that off, or if I'd still be able to have the service worker communicate with the extension, or if that would even work at all. Or is there a better way to do this that I'm overlooking?
Thank you for your time!

Google Apps Script: Getting a URL parameter to use in Javascript

I have a web app with Google Apps Script and would like to take a URL parameter and use it in modifying my HTML via Javascript, but am finding this tricky.
If I try using window.location in my Javascript it gives a different URL than the one shown in the address bar. The URL shown in the address bar is like this ... https://script.google.com/macros/s/MY_SCRIPT_ID/exec?param1=value1 .... but window.location gives something like this https://SOME_SORT_OF_LONG_ID-script.googleusercontent.com/userCodeAppPanel (it doesn't have param1 / value1 at all).
I know how to get the parameter value when I'm in the doGet(e) function -- by using e.parameter.param1 -- but I don't know how to be able to then subsequently use that value in some Javascript.
Help, please!

The html that GAS provides is never the actual URL, it is essentially another ID that google uses to keep track of its web pages. Remember that all Google apps are running on the Google server.
This may not be the same with a standalone script, but I suspect it will be, but I know if you get a google doc, the actual URL is:
https://docs.google.com/document/d/{{{ Your Document ID }}}
I expect a standalone app will be similar. Try using your webapp.getId(), and then adding it to the actual url of your script.

$.get() doesn't work with redirections

I'm writing an extensions for google chrome that needs searching the contents of all urls in google search page.
For example after searching jquery in google search box I want to see title tag of all links in the result page. I'll get all links with var links=$('a') then I'm trying to use jquery get() function as the way bellow but it doesn't give me the right result:
$.get($('a')[i], function(data) {
console.warn(data);
});
and the result is:
<script>window.googleJavaScriptRedirect=1</script><META name="referrer" content="origin"><script>var m={navigateTo:function(b,a,d){if(b!=a&&b.google){if(b.google.r){b.google.r=0;b.location.href=d;a.location.replace("about:blank");}}else{a.location.replace(d);}}};m.navigateTo(window.parent,window,"https://www.facebook.com/r.php");</script><noscript><META http-equiv="refresh" content="0;URL='https://www.facebook.com/r.php'"></noscript>

AJAX $.get() works with normal HTTP redirects.
The problem you have is that there is a JavaScript redirect on the page you are trying to load with $.get(). The code on the requested page will never run, so the redirect never happens.

Reload part of the page

Please, help me!
Can you explain me how to reload part of the page WITH change URL, but it must will not including hash tag.
some code:
URL: mysite.com/first/
<html>
<body>
<h1>RELOAD!</h1>
<div>few words...</div>
<input type="button">
<body>
</html>
want change to:
URL: mysite.com/second/
<html>
<body>
<h1>RELOAD!</h1>
<div>other few words...</div>
<input type="button">
<body>
</html>
How I can do reload content only in DIV on another?
I was see one more examples, like this next:
if (location.href.indexOf("#") > -1)
location.assign(location.href.replace('#', "/"));
BUT - it reload all page!
.htaccess - I cannot use, 'cause char "#" and next text not to sent to the server.
Also I have seen code:
history.pushState(null, null, '#myhash');
if(history.pushState) {
history.pushState(null, null, '#myhash');
}
else {
location.hash = '#myhash';
}
but cannot understand it right.
Maybe there is an other right way how to do it.

There are actually two different problems here:
How to load content from the server and display it in the existing page, without reloading the whole page
How to make it look like you are on a new URL, without reloading the page
Neither of these have anything to do with .htaccess (by which is generally meant Apache's mod_rewrite module) because they are both about how the client is behaving, not the server.
The first part is generally referred to as "AJAX", about which you will find tons of information online. The "X" originally stood for "XML", but actually you can fetch whatever kind of data you want, such as plain text, or a piece of ready-made HTML, and use JavaScript to put it into place on your page. The popular jQuery library has a method called .load(), which makes a request to the server, and uses the response to replace a particular part of the page.
The second part is a little trickier - since the page hasn't actually been reloaded, you essentially want the browser to lie about the current URL. The reason you will see a lot of examples changing only parts of the URL after the # is precisely because these aren't sent to the server; traditionally, they're used to scroll the current page to a paticular "anchor". You can therefore change them as often as you like, and if the user bookmarks or shares your page, you can look at the part after the # and re-load the state they bookmarked/shared.
However, as part of the "HTML5" group of technologies, an ability was added to change the actual URL bar of the browser, by "pushing a state" to the history object. In other words, adding an entry to the back/forward menu of the browser, without actually loading a new page. There are obvious security restrictions (you can't pretend the user navigated to a completely different domain), but the API itself is quite simple. Here is the MDN documentation explaining it.
For your simple example, assuming jQuery has been included, you might do something like this:
// Find the div with a jQuery selector; this would be more specific in reality
jQuery('div')
// Request some text from the server to replace the div
.load(
// This URL can be anything that generates the appropriate HTML
'/ajax.php?mode=div-content&stage=second',
// Add a callback function for when the AJAX call has finished
function(responseText, textStatus, XMLHttpRequest) {
// Inside the callback function, set the browser's URL bar
// and history to pretend this is a new page
history.pushState({}, '', '/second/');
}
);
Note that jQuery is far from the only way of doing this, it just keeps the example simple to make use of an existing function that does a lot of the work for us.

Is there a way to mitigate downloading of resources (images/css and js files) with Javascript?

I have a html page on my localhost - get_description.html.
The snippet below is part of the code:
<input type="text" id="url"/>
<button id="get_description_button">Get description</button>
<iframe id="description_container" src="#"/>
When the button is clicked the src of the iframe is set to the url entered in the textbox. The pages fetched this way are very big with lots of linked files. What I am interested in the page is a block of text contained in a <div id="description"> element.
Is there a way to mitigate downloading of resources linked in the page that loads into the iframe?
I don't want to use curl because the data is only available to logged in users and the steps to take with curl to get the content is too complicated. The iframe is simple as I use this on a box which sends the right cookies to identify the request as coming from a logged in user, but the problem is that it is very wasteful to get nearly 1 MB of data to keep 1 KB of it and throw out the rest.
Edit
If the proposed method just works in Firefox it is fine, so I added Firefox tag. Also, it is possible that the answer actually is from the realm of Firefox add-on techniques, so I added that tag as well.
The problem is not that I cannot get at what I'm looking for, rather, the problem is the easy iframe method is wasteful.
I know that Firefox does allow loading only the text of a page. If you open a page and press Ctrl+U you are taken to 'view page source' window, There links behave as normal and are clickable, if you click on a link in source view, the source of the new page is loaded into the view source window, without the linked resources being downloaded, exactly what I'm trying to get. But I don't know how to access this behaviour.
Another example is the Adblock add-on. It somehow kills elements before they get loaded. With plain Javascript this is not possible. Because it only is triggered too late to intervene in good time.

The Same Origin Policy forbids any web page to access contents of any other web page in a different domain so basically you cannot do that.
However it seems that with some browsers it is allowed to access web pages content if you are trying to access it from a local web page which seems to be your case.
Safari, IE 6/7/8 are browser that allow a local web page to do so via XMLHttpRequest (source: Google Browser Security Handbook) so you may want to choose to use one of those browsers to do what you need (note that future versions of those browsers may not allow to do so anymore).
A part from this solution I only see two possibities:
If the web pages you need to fetch content from are somehow controlled by you, you can create a simpler interface to let other web pages to get the content you need (for example allowing JSONP requests).
If the web pages you need to fetch content from are not controlled by you the only solution I see is to fetch content server side logging in from the server directly (I know that you don't want to do so, but I don't see any other possibility if the previous I mentioned are not practicable)
Hope it helps.

Actually I've seen Cross Domain jQuery .load request before, here: http://james.padolsey.com/javascript/cross-domain-requests-with-jquery/
The author claims that codes like these found on that page
$('#container').load('http://google.com'); // SERIOUSLY!
$.ajax({
url: 'http://news.bbc.co.uk',
type: 'GET',
success: function(res) {
var headline = $(res.responseText).find('a.tsh').text();
alert(headline);
}
});
// Works with $.get too!
would work. (The BBC code might not work because of the recent redesign, but you get the idea)
Apparently it is using YQL wrapped into a jQuery plugin to do the trick. Now I cannot say I fully understand what he is doing there but it appears to work, and fits the bill. Once you load the data I suppose it is a simple matter of filtering out the data that you need.
If you prefer something that works at the browser level, may I suggest Mozilla's Jetpack framework for lightweight extensions. I've not yet read the documentations in its entirety but it should contain the APIs needed for this to work.

There are various ways to go about this in AJAX, I'm going to show the jQuery way for brevity as one option, though you could do this in vanilla JavaScript as well.
Instead of an <iframe> you can just use a container, let's say a <div> like this:
<div id="description_container"></div>
Then to load it:
$(function() {
$("#get_description_button").click(function() {
$("#description_container").load($("input").val() + " #description");
});
});
This uses the .load() method which takes a string in this format: .load("url selector"), then takes that element in the page and places it's content inside the container you're loading, in this case #description_container.
This is just the jQuery route, mainly to illustrate that yes, you can do what you want, but you don't have to do it exactly like this, just showing the concept is getting what you want from an AJAX request, rather than in an <iframe>.

Your description sounds like you are fetching pages from the same domain (you said that you need to be logged in and have session credentials) so have you tried to use async request via XMLHttpRequest? It might complain if the html on a page is particularly messed up but you chould still be able to get raw text via .responseText and extract what you need with a regex.

We Keep Coding

JavaScript is the programming language of the Web.