Using Javascript to grab element on remote page - javascript

I would like to grab an element from a remote HTML page. As I am requesting data from a different domain I am using the below code to add the source as a script. Yes, very dodgy.
<script type="text/javascript">
var script = document.createElement('script');
script.setAttribute('type', 'text/javascript');
script.setAttribute('src', 'http://remoteDomain.com/page.html');
document.getElementsByTagName('head')[0].appendChild(script);
</script>
The above code fetches and appends the entire page to my document head. Seems to work okay. However now I would like to able to grab an element by ID, or even regex from this source.
Can this be done?
I am aware that the above code is dirty, so I'd be happy to receive any suggestions to clean it up!

Indeed very dodgy... But there are crossdomain AJAX tehniques that you can use. Some help here: http://usejquery.com/posts/9/the-jquery-cross-domain-ajax-guide

The above code fetches and appends the entire page to my document head.
It doesn't really, it just creates a script element of which its src points there.
It looks like you are trying to get around Same Origin Policy.
Can you use a server side proxy?

Browsers go to great lengths to prevent this being done client-side unless the site you're trying to read explicitly opts in.
Otherwise any random web page you visit could read info from your bank account, say.

Related

How to Get Access to Page's Script Context

My addon uses a content script to interact with the page. But it also needs access to the page's javascript so it can run one of the page's routines. So my content script needs access to the page's script context.
Here's what I mean.
Addon uses main.js which access content.js and uses messaging to communicate.
But the web-page (into which content.js is being injected) has it's own javascript. My content.js needs access to that context so it can fetch the values from variables there.
How can one get that?
I have been reading these mdn docs, but it seems like they are talking about an html page that you code yourself, like you would for a preferences page. But in my case I am working with an external website, not something coded just for the addon.
The approach listed on the MDN page also works for external pages, not just your own.
I.e. unsafeWindow.myPageVar will work.
This works:
var script = document.createElement("script");
script.innerHTML = "alert( myPageVar );";
document.body.appendChild( script );
Credit goes to this fellow.
I don't know whether this is the best way to do this, however. I hope that someone else more knowledgeable than me will answer.
Here's how to return a value:
var retval = unsafeWindow.SomePageFunction();
alert(retval);
It's called "unsafe" because you never know what about the page might be changed or might change. That's how it when the addon interacts with page scripts.

Missing forward slash after fqdn

So here is the situation, i'm getting an ad from my custom adserver like so
src = 'http://www.adserver.com/www/delivery/ajs.php?zoneid=1&cb=37930400855&charset=UTF-8&loc=http%3A//thissite.com/';
script = document.createElement 'script'
script.type = 'text/javascript'
script.src = src
$('.banner-container').append script
So the problem is the url is correct in the src variable it is correct when it is inserted into the dom
<script type="text/javascript" src="http://www.adserver.com/www/delivery/ajs.php?zoneid=1&amp;cb=37930400855&amp;charset=UTF-8&amp;loc=http%3A//thissite.com/"></script>
But the second the browser tries to fetch it the url changes to
http://www.adserver.comwww/delivery/ajs.php?zoneid=1&cb=37930400855&charset=UTF-8&loc=http%3A//thissite.com/
see right after the .com it strips the / so that comwww runs together, making it throw an error and of course not display what i want. I have tried uri encoding and other little things i had read or seen on stackoverflow to no avail.
Perhaps the problem is on the ad server site. They likely have a bad rewriterule, or a bad internal redirect. I have run your sample code with a different domain and it works fine.
Try visiting the js url in your browser directly, or using a command line tool like curl. Check that it is redirecting. So it is likely that the adserver.com site is redirecting badly. If they have a support contact, you should file a ticket with that company.
I am sorry that this does not directly solve your problem, but I feel that this response is a proper "answer" for this site.

update javascript function

This should be a simple problem, I just can't seem to stumble upon the right answer:
So I have a site in HTML with many pages that all link to the newest one, so I created a simple JavaScript function in a separate file:
function newest() {
window.location = "http://xxxxxxxxxx.xxx/6.html";
}
With the line:
< script type="text/javascript" src="javascript.js">< /script>
In my HTML document.
So I can update the number every time a new page is posted. The problem is that when I post a new one, the code doesn't refresh from the user side until you delete the cookies (if I replace it with 7, it will still redirect to 6).
Sorry if it is a stupid question, but everything I have looked up seems way off topic.
The cache expects your javascript to me immutable so unless you can include the file name external to your javascript then this path is not going to work... How about just creating a 'latest.html' page that is either a file system link to the original or else redirects to the latest version.
A simple client side solution would be to inject the script with different version attributes appended to it.
So HTML page can contain a script like :
var script = d.createElement('script');
script.type = 'text/javascript';
script.src = 'http://xxxxxxxxxx.xxx/javascript.js?v=' + Math.random();
d.getElementsByTagName('head')[0].appendChild(script);
Notice the random number?
where javascript.js is the one having your code:
function newest() {
window.location = "http://xxxxxxxxxx.xxx/6.html";
}
You can turn off the caching of the resources (javascript files) on the client machine by adding the instructions in your code for the web browser, not to cache. Refer to this link for how to turn off caching for your webpage.

Reading document.links from an IFrame

EDIT:
Just a quick mention as to the nature of this program. The purpose of this program is for web inventory. Drawing different links and other content into a type of hierarchy. What I'm having trouble with is pulling a list of links from a webpage within an IFrame.
I get the feeling this one is gonna bite me hard. (other posts indicate relevance to xss and domain controls)
I'm just trying something with javascript and Iframes. Basically I have a panel with an IFrame inside that goes to whatever website you want it to. I'm trying to generate a list of links from the webpage within the Iframe. Its strictly read only.
Yet I keep coming up against the permission denied problem.
I understand this is there to stop cross site scripting attacks and the resolution seems to be to set the document domain to the host site.
JavaScript permission denied. How to allow cross domain scripting between trusted domains?
However I dont think this will work if I'm trying to go from site to site.
Heres the code I have so far, pretty simple:
function getFrameLinks()
{
/* You can all ignore this. This is here because there is a frame within a frame. It should have no effect ont he program. Just start reading from 'contentFrameElement'*/
//ignore this
var functionFrameElem = document.getElementById("function-IFrame");
console.log("element by id parent frame ");
console.log(functionFrameElem);
var functionFrameData = functionFrameElem.contentDocument;
console.log("Element data");
console.log(functionFrameData);
//get the content and turn it into a doc
var contentFrameElem = functionFrameData.getElementById("content-Frame")
console.log(contentFrameElem);
var contentFrameData = contentFrameElem.contentDocument;
console.log(contentFrameData);
//get the links
//var contentFrameLinks = contentFrameData.links;
var contentFrameLinks = contentFrameData.getElementsByTagName('a');
Goal: OK so due to this being illegal and very similar to XSS. Perhaps someone could point out a solution as to how to locally store the document. I dont seem to have any problems accessing document.links with internal pages in the frame.
Possibly some sort of temp database of cache. The simpler the solution the better.
If you want to read it just for your self and in your browser, you can write a simple proxy with php in your server. the most simple code:
<?php /* proxy.php */ readfile($_GET['url']); ?>
now set your iframe src to your proxy file:
<iframe src="http://localhost/proxy.php?url=http://www.google.com"
id="function-IFrame"></iframe>
now you can access the iframe content from your (local) server.
if you want set the url with a program remember to encode the url (urlencode in php or encodeURIComponent in js)
Here is a bookmarklet you can run on any page (assuming the links are not in an iframe)
javascript:var x=function(){var lnks=document.links,list=[];for (var i=0,n=lnks.length;i<n;i++) {var href = lnks[i].href; list.push(href)};if (list.length>0) { var w=window.open('','_blank');w.document.write(list.length+' links found<br/><ul><li>'+list.sort().join('</li><li>')+'</ul>');w.document.close()}};void(x());
the other way is for you (on Windows) to save your HTML with extension .HTA
Then you can grab whatever lives in the iFrame
You might be interested in using the YQL (Yahoo Query Language) to retrieve filtered results from remote urls..
example of retrieving all the links from the yahoo.com domain

Redirect document.write from javascript script

We want to serve ads on our site but the adserver we are in talks with has issues with delivering their advertising fast enough for us.
The issue as I see it is that we are supposed to include a <script src="http://advertiserurl/myadvertkey"></script> where we want to display the ad and it will then download a script and use document.write to insert some html.
Problem is that the call to the advertiser website is slowish and the code returned then downloads another file (the ad) which means the speed of rendering our pages slows while we wait for the request to be filled.
Is there a way to take the output from the document.write call and write this in after the page has loaded?
Basically I want to do this:
<html>
<body>
<script>
function onLoad() {
var urlToGetContentFrom = 'http://advertiserurl/myadvertkey';
// download js from above url somehow
var advertHtml = // do something awesome to interprete document.write output
$('someElement').innerHTML = advertHtml;
}
</script>
</body>
</html>
Or anything similar that will let me get the output of that file and display it.
If I understand correctly, you want to capture document.write to a variable instead of writing it to the document. You can actually do this:
var advertHtml = '';
var oldWrite = document.write;
document.write = function(str)
{
advertHtml += str;
}
// Ad code here
// Put back the old function
document.write = oldWrite;
// Later...
...innerHTML = advertHtml;
You still have the hit of loading the script file though.
To decouple the main page loading from the ad loading, you can put the ad in its own page in an iframe or, similarly, download the script file with AJAX and execute it whenever it comes down. If the former is not adequate, because of referring URI or whatever, the latter gives you some flexibility: you could use string replacement to rewrite "document.write" to something else, or perhaps temporarily replace it like "document.write = custom_function;".
You may be interesed in the Javascript library I developed which allows to load 3rd party scripts using document.write after window.onload. Internally, the library overrides document.write, appending DOM elements dynamically, running any included scripts which may use document.write as well.
I have set up a demo, in which I load 3 Google Ads, an Amazon widget as well as Google Analytics dynamically.
You'd run into some security issues going cross domain due to the Same Origin Policy. I would look into JSONP if you have access to change the advertising content/service
http://docs.jquery.com/Ajax/jQuery.getJSON#urldatacallback

Categories