Dynamically load & parse local HTML from within HTML? - javascript

A bit of an unusual setup:
I'm writing in an html page that in turn loads another html page, parses it, analyzes it, and displays information about it.
The parsing is fairly easy using jQuery. I just need to figure out how to load the external page - that is, when page A is displayed in the browser, it needs to load page B, analyze page B, and display information about page B.
Both pages are local (not served via a web server).
Both load and ajax from jQuery run into the cross-origin permission issue:
XMLHttpRequest cannot load file://localhost/Users/me/test.html. Origin null is not allowed by Access-Control-Allow-Origin.
I can load the page with a script tag, but then I don't know how to access it so I can parse it:
<script type="text/html" src="test.html"></script>
Any ideas?

Have you thought about using JavaScript/jQuery to create an iframe? (You can use CSS to make the iframe hidden to the end user.) Then you can listen for the iframe's onload event, and parse it through the iframe's contentDocument element (I believe).

Related

UWP: webview does not display page using navigateToString method

I am trying to use webview element in a universal app using javascript. My aim is to browse some websites adding some content of my own to its html document.
First, I set src attribute of webview to www.example.com and it browses the site. This was just to make sure the webview is capable of browsing the site.
Next, I tried getting the html and load it to webview using navigateToString method like this:
$.get(url, function (data) {
webView.navigateToString(data);
});
This causes the page to be loaded out of shape (aperarently some .js or .css files are not loaded or blocked from running), or it isn't even loaded.
I wonder what is the difference loading the page by its url and loading its html by manually like this. And is there a workaround I can overcome this problem.
Note: I'm new at both js and html.
A web page is usually not made of a single HTML file. In order to make it work, you will have to retrieve not only the HTML but also the javascript and the css files.
This can be a tedious work.
If you are trying to open something from the web, the easiest way is to perform a regular navigate() which will take the URI as parameter and perform a "full" browse (as the browser will do). The retrieval/loading of the CSS/JS will be done for you.
If you want to open a local page (local to your application), navigateToString() is a good path but you will have to host locally all the page dependencies (css/js fiels) or embed all the style and code in the HTML page itself.

Phantomjs: Modifying html dom before opening it as webpage

I need to process html files that have corrupted script files that are added to it via tag.
Im planning to remove all script tag present in the webpage via phantomjs.
But on opening the webpage via webpage.open(), phantomjs parse error is thrown since it cannot parse the JS content within the script tag.
Here is an example:
<html>
<head>
<script>
corrupted JS
if(dadadd
;
</script>
<body>
some content
</body>
</html>
Can someone help me on suggesting the right way to clean this webpage using phantomjs ?
It's not (easily) possible. You could download (not through opening the page, but rather making an Ajax request in page.evaluate()) the static html, then change according to your needs, then assign it to page.content.
This still might not work, because as soon as you assign it to page.content, you're saying that PhantomJS should interpret this source as a page from an unknown domain (about:blank). Since the page source contains all kinds of links/scripts/stylesheets without a domain name, you'll have to change those too in order for the page to successfully load all kinds of resources.
It might be easier to just have a proxy between PhantomJS and the internet with a custom rule to adjust the page source to your needs.

Javascript to disable onload function inside an iframe

I have an iframe on my web page, the source url is a remote web page I don't have access to, the problem is that in that iframe body onload event, there is a JS function:
<body onload="if(top!=self) top.location.href=location.href">
so that will change my webpage to their webpage url. is there anyway I can use javascript on my web page to disable or rewrite the body onload function in that iframe? I think that is different from the iframe onload event.
The only way to even try to bust this would be to use server-side code (PHP, Ruby, Python, etc.) to essentially proxy their page into an iframe that you control. You could grab the source of their document and do a replacement to get rid of their onload event server-side.
You'd want to set up a structure something like:
AppFolder
index.html
your_iframe.php
In your iframe, simply download the contents of the url and do a replacement and echo back out to the browser. There are still a number of issues that you'd have to workout, such as relative resource paths in the iframed document.

inject javascript into an iframe for a portal application

I have an frame that has a web application inside it. it expects that certain javascript functions will exist on the page that it can call. How can I inject these javascript functions into the iframe from my parent application?
Your question is a little vague on details, such as whether you control the content inside the iframe or not. But there are a number of ways to go about accessing/applying Javascript between frames.
In the page contained within the iFrame:
parent.FunctionName();
This will call a function that exists within your main page that contains the iframe.
Similarly:
YourIFrameName.FunctionName();
Will call a function in your iframe from the parent.
You can also package the needed Javascript functions into a .js file. And include them in the header of whatever page needs them (the iframe and/or the main page).
Include this in your <head>:
<script type="text/javascript" src="YourJavascriptFile.js"></script>
However, if you do not control the contents, and run into the same origin policy, you have two options:
1) Rethink your application.
2) A workable mess: You would need to
call a script from the iframe that
does some cURL type magic to pull the
page contents of the included web app,
inject the needed Javascript, and then
output the altered contents in a
meaningful way.
If you decided you need to go the route of #2, I can edit with more specifics.
As Robert mentioned, I think that violates the same origin policy in most (if not all) browsers.
Alternatively, instead of trying to inject the functions into the iframe, why not have the iframe content reference them directly from the iframe's parent?

URL Hash modification after document.write()

I download via jQuery AJAX a whole html webpage. I want to replace the content of the current page with the one downloaded via ajax. I do it with document.write(). It doesn't work correctly because whenever I try to modify the hash, the webpage is reloaded.
I know in IE it it necessary an iframe, but that is not the problem, because I use jQuery History plugin. The problem is due to the use of document.write(), but I don't know why.
Update:
index.php -> main entry point, which downloads JS code to parse URL after hash and invoke request.php.
request.php -> request entry point. It returns the webpage.
It works OK when I simulate a direct request to request.php and the downloaded webpage updates the hash.
It doesn't work (in FFox only) when I simulate a original request to index.php, which downloads the webpage via request.php and the downloaded page modifies the hash.
I use document.write() to write the content of the webpage to the current window. So the problem is about the modification of the hash in a document "being written".
don't use document.write().
instead use $('your selector').html(your_html_fetched_via_ajax);
I thinkg that you can't modify the whole html object because it means erasing the reference to the javascript script tag. I would say your best bet is to either just link to the request.php page or just change the body tag
$('body').html(response_html);
And I agree with harshath.jr, don't use document.write().
The individuals pointing you towards an iframe are correct. Add the iframe, and simply set the src attribute to the page you're fetching...you won't even need request.php.
If you really want to try to load in the html without an iframe, you'd have the parse out the elements in the head and add them to your documents , and also parse the contents of the and add them to the current pages body. Its not guaranteed to display correctly, though. I think an iframe is really what you're looking for.

Categories