I'm using CasperJS to scrape a website. The page source has a <noscript> tag, and therefore is not showing the page I need to scrape, because it claims I don't have JavaScript enabled.
javascriptEnabled is true by default in CasperJS, but I added it to my initialization anyway, to no avail.
Any work arounds to fix this issue? It might also be an issue with PhantomJS...
Ok this issue has been fixed -- I did the following, if anyone has any questions. The HTML was rendered by the JavaScript, which took a long time to load, so open it like you would normally in a browser, and find an element that only appears when the javascript loads -- note doing view source doesn't work you have to inspect element (you get current DOM).
I then did:
casper.waitForSelector('.SOME_CLASS', function() {
this.echo(this.getHTML('.SOME_CLASS'));
this.echo(this.getElementInfo('.SOME_CLASS').text);
});
This allows the page to stop and load the javascript.
Related
Hello. I'm fixing some scripts from website which made by other one.
First I should look all of existing scripts but chrome dev tool doesn't show all of the source code in script tag.
I tried to copy it but still copied with "...".
I searched some keyword from the whole webpage code and I can find it in hidden part so maybe Chrome get a full script but just not showing to me.
How can I see all of scripts?
Double click the ellipsis or single click on the arrow to expand the script region:
You could also try right click and copy the outer HTML or you can edit the node as HTML inline... Explore the other options available to you in the context menu:
NOTE: If you are editing an existing web page you should probably start with the source code for the site and use an HTML code editor to edit the page and scripts, otherwise any changes you make wont be fixing the site for all users, the changes will be just for you.
First time poster. Currently trying to work on a project but I am having an issue with iframes. I'm making an extension for Chrome, and part of its functionality right now is to be fetching any shapes/borders on a page. Unfortunately, that also includes ones within iframes. I'm currently stuck because I cannot get around security policies from Cross-Domain sources.
I was wondering, however, if it is possible to create a function that displays what the Chrome console does when I hit "inspect element" on a page... I tried to make a console function called "save" that would automatically download a file of anything output to the log, but the output for the iframe's HTML is blank except for its tag...
When I am inspecting an element on the page, I can see the contents of that iframe just fine. Is there anyway to just get the actual text from Inspect Element and store that? I know this may be silly, but I genuinely have no idea. I am pretty new to JavaScript. I just need to match up script tags for shapes for part of the extension's functionality.
I'm assuming that this is going to end up being impossible, but I figured I'd ask. I'm also assuming that this inspect element functionality is something that only the browser can work with. But hey, maybe there's a way. Thanks for any help.
I use a userscript to modify the client-side code of a website. This code is adding an anchor tag to the page. Its target is _blank. The thing is that if I click this link too frequently, the site errors. A simple refresh on the new tab fixes the problem.
When I click on the link and it instantly opens a new tab. But I don't want that new tab to render until I visit it, or with some sort of time delay. Is there a way of achieving this?
I am using Firefox, so Firefox-only solutions are fine. I found this, but I don't see a way of using it to prevent the tab from rendering in the first place. When I Google for this, I see results about add-ons that can solve the problem. But, the links to them always 404. Ideally, the solution would only affect the tabs created by this script instead of the way all tabs work, but if the only way to do it is to affect the way all tabs work, I'd accept that as a solution.
The Tampermonkey documentation says there is a GM_openInTab function. It has a parameter called loadInBackground, but it only decides if the new tab is focused when you click the link.
If there is a way of making this new tab render some HTML of my choosing, I think that would be a neat solution. i.e., I'd write some HTML that, on focus, goes to the actual website's page. If this is an option, I'd need to know how to open a tab to HTML of my choosing in grease monkey.
(Just realization of idea you told in your question yourself)
You can place simple page that waits for focus and then redirects to what you pass in URL parameter somewhere and open in background tabs. Like:
load-url-from-search-on-focus.html?http://example.com:
<!doctype html>
<body
onload="document.title=u=location.search.slice(1)"
onfocus="u?document.location.replace(u):document.write('?search missing')">
Try it.
(data:uri could have been used instead of hosted page, if there weren't those pesky security precautions blocking rendering of top-level datauri navigations :|)
i'm building an online document portal that supports all Microsoft Office formats.
Instead of building my own module, i'm utilizing Google Docs Online Viewer since it already handles
this task properly, my only problem is it loads the header toolbar, which i dont want.
take for example This custom pdf-URL(i just googled for any pdf document), The navigation toolbar at the foot, but the header toobar, i want it hidden - all within the iFrame.
https://docs.google.com/viewer?url=http://www.scorpioncomputerservices.com/Press%20Coverage/Billgates.doc&embedded=false&controls=false
After Inspecting the Element on Chrome, i found the section of code controlling the element, problem is, how to hide this element on page load, by forcing a script/style to be executed on the page, while loading.
i would like to know if there's a way i could force-delete or hide the element controlling the toolbar within the iFrame, or better still if there are any alternatives to what i intend to do. my code would have looked like this
var obj = iframe.document.querySelectorAll('[role="toolbar"]');
obj.parentNode.removeElement(obj);
// or - i'm not sure anyof this would work.. and since it is loaded inside an iframe
// how do i execute this.
obj.remove();
i dont want my audience to be able to download the document, obviously curious developers might find a way, but thats going to be less than 2% - 5% of the total users.
how do i go about this please using javascript/CSS/or any library.
If you change the GET variable embedded to true the viewer won't display the top bar, however there's no way to edit the page inside the iFrame as Google has enabled cross site protection so the browser will prevent you from running any javascript to modify the content of the iFrame.
The only way to use the google document viewer is to get your site to load it in the background (not using an iFrame) and modify it before serving the page to the user.
Or alternitively I reccommend using an open source JS PDF viewer such as ViewerJS
The client's website has product listings. The prices for the product are pulled dynamically in through an iFrame at the bottom of the page. There is Javascript on the page that automatically resizes this iFrame to the correct height based upon how big the iFrame content is, once it's loaded.
The client is reporting that when printing the page, they cannot see anything from the iFrame where the prices should be - apparently it is not printing in IE, just the main page itself.
I am on a Mac and so can't test in IE, so I'm having a hard time experimenting with this.
Can anyone clarify the expected behaviour in this situation? Is it possible to get IE to print both page and included iFrames by default, and if so, how would I go about doing this? I can only find examples for printing a specific frame from a parent window.
Thanks!
The expected behaviour should be what you're experiencing in other browsers. If the page is printed, the iframe should be printed along with it. It would be difficult to imagine that everyone else got it wrong and IE got it correct in this instance.
Below is a bit of speculation on what the issue might be, but without knowing more/seeing code it's difficult to know the specifics:
This issue could be due to some css that you may have on your page. I've read of similar iframe printing issues where the visibility was set to hidden initially resulting in the iframe not printing correctly. To get around this specific case the user had to set the width and height to 0px. Without knowing more about your site, I can not correctly predict that this is happening.
Another issue may by your dynamic resizing based on the contents of the iframe. A simple test would be to comment that section out and set a generic width and height on the iframe to see if the printing issue still occurs. Perhaps those dynamic styles are not being carried over to the print stylesheet and are not getting applied (therefore not appearing at all).
As a quick suggestion, look into css media types:
print
Intended for paged material and for documents viewed on screen in
print preview mode. Please consult the section on paged media for
information about formatting issues that are specific to paged media.
Helpful link: Print Specification
This was an interesting point, so I did a test using IE8 (on a server, not locally).
I printed in IE8 a web page that included an iframe of something that I built. And it printed some of the contents the first time (the other contents showed up black). The second time I printed, the iframe contents were all black.
However, in my example, the contents in the iframe are changing constantly (images and text that fade in and out) and the css background behind it is black.
This test has the contents of the iFrame on a different host server than the contents of the main page. But to my knowledge, there is a cross-domain policy file working here.
Cross-domain policy issues were my first guess, but it's entirely possible there is some issue with how internet explorer renders the screenshot when it sends it to the printer.
If you are using Javascript, then why not try window.print() function along with print media CSS.
I can't explain why IE isn't working, but maybe you can fix the problem by adding this part of code into the parent page, in order to force each iframe to be refresh :
$(document).ready(function() {
if($.browser.msie) {//Only for IE
$('iframe').each(function() {
$(this).attr('src', $(this).attr('src'));
});
}
});
To get the browser, i use this method.
And i don't use contentDocument.location.reload(true); method to be sure the iframe to be refresh. See SO topic.
Try this Plugin it will solved your problem
http://projects.erikzaadi.com/jQueryPlugins/jQuery.printElement/