Finding where links are - javascript

When I go to this webpage, I see green buttons with the text "信息公开". My task is to download all links of this green button. So if there are ten buttons, I need all ten links.
However I cannot find the text "信息公开" when I download the page in Chrome. I suspect that some Javascript is executed to download information related to "信息公开". Indeed, when I use Chrome to inspect the green buttons, I find information that I cannot find in the HTML files which I download.
How can I find out where the links are?

You have two JavaScript-based options:
a) Use a headless browser like Phantom.js to scrape the site for the links, there should be no problem with the JavaScript-loaded content. This would be the solution if you want to automate the scraping (like running it daily and posting the links somewhere)
b) Much simpler, but not as automatic: Use the jQuery in the Chrome Console to build a selector to get all the links. For example this piece of code, will give you the links of the yellow community box on the right side of Stack Overflow:
$('.community-bulletin a').each(function(){console.log($(this).attr('href'))})

Related

Using puppeteer as a semi-automated system

I've got a certain task which I would normally accomplish with a chrome extension.
The only thing I cannot perform with a chrome extension is to make a screenshot of a node. I've been trying to use debugger api, but I couldn't make it snap a proper screenshot of an element when it goes beyond the viewport (even using the captureBeyondViewport property and some other stuff).
So my goal right now is to make puppeteer a semi-automated tool. It should open a browser and work in a concept similar to chrome extensions. So I have to be able to run certain code on any of the tabs of the puppeteer browser instance. Just like content scripts do. Once something happens (for example, clicking on certain element) the puppeteer should make certain actions (eg make a screenshot of the node/full-page screenshot etc). It must be able to cover multiple tabs and perform an action only from the tab where the action was performed.
If anyone can give me a good option to test with the chrome extension, that would be even better. The goal is to be able to capture screenshot of an element even if it's big enough to go beyond the viewport.
PS: html2canvas isn't an option because the website I need to work on the screenshots lack of images and basically look a bit weird.

Display PDF like on web

I am trying to display PDF file on the web without download option and copy option.
Then I found this https://books.google.co.in/books?id=kwBvDwAAQBAJ&printsec=frontcover&source=gbs_ge_summary_r&cad=0#v=onepage&q&f=false
Can you tell me how can I achieve this on my website?
What mplungjan said in his comment is correct. Anything that is put on the web can be copied one way or another. It appears that the google site you linked to is just showing an image of each page (see https://books.google.co.in/books/content?id=kwBvDwAAQBAJ&pg=PP1&img=1&zoom=3&hl=en&sig=ACfU3U0s8V3HjcApLeNwIGStMQlzZFaotA) with transparent pixels over each image to make it so you can't right-click to save the image. But it's easy to see what they're doing by viewing the source in the inspector.
If you don't want your users to be able to download the entire file, you could break it up into multiple small files (or images, like google is doing in your link) that would make it a little harder for them to get the files. But you can't really stop them from downloading anything.

Disable toolbar in PDF Web View Element to disable Download and Print

My issue is that I have to deploy a local server (without internet), so I cannot use Google Doc Viewer in this case. All I want is to restrict the user from download or printing the document. I have tried hiding or removing the toolbar in JS but it is not working out.
You may be able to disable the toolbar somehow, but that isn't good enough to keep users from downloading or printing it anyway, and nothing you can do will be. If a person can see something, they can copy it, no matter what you try to do to stop them (and all trying will do is inconvenience legitimate users). Previous similar questions:
How to prevent downloading images and video files from my website?
disable downloading of image from a html page
https://graphicdesign.stackexchange.com/questions/39462/is-it-possible-to-prevent-download-of-images-when-designing-a-website
Although those talk about images, the exact same reasoning applies to PDFs.

Overlay a frame on every webpage a visitor visits?

So I want to be able to have a space that overlays content on any website with the click of a button (something that also is above everything on a web page). An example would be the Google Translate page, http://translate.google.com/translate?u=about%3Ablank&hl=en&langpair=auto|en&tbb=1&ie=UTF-8 where the frame at the top will overlay any website that is entered in the url box.
What I want to do is have a box like this overlay every webpage, like google's translate does, but have it hide with a click of a floating image, say an arrow.
The files will be locally stored on my HDD, but I don't see this being an issue.
I don't know what languages to code this in, but I assume Javascript, however, I do not know the classes to call to do this. Any advice chaps? I'm not asking for a hand out, just a point in the right direction!
It looks like you want to develop a browser extension. Look here for Chrome:
http://code.google.com/chrome/extensions/getstarted.html
There are similar ways to do it for IE, FireFox, and Safari.
It sounds like you'll either need to use frames or an iframe. They are very similar in how you interact with them (say to make them load a new page) although they are different in their implementation.
A great site for learning about frames is w3Schools:
http://w3schools.com/html/html_frames.asp
http://w3schools.com/html/html_iframe.asp
You can use JavaScript to reference the frames via its name or its ID. Ex: document.framename.src = 'hello.php' or document.getElementById('frameId').src = 'hello.php'.
One problem with using frames is that search engines don't like them. If you are using an iframe, the search engine will search your page, but still not the iframe.
As for resizing/hiding the frame/iframe, you can do that with both frames and an iframe, although the method for accomplishing it varies depending on what you use.

Can I programmatically save part of a web page as an image?

If possible, I would like to do this with a simple button. The users are not terribly comfortable with computers, which is why I haven't just told them to print screen or use the snipping tool.
I know it can be done in Mozilla-based browsers with <canvas> and drawWindow(). But this application is running on Internet Explorer 7 and 8.
The page shows some graphs (generated by a ReportViewer control) based on the input of a couple of dropdowns. Does that mean a client-side script is the only option? Or could I do it in the ASP.NET back end somehow? Perhaps re-generating an image any time the dropdowns are changed?
(I've been a desktop dev for so long that I don't quite "get" what you can and can't do in web apps yet.)
From what I understand you've got some drop downs and you're generating a graph based on the input of those drop downs by the user?
So if I was doing this with PHP (just trying to give you ideas here, dunno whats possible and what's not in ASP) I would create an image magic or gdb library script that builds a jpeg based on the variables input in a querystring.
For instance this would output a jpeg image of a simple graph with 3 points on it:
http://mydomain.org/image.php?value1=10&value2=20&value3=30
Then for the front end of my script I would probably use jquery/ajax to call that script and show the image as the user is changing the values.
Then you have an image that you can potentially force download or instruct users to right click and choose "save as".
Anyway, this is just an idea, not a solution. I don't know about ASP.NET, but this is how I would do it in PHP.

Categories