A screenshot of a page from a URL without opening it?

A screenshot of a page from a URL without opening it? - javascript

We need a module that gets a url only, and return a screenshot / image of that page without opening it, so it's being done in the background.
All tools i read about here, are doing so by getting a specific div element as an input or redirecting to that page.
It also should fit with a url that has a redirect , so :
function getImageOfPage(url) {
//get html page of this url (dont show it)
//wait for page to load all photos
//create screen shot image
return img;
}

there's a solution for this you would need node.js, In this, you need the webshot module, npm i webshot
here's a sample code
const webshot = require('webshot');
webshot('https://example.com',
'img_name.png', function(err) {
if (!err) {
console.log("Screenshot taken!")
}
});

Related

Is there a way to hide a link if the file does not exist on FTP using JavaScript/jQuery?

what I'm hoping to accomplish is if there isn't a file for a particular location then hide the link, if there is, show the link.
I have this piece of html here:
<p>[[C1:event_location]]
<br />[[C1:event_location_city]], [[C1:event_location_state]], [[C1:event_location_country]]
<br />Google Map
| Route Map
| 1km Route Map
| 5km Route Map
</p>
What's happening here is the Luminate platform using it's own custom tags you see there E130 and C1:event_title to check the href that is linking to the FTP folder for a pdf file name that starts with Route_map_ or Route_map_1k_ or Route_map_5k_ and ends with the name of the location. If the pdf file is there and you click the link it'll load the pdf file, however if there isn't a file there with that name it'll load a page not found.
Ex:
I'm on my website, and I'm looking at the location Abbotsford. In the ftp folder I have a file called ROUTE_MAP_1K_Abbotsford.pdf, but I do NOT have a file called ROUTE_MAP_5K_Abbotsford.pdf.
If I click the link with the 1k it'll load the pdf file, if I click the link with the 5k it'll load a page not found because the pdf file doesn't exist. So because "5km Route Map" file doesn't exist that element should be hidden.
What I'm hoping is possible, is that there is some JS/jQuery way of checking if the file exists if it doesn't then hide the link.
EDIT I realize after seeing some answers that I worded the question wrong. So what is happening is if I hover my mouse over the 5km Route Map link the browser shows (at the bottom) examplewebsite.com/documents/2018Maps/ROUTE_MAP_5K_Abbotsford.pdf even though the file isn't actually there, because the Luminate code is rendering that url so if you click the link it'll take you to examplewebsite.com/site/PageServer?pagename=page_not_found_Run (you can still navigate the site, doesn't give a real server 404 error). Is there a way to check if the file actually exists, if it doesn't then hide the link?
Thank you for your time!

I believe below is a bit slow as it will ajax every single one of your pdfs, but it will do the trick, since you seem to only be able to solve it from the client side and since you only really need fail/error with 404:
$("a[href^='pdf']").each(function() {
var $currentLink = $(this);
$.ajax($currentLink.attr("href")).fail(function(jqXHR, textStatus, errorThrown) {
if(jqXHR.status == 404 || errorThrown == 'Not Found') {
$currentLink.remove();
}
});
});

Here's a JS check for whether a file exists or not:
function UrlExists(url)
{
var http = new XMLHttpRequest();
http.open('HEAD', url, false);
http.send();
return http.status!=404;
}
This will check for a 404 status (file not found)
If you'd like to structure it to return when a file DOES exist, switch the status code from 404 to 200.
JQuery Solution:
$.get(url)
.done(function() {
// exists code
}).fail(function() {
// not exists code
})
For additional context, look into:
How do I check if file exists in jQuery or pure JavaScript?

render dynamic page using phantom

I have a dynamic page of which i want to take a screenshot . I am using phantom for it. the website is built on MEAN stack .
Here's the code I am currently using :
var phantom = require('phantom');
phantom.create().then(function(ph) {
ph.createPage().then(function (page) {
page.open("http://localhost:3000/#!/dashboard/somedynamicpageid").then(function (status) {
page.render('screenshot.png');
page.close();
ph.exit();
});
});
});
Now the problem is , i don't get complete screenshot of the page.what i get in screenshot is just some loaded portion of the page.The images and graphs and textual data is still missing .How can i take screenshot when my entire page has loaded (with all the images ,dynamic text , other async stuff).

get the url of a new page when it is opened after click

I am writing a test that clicks on a button and opens a new tab and directs you to a new website. I want to call in that website value so I may parse it after the rfp code in the webpage name. I then open a decoder site and use it to decode and be sure the decoded webpage name works properly.
The code I'm using:
this.switchesToGetQuotePage = function() {
browser.getAllWindowHandles().then(function(handles) {
newWindowHandle = handles[1]; // this is your new window
browser.switchTo().window(newWindowHandle).then(function() {
getCurrentUrl.then(function(text) {
console.log(text);
});
});
});
};
When I call the getCurrentUrl function it returns below as the value:
data: ,

Use the protractor built in getLocationAbsUrl() to get the url of the current page if its angular based. Here's how -
browser.getLocationAbsUrl().then(function(url){
console.log(url);
});
However if you are working on a non-angular page then do wait until the page loads as the url changes (through redirections) until final page is delivered to the client and then use getCurrentUrl() on the page. Here's how -
var ele = $("ELEMENT_ON_NEW_PAGE"); //replace it with your element on the page
browser.switchTo().window(newWindowHandle).then(function() {
browser.wait(protractor.ExpectedConditions.visibilityOf(ele), 10000).then(function(){
getCurrentUrl.then(function(text) {
console.log(text);
});
});
});
Hope it helps.

HTML output from PhantomJS and Google Chrome/Firefox are different

I've been debugging this for a long time and it has me completely baffled. I need to save ads to my computer for a work project. Here is an example ad that I got from CNN.com:
http://ads.cnn.com/html.ng/site=cnn&cnn_pagetype=main&cnn_position=300x250_rgt&cnn_rollup=homepage&page.allowcompete=no&params.styles=fs&Params.User.UserID=5372450203c5be0a3c695e599b05d821&transactionID=13999976982075532128681984&tile=2897967999935&domId=6f4501668a5e9d58&kxid=&kxseg=
When I visit this link in Google Chrome and Firefox, I see an ad (if the link stops working, simply go to CNN.com and grab the iframe URL for one of the ads). I developed a PhantomJS script that will save a screenshot and the HTML of any page. It works on any website, but it doesn't seem to work on these ads. The screenshot is blank and the HTML contains a tracking pixel (a 1x1 transparent gif used to track the ad). I thought that it would give me what I see in my normal browser.
The only thing that I can think of is that the AJAX calls are somehow messing up PhantomJS, so I hard-coded a delay but I got the same results.
Here is the most basic piece of test code that reproduces my problem:
var fs = require('fs');
var page = require('webpage').create();
var url = phantom.args[0];
page.open(url, function (status) {
if (status !== 'success') {
console.log('Unable to load the address!');
phantom.exit();
}
else {
// Output Results Immediately
var html = page.evaluate(function () {
return document.getElementsByTagName('html')[0].innerHTML;
});
fs.write("HtmlBeforeTimeout.htm", html, 'w');
page.render('RenderBeforeTimeout.png');
// Output Results After Delay (for AJAX)
window.setTimeout(function () {
var html = page.evaluate(function () {
return document.getElementsByTagName('html')[0].innerHTML;
});
fs.write("HtmlAfterTimeout.htm", html, 'w');
page.render('RenderAfterTimeout.png');
phantom.exit();
}, 9000); // 9 Second Delay
}
});
You can run this code using this command in your terminal:
phantomjs getHtml.js 'http://www.google.com/'
The above command works well. When you replace the Google URL with an Ad URL (like the one at the top of this post), is gives me the unexpected results that I explained.
Thanks so much for your help! This is my first question that I've ever posted on here, because I can almost always find the answer by searching Stack Overflow. This one, however, has me completely stumped! :)
EDIT: I'm running PhantomJS 1.9.7 on Ubuntu 14.04 (Trusty Tahr)
EDIT: Okay, I've been working on it for a while now and I think it has something to do with cookies. If I clear all of my history and view the link in my browser, it also comes up blank. If I then refresh the page, it displays fine. It also displays fine if I open it in a new tab. The only time it doesn't is when I try to view it directly after clearing my cookies.
EDIT: I've tried loading the link twice in PhantomJS without exiting (manually requesting it twice in my script before calling phantom.exit()). It doesn't work. In the PhantomJS documentation it says that the cookie jar is enabled by default. Any ideas? :)

You should try using the onLoadFinished callback instead of checking for status in page.open. Something like this should work:
var fs = require('fs');
var page = require('webpage').create();
var url = phantom.args[0];
page.open(url);
page.onLoadFinished = function()
{
// Output Results Immediately
var html = page.evaluate(function () {
return document.getElementsByTagName('html')[0].innerHTML;
});
fs.write("HtmlBeforeTimeout.htm", html, 'w');
page.render('RenderBeforeTimeout.png');
// Output Results After Delay (for AJAX)
window.setTimeout(function () {
var html = page.evaluate(function () {
return document.getElementsByTagName('html')[0].innerHTML;
});
fs.write("HtmlAfterTimeout.htm", html, 'w');
page.render('RenderAfterTimeout.png');
phantom.exit();
}, 9000); // 9 Second Delay
};
I have an answer here that loops through all files in a local folder and saves images of the resulting pages: Using Phantom JS to convert all HTML files in a folder to PNG
The same principle applies to remote HTML pages.
Here is what I have from the output:
Before Timeout:
http://i.stack.imgur.com/GmsH9.jpg
After Timeout:
http://i.stack.imgur.com/mo6Ax.jpg

Web page Capture and save to image using phantomjs lib

i was searching google to get any js lib which can capture the image of any website or url. i came to know that phantomjs library can do it. here i got a small code which capture and convert the github home page to png image
if anyone familiar with phantomjs then please tell me what is the meaning of this line
var page = require('webpage').create();
here i can give any name instead of webpage ?
if i need to capture the portion of any webpage then how can i do it with the help of this library. anyone can guide me.
var page = require('webpage').create();
page.open('http://github.com/', function () {
page.render('github.png');
phantom.exit();
});
https://github.com/ariya/phantomjs/wiki
thanks

Here is a simple phantomjs script for grabbing an image:
var page = require('webpage').create(),
system = require('system'),
address, output, size;
address = "http://google.com";
output = "your_image.png";
page.viewportSize = { width: 900, height: 600 };
page.open(address, function (status) {
if (status !== 'success') {
console.log('Unable to load the address!');
phantom.exit();
} else {
window.setTimeout(function () {
page.render(output);
console.log('done');
phantom.exit();
}, 10000);
}
})
Where..
'address' is your url string.
'output' is your filename string.
Also 'width' & 'height' are the dimensions of what area of the site to capture (comment this out if you want the whole page)
To run this from the command line save the above as 'script_name.js and fire off phantom making the js file the first argument.
Hope this helps :)

The line you ask about:
var page = require('webpage').create();
As far as I can tell, that line does 3 things: It adds a module require('webpage'), then creates a WebPage Object in PhantomJS .create(), and then assigns that Object to var = page
The name "webpage" tells it which module to add.
http://phantomjs.org/api/webpage/
I too need a way to use page.render() to capture just one section of a web page, but I don't see an easy way to do this. It would be nice to select a page element by ID and just render out that element based at whatever size it is. They should really add that for the next version of PhantomJS.
For now, my only workaround is to add an anchor tag to my URL http://example.com/page.html#element to make the page scroll to the element that I want, and then set a width and height that gets close to the size I need.
I recently discovered that I can manipulate the page somewhat before rendering, so I want to try to use this technique to hide all of the other elements except the one I want to capture. I have not tried this yet, but maybe I will have some success.
See this page and look at how they use querySelector(): https://github.com/ariya/phantomjs/blob/master/examples/technews.js

We Keep Coding

JavaScript is the programming language of the Web.

A screenshot of a page from a URL without opening it? - javascript

there's a solution for this you would need node.js, In this, you need the webshot module, npm i webshot here's a sample code const webshot = require('webshot'); webshot('https://example.com', 'img_name.png', function(err) { if (!err) { console.log("Screenshot taken!") } });

Related

Is there a way to hide a link if the file does not exist on FTP using JavaScript/jQuery?

render dynamic page using phantom

get the url of a new page when it is opened after click

HTML output from PhantomJS and Google Chrome/Firefox are different

Web page Capture and save to image using phantomjs lib

Categories

Resources