How can I take a screenshot of a whole web page?

How can I take a screenshot of a whole web page? - javascript

I want to take the screenshot of Jsp page in the browser. I had googled a lot. Everyone is pointing to java.awt.Robot functionality. It is great. But what i need is i want the screenshot of the full web page which is also inside the scrollable area of the browser window. Moreover i want only the webpage content not the status bar and other tabs and menus on the browser. I had used the following code.
public class ScreenCapture {
public void TakeCapture()
{
try
{
Robot robot = new Robot();
String format = "jpg";
String fileName = "D:\\PDFTest\\PartialScreenshot." + format;
Dimension screenSize = Toolkit.getDefaultToolkit().getScreenSize();
Rectangle captureRect = new Rectangle(0, 0, screenSize.width , screenSize.height);
BufferedImage screenFullImage = robot.createScreenCapture(captureRect);
ImageIO.write(screenFullImage, format, new File(fileName));
Document document = new Document();
String input = "D:\\PDFTest\\PartialScreenshot.jpg";
String output = "D:\\PDFTest\\PartialScreenshot.pdf";
try {
FileOutputStream fos = new FileOutputStream(output);
PdfWriter writer = PdfWriter.getInstance(document, fos);
writer.open();
document.open();
document.add(Image.getInstance(input));
document.close();
writer.close();
}
catch (Exception e) {
e.printStackTrace();
}
}
catch (AWTException | IOException ex) {
System.err.println(ex);
}
}
public String getTakeCapture() {
return getTakeCapture();
}
Is there a way to take the screen shot of the full JSP webpage that is viewing in the browser.(Along the content inside the scrollable window) and then I have to convert this screenshot into PDF. Don't tell me the ways to directly convert it into the PDF using FlyingSaucer as it's not working in my case.

This is not possible in pure Java.
However you can add the html2canvas library to your JSP page. You can then use Javascript to submit the canvas image to your servlet and process it as you please.
See the following question and answer that deals with similar problem: How to upload a screenshot using html2canvas?

Related

HTML DOM to Download Image from <img> URI

I have created a list of all page uris I'd like to download an image from for a vehicle service manual.
The images are delivered via a PHP script,as can be seen here http://www.atfinley.com/service/index.php?cat=g2&page=32
This is probably meant to deter behaviors like my own, however, every single Acura Legend owner shouldn't depend on a single host for their vehicle's manual.
I'd like to design a bot in JS/Java that can visit every url I've stored in this txt document https://pastebin.com/yXdMJipq
To automate the download of the available png at the resource.
I'll eventually be creating a pdf of the manual, and publishing it for open and free use.
If anyone has ideas for libraries I could use, or ways to delve into the solution, please let me know. I am most fluent in Java.
I'm thinking a solution might be to fetch the html document at each url, and download the image from the <img src>argument.

I have written something similar but unfortunately, i can't find it anymore. Nevertheless, i remember using the JSoup Java-library which comes in pretty handy.
It includes an HTTP-client and you can run CSS-selectors on the document just like with jQuery...
This is the example from their frontpage:
Document doc = Jsoup.connect("http://en.wikipedia.org/").get();
Elements newsHeadlines = doc.select("#mp-itn b a");
Creating PDFs is quite tricky, but i use Apache PDFBox for such things...

I know you asked for a JavaScript solution but I believe PHP (which you also added as a tag) is more suitable for the task. Here are some guidelines to get you started:
Move all the URLs into an array and create a foreach loop that will iterate on it.
Inside the loop use the PHP Simple HTML DOM Parser to retrieve the image URL attribute for each page.
Still inside the loop use the URL for the image in a CURL request to grab the file from that and save it into your custom folder. You can find the code required for this part here.
If this process proves to be too long and you get a PHP runtime error consider storing the URLs generated by step 2 in a file and then using that file to generate a new array and run step 3 on it as a separate process.

Finished solution for grabbing image urls;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileWriter;
import java.io.IOException;
import java.io.Writer;
import java.util.Scanner;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
public class Acura {
public static void main(String[] args) throws IOException {
Scanner read;
Writer write;
try {
File list = new File("F:/result.txt");
read = new Scanner(list);
write = new FileWriter("F:/imgurls.txt");
double s = 0;
while(read.hasNextLine())
try {
s++;
String url = read.nextLine();
Document doc = Jsoup.connect(url).get();
Element img = doc.select("img").first();
String imgUrl = img.absUrl("src");
write.write(imgUrl + "\n");
System.out.println((double)(s/2690) + "%");
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
read.close();
write.close();
} catch (FileNotFoundException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
}
}
Generates a nice long list of image urls in a text document.
Could have done it in a non-sequential manner, but was heavily intoxicated when I did this. However I did add a progress bar for my own peace of mind :)

Scanner read;
Writer write;
try {
File list = new File("F:/imgurls.txt");
read = new Scanner(list);
double s = 0;
while(read.hasNextLine())
try {
s++;
String url = read.nextLine();
Response imageResponse = Jsoup.connect(url).ignoreContentType(true).execute();
FileOutputStream writer = new FileOutputStream(new java.io.File("F:/Acura/" + (int) s + ".png"));
writer.write(imageResponse.bodyAsBytes());
writer.close();
System.out.println((double)(s/2690) + "%");
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
read.close();
} catch (FileNotFoundException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
}
Worked for generating pngs

PDF not working with List Int array passed to Browser

Hi I have a defined variable contentAll from where I am writing to server as a PDF and mailing it to users. They want to see it open in a browser.
Server C# code is as below
List<int> contentAll = new List<int>();
contentAll = [01,02, 03,.....] etc. - Just a sample
byte[] base64EncodedStringBytes = Encoding.ASCII.GetBytes(Convert.ToBase64String(contentAll.SelectMany<int, byte>(BitConverter.GetBytes).ToArray()));
return Ok(base64EncodedStringBytes);
Jquery Code is as below
var pdfWin = window.open("data:application/pdf;base64," + response, "_blank", 'height=650,width=840');
Where response is the returned base64EncodedStringBytes
I am able to write the PDF and send it so no issues with content.
However Chrome is not opening it. The Error in PDF window . Can someone assist

IE 9/10/11 not rendering images after some unknown threshold

I am building an app that pulls files from SharePoint 2013 or SharePoint 2010 for view in HTML. In C#, files are pulled out of SharePoint (multipage documents like Word, Excel, PDF, TIFF, etc), then are fed into various 3rd party software (DataLogics and Aspose) - which break the documents down into their individual pages, then streams the individual pages to the browser in PNG format.
So in HTML, we have an img element whose src is set to a specific URL in an ASHX service. The ASHX service grabs the file out of SharePoint and, based on query string params, returns the desired page as a Stream.
Here is how we shoot it back:
[WebService(Namespace = "url")]
[WebServiceBinding(ConformsTo = WsiProfiles.BasicProfile1_1)]
public class FileTransfer : IHttpHandler, IReadOnlySessionState
{
public void ProcessRequest(HttpContext context)
var stream = GetStream(context.Request);
int chunkSize = 2097152; //2MB
byte[] chunk = new byte[chunkSize];
int bytesRead = 0;
do {
bytesRead = stream.Read(chunk, 0, chunkSize);
HttpContext.Current.Response.OutputStream.Write(chunk, 0, bytesRead);
}
while (bytesRead > 0);
}
This works perfectly 100% of the time in any browser when the file we are breaking down comes directly from SharePoint.
We also provide a feature where the user can upload a document. This is where the problem comes in. Uploaded documents are not saved in SharePoint. Instead their data is stored in SessionState until the user chooses to save. Files are uploaded to an ASMX service, then the browser requests their individual pages via the above ASHX.
Files are uploaded like this in an ASMX service:
[WebMethod(EnableSession = true)]
[ScriptMethod(ResponseFormat = ResponseFormat.Json)]
Public object Upload()
{
var request = HttpContext.Current.Request;
if (request.Files.Count == 1)
{
var uniqueId = request["uniqueId"];
var file = request.Files[0];
using (var memoryStream = new MemoryStream())
{
file.InputStream.CopyTo(memoryStream);
docInfo = UploadItem(uniqueId, pageNum, memoryStream.ToArray());
}
}
}
UploadItem adds the uniqueId and byte[] to SessionState.
Files are sent from javascript like this (FileUpload being tied to the change event of an input of type=file):
this.FileUpload = function (files) {
var upload = new XMLHttpRequest();
upload.onreadystatechange = () => {
if (this._curUploadRequest.readyState == 4) {
// handle response
}
};
UpdateFormDigest((<any>window)._spPageContextInfo.webServerRelativeUrl,(<any>window)._spFormDigestRefreshInterval);
var data = new FormData();
data.append("uniqueId", uniqueId);
data.append("pageNum", pageNum);
data.append("data", files[0]);
upload.open('POST', "myurl");
upload.setRequestHeader("X-RequestDigest", $("#__REQUESTDIGEST").val());
upload.send(data);
};
Now we come to the actual bug.
Images are rendered using:
<img src="url to ASHX service" />
In FireFox and Chrome, page images from uploaded documents always show up just fine. But in IE (9, 10, or 11), it renders only the first portion of them, then shows broken image icons on the image placeholders. For these broken images, the NET tab of IE shows it received 0kb and the error event is hit. But if I put a breakpoint in the ASHX just before it returns the stream, it always has a size.
More interestingly, if you take the url that the src is pointed to, open a new window and paste it in, the image shows up just fine.
I even tried to load the images in javascript first like this:
var img = new Image();
img.onload = function(){
// use jquery to append image to page
};
img.src = "url to ASHX service";
In this scenario, Chrome and Firefox work fine as usual, but IE fails again. Except this way, the NET tab of IE shows it received the correct size kb in response. However, it still shows the broken image icon and won't render images to the screen after some unknown threshold. The first several images come back, but once one breaks, all of the rest break.
I also modified the ASHX service to return base64 data instead of a stream, then bound the base64 to the src. In the debugger you can see the base64 assigned to the src of the img elements that show the broken image icon. So the data is there for sure, but IE just isn't rendering it...
I tried to recreate this problem outside of our SharePoint environment in this fiddle using knockout JS. Basically, I grab a ton of big images and throw them on the screen with each button click. But it works just fine. It works perfectly if I use jQuery too.
http://jsfiddle.net/bsdez92f/
Not sure where to go from here.
Any ideas?

So it turns out that the image size was causing a problem. I scaled the images down to thumbnail size on the server side and returned that to the browser. All is working fine at this point.

Print PDF Created using itextsharp

My Goal is to print a RDLC report on the client machine without preview. I can not use the ReportViewer print button since it requires the installation of ActiveX object and there are no permissions for that. So, I'm using ITextSharp to create a PDF from the byte array returned from the rendered LocalReport, and add a JavaScript for print.
During Debug, I can see that the PDF is generated and has 2 pages, and everything looks OK. I don't receive any errors and the function exits OK, but it doesn't print. What am I doing wrong, or what am I missing?
This is my code:
string jsPrint = "var pp = this.getPrintParams();pp.interactive= pp.constants.interactionLevel.silent;this.print(pp);";
byte[] bytes = report.Render("PDF", null, out mimeType, out encoding, out extension, out streamids, out warnings);
using (MemoryStream ms = new MemoryStream())
{
Document doc = new Document();
PdfWriter writer = PdfWriter.GetInstance(doc, ms);
doc.SetPageSize(PageSize.A4);
doc.Open();
PdfContentByte cb = writer.DirectContent;
PdfImportedPage page;
PdfReader reader = new PdfReader(bytes);
int pages = reader.NumberOfPages;
for (int i = 1; i <= pages; i++)
{
doc.SetPageSize(PageSize.A4);
doc.NewPage();
page = writer.GetImportedPage(reader, i);
cb.AddTemplate(page, 0, 0);
}
PdfAction jAction = PdfAction.JavaScript(jsPrint, writer);
writer.AddJavaScript(jAction);
doc.Close();
}
Thanks.

Regarding your question about PdfStamper (in the comments). It should be as simple as this:
string jsPrint = "var pp = this.getPrintParams();pp.interactive= pp.constants.interactionLevel.silent;this.print(pp);";
PdfReader reader = new PdfReader(bytes);
MemoryStream stream = new MemoryStream();
PdfStamper stamper = new PdfStamper(pdfReader, stream);
stamper.Writer.AddJavaScript(jsPrint);
stamper.Close();
reader.Close();
Regarding your original question: automatic printing of PDF documents is considered being a security hazard: one could send a PDF to an end-user and that PDF would cause the printer to spew out pages. That used to be possible with (really) old PDF viewers, but modern viewers prevent this from happening.
In other words: you may be trying to meet a requirement of the past. Today's PDF viewers always require an action from the end user to print a PDF document.

How to print a PDF from a web page

Does anybody know how to make a print page button print a pdf document?
At the moment i'm using
Print Page
Obviously that just prints the page though. I have had to create pdf's for each page and thought it would be easier just to print the pdf instead of the page (Cross browser printing styles is kinda sucking ;).
Any ideas?

There is no standard way to print anything in PDF in any browser, such as on the Windows platform. On the Mac, there is always an option to print something as a PDF file, so regular printing will do.

I suggest you use Itextsharp. If you are using asp.net c#, this code works for you. Runs in the server side though. You can just put the html inside a panel to make it readable in the server.
/// import these namespaces
using System.IO;
using iTextSharp.text;
using iTextSharp.text.pdf;
using iTextSharp.text.html.simpleparser;
using System.Web.Services;
using System.Text;
/// Call this method whenever you need to convert
/// the html content inside the panel which runs in the server side.
[WebMethod]
public void ConvertHtmlStringToPDF()
{
StringBuilder sb = new StringBuilder();
StringWriter tw = new StringWriter(sb);
HtmlTextWriter hw = new HtmlTextWriter(tw);
pnlPDF.RenderControl(hw);
string htmlDisplayText = sb.ToString();
Document document = new Document();
MemoryStream ms = new MemoryStream();
PdfWriter writer = PdfWriter.GetInstance(document, ms);
StringReader se = new StringReader(htmlDisplayText);
HTMLWorker obj = new HTMLWorker(document);
document.Open();
obj.Parse(se);
// step 5: we close the document
document.Close();
Response.Clear();
Response.AddHeader("content-disposition", "attachment; filename=report.pdf");
Response.ContentType = "application/pdf";
Response.Buffer = true;
Response.OutputStream.Write(ms.GetBuffer(), 0, ms.GetBuffer().Length);
Response.OutputStream.Flush();
Response.End();
}

We Keep Coding

JavaScript is the programming language of the Web.

How can I take a screenshot of a whole web page? - javascript

Related

HTML DOM to Download Image from <img> URI

PDF not working with List Int array passed to Browser

IE 9/10/11 not rendering images after some unknown threshold

Print PDF Created using itextsharp

How to print a PDF from a web page

Categories

Resources