Get html source of dynamic content generated through Java functions

Get html source of dynamic content generated through Java functions - javascript

In the site I am working on there many Java functions that dynamically generate content when executed, the problem is this content is not visible in the source when viewing the source it show only the java function so that content is also not visible to search engines.
Is there is any way to make this content visible in source so it be visible to search engines?

The answer to your question is 'no'. Search engines do not attempt to parse and run JS (which is necessary to recreate the output the user sees).

Related

Can I insert HTML into a page via a file, and then interact with those DOM elements with JavaScript?

I'm building a Chrome Extension.
The extension injects some CSS and JavaScript when .html files on the users local drive are loaded in the browser (file:///).
My extension adds an extensive UI to the page that allows the user to modify and manipulate the original source code from their .html file.
The primary purpose of the extension is debugging and QAing HTML email newsletters. Here's just a few things that it does:
Checking links for the appropriate parameters.
Toggling images off and on to simulate popular email clients.
Displaying the source code side-by-side to show a desktop view and multiple mobile sized views.
A function that takes the original HTML and generates a plain text version.
A function that toggles <style> blocks off and on to simulate popular email clients ignoring them.
Email files are backed up via Dropbox and the Dropbox API is integrated to allow for quick sharing right from the email newsletter.
Until now I've been using javascript in my injected content script like this to create all of my menu items.
var debugOrb = document.createElement("div");
debugOrb.id = "borders-orb";
debugOrb.className = "borders-orb orb glyph";
debugOrb.addEventListener("click", toggleBorders, false);
orbsBottom.appendChild(debugOrb);
Here's an extended view of the code I've written to create all of these toggles/menu items: http://pastebin.com/LQTkNhpP
My problem is that now I'm going to be adding a LOT more clickable menu items like this. And it feels like if I do, it's going to get out of hand really quick. Especially since I'll be nesting a lot of divs to make the whole thing look organized and using JavaScript to create lots of text nodes too.
My first thought was what if I could just create my entire menu in regular HTML, then just inject that file into the page with the javascript in my content script. I'm barely intermediate level with JavaScript though. And as I understand it, if I did this, I'd lose my ability to use onclick handlers for all of these divs I'm creating.
Is there an efficient way to handle my goal that I'm not aware of?
Notes:
I'm not using any framework/plugins like React, Angular, or jQuery.

Once the html is added you can always get the element by id and then add an event listener to that element. You can have functions relate to the divs and then onload create the event listeners. element.addEventListener ('click', function);

Insert html pages into wordpress

I have an html site with a page of info for each county in the US. I want to convert this into a new wordpress site. I can do this one by one but my issue comes when I have mass changes to affiliate code or common text. I would have to got to each page and manually change it. but with over 3000 pages it would be way to time consuming. I dont want to use Iframes but would like to know if there is a way to call the html pages into the wordpress page that makes sense seo wise.
I am open to creating a page for each county or have one page with text or buttons on it with each county listed and when clicked will insert the info below. I know alot about static html coding but am new to php.

If you dont want Iframes, I think there only remain two options. I don't know if they will work in WordPress though.
1. PHP Include
With the very simple PHP include() statement, you can include the old html files in your new website. If you have a HTML-file for example, name your file yourname.php and add this in the position you want your old page to appear:
<?php include(path_to_old_page/name.html); ?>
This will include the full old page, but the file needs to be on the same server.
2. AJAX
With JavaScript you can perform XHTTP-requests to load files from the server. This is easiest when using jQuery. Here you can use the $(selector).load(path_to_old_page/name.html) statement. This will load the file in the HTML elements to which the selector applies.
(The selector works the same as CSS selectors, see the w3schools page for more)
This will also include the full old page, when it is on the same server

You can have your static pages in WordPress as well. Like if you want to create a new county named "example" you can create new WordPress page named "example" by entering title " example" .... now come to content. Just copy page content (only "example" county related html code from your static website) and place that code inside newly created WordPress "example" page. Make sure you add this html content inside 'text' tab in editor. Your page will be created with all your existing data ... now you can view this page and can use this page's URL where ever you want.

Generate PDF of a section of a php page once rendered into HTML and CSS and account for pdf page-breaks

The system I am working on has a questionnaire in it and then shows the responses to the admin in a nice report on screen. I need to create functionality that turns the on screen report into a pdf, similar to how the browser generates a pdf of the page when you select print. Although I need to only turn a section of the page into pdf. And it would be ideal to be able to alter the HTML so that the pdf page breaks don't interfere with the presentation of the report.
You can download a pdf of how the report looks, generated by the browser functionality. This is just an example, I need the pdf to be generated by a link or button and not include the whole page (the top part in this case).
I have tried some php HTML to pdf generators, but it's difficult because the HTML is dynamically generated so I'm not sure how to send all the HTML, once rendered, to the page that creates the pdf.
To overcome the page breaks, I've considered using javascript or jquery to read the height of the div of each question within the report and then write a simple script to calculate if the next div will fit on the page and if not add a margin on top of that div so that it starts on a new page.
What software can I use to generate the pdf, given these requirements? Either php or javascript. Appreciate the help.

Have you considered Snappy for PHP? It makes use of wkhtmltopdf behind the curtains to convert any HTML document into PDF.
We are using it and it works great.
https://github.com/KnpLabs/snappy

You could try mpdf and use the page-break-inside: avoid property which is actually a CSS property. I have not used this, but it might be what you're looking for.
Looks like you can add this property to the <div> and <table> tags (mPDF Supported CSS).

Is it possible to convert a dynamic HTML page with a lot of javascript to a page without javascript?

I have a page with a lots of javascript. However, the page once rendered remains static, there are no moving things or special effects, etc... It should be possible to render the same HTML without any javascript at all using only the plain HTML and CSS. This is exactly what I want - I would like to get a no javascript version of the particular page. Surely, I do not expect any dynamic behavior, so I am OK if buttons are dead, for example. I just want them rendered.
Now, I do not want an image. It needs to be an HTML with CSS, may be embedded with the HTML, which is fine too.
How can I do it?
EDIT
I am sorry, but I must have not been clear. My web site works with javascript and will not work without it. I do not want to check if it works without, I know it will not and I really do not care about it. This is not what I am asking. I am asking about a specific page, which I want to grab as pure HTML + CSS. The fact that its dynamic nature is lost is of no importance.
EDIT2
There is a suggestion to gram the HTML from the DOM inspector. This is what I did the first thing - in Chrome development utils copied as HTML the root html element and saved it to a file. Of course, this does not work, because it continues to reference the CSS files on the web. I guess I should have mentioned that I want it to work from the file system.
Next was to save the page as complete with all the environment using some kind of the Save menu (browser dependent). It saves the page and all the related files forming a closure, which can be open from the file system. But the html has to be manually cleaned up of all the javascript - tedious and error prone.
EDIT3
I seem to keep forgetting things. Images should be preserved, of course.

I have to do a similar task on a semi-regular basis. As yet I haven't found an automated method, but here's my workflow:
Open the page in Google Chrome (I imagine FireFox also has the relevant tools);
"Save Page As" (complete page), rename the html page to something nicer, delete any .js scripts which got downloaded, move everything into a single folder;
On the original page, open the Elements tab (DOM inspector), find and delete any tags which I know cause problems (Facebook "like" buttons for example) (I also try to delete script tags at this stage because it's easier) and copy as HTML (right-click the <html> tag. Paste this into (replace) the downloaded HTML file (remember to keep the DOCTYPE which doesn't get copied;
Search all HTML files for any remaining script sections and delete (also delete any noscript content), and search for on (that's with a space at the start but StackOverflow won't render it) to remove handlers (onload, onclick, etc);
Search for images (src=, url(), find common patterns in image filenames and use regular expressions to replace them globally. So for example src="/images/myimage.png" => |/images/||. This needs to be applied to all HTML and CSS files. Also make sure the CSS files have the correct path (href). While doing this I usually replace all href (links) with #;
Finally open the converted page in a browser (actually I tend to do this early on so that I can see if any change I make causes it to break), use the Console tab to check for 404 errors (images that didn't get downloaded or had a different name) and the Network tab to check if anything is still being loaded from the online version;
For any files which didn't get downloaded I go back to the original page and use the Resources tab to find them and download manually;
(Optional) Cull any content which isn't needed (tracker images/iframes, unused CSS, etc).
It's a big job. I'd love a tool which automated all that, but so far I haven't found one. The pages I download are quite badly made (shops) which have a lot of unusual code, so that's why there are so many steps. You might not need to follow every step.

Javascript render (visitors) and html render (bots). Is legal? SEO

I have an aspx application.
In every GET the server respond with a "basic" html containing everything except table grids.
This "grid information" is contained in a input type hidden (json format) in the page.
This is by design and cannot be changed.
A normal visitor wil see the page HTML:
head, body, scripts, meta tags
text, labels, inputs...
<div id='gridcontainer'></div>
more html
more html
Then onpage load I render dynamically by using javascript a table inside div (gridcontainer).
So after onload event is executed, the user see also the table grid inside div.
In this situation google is not indexing the information in tabular grids, because it is rendered by javascript after page load.
The application has the ability to render the exact same content in HTML without using javascript (loosing some functionality). When I say the same exact content I really mean the same page (same content, same headers, same metatags, same title), but not being render by javascript.
The content length may be diferent if we compare both responses because HTML responses might be bigger than html + json + javascript.
This is what I want the spider to see:
head, body, scripts, meta tags
text, labels, inputs...
<div id='gridcontainer'>
<table> table row 1, table row2.....<table>
</div>
more html
more html
To sum up, I want to deliver the "HTML" version to spiders and the other (javascript rendered) to visitors.
Is this cloaking?
This may be dangerous to search engines or is a total legal method if the content I am displaying is totally the same (no tricks).
Thanks in advance!

If the content is basically the same and a human viewer would say that it's the same content, then it's legal. I know of a fairly major site that does this with Google's blessing. Any site that has a page that is largely generated with client-side JS has to do something like this for Google to see anything useful. Since Google doesn't currently evaluate Javascript, there is no other choice for a page that use JS-generated HTML.
I don't know if there's a way to get Google's blessing to avoid any accidental penalty.
The important point is that the actual content of the page needs to be the same. The details of the formatting does not have to be identical.

Note: For legal advice, contact a lawyer.
Yes, this is 'cloaking'.
Yes, it's morally questionable.
But No, it isn't illegal. *(subject to the disclaimer at the top of this answer)
But either way don't do it, because Yes, Google will kill your rankings if they catch you trying to serve content to them which the user doesn't get to see.

If you use progressive enhancement you won't have any issues at all. What you would do is serve the HTML version so users who don't have JavaScript enabled can still see the content. Then add JavaScript that, when the page loads, removes the current HTML and adds the enhanced version of that same content. They key is that the content is the same, just the experience is different due to lack of JavaScript capabilities. This will never get you in trouble with the search engines and is great accessibility. Accessibility is one of the main tenants of SEO.

We Keep Coding

JavaScript is the programming language of the Web.