How to manipulate DOM using Node js?

How to manipulate DOM using Node js? - javascript

I am new to javascript and Node.js but I am trying to figure out if there is an alternative to document.getElementById() in Node that has the same function. If it cannot be done in Node, is it possible to create a pure js file to manipulate the DOM and a separate Node file. For extra information, what I am trying to do is to convert csv lines into a json object and then update the webpage with new information which is why I want to use document.getElementById().

document.getElementById() is a function that exists in a browser. There is no such function in nodejs.
It is possible to get a 3rd party module that will parse an HTML web page, create a DOM and then allow you to access the DOM programmatically to see what's in the web page. Cheerio and Puppeteer are two such 3rd party modules, each with differing levels of features. Puppeteer actually uses the Chromium browser engine and can even run Javascript in the page and generate screenshots. Cheerio parses the HTML and lets you access just what it creates (without Javascript running).
It sounds like maybe you're a bit confused about how web pages work. A browser running on the end user's computer loads a web page. Once the page is loaded, at that point the server's job is done. The web page exists only in the browser on the user's computer. The server can't directly, on its own, change that web page.
To change that web page (without reloading it), you would have to have supporting Javascript code in the web page (that runs in the user's browser). For example, you could have your Javascript make an Ajax call from the web page that would request certain data from the server. When the server gets that request, it could generate the data and return JSON back to the browser. The Javascript in the browser would then receive that JSON, parse it into a Javascript object and then use the DOM to insert new objects into the existing web page based on the data it received.
Note that all changes to the existing web page in the browser are made by the Javascript running in the web page in the user's browser, not directly by the server. The server can supply data, but cannot directly change the user's web page itself. Of course, the user could request an update page and the browser would request a new version of the whole page and the server could then supply a page that had different data in it, but that would involve reloading the whole page.
There are also template engines that exist for nodejs so that when your server is generating a web page, the template engine can help you create a set of HTML for that web page that incorporates dynamic data. This doesn't dynamically change a web page that is already sitting in a browser being displayed. Instead, it helps you generate a web page from scratch that incorporates dynamic data into the web page when it is first downloaded. Examples of templates engines that work with Express in nodejs are Pug, EJS, Nunjucks, handlebars and many others.

Related

HtmlAgilityPack download webpage which loaded asynchronously by javascript

i am using HtmlAgilityPack and trying to load some webpages. some webpages are javascript based and loads asynchoronously. is there any way to load web page after x seconds or after making sure page is completely loaded

Html Agility Pack is not mimicking the client side calls to dynamically load content into the DOM. It is a headless scraper that is downloading the static page given by the server; if you want that content, you will have to mimic the calls made by the client browser. If you do not want to try to emulate the calls a browser would make, instead of using a headless scraper, you can use something like Selenium to do this for you, the down side being, the browser will be opened on the host machine.

a website preview - loading a webpage using javascript Or Server-side

I want to show website preview on a link similar to facebook when a user post a link. my question has been repeated in the following link ,but I am going to ask specific information throughout my solutions. I have 2 solutions for showing webpage preview which are as follows:1. server side html process 2. client side html process.
1. server-side html process
I used System.Net.WebClient().DownloadString(url) to retrieve the web page data in server side , and I tried to extract the most important information in the page ,but in most cases, main part of the page loads using javascript , therefore I do not have access to that information.
Another solution in server-side html process is to work with webBrowser and WebDocument objects. because I didn't work with these libraries and I don't know how much the Web server performance affect by applying this objects , I only present this solution for discussion .Therefore are there any server-side html graber which fetch all html data including javascript loaded html source?
2. Client Side Html process
The simplest approach for client side is to use the iframe tag, but it has two following problems:
a. I can not access to innerHTML of the frame for the links on other domains.
b. I can not load https webpages such as drop-box and facebook in the iframe
because of "x-frame options" error.
My question is that, is there any other client-side solution to retrieve dynamic html source(loaded by javascript) from 3rd party webpages (usually https)? Or can I solve above problems with some tricks.

I guess server side approach would be most viable option. On client side you can use proxy services which allow to solve cross domain limitation, for example, crossorigin.
To generate a preview, similar to one Facebook provides, you need to get Open Graph information for target page. Libraries to process open graph data available for multiple platforms. OpenGraph-Net could be used on .NET plarform.

Process web page in JavaScript/browser and inserting into database?

I'm processing URLs with my program,
Basically what it does is, file_get_contents($url) to get web content and attach a JavaScript code at the bottom which processes the HTML source for the biggest image and inserts into the database by ajax.
The problem is I have to instantiate Firefox browser for the processing. I really don't need Firefox to render the page visually and other workings. All I want is for my script to do its job.
So is there a way to use Firefox's HTML/CSS and JavaScript engine without having to call the entire browser?

How can I get the HTML generated with javascript?

I want to get the HTML content of a web page but most of the content is generated by javascript.
Is it posible to get this generated HTML (with python if posible)?

The only way I know of to do this from your server is to run the page in an actual browser engine that will parse the HTML, build the normal DOM environment, run the javascript in the page and then reach into that DOM engine and get the innerHTML from the body tag.
This could be done by firing up Chrome with the appropriate URL from Python and then using a Chrome plugin to fetch the dynamically generated HTML after the page was done initializing itself and communicate back to your Python.

Checkout Selenium. It have a python driver, which might be what you're looking for.

If most of the content is generated by Javascript then the Javascript may be doing ajax calls to retrieve the content. You may be able to call those server side scripts from your Python app.
Do check that it doesn't violate the website's terms though and get permission.

Append javascript/html to page when navigating from a different page?

Alright, first off this is not a malicious question I'm asking. I have no intentions of using any info for ill gains.
I have an application that contains an embedded browser. This browser runs within the application's process, so I can't access it via Selenium WebDriver or anything like that. I know that it's possible to dynamically append scripts and html to loaded web pages via WebDriver, because I've done it.
In the embedded browser, I don't have access to the pages that get loaded. Instead, I can create my own html/javascript pages and execute them, to manipulate the application that houses the browser. I'm having trouble manipulating the existing pages within the browser.
Is there a way to dynamically add javascript to a page when you navigate to it and have it execute right after the page loads?
Something like
page1.navigateToUrl(executeThisScriptOnLoad)
page2 then executes the passed script.

I guess it is not possible to do it without knowledge of destination site. Although you can send data to the site and then use eval() function to evaluate sent data on destination page.

We Keep Coding

JavaScript is the programming language of the Web.