Is it possible to manipulate DOM in node js? - javascript

I built a random quotes app using vanilla js and ajax and DOM manipulation and was wondering if I can do the js bit in the back end using node/express. However I can't seem to figure out if DOM manipulation ( like generating a new quote upon clicking the button ) can be done via NODE JS ?

You can parse HTML into a DOM with Node.js using a library such as libxmljs or cheerio which would then allow you to manipulate the result.
You can't use code running in Node.js to manipulate the DOM that a browser has built, or directly handle a click on a button rendered in the browser because both the button and DOM will be in the browser and not in the Node.js program.

The DOM is not part of JS. It is part of HTML, and as such requires the context of the browser in order for it to become available.
What you want to do is continue your DOM manipulation in the browser and make API calls to your node back end. This would be using something like the fetch API to broker the data and then responding to the data by updating the DOM appropriately for your application.

You can try to use some template rendering language like (ejs, handlebars...) to use your vanilla javascript code.
There are many template engine manipulation you can use to generate server side rendering your page.
With template engines you can serve html and manipulate the DOM before serving to the client.

You need distinguish what's dom and what's node firstly.
1.DOM
Means to be running in browser, it's need three type files
js
css
html
2.Node
It's a programing language, a tool, looks like javascript. You can't running node in brsower. But you can create html/css/js file by node coding, alse you can return them by server framework (eg. node express)
3.My suggestion
If you truely wanna write some code in node express, and do some DOM manipulate.
try:
SSR (Server Side Render)
Coding your DOM code in ejs file or some other server side html template
Return your page by writing server api(Express is good choice to do that)

Well, here Node/Express is a backend tool and your browser (where vanilla JS would run) is a front end.
A backend can generate output in the form of HTML/JSON/XML or even plain text!. But similarly, as we can't change the value returned by a function, The backend cannot change once it outputs something.
Here the frontend tools are used, that are able to change the DOM which is vanilla JS in your case.
So one way is fetching a new resource from backend all together OR the smarter way is to just fetch the quote you want to change (by creating a REST API on the same backend)
so you need some route like /page/quotes that return an HTML page
and some route like /api/random_quote that may return a random quote in JSON format. that you may call from frontend JS using something like fetch

Related

Wait for the JS to be executed before scraping the page

I'm trying to scrape the following page using hQuery: http://www.oddsportal.com/search/Paris+SG/soccer/
I realised half way that the odds of each game are included using JS (before, it's just -). Is there any way to get the page after the javascript has been executed or should I find another website??
My guess is that you would have to use another browser (not hQuery) and look into the code and see if there are any events that are emitted that you can catch up on.
You cannot using PHP
Scraping a site gives you whatever the server responds with to the HTTP request that you make (from which the "initial" state of the DOM tree is derived, if that content is HTML). It cannot take into account the "current" state of the DOM after it has been modified by Javascript.
You can using other powerful tools like selenium
You would need PhantomJs PHP wrapper for that is easy to use and gives more control and features, please see my answer here
Scraping a dynamically loading website with php curl
Hope it helps

Strategy for making React image gallery SEO-friendly

I wrote a React image gallery or slideshow. I need to make the alt text indexable by search engines, but because my server is in PHP, React.renderToString is of limited use.
The server is in PHP + MySQL. The PHP uses Smarty, a decent PHP template engine, to render the HTML. The rest of the PHP framework is my own. The Smarty template has this single ungainly line:
<script>
var GalleryData = {$gallery};
</script>
which is rendered by the PHP's controller function as follows:
return array(
'gallery' => json_encode($gallery),
);
($gallery being the result table of a MySQL query).
And my .js:
React.render(<Gallery gallery={GalleryData} />, $('.gallery').get(0));
Not the most elegant setup, but given that my server is in PHP there doesn't seem to be much of a better way to do it (?)
I did a super quick hack to fix this at first shot - I copied the rendered HTML from Firebug, and manually inserted it into a new table in the DB. I then simply render this blob of code from PHP and we're good to go on the browser.
There was one complication which is that because React components are only inserted into the DOM as they're first rendered (as I understand it), and because the gallery only shows one image slide at a time, I had to manually click through all slides once before saving the HTML code out.
Now however the alt text is editable by CMS and so I need to automate this process, or come up with a better solution.
Rewriting the server in Node.js is out of the question.
My first guess is that I need to install Node, and write a script that creates the same React component. Because the input data (including the alt text) has to come from MySQL, I have a few choices:
connect to the MySQL DB from Note, and replicate the query
create a response URL on the PHP side that returns only the JSON (putting the SQL query into a common function)
fetch the entire page in Node but extracting GalleryData will be a mess
I then have to ensure that all components are rendered into the DOM, so I can script that by manually calling the nextSlide() method as many times as there are slides (less one).
Finally I'll save the rendered DOM into the DB again (so the Node script will require a MySQL connection after all - maybe the 1st option is the best).
This whole process seems very complicated for such a basic requirement. Am I missing something?
I'm completely new to Node and the whole idea of building a DOM outside of the browser is basically new to me. I don't mind introducing Node into my architecture but it will only be to support React being used on the front-end.
Note that the website has about 15,000 pageviews a month, so massive scalability isn't a consideration - I don't use any page caching as it simply isn't needed for this volume of traffic.
I'm likely to have a few React components that need to be rendered statically like this, but maintaining a small technical overhead (e.g. maintaing a set of parallel SQL queries in Node) won't be a big problem.
Can anyone guide me here?
I think you should try rendering React components on server-side using PHP. Here is a PHP lib to do that.
But, yes, you'll basically need to use V8js from your PHP code. However, it's kind of experimental and you may need to use other around. (And this "other way around" may be using Node/Express to render your component. Here is some thoughts on how to do it.)

Generating HTML in JavaScript vs loading HTML file

Currently I am creating a website which is completely JS driven. I don't use any HTML pages at all (except index page). Every query returns JSON and then I generate HTML inside JavaScript and insert into the DOM. Are there any disadvantages of doing this instead of creating HTML file with layout structure, then loading this file into the DOM and changing elements with new data from JSON?
EDIT:
All of my pages are loaded with AJAX calls. But I have a structure like this:
<nav></nav>
<div id="content"></div>
<footer></footer>
Basically, I never change nav or footer elements, they are only loaded once, when loading index.html file. Then on every page click I send an AJAX call to the server, it returns data in JSON and I generate HTML code with jQuery and insert like this $('#content').html(content);
Creating separate HTML files, and then for example using $('#someID').html(newContent) to change every element with JSON data, will use even more code and I will need 1 more request to server to load this file, so I thought I could just generate it in browser.
EDIT2:
SEO is not very important, because my website requires logging in so I will create all meta tags in index.html file.
In general, it's a nice way of doing things. I assume that you're updating the page with AJAX each time (although you didn't say that).
There are some things to look out for. If you always have the same URL, then your users can't come back to the same page. And they can't send links to their friends. To deal with this, you can use history.pushState() to update the URL without reloading the page.
Also, if you're sending more than one request per page and you don't have an HTML structure waiting for them, you may get them back in a different order each time. It's not a problem, just something to be aware of.
Returning HTML from the AJAX is a bad idea. It means that when you want to change the layout of the page, you need to edit all of your files. If you're returning JSON, it's much easier to make changes in one place.
One thing that definitly matters :
How long will it take you to develop a new system that will send data as JSON + code the JS required to inject it as HTML into the page ?
How long will it take to just return HTML ? And how long if you can re-use some of your already existing server-side code ?
and check how much is the server side interrection of your pages...
also some advantages of creating pure HTML :
1) It's simple markup, and often just as compact or actually more compact than JSON.
2) It's less error prone cause all you're getting is markup, and no code.
3) It will be faster to program in most cases cause you won't have to write code separately for the client end.
4) The HTML is the content, the JavaScript is the behavior. You're mixing both for absolutely no compelling reason.
in javascript or nay other scripting language .. if you encountered a problem in between the rest of the code will not work
and also it is easier to debug in pure html pages
my opinion ... use scriptiong code wherever necessary .. rest of the code you can do in html ...
it will save the triptime of going to server then fetch the data and then displaying it again.
Keep point No. 4 in your mind while coding.
I think that you can consider 3 methods:
Sending only JSON to the client and rendering according to a template (i.e.
handlerbar.js)
Creating the pages from the server-side, usually faster rendering also you can cache the page.
Or a mixture of this would be to generate partial views from the server and sending them to the client, for example it's like having a handlebar template on the client and applying the data from the JSON, but only having the same template on the server-side and rendering it on the server and sending it to the client in the final format, on the client you can just replace the partial views.
Also some things to think about determined by the use case of the applicaton, is that if you are targeting SEO you should consider ColBeseder advice, of if you are targeting mobile users, probably you would better go with the JSON only response, as this is a more lightweight response.
EDIT:
According to what you said you are creating a single page application, if this is correct, then probably you can go with either the JSON or a partial views like AngularJS has. But if your server-side logic is written to handle only JSON response, then probably you could better use a template engine on the client like handlerbar.js, underscore, or jquery templates, and you can define reusable portions of your HTML and apply to it the data from the JSON.
If you cared about SEO you'd want the HTML there at page load, which is closer to your second strategy than your first.
Update May 2014: Google claims to be getting better at executing Javascript: http://googlewebmastercentral.blogspot.com/2014/05/understanding-web-pages-better.html Still unclear what works and what does not.
Further updates probably belong here: Do Google or other search engines execute JavaScript?

How to analyze the markup of a web page after javascript processing within a script/from CLI?

I have been researching for the standard practice to analyze the markup of a web page after javascript processing within a script or from the command line, i.e. without any browser?
This needs to happen on a Linux environment. Are the are "installables" that would allow you to pass HTML markup including javascript and it would return the markup after simulating a standard browser request and all Javascript calls have been done?
If there are any Perl Modules you can think of than that would be of even more help.
I have been looking at https://developer.mozilla.org/en/SpiderMonkey and http://search.cpan.org/~mschilli/JavaScript-SpiderMonkey-0.12/SpiderMonkey.pm but I am not sure this would allow me to pass in a full HTML document in and get the processed version with all javascript DOM manipulations back?
Please let me know.
Update, I figure it out
I figured it all out - this is what needs to be done:
#!/usr/bin/perl
use WWW::Scripter;
$w = new WWW::Scripter;
$w->use_plugin('JavaScript');
$w->get('http://www.google.com');
print $w->content(),"\n";
You have to use a browser, a new one like WWW::Scripter::Plugin::Javascript
or an old one like WWW::Mechanize::Firefox
Maybe the solution could be headless browser like PhantomJS. Not a perl module, but very practical for front-end testing and automation.

HTML that's both server-side and javascript generated - how to combine?

I'm usually a creative gal, but right now I just can't find any good solution. There's HTML (say form rows or table rows) that's both generated javascript-based and server-sided, it's exactly the same in both cases. It's generated server-sided when you open the page (and it has to stay server-sided for Google) and it's generated by AJAX, to show live updates or to extend the form by new, empty rows.
Problem is: The HTML generation routines are existing twice now, and you know DRY (don't repeat yourself), aye? Each time something's changed I have to edit 2 places and this just doesn't fit my idea of good software.
What's your best strategy to combine the javascript-based and server-sided HTML generation?
PS: Server-sided language is always different (PHP, RoR, C++).
PPS: Please don't give me an answer for Node.JS, I could figure that out on my own ;-)
Here's the Ruby on Rails solution:
Every model has its own partial. For example, if you have models Post and Comment, you would have _post.html.erb and _comment.html.erb
When you call "render #post" or "render #comment", RoR will look at the type of the object and decide which partial to use.
This means that you can redner out an object in the same way in many different views.
I.e. in a normal response or in an AJAX response you'd always just call "render #post"
Edit:
If you would like to render things in JS without connecting to the server (e.g. you get your data from a different server or whatever), you can make a JS template with the method I mentioned, send it to the client and then have the client render new objects using that template.
See this like for a JS templating plugin: http://api.jquery.com/category/plugins/templates/
Make a server handler to generate the HTML. Call that code from the server when you open the page, and when you need to do a live update, do an AJAX request to that handler so you don't have to repeat the code in the client.
What's your best strategy to combine the javascript-based and server-sided HTML generation?
If you want to stay DRY, don't try to combine them. Stick with generating the HTML only on the server (clearly the preferable option for SEO), or only on the client.
Make a page which generates the HTML on the server and returns it, e.g.:
http://example.com/serverstuff/generaterows?x=0&y=foo
If you need it on the server, access that link, or call the subroutine that accessing the link calls. If you need it on the client, access that link with AJAX, which will end up calling the same server code.
Or am I missing something? (I'm not sure what you mean by "generated by AJAX").
I don't see another solution if you have two different languages. Either you have a PHP/RoR/whatever to JavaScript compiler (so you have source written in one language and automatically generated in the others), or you have one generate output that the other reads in.
Load the page without any rows/data.
And then run your Ajax routines to fetch the data first time on page load
and then subsequently fetch updates/new records as and when required/as decided by your code.

Categories