I've been reading articles online about what universal javascript is but I'm still not comfortable with the definitions each site is giving which is, "code that can run on the client and server." Does this mean that a node.js app is inherently universal javascript because it will have javascript running in the client side and server side. Or does universal javascript have to do with server side rendering then client side rendering?
Preface: I cannot find any highly-authoritative (e.g. ECMA, Microsoft, Mozilla or Google) source that provides a strict definition of "universal JavaScript" or "isomorphic JavaScript" - at most I've found a few blog posts (albeit by influential personalities) however I can see why a newcomer might be confused.
It seems there are two definitions going around which are similar, but with crucial differences:
1. To refer to JavaScript which runs anywhere
This definition refers to JavaScript which does not take a dependency on any specific client-side or server-side API, instead they only make use of features present in JavaScript's built-in library (String, Array, Date, function, Math etc) or on other libraries that also similarly restrict their dependencies (a transitive relation).
Remember that "JavaScript" does not mean that the DOM API, AJAX, HTML5 <canvas> (and so on) are available - it just means the JavaScript scripting language is being used - that's it. JavaScript has been available outside of web-browsers for over 20 years now (Windows support JavaScript as a shell-scripting language in cscript.exe/wscript.exe and ASP 3.0 supported server-side JScript as an alternative to VBScript - and the .NET Framework has "JScript.NET" too).
So in this case, if you wrote a library that adds some useful string functions, which only references String, then that script would work without issue in a Node.js server environment or an in-browser environment.
But if your script ever used the window object (only present in browsers) or express (a library only for Node) then it loses "universal" status because it cannot "run everywhere".
2. To refer to JavaScript which renders the same HTML whether on the server or on the client
e.g. http://isomorphic.net/
This definition is actually a strict subset of the first definition: as the same script must (by definition) run inside both a server/Node.js context, but also a browser DOM context - and when it runs it generates content (typically HTML) that is then displayed in the user's browser (and by doing this it must take a dependency on both a Node API and the W3C DOM - so then it cannot strictly run "anywhere" because neither are available in a cscript.exe environment, for example.
Note: There is debate if use of XMLHttpRequest or fetch makes a script universal or not - as their presence is not guaranteed (as technically they're part of the DOM, not JavaScript's built-in library).
In this 2015 blog post ( https://medium.com/#ghengeveld/isomorphism-vs-universal-javascript-4b47fb481beb ) the author argues that only the term "isomorphic JavaScript" should be used to refer to rendering code that runs in both browser and server environments, while "universal JavaScript" should refer to truly portable, environment-agnostic, JavaScript (i.e. my first definition).
Nowadays Single Page Applications have become very popular but they have problems, SEO, for example.
So, how does an SPA work? JavaScript loads in the browser and loads data from an API. Most of the rendering is done on the client Side. But search engine bots have a hard time indexing the page because it doesn't have much without JS.
Now, Universal/Isomorphic App comes to the rescue. At the initial page load, the original page renders on the server. After that, the app works like an SPA. It's got better SEO because when a search engine bot asks for a page, the server returns the whole rendered HTML page, with content and meta tags.
Edit
An isomorphic app can be done with JavaScript (Node.js), PHP or some other language, but if that app written with Node.js, then we can call it universal as both the backend and frontend are in JavaScript.
I'll try to explain it with examples, even if other answers seem already accurate.
A basic example
Imagine you develop a SPA that render an Hello World message. This means that your browser loads an HTML file with a <script> tag (or the reference to a JS file) that actually makes this happen. You can prove that "Hello world" is generated by JavaScript in the client browser, because if you deactivate JavaScript you won't see any message.
Now isolate the code that prints the string "Hello World", it doesn't need much to be adapted and work in the server side. In fact, the server just needs to send an HTML string that "contains" the <h1>Hello World</h1> inside its body.
So what it makes it universal/isomorphic? The fact that the code can understand in which environment it runs (the browser, the server or possibly an other environment) and it keeps functioning. Remember: code usually only runs in one of the two environments, the thing is that you wrote some common code that can run in both environments (universal).
The behavior of a more complex Universal App
Imagine that you struggle to develop a new Universal website. The code can acknowledge in which environment it is running and work just fine. So you have, let's say, 80% of your code that is shared, it doesn't even need to know the environment, and the rest of your code is there to managing the fact that your app can be used in the client or in the server.
How does this work?
The client first contacts the server that returns some HTML to the client with all the content of the page, elaborated in the server. So the server renders the application. In the mean time the browser downloads the script file where your single page can work in the client. The client is now rendering the same page again. You won't see anything, because if it is properly done, it will just be the same (of course all the animations and real time features have to work client-side, so you will eventually see your animations starting)
When the user clicks an internal link or uses an interactive feature, or eventually fills out and submits a form, the client-side code is in use. The server doesn't get any request, especially assuming that all the interaction are abstracted in an API that is not our isomorphic app.
If the user goes crazy and wants to deactivate JavaScript, how do you assure that, for example, forms still work? Here is a trick you can use:
<form
method="post"
action="/api/fakeBackendRoute"
onSubmit={this.handleSubmit}
>
[input fields here]
</form>
When the client JS is available, the handleSubmit is executed and the propagation of the event is prevented. This way the server side code will never fire.
If the client JS is disabled, then handleSubmit will never be executed and you have to care that your /api/fakeBackendRoute will handle the data exactly how the client would.
Why do people use it?
In my opinion the difficulty of undertaking the development of an Universal App is often underestimated. Good reasons to use it are:
Be more SEO friendly
Support very old browsers. For example, if you want to support IE8, you could do something like this:
<!--[if gt IE 8]><!-->
<script src="yourfile.js"></script>
<!--<![endif]-->
Be more accessible for people that don't want to use JavaScript
Other reasons could be:
Performance, if it matters to your application. You can improve your response time by using, for example, a lot of Node capabilities to stream your HTML string in the first request, and eventually later be more in the client, where things will be likely faster. But you could decide whether it is faster to render on the client or on the server, depending on the content and how you create your assets.
If someone knows other good reasons, just comment below and I will add them.
Some good reference links:
https://medium.com/airbnb-engineering/isomorphic-javascript-the-future-of-web-apps-10882b7a2ebc
https://medium.com/front-end-developers/handcrafting-an-isomorphic-redux-application-with-love-40ada4468af4
https://github.com/xgrommx/awesome-redux
Related
When I discovered that Node.js was built using the V8 JavaScript engine, I thought:
Great, web scraping will be easier as the page
will be rendered like in the browser, with a
"native" DOM supporting XPath and any AJAX calls on
the page executed.
Why doesn't it have a native DOM when it uses the same JavaScript engine as Chrome?
Why doesn't it have a mode to run JavaScript in retrieved pages?
What am I not understanding about JavaScript engines vs the engine in a web browser?
Many thanks!
The DOM is the DOM, and the JavaScript implementation is simply a separate entity. The DOM represents a set of facilities that a web browser exposes to the JavaScript environment. There's no requirement however that any particular JavaScript runtime will have any facilities exposed via the global object.
What Node.js is is a stand-alone JavaScript environment completely independent of a web browser. There's no intrinsic link between web browsers and JavaScript; the DOM is not part of the JavaScript language or specification or anything.
I use the old Rhino Java-based JavaScript implementation in my Java-based web server. That environment also has nothing at all to do with any DOM. It's my own application that's responsible for populating the global object with facilities to do what I need it to be able to do, and it's not a DOM.
Note that there are projects like jsdom if you want a virtual DOM in your Node project. Because of its very nature as a server-side platform, a DOM is a facility that Node can do without and still make perfect sense for a wide variety of server applications. That's not to say that a DOM might not be useful to some people, but it's just not in the same category of services as things like process control, I/O, networking, database interop, and so on.
There may be some "official" answer to the question "why?" out there, but it's basically just the business of those who maintain Node (the Node Foundation now). If some intrepid developer out there decides that Node should ship by default with a set of modules to support a virtual DOM, and successfully works and works and makes that happen, then Node will have a DOM.
P.S: When reading this question I was also wondering if V8 (node.js is built on top of this) had a DOM
Why when it uses the same JS engine as Chrome doesn't it have a native
DOM?
But I searched google and found Google's V8 page which recites the following:
JavaScript is most commonly used for client-side scripting in a
browser, being used to manipulate Document Object Model (DOM) objects
for example. The DOM is not, however, typically provided by the
JavaScript engine but instead by a browser. The same is true of
V8—Google Chrome provides the DOM. V8 does however provide all the
data types, operators, objects and functions specified in the ECMA
standard.
node.js uses V8 and not Google Chrome.
Likewise, why doesn't it have a mode to run JS in retrieved pages?
I also think we don't really need it that bad. Ryan Dahl created node.js as one man (single programmer). Maybe now he (his team) will develop this, but I was already extremely amazed by the amount of code he produced (crazy). He wanted to make a non-blocking easy/efficient library, which I think he did a mighty good job at.
But then again, another developer created a module which is pretty good and actively developed (today) at https://github.com/tmpvar/jsdom.
What am I not understanding about Javascript engines vs the engine in
a web browser? :)
Those are different things as is hopefully clear from the quote above.
The Document Object Model (DOM in short) is a programming interface for HTML and XML documents and it represents the page so that programs can change the document structure, style, and content. More on this subject.
The necessary distinction between client-side (browser) and server-side (Node.js) and their main goals:
Client-side: accessing and displaying information of the web
Server-side: providing stable and reliable ways to deliver web information
Why is there no DOM in Node.js be default?
By default, Node.js doesn't have access, nor have any knowledge about the actual DOM in your own browser. Node.js just delivers the data, that will be used by your own browser to process and render the whole website, the DOM included. The server provides the data to your browser to use and process. That is the intended way.
Why wouldn't you want to access the DOM in Node.js?
Accessing your browser's actual DOM using Node.js would be just simply out of the goal of the server. Your own browser's role is to display the data coming from the server. However it is certainly possible and there are multiple solutions in different level of depths and varieties to pre-render, manipulate or change the DOM using AJAX calls. We'll see what future trends will bring.
Why would you want to access the DOM in Node.js?
By default, you shouldn't access your own, actual DOM (at least some data of it) using Node.js. Client-side and server-side are separated in terms of role, functionality, and responsibility based on years of experience and knowledge. Although there are several situations, where there are solid reasons to do so:
Gathering usage data (A/B testing, UI/UX efficiency and feedback)
Headless testing (Development, automation, web-scraping)
How can you access the DOM in Node.js?
jsdom: pure-JavaScript implementation, good for testing your own DOM/browser-related project
cheerio: great solution if you like/often use jQuery
puppeteer: Google's own way to provide headless testing using Google Chrome
own solution (your possible future project link here)
Although these solutions do not provide a way to access your browser's own, actual DOM by default, but you can create a project to send some form of data about your DOM to the server, then use/render/manipulate that data based on your needs.
...and yes, web-scraping and web development in terms of tools and utilities became more sophisticated and certainly easier in several fields.
node.js chose not to include it in their standard library. For any functionality, there is an inevitable tradeoff between comprehensiveness, scalability, and maintainability.
That doesn't mean it's not potentially useful. There is at least one JavaScript DOM implementation intended for NodeJS (among other CommonJS implementations).
You seem to have a flawed assumption that V8 and the DOM are inextricably related, that's not the case. The DOM is actually handled by Webkit, V8 doesn't handle the DOM, it handles Javascript calls to the DOM. Don't let this discourage you, Node.js has carved out a significant niche in the realtime server market, but don't let anybody tell you it's just for servers. Node makes it possible to build almost anything with JavaScript.
It is possible to do what you're talking about. For example there is the very good jsdom library if you really need access to the DOM, and node-htmlparser, there are also some really good scraping libraries that take advantage of these like apricot.
2018 answer: mainly for historical reasons, but this may change in future.
Historically, very little DOM manipulation was done on the server. Addiotinally, as other answers allude, the JS stdlib and the DOM are seperate libraries - if you're using node, for, say, Unix scripting, then HTMLElement and NodeList etc aren't really relevant to that.
However: server-side DOM manipulation is now a very common part of delivering web apps. Web servers need to understand the structure of pages, and, if asked to render a resource as HTML, deliver HTML content that reflects the initial state of a web application. This means web apps load much faster than if the server simply delivers a stub page and has the browsers then do the work of filling in the real content. Currently this is done with JSDom and similar, but in the same way node has Request and Response objects built in, having DOM functions maintained as part of the stdlib would help with this task.
Javascript != browser. Javascript as a language is not tied to browsers; node.js is simply an implementation of Javascript that is intended for servers, not browsers. Hence no DOM.
If you read DOM as 'linked objects immediately accessible from my script' then the answer 'it does, but it's very different from set of objects available from web document script'. The main reason is that node is 'evented I/O for V8', not 'HTML tree objects for V8'
Node is a runtime environment, it does not render a DOM like a browser.
Because there isn't a DOM. DOM stands for Document Object Model. There is no document in Node, so not DOM to manipulate it. That is definitively a browser thing.
You can use a library like cheerio though which gives you some simple DOM manipulation.
Node is server-level JavaScript. It's just the language applied to a basic system API, more like C++ or Java.
It seems people have answered 'why' but not how. A quick answer of how is that in a web browser, a document object is exposed (hence DOM , document object model). On windows this object is called document object. You can refer to this page and look at the methods it exposes which are for handling HTML documents like createElement. I don't use node.js or haven't done COM programming in a while but I'd imagine you could use DOM in node.js by simply calling the COM object IHTMLDocument3. Of course for other platforms like Mac OS X or Linux you would probably have to use something from their OS api. This should allow you to easily build a webpage server side using DOM, or to scrape incoming web pages.
Node.js is for serverside programming. There is no DOM to be rendered in the server.
1) What does it mean for it to have a D ocument O bject M odel? There's no document to represent.
2) You're most of the time you're not retrieving pages. You can, but most Node apps probably won't be.
3) Without a document and a browser, Javascript is just another programming language. So you may ask why there isn't a DOM in C# or Java
Good afternoon!
We're looking to get a javascript variable from a webpage, that we are usually able to retrieve typing app in the Chrome DevTools.
However, we're looking to realize this headlessly as it has to be performed on numerous apps.
Our ideas :
Using a Puppeteer instance to go on the page, type the command and return the variable, which works, but it's very ressource consuming.
Using a GET/POST request to the page trying to inject the JS command, but we didn't succeed.
We're then wondering if there will be an easier solution, as a special API that could extract the variable?
The goal would be to automate this process with no human interaction.
Thanks for your help!
Your question is not so much about a JS API (since the webpage is not yours to edit, you can only request it) as it is about webcrawling / browser automation.
You have to add details to get a definitive answer, but I see two scenarios:
the website actively checks for evidence of human browsing (for example, it sits behind CloudFlare and has requested this option); or the scripts depend heavily on there being a browser execution environment available. In this case, the simplest option is to automate a browser, because a headless option has to get many things right to fool the server or the scripts. I would use karate, which is easier than, say, selenium and can execute in-browser scripts. It is written in Java, but you can execute it externally and just read its reports.
the website does not check for such evidence and the scripts do not really require a browser execution environment. Then you can simply download everything requires locally and attempt to jury-rig the JS into executing in any JS environment. According to your post, this fails; but it is impossible to help unless you can describe how it fails. This option can be headless.
You can embed Chrome into your application and instrument it. It will be headless.
We've used this approach in the past to copy content from PowerPoint Online.
We were using .NET to do this and therefore used CEFSharp.
We have an app that sits behind a firewall and behind a CAS authentication layer. It has a feature that allows users with a special role to customize the way the app works by writing JavaScript functions that get inserted into the application at runtime, and which can be fired by events such as button clicks and page load and the like. (The JS is not "eval"'d - it is written into the page server-side.)
Needless to say, this feature raises security concerns!
Are there recommendations beyond what's being done already to secure this, that is beyond a) firewall, b) robust authentication and c) authorization.
EDIT:
In response to questions in comments:
1. Does the injected code become part of the application, or it is executed as an independent application (separated context)?
Yes, it becomes a part of the application. It currently gets inserted, server-side, into a script tag.
Does inserted JavaScript run on clients' browsers other than the original writer?
Yes. It gets persisted, and then gets inserted into all future requests.
(The application can be thought of as an "engine" for building custom applications against a generic backend data store which is accessed by RESTful calls. Each custom application can have its own set of custom these JavaScripts)
You really shouldn't just accept arbitrary JavaScript. Ideally, what should happen is that you tokenize whatever JavaScript is sent and ensure that every token is valid JavaScript, first and foremost (this should apply in all below scenarios).
After that, you should verify that whatever JavaScript is sent does not access sensitive information.
That last part may be extremely difficult or even impossible to verify in obfuscated code, and you may need to consider that no matter how much verification you do, this is an inherently unsafe practice. As long as you understand that, below are some suggestions for making this process a little safer than it normally is:
As #FDavidov has mentioned, you could also restrict the JavaScript from running as part of the application and sandbox it in a separate context much like Stack Snippets do.
Another option is to restrict the JavaScript to a predefined whitelist of functions (some of which you may have implemented) and globals. Do not allow it to interact directly with DOM or globals except of course primitives, control flow, and user-defined function definitions. This method does have some success depending on how robustly enforced the whitelist is. Here is an example that uses this method in combination with the method below.
Alternatively, if this is possible with what you had in mind, do not allow the code to run on anyone's machine other than the original author of the code. This would basically be moving a Userscript-like functionality into the application proper (which I honestly don't see the point), but it would definitely be safer than allowing it to run on any client's browser.
I have the following situation. A customer uses JavaScript with jQuery to create a complex website. We would like to use JavaScript and jQuery on the server (IIS) for the following reasons:
Skills transfer - we would like to use JavaScript and jQuery on the server and not have to use eg VB Script. / classic asp. .Net framework/Java etc is ruled out because of this.
Improved options for search/accessibility. We would like to be able to use jQuery as a templating system, but this isn't viable for search engines and users with js turned off - unless we can selectively run this code on the server.
There is significant investment in IIS and Windows Server, so changing that is not an option.
I know you can run jScript on IIS using windows Script host, but am unsure of the scalability and the process surrounding this. I am also unsure whether this would have access to the DOM.
Here is a diagram that hopefully explains the situation. I was wondering if anyone has done anything similar?
EDIT: I am not looking for critic on web architecture, I am simply wanting to know if there are any options for manipulating the DOM of a page before it is sent to the client, using javascript. Jaxer is one such product (no IIS) Thanks.
Have a look at bringing the browser to the server, Rhino, and Use Microsoft's IIS as a Java servlet engine.
The first link is from John Resig's (jQuery's creator) blog.
Update August 2 2011
Node.js is coming to Windows.
The idea to reuse client JS on the server may sound tempting, but I am not sure that jQuery itself would be ready to run in server environment.
You will need to define global context for jQuery somehow by initializing window, document, self, location, etc.. I am not sure it is doable.
Besides, as Cheeso has mentioned, Active Server Pages is a very outdated technology, it was replaced with ASP.Net by Microsoft in the beginning of the century. I used to maintain a legacy system using ASP 3.0 for more than a year and that was pain. The most wonderful pastime was debugging: you will hardly find anything for the purpose today and will have to decript beautiful errors like in IIS log:
error '800a9c68'
Application-defined or object-defined error
Nevertheless, I can confirm that I managed to reuse client and server JScript. But this was code written by me who knew that it was going to be used on the server.
P.S. I would not recommend move that way. There are plenty templating frameworks which are familiar to those who write HTML and JavaScript.
JScript runs on IIS via something called ASP.
Active Server Pages.
It was first available in 1996.
Eventually ASP.NET was introduced as a successor. But ASP is still supported.
There is no DOM for the HTML page, though.
You might need to reconsider your architecture a bit.
I think the only viable solutions you're likely to find anywhere near ready to go involve putting IIS in front of Java. There are two browser-like environments I'm aware of coded for Java:
1) Env-js (see http://groups.google.com/group/envjs and http://github.com/thatcher/env-js )
I believe this one has contributions from jQuery's John Resig and was put together with jQuery testing/support in mind.
2) HTMLUnit (see http://htmlunit.sourceforge.net/ ) This one's older, and wasn't originally conceived around jQuery, but there are reports in the wild of using it to run jQuery's test suite successfully (http://daniel.gredler.net/2007/08/08/htmlunit-taming-jquery/ ).
If you want something pure-IIS/MS, I think your observation about windowsScript host and/or something like the semi-abandoned JScript.NET is probably about as close as you're going to come, along with a port (which you'll probably have to start) of something like Env-js or HTMLUnit.
Also, I don't know if you've seen the Wikipedia list of server-side JavaScript solutions:
http://en.wikipedia.org/wiki/Server-side_JavaScript
Finally... you could probably write a serviceable jQuery-like library in any language that already has some kind of DOM library and first-class functions (or, failing that an eval facility). See, for example pQuery for Perl (http://metacpan.org/pod/pQuery ). This would get you the benefits of the jQuery style of manipulating documents. Skill transfer is great and JavaScript has a wonderful confluence of very nice features, but on the other hand, having developers who care enough to learn multiple languages is also great, and js isn't the only nice language out there.
I think it's mainly a browser based script so probably you are better of using technologies based on VB or .NET to perform or generate HTML from templates. I'm sure there are because in the java world there are a few of these around (like velocity). You'd then use jQuery to create or add client side functionality and usability so it makes the website more usable than it would have been.
What exactly do you mean by
"A customer uses JavaScript with
jQuery to create a complex website"
Half the point of jQuery is to make it easy for the developer to manipulate the DOM, and therefore add interactive enhancements to a web site. By running the Javascript on the server and only rendering HTML you will lose the ability to add these enhancements, without doing a round trip to the server (think WebForms postback model...ugh).
Now if what you really mean is the customer uses a site builder based on jQuery, why not have that tool output flat HTML in the first place?
Take a look at this technology. You can invoke scripts to run at server, at client, or both. Plus, this really implements the firefox engine on the server. Take a look at it.
Aptana's Jaxer is the first AJAX web server so far. I have not tryed it yet, but I will. Looks promising and very powerful.
I was looking into GWT. It seems nice, but our software have the must work without JS requirement. Is it possible?
No, it isn't. GWT provides a windowing toolkit that is specifically designed to run on the client, not on the server. Degraded (e.g. non-javascript) code would need to deliver complete HTML to the browser, which GWT simply does not do. It compiles your java code to a javascript file that is delivered to the client and builds the UI by DOM-manipulation on the client. Then there's some code to talk back to the server, some implicit, some written by you yourself. This model does not lend itself well to degrading gracefully.
The only way to degrade somewhat gracefully is to provide a second, non-javascript UI or use another toolkit that doesn't render the frontend on the client but delivers HTML. Sorry.
You could degrade gracefully by creating an html structure that is just 'good enough' (with form posts, linked menus, etc) and then have GWT attach to each part of that structure, augmenting its behavior. For example, make an HTML drop down dynamic, replace a link to another page with a component that opens a lightbox, or replace a link to another page with an XML http request to do the same thing (e.g. cast a vote).
I've done this a number of times for clients.
It's the opposite way that most GWT gets developed, but it can work.
I was looking at this issue myself when designing my website. GWT isn't really any better than just writing Javascript files in that their syntax is almost identical. The true benefit comes when you share client and server libraries. Hopefully you've resolved this issue in the last two years, but at any rate here are a couple examples that you may find useful.
Creating Gmail: With GWT, you can create an EmailFormatter in a shared package that does the email listing markup so that your server doesn't have to. You could then add support for legacy browsers ("older version") by using the same EmailFormatter class on the server side.
Form verification: While is is absolutely necessary from a security perspective to validate user input server side, it is more convenient for most users to have Javascript check a form before it is submitted. You can use the same Java code with GWT to do this.