Using Browser Javascript on Server Side (in Rails 3) - javascript

There is a great bookmarklet script that takes a HTML document and, using javascript, strips out the main article content (like Instapaper, but better).
I want to know the most efficient way to use this same javascript script on the server side with Rails 3.
Is it even possible? The ideal would be to be able to request a URL from the server (in Rails) and then parse the response using the javascript, and return the processed text (and then persist it to a db).
I was thinking of just adapting the script in Ruby, but this seems silly, especially since jQuery and javascript itself have a bunch of built-in functions for parsing a DOM. On the other hand, the script uses DOM constructions from the browser, so it might require a server-side browser?
Any suggestions?

We actually do this very thing in one of our webapps. If you want to implement this functionality server-side in your Ruby on Rails application, your best bet is to use a Ruby HTML/XML parsing library, such as Nokogiri.
I wrote an article specifically explaining how to strip out the important information from a linked webpage, like Instapaper does, using Ruby + Nokogiri.
Create a Printable Format for Any Webpage with Ruby and Nokogiri

Maybe run the script in something like Rhino Shell and capture the output?

Node.js comes to mind when speaking about server-side Javascript.
I think the Javascript stuff of readability could also be translated into Ruby but this would probably require some serious amount of work.

Related

Incorporating a C# library within a Javascript "wrapper"

I'm building a plain old HTML website that I'm putting a little Javascript into. Out of curiosity, I've been considering using other languages to accomplish the tasks that I want to achieve using Javascript, which include locating (and ordering) objects by date and tag variables.
Is it at all possible to incorporate C# in my JS code, referencing a library located somewhere in the site's directory? Would it theoretically be possible, or would I essentially just be translating the C# into the equivalent Javascript?
Sorry, you can't. Browsers can only run Javascript code. There are other languages (coffeescript, typescript, etc) that "compile" to javascript, but in the end the only valid language for client side scripting is Javascript.
There are some "C# to javascript" translators, however that only gives you a similar syntax, but not the wide standard library that .NET Framework comes with.
What you can do however is run your C# code on server side (you can run ANYTHING there, since it's completely under your control) and then use AJAX to call it. This is known as a "web service" and it's a pretty standard practice.

Load a DOM and Execute javascript, server side, with .Net

I would like to load a DOM using a document (in string form) or a URL, and then Execute javascript functions (including jquery selectors) against it. This would be totally server side, in process, no client/browser.
Basically I need to load the dom and then use jquery selectors and text() & type val() functions to extract strings from it. I don't really need to manipulate the dom.
I have looked at .Net javascript engines such as Jurassic and Jint, but neither support loading a DOM, and so therefore can't do what I need.
I would be willing to consider non .Net solutions (node.js, ruby, etc) if they exist, but would really prefer .Net.
edit
The below is a good answer, but currently I'm trying a different route, I'm attempting to port envjs to jurassic. If I can get that working I think it will do what I want, stay tuned....
The answer depends on what you are trying to do. If your goal is basically a complete web browser simulation, or a "headless browser," there are a number of solutions, but none of them (that I know of) exist cleanly in .NET. To mimic a browser, you need a javascript engine and a DOM. You've identified a few engines; I've found Jurassic to be both the most robust and fastest. The google chrome V8 engine is also very popular; the Neosis Javascript.NET project provides a .NET wrapper for it. It's not quite pure .NET since you have a non-.NET dependency, but it integrates cleanly and is not much trouble to use.
But as you've noted, you still need a DOM. In pure C# there is XBrowser, but it looks a bit stale. There are javascript-based representations of the entire browser DOM like jsdom, too. You could probably run jsdom in Jurassic, giving you a DOM simulation without a browser, all in C# (though likely very slowly!) It would definitely run just fine in V8. If you get outside the .NET realm, there are other better-supported solutions. This question discusses HtmlUnit. Then there's Selenium for automating actual web browsers.
Also, bear in mind that a lot of the work done around the these tools is for testing. While that doesn't mean you couldn't use them for something else, they may not perform or integrate well for any kind of stable use in inline production code. If you are trying to basically do real-time HTML manipulation, then a solution mixing a lot of technologies not that aren't widely used except for testing might be a poor choice.
If your need is actually HTML manipulation, and it doesn't really need to use Javascript but you are thinking more about the wealth of such tools available in JS, then I would look at C# tools designed for this purpose. For example HTML Agility Pack, or my own project CsQuery, which is a C# jQuery port.
If you are basically trying to take some code that was written for the client, but run it on a server -- e.g. for sophisticated/accelerated web scraping -- I'd search around using those terms. For example this question discusses this, with answers including PhantomJS, a headless webkit browser stack, as well as some of the testing tools I have already mentioned. For web scraping, I would imagine you can live without it all being in .NET, and that may be the only reasonable answer anyway.

Is it possible to create dynamic HTML pages with Javascript?

Is it possible to create dynamic HTML page with Javascript ? Now or tomorrow...
(Is it possible to see javascript replacing PHP, ASP, JSP or ASP.NET ?)
It sounds like your question is whether JS can be used for server-side code. The answer is yes - the most popular is certainly Node.js, which I highly recommend.
It's not up to version 1 yet, but it's already being used in production by a bunch of high-profile companies.
For more info, see this SO question.
Dynamic HTML (DHTML) is defined as JavaScript + HTML + CSS. From wikipedia:
Dynamic HTML, or DHTML, is an umbrella term for a collection of technologies used together to create interactive and animated web sites by using a combination of a static markup language (such as HTML), a client-side scripting language (such as JavaScript), a presentation definition language (such as CSS), and the Document Object Model.
But, it sounds like you are asking about using JavaScript on the server. ASP uses JavaScript (or vbscript). ASP.Net can use JScript.Net. Node.js is a newer server implementation of JavaScript.
The thing about server technologies like ASP or JSP is that there is more to them than just the programming language. They include frameworks and templating engines. JavaScript can't do this on its own because it requires things like declarative syntax. But, as a language, JavaScript has been used on the server for a long time.
Yes, you can take a look at Node.js which is a server implementation of Javascript.
If you want to do this without any server-side code. Yes you can by using a templating system. You consume all of the site information, usually in JSON format, then generate all HTML client-side.
http://net.tutsplus.com/tutorials/javascript-ajax/quick-tip-an-introduction-to-jquery-templating/
Of course, you need a method to update the data files to make this "dynamic".

Execute javascript on IIS server

I have the following situation. A customer uses JavaScript with jQuery to create a complex website. We would like to use JavaScript and jQuery on the server (IIS) for the following reasons:
Skills transfer - we would like to use JavaScript and jQuery on the server and not have to use eg VB Script. / classic asp. .Net framework/Java etc is ruled out because of this.
Improved options for search/accessibility. We would like to be able to use jQuery as a templating system, but this isn't viable for search engines and users with js turned off - unless we can selectively run this code on the server.
There is significant investment in IIS and Windows Server, so changing that is not an option.
I know you can run jScript on IIS using windows Script host, but am unsure of the scalability and the process surrounding this. I am also unsure whether this would have access to the DOM.
Here is a diagram that hopefully explains the situation. I was wondering if anyone has done anything similar?
EDIT: I am not looking for critic on web architecture, I am simply wanting to know if there are any options for manipulating the DOM of a page before it is sent to the client, using javascript. Jaxer is one such product (no IIS) Thanks.
Have a look at bringing the browser to the server, Rhino, and Use Microsoft's IIS as a Java servlet engine.
The first link is from John Resig's (jQuery's creator) blog.
Update August 2 2011
Node.js is coming to Windows.
The idea to reuse client JS on the server may sound tempting, but I am not sure that jQuery itself would be ready to run in server environment.
You will need to define global context for jQuery somehow by initializing window, document, self, location, etc.. I am not sure it is doable.
Besides, as Cheeso has mentioned, Active Server Pages is a very outdated technology, it was replaced with ASP.Net by Microsoft in the beginning of the century. I used to maintain a legacy system using ASP 3.0 for more than a year and that was pain. The most wonderful pastime was debugging: you will hardly find anything for the purpose today and will have to decript beautiful errors like in IIS log:
error '800a9c68'
Application-defined or object-defined error
Nevertheless, I can confirm that I managed to reuse client and server JScript. But this was code written by me who knew that it was going to be used on the server.
P.S. I would not recommend move that way. There are plenty templating frameworks which are familiar to those who write HTML and JavaScript.
JScript runs on IIS via something called ASP.
Active Server Pages.
It was first available in 1996.
Eventually ASP.NET was introduced as a successor. But ASP is still supported.
There is no DOM for the HTML page, though.
You might need to reconsider your architecture a bit.
I think the only viable solutions you're likely to find anywhere near ready to go involve putting IIS in front of Java. There are two browser-like environments I'm aware of coded for Java:
1) Env-js (see http://groups.google.com/group/envjs and http://github.com/thatcher/env-js )
I believe this one has contributions from jQuery's John Resig and was put together with jQuery testing/support in mind.
2) HTMLUnit (see http://htmlunit.sourceforge.net/ ) This one's older, and wasn't originally conceived around jQuery, but there are reports in the wild of using it to run jQuery's test suite successfully (http://daniel.gredler.net/2007/08/08/htmlunit-taming-jquery/ ).
If you want something pure-IIS/MS, I think your observation about windowsScript host and/or something like the semi-abandoned JScript.NET is probably about as close as you're going to come, along with a port (which you'll probably have to start) of something like Env-js or HTMLUnit.
Also, I don't know if you've seen the Wikipedia list of server-side JavaScript solutions:
http://en.wikipedia.org/wiki/Server-side_JavaScript
Finally... you could probably write a serviceable jQuery-like library in any language that already has some kind of DOM library and first-class functions (or, failing that an eval facility). See, for example pQuery for Perl (http://metacpan.org/pod/pQuery ). This would get you the benefits of the jQuery style of manipulating documents. Skill transfer is great and JavaScript has a wonderful confluence of very nice features, but on the other hand, having developers who care enough to learn multiple languages is also great, and js isn't the only nice language out there.
I think it's mainly a browser based script so probably you are better of using technologies based on VB or .NET to perform or generate HTML from templates. I'm sure there are because in the java world there are a few of these around (like velocity). You'd then use jQuery to create or add client side functionality and usability so it makes the website more usable than it would have been.
What exactly do you mean by
"A customer uses JavaScript with
jQuery to create a complex website"
Half the point of jQuery is to make it easy for the developer to manipulate the DOM, and therefore add interactive enhancements to a web site. By running the Javascript on the server and only rendering HTML you will lose the ability to add these enhancements, without doing a round trip to the server (think WebForms postback model...ugh).
Now if what you really mean is the customer uses a site builder based on jQuery, why not have that tool output flat HTML in the first place?
Take a look at this technology. You can invoke scripts to run at server, at client, or both. Plus, this really implements the firefox engine on the server. Take a look at it.
Aptana's Jaxer is the first AJAX web server so far. I have not tryed it yet, but I will. Looks promising and very powerful.

Tree-Based (vs. HTML-Based) Web Framework?

Anyone who writes client-side JavaScript is familiar with the DOM - the tree structure that your browser references in memory, generated from the HTML it got from the server. JavaScript can add, remove and modify nodes on the DOM tree to make changes to the page. I find it very nice to work with (browser bugs aside), and very different from the way my server-side code has to generate the page in the first place.
My question is: what server-side frameworks/languages build a page by treating it as a DOM tree from the beginning - inserting nodes instead of echoing strings? I think it would be very helpful if the client-side and server-side code both saw the page the same way. You could certainly hack something like this together in any web server language, but a framework dedicated to creating a page this way could make some very nice optimizations.
Open source, being widely deployed and having been around a while would all be pluses.
You're describing Rhino on Rails, which is not out but will be soon.
Similarly, Aptana Jaxer, however RnR will include an actual framework (Rails) whereas Jaxer is just the server technology.
Aptana's Jaxer AJAX server might be something for you to check out, as it uses JS server-side, as well.
That being said, I would argue that you're better off not generating your markup with print statements or echos, but rather template and hook in your dynamic content.
Jaxer is server-side javascript + the DOM. You can integrate jaxer with other languages, by post-processing their output.
Also in java, php, ... you can use xpath to manipulate the DOM.
I see where you're coming from but it's all a bit moot isn't it. You can't send anything but rendered content to the browser, and you have to do it all in one go (AJAX aside). There's no value from what you are suggesting (from what I can see) as even if you build it tree-like, you're still only building a page which is sent wholesale to the client.

Categories