Browser gets blocked, workers to the rescue?

Browser gets blocked, workers to the rescue? - javascript

I use JavaScript for rendering 20 tables of 100 rows each.
The data for each table is provided by controller as JSON.
Each table is split into section that have "totals" and have some other JavaScript logic code. Some totals are outside of the table itself.
As a result JavaScript blocks browser for a couple of seconds (especially in IE6) :(
I was consideting to use http://code.google.com/p/jsworker/,
however Google Gears Workers (I guess workers in general) will not allow me to make changes to DOM at the worker code, and also it seems to me that I can not use jQuery inside jsworker worker code. (Maybe I am wrong here?).
This issue seems to be fundamental to the JavaScript coding practice, can you share with me your thoughts how to approach it?

Workers can only communicate with the page's execution by passing messages. They are unable to interact directly with the page because this would introduce enormous difficulties.
You will need to optimise your DOM manipulation code to speed up the processing time. It's worth consulting google for good practices.
One way to speed up execution is to build the table outside of the DOM and insert it into the document only when it is completed. This stops the browser having to re-draw on every insertion, which is where a lot of the time is spent.

You are not supposed to change the UI in a backgroundworker in general. You should always signal the main thread that the worker is finished, and return a result to the main thread whom then can process that.
HTML5 includes an async property for tags, and writing to your UI from there causes a blank page with just the stuff you wanted to write.
So you need another approach there.
I am not a JS guru, so I can't help you with an implementation, but at least you now have the concept :)

If you want to dig into the world of browser performance, the High Performance Web Sites blog has heaps of great info — including when javascripts block page rendering, and best practice to avoid such issues.

Related

Can I effectively use transpilation, code tokenzation/regeneration, a VM or a similar approach to guaranteeably sandbox/control code within a browser?

It looks like I'm asking about a tricky problem that's been explored a lot over the past decades without a clear solution. I've seen Is It Possible to Sandbox JavaScript Running In the Browser? along with a few smaller questions, but all of them seem to be mislabeled - they all focus on sandboxing cookies and DOM access, and not JavaScript itself, which is what I'm trying to do; iframes or web workers don't sound exactly like what I'm looking for.
Architecturally, I'm exploring the pathological extreme: not only do I want full control of what functions get executed, so I can disallow access to arbitrary functions, DOM elements, the network, and so forth, I also really want to have control over execution scheduling so I can prevent evil or poorly-written scripts from consuming 100% CPU.
Here are two approaches I've come up with as I've thought about this. I realize I'm only going to perfectly nail two out of fast, introspected and safe, but I want to get as close to all three as I can.
Idea 1: Put everything inside a VM
While it wouldn't present a JS "front", perhaps the simplest and most architecturally elegant solution to my problem could be a tiny, lightweight virtual machine. Actual performance wouldn't be great, but I'd have full introspection into what's being executed, and I'd be able to run eval inside the VM and not at the JS level, preventing potentially malicious code from ever encountering the browser.
Idea 2: Transpilation
First of all, I've had a look at Google Caja, but I'm looking for a solution itself written in JS so that the compilation/processing stage can happen in the browser without me needing to download/run anything else.
I'm very curious about the various transpilers (TypeScript, CoffeeScript, this gigantic list, etc) - if these languages perform full tokenization->AST->code generation that would make them excellent "code firewalls" that could be used to filter function/DOM/etc accesses at compile time, meaning I get my performance back!
My main concern with transpilation is whether there are any attacks that could be used to generate the kind code I'm trying to block. These languages' parsers weren't written with security in mind, after all. This was my motivation behind using a VM.
This technique would also mean I lose execution introspection. Perhaps I could run the code inside one or more web workers and "ping" the workers every second or so, killing off workers that [have presumably gotten stuck in an infinite loop and] don't respond. That could work.

Why can't WebWorkers have access to the DOM? [duplicate]

This question already has answers here:
Is there a way to create out of DOM elements in Web Worker?
(10 answers)
Closed 7 years ago.
We all know we can spin up some web workers, give them some mundane tasks to do and get a response at some point, at which stage we normally parse the data and render it somehow to the user right.
Well... Can anyone please provide a relatively in-depth explanation as to why Web Workers do not have access to the DOM? Given a well thought out OO design approach, I just don't get why they shouldn't?
EDITED:
I know this is not possible and it won't have a practical answer, but I feel I needed a more in depth explanation. Feel free to close the question if you feel it's irrelevant to the community.

The other answers are correct; JS was originally designed to be single-threaded as a simplification, and adding multi-threading at this point has to be done very carefully. I wanted to elaborate a little more on thread safety in Web Workers.
When you send messages to workers using postMessage, one of three things happen,
The browser serializes the message into a JSON string, and de-serializes on the other end (in the worker thread). This creates a copy of the object.
The browser constructs a structured clone of the message, and passes it to the worker thread. This allows messaging using more complex data types than 1., but is essentially the same thing.
The browser transfers ownership of the message (or parts of it). This applies only for certain data types. It is zero-copy, but once the main context transfers the message to the worker context, it becomes unavailable in the main context. This is, again, to avoid sharing memory.
When it comes to DOM nodes,
is obviously not an option, since DOM nodes contain circular references,
could work, but in a read-only mode, since any changes made by the worker to the clone of the node will not be reflected in the UI, and
is unfortunately not an option since the DOM is a tree, and transferring ownership of a node would imply transferring ownership of the entire tree. This poses a problem if the UI thread needs to consult the DOM to paint the screen and it doesn't have ownership.

The entire world of Javascript in a browser was originally designed with a huge simplification - single threadedness (before there were webWorkers). This single assumption makes coding a browser a ton simpler because there can never be thread contention and can never be two threads of activity trying to modify the same objects or properties at the same time. This makes writing Javascript in the browser and implementing Javascript in the browser a ton simpler.
Then, along came a need for an ability to do some longer running "background" type things that wouldn't block the main thread. Well, the ONLY practical way to put that into an existing browser with hundreds of millions of pages of Javascript already out there on the web was to make sure that this "new" thread of Javascript could never mess up the existing main thread.
So, it was implemented in a way that the ONLY way it could communicate with the main thread or any of the objects that the main thread could modify (such as the entire DOM and nearly all host objects) was via messaging which, by its implementation is naturally synchronized and safe from thread contention.
Here's a related answer: Why doesn't JavaScript get its own thread in common browsers?

Web Workers do not have DOM access because DOM code is not thread-safe. By DOM code, I mean the code in the browser that handles your DOM calls.
Making thread-unsafe code safe is a huge project, and none of the major browsers are doing it.
Servo project is writing new browser from scratch, and I think they are writing their DOM code with thread-safety in mind.

Why doesn't JavaScript get its own thread in common browsers?

Not enough that JavaScript isn't multithreaded, apparently JavaScript doesn't even get its own but shares a thread with a load of other stuff. Even in most modern browsers JavaScript is typically in the same queue as painting, updating styles, and handling user actions.
Why is that?
From my experience an immensely improved user experience could be gained if JavaScript ran on its own thread, alone by JS not blocking UI rendering or the liberation of intricate or limited message queue optimization boilerplate (yes, also you, webworkers!) which the developer has to write themselves to keep the UI responsive all over the place when it really comes down to it.
I'm interested in understanding the motivation which governs such a seemingly unfortunate design decision, is there a convincing reason from a software architecture perspective?

User Actions Require Participation from JS Event Handlers
User actions can trigger Javascript events (clicks, focus events, key events, etc...) that participate and potentially influence the user action so clearly the single JS thread can't be executing while user actions are being processed because, if so, then the JS thread couldn't participate in the user actions because it is already doing something else. So, the browser doesn't process the default user actions until the JS thread is available to participate in that process.
Rendering
Rendering is more complicated. A typical DOM modification sequence goes like this: 1) DOM modified by JS, layout marked dirty, 2) JS thread finishes executing so the browser now knows that JS is done modifying the DOM, 3) Browser does layout to relayout changed DOM, 4) Browser paints screen as needed.
Step 2) is important here. If the browser did a new layout and screen painting after every single JS DOM modification, the whole process could be incredibly inefficient if the JS was actually going to make a bunch of DOM modifications. Plus, there would be thread synchronization issues because if you had JS modifying the DOM at the same time as the browser was trying to do a relayout and repaint, you'd have to synchronize that activity (e.g. block somebody so an operation could complete without the underlying data being changed by another thread).
FYI, there are some work-arounds that can be used to force a relayout or to force a repaint from within your JS code (not exactly what you were asking, but useful in some circumstances).
Multiple Threads Accessing DOM Really Complex
The DOM is essentially a big shared data structure. The browser constructs it when the page is parsed. Then loading scripts and various JS events have a chance to modify it.
If you suddenly had multiple JS threads with access to the DOM running concurrently, you'd have a really complicated problem. How would you synchronize access? You couldn't even write the most basic DOM operation that would involve finding a DOM object in the page and then modifying it because that wouldn't be an atomic operation. The DOM could get changed between the time you found the DOM object and when you made your modification. Instead, you'd probably have to acquire a lock on at least a sub-tree in the DOM preventing it from being changed by some other thread while you were manipulating or searching it. Then, after making the modifications, you'd have to release the lock and release any knowledge of the state of the DOM from your code (because as soon as you release the lock, some other thread could be changing it). And, if you didn't do things correctly, you could end up with deadlocks or all sorts of nasty bugs. In reality, you'd have to treat the DOM like a concurrent, multi-user datastore. This would be a significantly more complex programming model.
Avoid Complexity
There is one unifying theme among the "single threaded JS" design decision. Keep things simple. Don't require an understanding of a multiple-threaded environment and thread synchronization tools and debugging of multiple threads in order to write solid, reliable browser Javascript.
One reason browser Javascript is a successful platform is because it is very accessible to all levels of developers and it relatively easy to learn and to write solid code. While browser JS may get more advanced features over time (like we got with WebWorkers), you can be absolutely sure that these will be done in a way that simple things stay simple while more advanced things can be done by more advanced developers, but without breaking any of the things that keep things simple now.
FYI, I've written a multi-user web server application in node.js and I am constantly amazed at how much less complicated much of the server design is because of single threaded nature of nodejs Javascript. Yes, there are a few things that are more of a pain to write (learn promises for writing lots of async code), but wow the simplifying assumption that your JS code is never interrupted by another request drastically simplifies the design, testing and reduces the hard to find and fix bugs that concurrency design and coding is always fraught with.
Discussion
Certainly the first issue could be solved by allowing user action event handlers to run in their own thread so they could occur any time. But, then you immediately have multi-threaded Javascript and now require a whole new JS infrastructure for thread synchronization and whole new classes of bugs. The designers of browser Javascript have consistently decided not to open that box.
The Rendering issue could be improved if desired, but at a significant complication to the browser code. You'd have to invent some way to guess when the running JS code seems like it is no longer changing the DOM (perhaps some number of ms go by with no more changes) because you have to avoid doing a relayout and screen paint immediately on every DOM change. If the browser did that, some JS operations would become 100x slower than they are today (the 100x is a wild guess, but the point is they'd be a lot slower). And, you'd have to implement thread synchronization between layout, painting and JS DOM modifications which is doable, but complicated, a lot of work and a fertile ground for browser implementation bugs. And, you have to decide what to do when you're part-way through a relayout or repaint and the JS thread makes a DOM modification (none of the answers are great).

Is it still worth using eval for performance on mobile JavaScript?

To combine all modules into a single resource, we wrote each module into a separate script tag and hid the code inside a comment block (/* */). When the resource first loads, none of the code is parsed since it is commented out. To load a module, find the DOM element for the corresponding script tag, strip out the comment block, and eval() the code....
On an iPhone 2.2 device, 200k of JavaScript held within a block comment adds 240ms during page load, whereas 200k of JavaScript that is parsed during page load added 2600 ms. That's more than a 10x reduction in startup latency by eliminating 200k of unneeded JavaScript during page load!
http://googlecode.blogspot.co.uk/2009/09/gmail-for-mobile-html5-series-reducing.html
https://developers.google.com/speed/docs/best-practices/mobile
The gmail article is more than three years old and there's been great advantages in mobile performance since then, namely things like iOS's Nitro and JIT coming to mobile. Are the performance gains still to be had from using eval?

Its not the same technology issue as it was before since JavaScript engines have become so performant. Rather there are other considerations in terms of being more app-like.
There are tricks now that are different in approach such as using web workers for ajax requests to free up the thread, utilizing the GPU with CSS transformations and requestAnimationFrame or even asm.js. Using localStorage/sessionStorage and Application Cache is another approach along those lines where you can really get a lot of client-side caching up front to avoid calling anything more than the content JSON / images data urls / videos and load/execute things into memory as needed from those caches.
Its a different time in other words and your question is interesting but not focused in the right areas to really make a difference in web-app performance.

Just In General: JS Only Vs Page-Based Web Apps

When a developing a web app, versus a web site, what reasons are there to use multiple HTML pages, rather than using one html page and doing everything through Javascript?
I would expect that it depends on the application -- maybe -- but would appreciate any thoughts on the subject.
Thanks in advance.
EDIT:
Based on the responses here, and some of my own research, if you wanted to do a single-page, fully JS-Powered site, some useful tools would seem to include:
JQuery Plug Ins:
JQuery History:
http://balupton.com/projects/jquery-history
JQuery Address:
http://plugins.jquery.com/project/jquery-address
JQuery Pagination:
http://plugins.jquery.com/project/pagination
Frameworks:
Sproutcore
http://www.sproutcore.com/
Cappucino
http://cappuccino.org/
Possibly, JMVC:
http://www.javascriptmvc.com/

page based applications provide:
ability to work on any browser or device
simpler programming model
they also provide the following (although these are solvable by many js frameworks):
bookmarkability
browser history
refresh or F5 to repeat action
indexability (in case the application is public and open)

One of the bigger reasons is going to be how searchable your website is.
Doing everything in javascript is going to make it complicated for search engines to crawl all content of your website, and thus not fully indexing it. There are ways around this (with Google's recent AJAX SEO guidelines) but I'm not sure if all search engines support this yet. On top of that, it's a little bit more complex then just making separate pages.
The bigger issue, whether you decide to build multiple HTML pages, or you decide to use some sort of framework or CMS to generate them for you, is that the different sections of your website have URL's that are unique to them. E.g., an about section would have a URL like mywebsite.com/about, and that URL is used on the actual "about" link within the website.

One of the biggest downfalls of single-page, Ajax-ified websites is complexity. What might otherwise be spread across several pages suddenly finds its way into one huge, master page. Also, it can be difficult to coordinate the state of the page (for example, tracking if you are in Edit mode, or Preview mode, etc.) and adjusting the interface to match.
Also, one master page that is heavy on JS can be a performance drag if it has to load multiple, big JS files.

At the OP's request, I'm going to discuss my experience with JS-only sites. I've written four relevant sites: two JS-heavy (Slide and SpeedDate) and two JS-only (Yazooli and GameCrush). Keep in mind that I'm a JS-only-site bigot, so you're basically reading John Hinkley on the subject of Jody Foster.
The idea really works. It produces gracefully, responsive sites at very low operational costs. My estimate is that the cost for bandwidth, CPU, and such goes to 10% of the cost of running a similar page-based site.
You need fewer but better (or at least, better-trained) programmers. JavaScript is an powerful and elegant language, but it has huge problems that a more rigid and unimaginative language like Java doesn't have. If you have a whole bunch of basically mediocre guys working for you, consider JSP or Ruby instead of JS-only. If you are required to use PHP, just shoot yourself.
You have to keep basic session state in the anchor tag. Users simply expect that the URL represents the state of the site: reload, bookmark, back, forward. jQuery's Address plug-in will do a lot of the work for you.
If SEO is an issue for you, investigate Google Ajax Crawling. Basically, you make a very simple parallel site, just for search engines.
When would I not use JS-only? If I were producing a site that was almost entirely content, where the user did nothing but navigate from one place to another, never interacting with the site in a complicated manner. So, Wikipedia and ... well, that's about it. A big reference site, with a lot of data for the user to read.

modularization.
multiple files allows you to mre cleanly break out different workflow paths and process parts.
chances are your Business Rules are something that do not usually directly impact your layout rules and multiple files would better help in editing on what needs to be edited without the risk of breaking something unrelated.

I actually just developed my first application using only one page.
..it got messy
My idea was to create an application that mimicked the desktop environment as much as possible. In particular I wanted a detailed view of some app data to be in a popup window that would maintain it's state regardless of the section of the application they were in.
Thus my frankenstein was born.
What ended up happening due to budget/time constraints was the code got out of hand. The various sections of my JavaScript source got muddled together. Maintaining the proper state of various views I had proved to be... difficult.
With proper planning and technique I think the 'one-page' approach is a very easy way to open up some very interesting possibilities (ex: widgets that maintain state across application sections). But it also opens up many... many potential problem areas. including...
Flooding the global namespace (if you don't already have your own... make one)
Code organization can easily get... out of hand
Context - It's very easy to
I'm sure there are more...
In short, I would urge you to stay away from relying on JavaScript dependency for the compatibility issue's alone. What I've come to realize is there is simply no need rely on JavaScript to everything.
I'm actually in the process of removing JavaScript dependencies in loo of Progressive Enhancement. It just makes more sense. You can achieve the same or similar effects with properly coded JavaScript.
The idea is too...
Develop out well-formatted, fully functional application w/o any JavaScript
Style it
Wrap the whole thing with JavaScript
Using Progressive Enhancement one can develop an application that delivers the best possible experience for the user that is possible.

For some additional arguments, check out The Single Page Interface Manifesto and some (mostly) negative reaction to it on Hacker News (link at the bottom of the SPI page):
The Single Page Interface Manifesto: http://itsnat.sourceforge.net/php/spim/spi_manifesto_en.php

stofac, first of all, thanks for the link to the Single Page Interface (SPI) Manifesto (I'm the author of this boring text)
Said this, SPI != doing everything through Javascript
Take a look to this example (server-centric):
http://www.innowhere.com/insites/
The same in GAE:
http://itsnatsites.appspot.com/
More info about the GAE approach:
http://www.theserverside.com/news/thread.tss?thread_id=60270
In my opinion coding a complex SPI application/web site fully on JavaScript is very very complex and problematic, the best approach in my opinion is "hybrid programming" for SPI, a mix of server-centric for big state management and client-centric (a.k.a JavaScript by hand) for special effects.

Doing everything on a single page using ajax everywhere would break the browser's history/back button functionality and be annoying to the user.

I utterly despise JS-only sites where it is not needed. That extra condition makes all the difference. By way of example consider the oft quoted Google Docs, in this case it not only helps improve experiences it is essential. But some parts of Google Help have been JS-only and yet it adds nothing to the experience, it is only showing static content.
Here are reasons for my upset:
Like many, I am a user of NoScript and love it. Pages load faster, I feel safer and the more distracting adverts are avoided. The last point may seem like a bad thing for webmasters but I don't want anyone to get rewarded for pushing annoying flashy things in my face, if tactless advertisers go out of business I consider it natural selection.
Obviously this means some visitors to your site are either going to be turned away or feel hassled by the need to provide a temporary exclusion. This reduces your audience.
You are duplicating effort. The browser already has a perfectly good history function and you shouldn't need to reinvent the wheel by redrawing the previous page when a back button is clicked. To make matters worse going back a page shouldn't require re-rendering. I guess I am a student of If-it-ain't-broke-don't-fix-it School (from Don't-Repeat-Yourself U.).
There are no HTTP headers when traversing "pages" in JS. This means no cache controls, no expiries, content cannot be adjusted for requested language nor location, no meaningful "page not found" nor "unavailable" responses. You could write error handling routines within your uber-page that respond to failed AJAX fetches but that is more complexity and reinvention, it is redundant.
No caching is a big deal for me, without it proxies cannot work efficiently and caching has the greatest of all load reducing effects. Again, you could mimic some caching in your JS app but that is yet more complexity and redundancy, higher memory usage and poorer user experience overall.
Initial load times are greater. By loading so much Javascript on the first visit you are causing a longer delay.
More JavaScript complexity means more debugging in various browsers. Server-side processing means debugging only once.
Unfuddle (a bug-tracker) left a bad taste. One of my most unpleasant web experiences was being forced to use this service by an employer. On the surface it seems well suited; the JS-heavy section is private so doesn't need to worry about search engines, only repeat visitors will be using it so have time to turn off protections and shouldn't mind the initial JS library load.
But it's use of JS is pointless, most content is static. "Pages" were still being fetched (via AJAX) so the delay is the same. With the benefit of AJAX it should be polling in the background to check for changes but I wouldn't get notified when the visible page had been modified. Sections had different styles so there was an awkward re-rendering when traversing those, loading external stylesheets by Javascript is Bad Practice™. Ease of use was sacrificed for whizz-bang "look at our Web 2.0" features. Such a business-orientated application should concentrate on speed of retrieval, but it ended up slower.
Eventually I had to refuse to use it as it was disrupting the team's work flow. This is not good for client-vendor relationships.
Dynamic pages are harder to save for offline use. Some mobile users like to download in advance and turn off their connection to save power and data usage.
Dynamic pages are harder for screen readers to parse. While the number of blind users are probably less than those with NoScript or a mobile connection it is inexcusable to ignore accessibility - and in some countries even illegal, see the "Disability Discrimination Act" (1999) and "Equality Act" (2010).
As mentioned in other answers the "Progressive Enhancement", née "Unobtrusive Javascript", is the better approach. When I am required to make a JS-only site (remember, I don't object to it on principle and there are times when it is valid) I look forward to implementing the aforementioned AJAX crawling and hope it becomes more standardised in future.

We Keep Coding

JavaScript is the programming language of the Web.