how much can d3 js scale

how much can d3 js scale - javascript

I am trying to build a network graph (like a network for brain) to display millions of nodes. I would like to know to what extent I can push the d3 js to in terms of adding more network nodes on one graph?
Like for example, http://linkedjazz.org/network/ and http://fatiherikli.github.io/programming-language-network/#foundation:Cappuccino
I am not that familiar with d3.js (though I am a JS dev), I just want to know if d3.js is the right tool to build a massive network visualization (one million nodes +) before I start looking at some other tools.
My requirements are simply: build a interactive web based network visualization that can scale

Doing a little searching myself, I found the following D3 Performance Test.
Be Careful, I locked up a few of my browser tabs trying to push this to the limit.
Some further searching led me to a possible solution where you can pre-render the d3.js charts server side, but I'm not sure this will help depending on your level of interaction desired.
That can be found here.

"Scaling" is not really an abstract question, it's all about how much you want to do and what kind of hardware you have available. You've defined one variable: "millions of nodes". So, the next question is what kind of hardware will this run on? If the answer is "anything that hits my website", the answer is "no, it will not scale". Not with d3 and probably not with anything. Low cost smartphones will not handle millions of nodes. If the answer is "high end workstations" the answer is "maybe".
The only way to know for sure is to take the lowest-end hardware profile you plan to support and test it. Can you guarantee users have access to a 64GB 16 core workstation? An 8GB 2 core laptop? Whatever it is, load up a page with whatever the maximum number of nodes is and sketch in something to simulate the demands of the type of interaction you want and see if it works.

How much d3 scales is very dependent on how you go about using it.
If you use d3 to render lots of svg elements, browsers will start to have performance issues in the upper thousands of elements. You can render up to about 100k elements before the browser crashes, but at that point user interaction is basically useless.
It is possible, however, to render lots and lots of lines or circles with a canvas. In canvas, everything is rendered in a single image file. Rather than creating a new element for each node or line, you draw a line in the image file for it. The downside of this is that animation is a bit more difficult, since you can't move elements in a canvas, only draw on top of a canvas or redraw the whole thing. This isn't impossible, but would be computationally expensive with a million nodes.
Since canvas doesn't have nodes, if you want to use the enter/exit/update paradigm with it, you have to put placeholder elements in the DOM. Here's a good example of how to do that: DOM-to-canvas with D3.
Since the memory costs of canvas don't scale with the number of nodes, it makes for a very scalable solution for large visualizations, but workarounds are required to get it to be interactive.

Related

Hiding crucial data from an SVG

I have a SVG generated map for the game I am developing. I have no problems with the game being open-source and it uses open web technologies such as HTML and SVG. No problems there.
But at the same time I want the players not to be able to see or reverse engineer a map of the whole world (to retain true exploration). For now I generate map using a seed that is secret and not version controlled. So even though the algorithm is known curious players can use open-sourced code to generate "game-like worlds" but not that exact one. This solves the "global" problem.
But since SVG is rendered on a page as a single Voronoi diagram all the data (I don't mind the coordinates of points) would be extractable. Data like resources, land types, biomes, climate etc. could be fetched from SVG to gain an upper hand in finding good locations for settlements.
Any idea how to prevent that? Players have limited vision so I thought about either:
not rendering the whole Voronoi diagram at all (just the visible part), but that could be potentially tricky to do (maybe, haven't looked into it yet),
inserting the resource/land tile data into SVG graph only to visible locations
I can see the benefits of both approaches and if done correctly it could even boost the performance (not rendering the whole thing/rendering with less data) and lead to bigger worlds without impacting performance.
Any other ideas/programming/architectural approaches to help with the issue?
(I am using Vue.js, d3.js, svg-pan-zoom and Laravel backend just in case it helps.)

The ideas that you gave are perfect, but for implementing them, you need to make hard work, and spend much time.
I have a suggestion. Is will work for most of the users. Maybe some users will "hack" it. But I believe it will work for 95% of the times.
You can create a very big rectangle, from the top left point 0,0 until the right bottom point. The rectangle will be white, and it will be over all other shapes.
This way if someone will download the SVG, we will see nothing. Just a big white rectangle.
In you game HTML, you can add a CSS selector, to hide this rectangle.
If you following this method, most of the users (who don't have a photo editing software) will not be able to see the map.
Users who knows how to inspect elements in HTML may see the map. But I believe that most of them who will see a white box, will not believe that there is something behind.
I think that this is a simple temporary approach that you can do, before doing other more defensive ways.

D3 + Leaflet, only draw if visible (big data vis)

In order to have a reasonable performance with a lot of svg paths, svg text and svg textpath elements on a leaflet map, I wonder how D3 handles elements which are currently not on screen.
So for example when I zoom in to an area such as Washington State, 99.9% of the world is not shown - is D3 default behaviour to draw all the other elements regardless?
I am basing my project on Mike Bostocks d3 + leaflet example. There are no viewports/ viewbox attributes used - is it done somewhere else? Thanks for your input.

I think there's two parts to this question
Drawing of SVG DOM elements that are off screen
As #LarsKotthoff mentions, it's probably not worth worrying about these, as the browser will probably do a better job than you of optimising them away.
Processing of data that will result in SVG DOM elements being drawn off screen.
I think this is where you can make a difference. If you have data manipulation/processing that is expensive, then processing things that will not be displayed seems like a waste of cycles. The only way I can think of improving this situation is to determine as early as possible whether something will be off screen or not. If it is going to be off screen, then ignore it when doing any further data processing.
In these situations though, you need a way to detect when it moves into view or out of view and either process or not, as appropriate. This may result in some additional overhead that makes it not worth doing.
Your individual situation will determine how effective this can be for you, but if you have a specific example, then users here may be able to assist with re-factoring to help performance.
There are also other things you can do, like re-thinking the visualisation to require less elements in the first place. In my experience performance has not really been an issue until such a point that there is so much information on screen that the value of the visualisation has been diminished. Removing the extraneous information has resulted in improved performance and improved comprehension of the visualisation. Of course, this is my particular experience and there are definitely times when that won't apply.

Javascript big-data visualisation

I'm about to develop a UI for medical research application. I need to make a time series line graph. Main issue is the amount of data:
5,000 points per graph, with a few of them displayed simultaneity. I’m expecting 50,000 points processed all together.
The question is what presentation library?
The main features I’m looking for are: Handles huge data sets, Zoom, annotations, live update.
I’m already looking into http://dygraphs.com/ and http://meteorcharts.com/
I wouldn't want any library that renders the data as DOM elements, or that uses SVG (from performance perspective)

Well, I think I'll give everyone my own answer to my question:
Dygraphs (http://dygraphs.com/) seems to be on the spot. The documentation, although a lot of apparent efforts, leaves a lot to be desired. But from performance, features and flexibility, it's the best I've seen. At least for my current project needs.
Way to go, Dygraphs!

Have you checked out D3? I'm using that for a lot of graph visualization. It looks like this example renders to svg.
My stuff renders to a SVG for force graph visualizations too, but it looks like D3 might be able to use either a canvas or SVG, but I'm not positive about what all can be rendered to which. Honestly, I'd give D3 a try even with SVG, it might be fast enough. I remember reading something about someone simulating thousands of particles using D3's force graph visualizations without issues. It's SUPER easy to get your data into the right format for it to use.
Good luck!

I am developing a very similar application.
I am currently using Flot for the chart rendering. It handles annotations and zoom, take a look at their plugin library.
I recommend this downsampling plugin which will speed up graph rendering. Rendering 5000 points on your graph is useless: you have less vertical pixels on your screen than that! so this library will only render those that actually have a visual importance.
This only gives you the graph. You may want some kind of dashboard to present all that... I am currently looking at Grafana, which is used for a totally different purpose but makes awesome dashboards. It may be possible to "rip out" their dashboarding features (it uses Flot as well).
Another option is Hightcharts, but that's not free.

Check raphael js Library
Raphaël is a small JavaScript library that should simplify your work with vector graphics on the web. If you want to create your own specific chart or image crop and rotate widget, for example, you can achieve it simply and easily with this library.

Graphing Algorithm for many nodes

I have been trying to develop a web based application to help in the graphing of nodes and their interactions.
I have attempted to use the Sigma.Js with the Force Atlas extension.
For my simple tests (few nodes) the results are quite good-looking, however with an additional thousand nodes the result becomes quite a mess.
Is there any such way to make the result more view able? (easier on the eyes/not just 1 big blob) How would I go about doing this? Are there any algorithms already written(that I may implement?)

You can try the Fruchterman-Reingold force layout (for which there is a sigma plugin). It specifically minimises the number of links that cross each other, so it is in general more suitable for large graphs (unless all the nodes have lots of connections).
In addition, the fisheye plugin may help to make more sense of the graph after it has been drawn.

sigma.layout.forceAtlas2 scales much better, however it won't do miracles if the graph has a strong density of connections.

What's the fastest way to draw to an HTML 5 canvas?

I'm investigating the possibility of producing a game using only HTML's canvas as the display media. To take an example task I need to do, I need to construct the game environment from a number of isometric tiles. Of course, working in 2D means they by necessity come in rectangular packages so there's a large overlap between tiles.
I'm old enough that the natural solution to this problem is to call BitBltMasked. Oh wait, no, an HTML canvas doesn't have something as simple and as pleasing as BitBlt. It seems that the only way to dump pixel data in to a canvas is either with drawImage() which has no useful drawing modes that ignore the alpha channel or to use ImageData objects that have the image data in an array.. to which every. access. is. bounds. checked. and. therefore. dog. slow.
OK, that's more of a rant than a question (things the W3C like tend to provoke that from me), but what I really want to know is how to draw fast to a canvas? I'm finding it very difficult to ditch the feeling that doing 100s of drawImages() a second where every draw respects the alpha channel is inherently sinful and likely to make my application perform like arse in many browsers. On the other hand, the only way to implement BitBlt proper relies heavily on a browser using a hotspot-like execution technique to make it run fast.
Is there any way to draw fast across every possible implementation, or do I just have to forget about performance?

This is a really interesting problem, and there's a few interesting things you can do to solve it.
First, you should know that drawImage can accept a Canvas, not just an image. The "sub-Canvas"es don't even need to be in the DOM. This means that you can do some compositing on one canvas, then draw it to another. This opens a whole world of optimization opportunities, especially in the context of isometric tiles.
Let's say you have an area that's 50 tiles long by 50 tiles wide (I'll say meters for the sake of my own sanity). You might divide the area into 10x10m chunks. Each chunk is represented by its own Canvas. To draw the full scene, you'd simply draw each of the chunks' Canvas objects to the main canvas that's shown to the user. If only four chunks (a 20x20m area), you would only perform four drawImage operations.
Of course, each of those individual chunks will need to render its own Canvas. On game ticks where nothing happens in the chunk, you simply don't do anything: the Canvas will remain unchanged and will be drawn as you'd expect. When something does change, you can do one of a few things depending on your game:
If your tiles extend into the third dimension (i.e.: you have a Z-axis), you can draw each "layer" of the chunk into its own Canvas and only update the layers that need to be updated. For example, if each chunk contains ten layers of depth, you'd have ten Canvas objects. If something on layer 6 was updated, you would only need to re-paint layer 6's Canvas (probably one drawImage per square meter, which would be 100), then perform one drawImage operation per layer in the chunk (ten) to re-draw the chunk's Canvas. Decreasing or increasing the chunk size may increase or decrease performance depending on the number of update you make to the environment in your game. Further optimizations can be made to eliminate drawImage calls for obscured tiles and the like.
If you don't have a third dimension, you can simply perform one drawImage per square meter of a chunk. If two chunks are updated, that's only 200 drawImage calls per tick (plus one call per chunk visible on the screen). If your game involves very few updates, decreasing the chunk size will decrease the number of calls even further.
You can perform updates to the chunks in their own game loop. If you're using requestAnimationFrame (as you should be), you only need to paint the chunk Canvas objects to the screen. Independently, you can perform game logic in a setTimeout loop or the like. Then, each chunk could be updated in its own tick between frames without affecting performance. This can also be done in a web worker using getImageData and putImageData to send the rendered chunk back to the main thread whenever it needs to be updated, though making this work seamlessly will take a good deal of effort.
The other option that you have is to use a library like pixi.js to render the scene using WebGL. Even for 2D, it will increase performance by decreasing the amount of work that the CPU needs to do and shifting that over to the GPU. I'd highly recommend checking it out.

I know that GameJS has blit operations, and I certainly assume any other html5 game libraries do as well (gameQuery, LimeJS, etc etc). I don't know if these packages have addressed the specific array-bounds-checking concern that you had, but in practice their samples seem to work plenty fast on all platforms.
You should not make assumptions about what speedups make sense. For example, the GameJS developer reports that he was going to implement dirty rectangle tracking but it turned out that modern browsers do this automatically---link.
For this reason and others, I suggest to get something working before thinking about the speed. Also, make use of drawing libraries, as the authors have presumably spent some time optimizing performance.
I have no personal knowledge about this, but you can look into the appMobi "direct canvas" HTML element which is allegedly a much faster version of normal canvas, link. I'm confused about whether this works in all browsers or just webkit browsers or just appMobi's own special browser.
Again, you should not make assumptions about what speedups make sense without a very deep knowledge of web browser internal processes. That webpage about "direct canvas" mentions a bunch of things that slow down canvas-drawing: "Reflowing text, mapping hot spots, creating indexes for reference links, on and on." Alpha-blending and array-bounds-checking are not mentioned as prominent causes of slowness!

Unfortunately, there's no way around the alpha composition overhead. Clipping may be one solution, but I doubt there would be much, if any, performance gain. Not to mention how complicated such a route would be to implement on irregular shapes.
When you have to draw the entire display, you're going to have to deal with the performance hit. Although afterwards, you have a whole screen's worth of pre-calculated alpha imagery and you can draw this image data at an offset in one drawImage call. Then, you would only have to individually draw the new tiles that are scrolled into view.
But still, the browser is having to redraw each pixel at a different location in the canvas. Which is quite expensive. It would be nice if there was a method for just scrolling pixels, but no luck there either.
One idea that comes to mind is that you could implement multiple canvases, translating each individual canvas instead of redrawing the pixels. This would allow the browser to decide how to redraw those pixels, in a more native way, at least in theory anyway. Then you could render the newly visible tiles on a new, or used/cached, canvas element. Positioning it to match up with the last screen render.
But that's just my two blits... I mean bits... duh, I mean cents :]

We Keep Coding

JavaScript is the programming language of the Web.