D3.js scattergraph with large (>500,000) points? Clustering? - javascript

I'm looking at plotting a scatterplot with a large number of points (500,000 and upwards).
Currently, we're doing this in Python with Matplotlib. It plots the points, and it provides controls to pan and zoom. I don't believe it provides any clustering or points, it just plots them all - doesn't make much sense at the zoomed out view, I suppose, but you can zoom in and they're all there.
I was looking at doing the chart in JavaScript, to make it a bit easier to distribute. I was looking at D3.js, to see if something similar is feasible there. I did find this example of a basic scatterplot:
http://bl.ocks.org/mbostock/3887118
Firstly, would you be able to plot that number of points? (500,000 and upwards) I was under the impression you couldn't due to the overhead of all the DOM objects? Are there ways around this?
Secondly, is there any kind of clustering available, either a library or even just an example of this being done in D3.js?
Thirdly, if anybody knows any good examples of pan/zoom functionality and clustering, or even just a packaged JS library that handles it, that would be awesome.
Fourth, it would be also nice to have click handlers for each point - and to display some text either in a overlay, or even just in a separate window. Any thoughts on this?

Can you draw half a million points with D3? Sure, but not with SVG. You'll have to use canvas (here's a simple example with 10,000 points that includes brush-based selection: http://bl.ocks.org/emeeks/306e64e0d687a4374bcd) and that means that you no longer have individual elements to assign click handlers to. You will not be able to render half a million points with SVG, because all those DOM elements will choke your interface, as you mentioned.
D3 does include quadtree support that can be leveraged for clustering. It's in use in the above example to speed up search but you could use it to nest elements in proximity at certain scales.
Ultimately, your choices are:
1) Some other library/custom implementation that renders in canvas and polls the mouse position to give you the data element rendered at that point.
2) A sophisticated custom D3 approach that nests elements in proximity and only renders SVG elements appropriate at the zoom level and canvas position (pan) you're at.

Yes, D3.js can be made to work with million scale data with two things:
pre-rendering on the server side. For more see here: https://mango-is.com/blog/engineering/pre-render-d3-js-charts-at-server-side/
By aggregating (or clustering) part of the data so that user can interact and expand the graph if need be. For this use collapsible nodes if you can (http://bl.ocks.org/mbostock/1062288).
Also avoid using force layout. It takes time to settle and converge to a stable positioning.
For clustering libraries, I would pick one up off the shelf. I would choose the scikits library from python, there are many in JavaScript but they are not very robust as they mostly cover k-means or hierarchical clustering. I would precalculate the coordinates using scikits by clustering and then render it using D3.
D3 handles Pan and zoom. Again click handlers and text display are available in D3. (http://bl.ocks.org/robschmuecker/7880033)

Related

Easy way to enlarge visualization in D3 (without zooming in browser)

I'd have a few different visualizations (containing primarily nodes, links and paths), which I would like to enlarge by a factor 3 without having to change the width and length of every single class. Is there a simple way in D3 to make the whole visualization bigger? I'm currently zooming in using the browser, but that is not a good solution as it requires other users to do so as well.

Comparison of two network graphs

I have two network graphs but placing both of them next to each other is the easiest way to compare if the graphs are small. But as the graph grows, it is making hard for the user to compare the views. I wanted to know the best way to merge two graphs and show the comparison.
In the above picture it can be seen that no of nodes are same but the way they are linked is different.
I would like to know how to present the compared data.
Any ideas about different views to present such comparison using d3.js.
I would suggest not trying to apply force layout or similar method for drawing graphs (that would draw the graph in a fashion similar to the on in the picture in your question). Instead, I wold like to suggest using circular layout for both graphs (similar to chord diagram):
This visual example is made for other purposes, but similar principles could be applied to your problem:
Layout all vertexes on a circle, in equidistant style (if there are some vertices belonging only to one of two graphs, they can be grouped and marked different color)
If there is a link between two vertices in both graphs, connect them in one color (lets say green)
If there is a link between two vertices in one graph only, connect them in appropriate color, dependant on a graph (lets say red and purple)
This method scales well with number of vertices.
Hope this helps.
Following methods for network comparison could be useful in the current scenario:
NetConfer, a web application
Nature Scientific Reports, an article guiding comparisons
CompNet, a GUI tool

Draw Zoomable Time Line With Scroll Bars On Html5 Canvas

I need a Time Line For My Web Project.
Something like this - I read the code of this Time Line but did not understand it because it is not documented enough.
My problem is the math behind all of this (not the interaction with the canvas).
I have read several articles about the math of the scroll bars, but none of them talk about zoom.
Some
articles suggest to hold canvas element with very large width value - and to display just the
View Port.
I don't think that's the right way to do it - I want to draw just the correct viewport.
In my project, I have array of n points.
Each point holds time value represented in seconds, but not all of the points are within the Viewp Port.
Considering the current zoom level, how do I calculate:
What points should be drawn and where to draw them?
What is the size and position of the thumb?
Any articles / tutorials about such a thing?
You might be able to use something like Flot which handles the placement of points, as well as zooming and panning. Here's an example of that.
There are a bunch of other drawing libraries, here a good list.
You always have Raphealjs.com , one of the most used library to play with SVG, with this you can write your own js to generate the timeline.

Render large matrix (1000*1000) in Javascript

I need to display a large matrix within our web-based application. The matrix dimensions are approx. 1000*1000 and each cell is either filled or not.
Basically, it should look like this (much larger and without the colors):
http://mbostock.github.com/protovis/ex/matrix.html
I need basic interaction, such as zooming and clicking on a cell. The matrix is likely to to be a sparse matrix.
I tried Protovis but rendering takes forever if the matrix is larger than 80*80.
What Javascript library might be suitable for this task?
I would use an HTML5 Canvas for fast drawing. (This super-simple demo renders in a few seconds on my computer.) If you want to zoom in, you can see this answer.
In order to display a million items to a user, each element would probably have to be the size of a single pixel.
I'd just use a canvas.
You could try the JavaScript library clustergrammer.js (see https://github.com/cornhundred/clustergrammer.js). It uses D3.js to make interactive (zoomable, reorderable, filterable, etc) visualizations. It can handle on the order of 100,000 data points, but if you matrix is sparse enough then you can render large matrices.
Here's an example of clustergrammer.js being used to visualize a 6000x230 matrix http://amp.pharm.mssm.edu/clustergrammer/viz/568affd5b6541b84f3a68234

Raphael Pie Chart

I am trying to create a pie chart and am customizing the example here: http://raphaeljs.com/pie.html
I need to draw lines from the labels to the slices but IE is giving me trouble and I'm not sure what to do about overlapping lines and labels.
Has anyone done this before?
this maybe useful to you:
Overlapping issues can usually be solved by using:
a) ".toFront()" on the raphael element, that should appear in foreground
b) ".getBBox()" at your label and use its parameters to figure out the starting point.
those functions can be looked up in the raphaeljs reference.
.getBBox() should be a good way to start, when you try to connect elements. you can easily figure out its meassurements and use those values (x,y,width,height) to calculate the entry point of your path
that makes it easier to avoid any overlapping at all. but keep in mind the end SVG elements are directly placed within the dom and therefor are meant to work in layers.
overlapping / partially hiding elements often offer you a great animation potential and are not really a bad thing to work with.

Categories