I have a dataset which is best represented by a graph. It consists of nodes of 6 or 7 different "types" with directed edges (dependencies on one another, guaranteed not to have cyclic dependencies). The dataset is essentially a template of a layered configuration, and the user needs to be able to select bits and pieces of the configuration from different layers which are desired, and have the dependent bits be brought in automatically.
The general UI need is for a user to select or un-select items from multi-select boxes (one such box for each node type), and have "depended-on" items in the other boxes become selected or unselected as needed. I need to be able to pull down the dataset from the server, let the user select the desired bits (with the dependency processing being done in javascript on the client side for responsiveness), and then submit the result back when they are finished.
The dataset is large and complex enough that actually showing it as a graph would be overwhelming and confusing to the user. Only basic graph traversal operations are needed, since all that is required is to cascade selections out the dependencies. (For example, a user un-selecting a node would result in that nodes dependencies becoming unselected if there were no other selected node which still depended on them. A user selecting a node would result in all of that node's dependencies becoming selected.) A simple depth or breadth first search following directed edges from the start node will suffice to visit all affected nodes. If I can follow edges either direction, bonus. (If not I can easily generate an edge-reversed graph and use that when needed.)
I have dug around on here and found references to a number of javascript graph visualization libraries, but most of these discussions seem to interpret "graph" as "chart" and I have no charting needs here. My digging has led me to this list: Raphael, protovis, flare, D3, jsVis, Dracula, and prefuse. From this list it looks like jsVis or Dracula might have the underlying graph constructs I need if I just ignore the visualization side, but it isn't clear to me from the documentation if that is the case. I have to rule out a few others because I cannot bring in any flash dependencies. Unfortunately I don't have time to prototype things with this many libraries. (I will be digging into jsVis and dracula more though, barring some handy input here.)
If anyone has experience with something from that list and believes that the graph portion of it can be used independently of the visualization portion, that will certainly meet my needs. If there is some other library I could use that meets my needs, that would be great too. One final requirement regarding licensing: the library needs to be "free" in a non-copyleft way - So ideally Apache v2.0, BSD, MIT, or something like that.
I haven't used it, but you might want to check out data.js. It's an MIT-licensed library with a range of data-structure utilities. In particular, it includes Data.Node and Data.Graph:
A Data.Graph can be used for representing arbitrary complex object graphs. Relations between objects are expressed through links that point to referred objects. Data.Graphs can be traversed in various ways.
Related
My institution is dealing with several fairly large networks:
around 1000 nodes and 2000 edges (so smaller than this one),
fixed node positions,
potentially dozens of attributes per element,
and a minified network file size of 1 MB and up.
We visualize the networks in custom-built HTML pages (one network per page) with cytoscape.js. The interface allows the user to select different pre-defined coloring options for the nodes (the main functionality) - colors are calculated from one of a set of numeric node attributes. Accessory functions include node qtips, finding a node and centering on it (via a select2 dropdown), and highlighting nodes based on their assignment to pre-defined groups (via another select2 drowpdown).
We face the problem that the page (and so necessarily the whole browser) becomes unresponsive for at least 5 seconds whenever a color change is initiated, and we are looking for strategies to remedy that. Here are some of the things that we have tried or are contemplating:
Transplant node attributes to separate files that are fetched on demand via XHRs (done, but performance impact unclear).
Offload cytoscape.js to a Web worker (produced errors - probably due to worker DOM restrictions - or did not improve performance).
Cache color hue calculation results with lodash's memoize() or similar (pending).
Export each colorized network as an image and put some fixed-position HTML elements (or a canvas?) on top of the image stack for node qtips and such. So we would bascially replace cytoscape.js with static images and custom JavaScript.
I appreciate any suggestions on alternative or complementary strategies for improving performance, as well as comments on our attempts so far. Thanks!
Changing the style for 1000 nodes and 2000 edges takes ~150ms the machine I'm using at this moment. It follows that the issue is very likely to be in your own code.
You haven't posted an example or a sample of what you've currently written, so it's not possible for me to say what the performance issue is in your code.
You've hinted that you're using widgets like <select>. I suspect you're reading from the DOM (e.g. widget state) for each element. DOM operations (even reads) are slow.
Whatever your performance issue is, you've decided to use custom functions for styling. The docs note this several times, but I'll reiterate here: Custom functions can be expensive. Using custom functions is like flying a plane without autopilot. You can do it, but now you have to pay attention to all the little details yourself. If you stick to the in-built stylesheets features, style application is handled for you quickly and automatically (like autopilot).
If you want to continue to use expensive custom functions, you'll have to use caching. Lodash makes this easy. Even if you forgo custom functions for setting .data() with mappers, you still may find caching with Lodash useful for your calculations.
You may find these materials useful:
Chrome JS profiling
Firefox performance profiling
Cytoscape.js performance hints
I'm hooked on D3 as a way to visualize the relationships among a zillion data entities. (Okay, maybe among a couple thousand.)
Now I want to be able to see textual information in parallel with the graphical entities -- i.e. a graphical view in one pane and a textual view in another -- so I've recently been playing with Johnny Stromberg's List.js package (http://www.listjs.com/). It's quite responsive and does many of the things I want: sorting, filtering, etc.
But it's not D3 ;).
Ideally, I'd want something that maintains the textual list using the same binding mechanism as the graphical view in order to guarantee that the list and the display accurately track one another (i.e. using the .enter() and .exit() methods...). And it should be easy to select a line in the list and have the selection reflected in the graphical view, and vice versa. Filtering items in the list should cause the same filtering in the view. Etc.
Before I dash off and write my own extensions to D3, has anyone done something like this? Or have pointers to starting points?
I am trying to build a network graph (like a network for brain) to display millions of nodes. I would like to know to what extent I can push the d3 js to in terms of adding more network nodes on one graph?
Like for example, http://linkedjazz.org/network/ and http://fatiherikli.github.io/programming-language-network/#foundation:Cappuccino
I am not that familiar with d3.js (though I am a JS dev), I just want to know if d3.js is the right tool to build a massive network visualization (one million nodes +) before I start looking at some other tools.
My requirements are simply: build a interactive web based network visualization that can scale
Doing a little searching myself, I found the following D3 Performance Test.
Be Careful, I locked up a few of my browser tabs trying to push this to the limit.
Some further searching led me to a possible solution where you can pre-render the d3.js charts server side, but I'm not sure this will help depending on your level of interaction desired.
That can be found here.
"Scaling" is not really an abstract question, it's all about how much you want to do and what kind of hardware you have available. You've defined one variable: "millions of nodes". So, the next question is what kind of hardware will this run on? If the answer is "anything that hits my website", the answer is "no, it will not scale". Not with d3 and probably not with anything. Low cost smartphones will not handle millions of nodes. If the answer is "high end workstations" the answer is "maybe".
The only way to know for sure is to take the lowest-end hardware profile you plan to support and test it. Can you guarantee users have access to a 64GB 16 core workstation? An 8GB 2 core laptop? Whatever it is, load up a page with whatever the maximum number of nodes is and sketch in something to simulate the demands of the type of interaction you want and see if it works.
How much d3 scales is very dependent on how you go about using it.
If you use d3 to render lots of svg elements, browsers will start to have performance issues in the upper thousands of elements. You can render up to about 100k elements before the browser crashes, but at that point user interaction is basically useless.
It is possible, however, to render lots and lots of lines or circles with a canvas. In canvas, everything is rendered in a single image file. Rather than creating a new element for each node or line, you draw a line in the image file for it. The downside of this is that animation is a bit more difficult, since you can't move elements in a canvas, only draw on top of a canvas or redraw the whole thing. This isn't impossible, but would be computationally expensive with a million nodes.
Since canvas doesn't have nodes, if you want to use the enter/exit/update paradigm with it, you have to put placeholder elements in the DOM. Here's a good example of how to do that: DOM-to-canvas with D3.
Since the memory costs of canvas don't scale with the number of nodes, it makes for a very scalable solution for large visualizations, but workarounds are required to get it to be interactive.
I am trying to draw directed acyclic graph using d3.js. While searching for the layout, I came across Dagre but it seems to be of less use as I do not want to use DOT based code anywhere. If anyone knows about pure Javascript solution for this or plugin/custom layout for DAG, please let me know. Thanks in advance.
Dagre author here. Dagre doesn't include any of the graphviz code - it is pure JavaScript. It is based on similar layout techniques though; both are based on techniques from the Sugiyama paper.
You can find some examples of dagre here:
http://cpettitt.github.io/project/dagre-d3/latest/demo/interactive-demo.html
http://cpettitt.github.io/project/dagre-d3/latest/demo/sentence-tokenization.html
http://cpettitt.github.io/project/dagre-d3/latest/demo/tcp-state-diagram.html
The minified source can be found here: http://cpettitt.github.io/project/dagre-d3/latest/dagre-d3.min.js. It clocks in at about 44K.
Rendering directed acyclic graphs (and actually highlighting the directedness property) is a domain of the Sugiyama layout algorithms.
They basically assign layers (through a topological sorting) to the nodes and then calculate a sequencing for the nodes in the layers. Using a simple heuristic to reverse cycles first, this works well for cyclic graphs as well. Graphviz DOT has an implementation of this layout called dot, which is also the name of the file format it uses, so there sometimes is a bit confusion when DOT is mentioned.
Of course there are other implementations of the algorithm, even a cross-compiled Javascript version of dot is available. The probably most feature-complete solution available for Javascript is the commercial implementation of the algorithm in the yFiles library. So if this is in a commercial scenario, you might want to take a look at the corresponding live demo. Note that although yFiles comes with its own rendering and editor implementation, you could still plug the code into d3.js, since the layout algorithms can be used as standalone implementations to "just" calculate the coordinates of the nodes, the edge connection points, the bends, and the labels. This specific implementation supports a great number of additional constraints, like "Port Constraints" (to restrict the direction of the outgoing and incoming edges as well as their exact locations at the nodes), hierarchically grouped nodes (where each node can have a parent node and the parent node "contains" all of its child nodes), layer and sequence constraints, edge labeling constraints, different edge routing styles, bus-routing, and more.
Disclosure: I work for the company that creates said library, however on SO I do not represent my employer.
Here is my requirement:
I need to create a visualization of links between different representations of a person. The image below I think indicates that fairly clearly.
Additionally, those rectangles would also contain some data about that representation of a person (such as demographics and the place). I also need to be able to handle events when clicking on the boxes or the links between them, as a sort of management tool (so, for example, double clicking a link to delete it, or something along those lines). Just as importantly, since the number of people and links will varies, I need it to be displayed by spacing out the people in a roughly equidistant fashion like the image shows.
What would be a javascript library that could accomplish this? I have done some research and have yet not found something that can cleanly do this but I'm hardly an expert in those libraries.
Here are the ones I've looked at:
Arbor js: Can dynamically create the spacing and links of the graph but I'm responsible for rendering all the visuals and there's really no hooks for things like clicking the links.
jsPlumb: Easily create connections between elements and draws them nicely enough but doesn't seem to address any layout issues. Since I don't know how many people will be on the screen, I have to be able to space them out equidistant and that doesn't seem to be a concern of jsPlumb.
D3.js: This creates a good visualization with the spacing I need but I don't see how I can show the data inside each node or do things like like mouse events on the links or box.
I'm feeling a bit lost so I'm hoping someone could point me to something that could help me or maybe point me to an example from one of these libraries that shows me that what I want is possible.
I ended up using Arbor with Raphael as my rendering library and it's worked out very well.
Take a look at Dracula Graph Library. It's a simple library that seems to do both layout as well as rendering graphs (using Raphael under the hood). It's a bit underdeveloped however.