Class diagram - optimal node positioning - javascript

I am trying to build a class diagram model viewer using d3.js and d3.dag
A most crucial part of this viewer is that it should be able to optimally position nodes so that we won't have link crossing (whenever possible) and plus should be able to clearly see what's connected to what
We know:
Width of each node
Height of each node
Links starting coordinate
Links ending coordinate
Links all corner coordinates
We want:
To see connections ending (Can be achieved manually moving nodes).
To minimize links crossing (If it's possible)
What I need is kinda theoretical.
Is there any known algorithm which can solve the problem above (Language does not matter, I just need theoretical reference)
Putting examples below
What's the current situation
What can I achieve by myself
What would be perfect
Example 1.
Current
Achievable
Perfect
Example 2.
Current
Achievable
Perfect
Example 3.
Current
Achievable And Perfect
Example 4.
Current
Achievable
Perfect
Example 5.
Current
Achievable
Perfect
Example 6.
Current
Perfect
Update
Traditional (node to node link ) crossing is already minimized in this case (thanks to d3-dag). The issue is that we don't have the only node to node relationship, we also have a node row to row relationship and in this case, d3-dag fails

I used d3-dag to topologically sort nodes, and then repositioned them vertically, top if odd and bottom if even
Although it's not an algorithm I was looking for, it improved the visual look of components dramatically and made much more readable
Old
New
Old
New
Old
New

I'm not sure this is still a problem, but if so, could you elaborate on what "row -> row" relationships are? Is that a strict definition on which nodes belong on which row? More recent versions of d3-dag support a rank criterion for nodes which specified relative ordering an equality constraints for a nodes row, but that may not be what you're looking for.
If you want to reposition just the edges, there may be way to pull out d3-dag's internals to just do the crossing minimization. Essentially d3 does what most layered layouts do, break an edge up to sections, and change the order of each edge in each row. After the ordering is set, it's easy to assign coordinates. If you can phrase your problem like that, you should be able to use code like this to phrase it as an integer linear program and find the optimal edge layout.

Related

several questions regarding cytoscape.js

I've just started using cytoscape (with cytoscape-klay layout) to render some graphs and I'm unable to find answers to some of the questions I have:
Is there a way to left-align the graph? (graph seems to be centered horizontally, but in my use case, I need it to start from the left)
Edit: my graph is not pannable nor zoomable and panBy and shift doesn't seem to work
Is there a way to get the graph rendered without specifying the container dimensions? (right now, if I don't specify the dimensions, the canvas itself has a dimension of 0x0 and since I specify a width/height for every node, I think it should be possible)
Edit: I can't use an absolute positioned container since my view has some sticky parts. Also, I want the container dimension to be exactly what the graph needs as it is not pannable. I'm currently doing the following to accommodate both 1 and 2:
const dim = cy.$('*').boundingBox(); // get real size of graph
const container = document.getElementById('#cy');
container.setAttribute('style', `width: ${dim.w - 120}px; height: ${dim.h}px; transform: translateX(-175px)`);
Is there a way to define css in a css file or does it have to be in js?
Edit: I'm aware that I can use a js/json file - I'm wondering if I can also use a css file (I assume not, but I just want to make sure)
my nodes should be interactive (open / close) and since I work with Vue.js I prefer having my nodes as Vue components. while this seems impossible with cytoscape itself as it supports only a label for a node, I found nodeHtmlLabel package that gets me really close, it doesn't adhere to the styles that were set in js and reactivity seems to be a problem. Is there a better alternative?
Edit: I'm not asking about compound nodes. For a simple example of what I need, you can think of each node having a title and a description but only the title is visible and if you click on the node, it will open to show the description (node height would change)
can I customize the edges? (have an edge consist of a straight line connected to a bezier curved line, use my own control points, etc.) If not, how does these weights / distances work? do they work equally well for Top-Down and Left-Right graphs? (I'm trying to have something like this)
Edit: I'm aware of the current edge types, but I'm looking for something that is a merge between taxi and bezier. Since this is not currently supported, I wanted to know if there is any way to define a custom line myself
Background - I managed to get really close to the desired design with dagre-d3 but I have a problem with ranks and it seems that this package is unmaintained :(, hence I decided to look for an alternative

Is it possible to have single stable dagre layout in cytoscape.js

cytoscape.js dagre layout works really well.
However there is no single stable layout given particular graph.
The layout algorithm seems to be using some random number generator to calculate node positions. It leads to annoying situation when the same graph sometimes rendered differently on the screen.
Is any simple way to fix that? Usually random seed value could be set to some user-defined number. I was unable to find suggestion on how to do that in cytoscape.js docs.
I have seen similar issue with Dagre as well. After digging a little bit into the source code, the reason might be with the ranking algorithm that's used:
Link
AFAIK, network simplex algorithm chooses a random starting to find the optimum, and in case where multiple optima exist, it doesn't guarantee that the same solution will be reached every time.

How can I prevent overlapping in a family tree generator?

I'm creating an interactive family tree creator, unlike more simpler versions which are simple pedigree charts/trees.
The requirements for mine (based on familyecho.com) are:
multiple partners vs just a simple 2 parent to 1 child that you normally see.
multiple siblings
partners dont necessarily need to have children
there doesn't always have to be a parent "pair", there can just be a single father/mother
The problem I'm encountering is: I'm generating the offsets based on the "current" node/family member and when I go past the first generation with say, 2 parents, it overlaps.
Example of the overlap as well as partner not being drawn on the same X axis:
Here is the actual app and main js file where I'm having the issue. And here is a simplified jsfiddle I created that demonstrates the parent/offset issue though I really have to solve overlapping for this in general, in addition to making sure partners are drawn on the same x axis as other partners.
How can I go about solving this and possible future overlapping conflicts? Do I need some sort of redraw function that detects collisions and adjusts the offsets of each block upon detecting? I'm trying to make it seamless so there's a limited amount of redrawing done.
An example of calculating offset relative to the "context" or current node:
var offset = getCurrentNodeOffset();
if ( relationship == RELATIONSHIPS.PARTNER ) {
var t = offset.top; // same level
var l = offset.left + ( blockWidth + 25 );
} else {
var t = offset.top - (blockHeight + 123 ); // higher
var l = offset.left - ( blockWidth - 25 );
}
I'm going to give a complicated answer, and that's because this situation is more complicated than you seem aware of. Graph layout algorithms are an active field of research. It's easy to attempt a simpler-than-general algorithm and then have it fail in spectacular ways when you make unwarranted, and usually hidden, assumptions.
In general, genetic inheritance graphs are not planar (see Planar Graphs on Wikipedia). Although uncommon, it certainly happens that all the ancestral relationships are not filled by unique people. This happens, for example, when second cousins have children.
Another non-planar situation can occur in the situation of children from non-monogamous parents. The very simplest example is two men and two women, each pairing with children (thus at least four). You can't lay out even the four parent pairs in one rank without curved lines.
These are only examples. I'm sure you'll discover more as you work on your algorithm. The real lesson here is to explicitly model the class of relationship your algorithm is able to lay out and to have verification code in the algorithm to detect when the data doesn't meet these requirements.
The question you are actually asking, though, is far more basic. You're having basic difficulties because you need to be using a depth-first traversal of the graph. This is the (easiest) full version of what it means to lay out "from the top down" (in one of the comments). This is only one of many algorithms for tree traversal.
You're laying out a directed graph with (at least) implicit notion of rank. The subject is rank 0; parents are rank 1; grandparents at rank 2. (Apropos the warnings above, ranking is not always unique.) Most of the area of such graphs is in the ancestry. If you don't lay out the leaf nodes first, you don't have any hope of succeeding. The idea is that you lay out nodes with the highest rank first, progressively incorporating lower-ranked nodes. Depth-first traversal is the most common way of doing this.
I would treat this as a graph-rewriting algorithm. The basic data structure is a hybrid of rendered subgraphs and the underlying ancestry graph. A rendered subgraph is a (1) a subtree of the whole graph with (1a) a set of progeny, all of whose ancestors are rendered and (2) a collection of rendering data: positions of nodes and lines, etc. The initial state of the hybrid is the whole graph and has no rendered subgraphs. The final state is a rendered whole graph. Each step of the algorithm converts some set of elements at the leaf boundary of the hybrid graph into a (larger) rendered subgraph, reducing the number of elements in the hybrid. At the end there's only one element, the render graph as a whole.
Since you are already using Family Echo, I'd suggest you look at how they develop their online family tree diagram, since they seem to have solved your problem.
When I enter your sample diagram into Family Echo, I can build a nice looking tree that seems to be what you are looking for with no cross over.
Although they are creating their diagrams with html and css, you can add the people to their diagrams one by one and then inspect where the boxes are being placed in terms of the left and top pixel locations of each element.
If I had more expertise in JavaScript, I would have tried building up some code to replicate some of what Family Echo is doing, but I'm afraid that's not my mojo.
you'll have to adjust all the branches off the node that you affect, each branch will have to recalculate the position of its nodes, and each node will have to be recalculated locally reaching the leaves.You calculated once the leaves are going to have to recalculate all the way to backing up, all that recursively. It's like a real tree, when you add physically branch to trunk ... the other branches move alone to leave some space, all sheets are automatically reset, so you have to imagine. And simulate this process in your diagram. Processes each branch reaches each leaf, and recalculates up to recompute the modified node neighbors. (one level above you started) That is not easy or single job to do.

D3 Scatterplot with thousands of data points

I would like to make a scatter plot using D3 with the ability of only looking at a small section at a time using some sort of slider across the x-axis. Is there a method in javascript where I can efficiently buffer the data and quickly access the elements as the user scrolls left or right?
My goal is similar to this protovis example here, but with 10 times the amount of data. This example chokes when I make that many data points.
I have done a scatterplot with around 10k points where I needed to filter sections of the plot interactively.
I share a series of tips that worked for me, which I hope some may hopefully help you too:
Use a key function for your .data() operator as it is done at the end of this tutorial. The advantage of using keys is that you do not need to update elements that do not change.
Not related to d3, but I divided my data space into a grid, so that each data point is associated to a single cell (in other words each cell is an index to a set of points). In this way, when I needed to access from, let's say, from x_0 to x_1, I knew what cells I needed, and hence I could access a much more refined set of possible data points (avoiding iterating along all points).
Avoid transitions: from my personal experiences the .transition() is not very smooth when thousand of SVG elements are selected (it may be better now in newer versions or with faster processors)
In my case it was more convenient to make points invisible (.attr("display","none")) or visible rather than removing and creating SVG elements (I wonder if this is more time efficient too)

Data visualization: Bubble charts, Venn diagrams, and tag clouds (oh my!)

Suppose I have a large list of objects (thousands or tens of thousands), each of which is tagged with a handful of tags.
There are dozens or hundreds of possible tags and their usage follows a typical power law:
some tags are used extremely often but most are rare.
All but the most frequent couple dozen tags could typically be ignored, in fact.
Now the problem is how to visualize the relationship between these tags.
A tag cloud is a nice visualization of just their frequencies but it ignores which tags occur with which other tags.
Suppose tag :bar only occurs on objects also tagged :foo.
That should be visually apparent.
Similarly for three tags that tend to occur together.
You could make each tag a bubble and let them partially overlap with each other.
Technically that's a Venn diagram but treating it that way might be unwieldy.
For example, Google charts can create Venn diagrams, but only for 3 or fewer sets (tags):
http://code.google.com/apis/chart/docs/gallery/venn_charts.html
The reason they limit it to 3 sets is that any more and it looks horrendous.
See "extentions to higher numbers of sets" on the Wikipedia page: http://en.wikipedia.org/wiki/Venn_diagrams
But that's only if every possible intersection is non-empty.
If no more than 3 tags ever co-occur (maybe after throwing out the rare tags) then a collection of Venn diagrams could work (with the sizes of the bubbles representing tag frequency).
Or perhaps a graph (as in vertices and edges) with visually thicker or thinner edges to represent frequency of co-occurrence.
Do you have any ideas, or pointers to tools or libraries?
Ideally I'd do this with javascript but I'm open to things like R and Mathematica or really anything else.
I'm happy to share some actual data (you'll laugh if I tell you what it represents) if anyone is curious.
Addendum: The application I originally had in mind was TagTime but it occurs to me that this also maps well to the problem of visualizing one's delicious bookmarks.
If i understand your question correctly, an image matrix should work nicely here. The implementation i have in mind would be an n x m matrix in which the tagged items are rows, and each tags type is a separate column. Every cell in the matrix would consist entirely of "1's" and "0's", i.e., a particular item either has a given tag or it doesn't.
In the matrix below (which i rotated 90 degrees so it would fit better in this window--so columns actually represent tagged items, and each row shows the presence or absence of a given tag across all items), i simulated the scenario in which there are 8 tags and 200 tagged items. , a "0" is blue and a "1" is light yellow.
All values in this matrix were randomly selected (each tagged item is eight draws from a box consisting of two tokens, one blue and one yellow (no tag and tag, respectively). So not surprisingly there's no visual evidence of a pattern here, but if there is one in your data, this technique, which is dead simple to implement, can help you find it.
I used R to generate and plot the simulated data, using only base graphics (no external packages or libraries):
# create the matrix
A = matrix(data=r1, nrow=1, ncol=8)
# populate it with random data
for (i in seq(0, 200, 1)){r1 = sample(0:1, 8, replace=TRUE); A = rbind(A, r1)}
# now plot it
image(z=A, ann=F, axes=F, col=topo.colors(12))
I would create something like this if you are targeting the web. Edges connecting the nodes could be thicker or darker in color, or perhaps a stronger force connecting them so they are close in distance. I would also add the tag name inside the circle.
Some libraries that would be very good for this include:
Protovis (Javascript)
Flare (Adobe Flash)
Some other fun javascript libraries worth looking into are:
Processing for Javascript
Raphael
Although this is an old thread, I just came across it today.
You may also want to consider using a Self-Organizing Map.
Here is an example of a self-organizing map for world poverty. It used 39 of what you call your "tags" to arrange what you call your "objects".
http://www.cis.hut.fi/research/som-research/povertymap.gif
Note sure it would work as I did not test that, but here is how I would start:
You can create a matrix as doug suggests in his answer, but instead of having documents as rows and tags as columns, you take a square matrix where tags are rows and columns. Value of the cell T1;T2 will be the number of documents tagged with both T1 and T2 (note that by doing that you'll get a symetric matrix because [T1;T2] will have the same value as [T2;T1]).
Once you have done that, each row (or column) is a vector locating the tag in a space with T dimensions. Tags near each others in this space often occur together. To visualize co-occurrence you can then use a method to reduce your space dimensionality or any clustering method. For example you can use a kohonen self organizing map to project your T-dimensions space to a 2D space, you'll then get a 2D matrix where each cell represents an abstract vector in the tag space (meaning the vector won't necessary exists in your data set). This vector reflect a topological constraint of your source space, and can be seen as a "model" vector reflecting a significant co-occurence of some tags. Moreover, cells near each others on this map will represent vectors close to each other in the source space, thus allowing you to map the tag space on a 2D matrix.
Final visualization of the matrix can be done in many ways but I cannot give you advice on that without first seeing the results of the previous processing.

Categories