D3 vs Scipy (Voronoi diagram implementation) - javascript

Background
I'm working with a set of 8000 geographical points contained in csv file. On one hand I create a visualisation of Voronoi diagrams built using these points - it's done using D3 library. On the other hand I calculate these Voronoi diagrams in Python using Scipy.
My work logic is simple - I mess with my data on Python's side, making heatmaps, analysis and so on and then I visualise effects using D3. But today I accidentally found that Voronoi diagrams made by Scipy and D3 are different. I noticed that after using geojson.io to plot GeoJsons of Voronois made in Python just to see if I can visualise everything there.
As I said, the Voronois were different - some of them had different angles and some even had additional vertices.
Question:
Why is that happening? Why Voronoi diagrams calculated by these 2 libraries (D3 and Scipy) differ?
Further description
How it is done on D3 side: Based on Chris Zetter example http://chriszetter.com/blog/2014/06/15/building-a-voronoi-map-with-d3-and-leaflet/ I translate latitude and longitude into custom projection to visualise it on the mapbox map.
var voronoi = d3.geom.voronoi()
.x(function(d) { return d.x; })
.y(function(d) { return d.y; })
.clipExtent([[N_W.x , N_W.y],[S_E.x, S_E.y]])
I create Voronoi based on points that are visible within map border + some padding (filteredPoints)
filteredPoints = points.filter(function(d) {
var latlng = new L.LatLng(d.latitude, d.longitude);
if (!drawLimit.contains(latlng)) { return false };
// this translates points from coordinates to pixels
var point = map.latLngToLayerPoint(latlng);
key = point.toString();
if (existing.has(key)) { return false };
existing.add(key);
d.x = point.x;
d.y = point.y;
return true;
});
voronoi(filteredPoints).forEach(function(d) { d.point.cell = d});
How it is done on Python side: I use scipy.spatial.Voronoi.
from scipy.spatial import Voronoi
def create_voronois():
points = numpy.array(points_list)
vor = Voronoi(points)
Where "points_list" is a list of my 8000 geographical points.
EDIT:
Screenshot from my visualisation - black borders are Voronois made with D3, white ones are made by scipy.spatial.Voronoi. As we can see scipy is wrong. Did anyone compare these 2 libraries before?
http://imgur.com/b1ndx0F
Code to run. It prints GeoJson with badly calculated Voronois.
import numpy
from scipy.spatial import Voronoi
from geojson import FeatureCollection, Feature, Polygon
points = [
[22.7433333333000, 53.4869444444000],
[23.2530555556000, 53.5683333333000],
[23.1066666667000, 53.7200000000000],
[22.8452777778000, 53.7758333333000],
[23.0952777778000, 53.4413888889000],
[23.4152777778000, 53.5233333333000],
[22.9175000000000, 53.5322222222000],
[22.7197222222000 ,53.7322222222000],
[22.9586111111000, 53.4594444444000],
[23.3425000000000, 53.6541666667000],
[23.0900000000000, 53.5777777778000],
[23.2283333333000, 53.4713888889000],
[23.3488888889000, 53.5072222222000],
[23.3647222222000 ,53.6447222222000]]
def create_voronois(points_list):
points = numpy.array(points_list)
vor = Voronoi(points)
point_voronoi_list = []
feature_list = []
for region in range(len(vor.regions) - 1):
vertice_list = []
for x in vor.regions[region]:
vertice = vor.vertices[x]
vertice = (vertice[1], vertice[0])
vertice_list.append(vertice)
polygon = Polygon([vertice_list])
feature = Feature(geometry=polygon, properties={})
feature_list.append(feature)
feature_collection = FeatureCollection(feature_list)
print feature_collection
create_voronois(points)

Apparently your javascript code is applying a transformation to the data before computing the Voronoi diagram. This transformation does not preserve the relative distances of the points, so it does not generate the same result as your scipy code. Note that I'm not saying that your d3 version is incorrect. Given that the data are latitude and longitude, what you are doing in the javascript code might be correct. But to compare it to the scipy code, you have to do the same transformations if you expect to get the same Voronoi diagram.
The scripts below show that, if you preserve the relative distance of the input points, scipy's Voronoi function and d3.geom.voronoi generate the same diagram.
Here's a script that uses scipy's Voronoi code:
import numpy
from scipy.spatial import Voronoi, voronoi_plot_2d
import matplotlib.pyplot as plt
points = [
[22.7433333333000, 53.4869444444000],
[23.2530555556000, 53.5683333333000],
[23.1066666667000, 53.7200000000000],
[22.8452777778000, 53.7758333333000],
[23.0952777778000, 53.4413888889000],
[23.4152777778000, 53.5233333333000],
[22.9175000000000, 53.5322222222000],
[22.7197222222000, 53.7322222222000],
[22.9586111111000, 53.4594444444000],
[23.3425000000000, 53.6541666667000],
[23.0900000000000, 53.5777777778000],
[23.2283333333000, 53.4713888889000],
[23.3488888889000, 53.5072222222000],
[23.3647222222000, 53.6447222222000]]
vor = Voronoi(points)
voronoi_plot_2d(vor)
plt.axis('equal')
plt.xlim(22.65, 23.50)
plt.ylim(53.35, 53.85)
plt.show()
It generates this plot:
Now here's a javascript program that uses d3.geom.voronoi:
<html>
<head>
<script type="text/javascript" src="http://mbostock.github.com/d3/d3.js"></script>
<script type="text/javascript" src="http://mbostock.github.com/d3/d3.geom.js"></script>
</head>
<body>
<div id="chart">
</div>
<script type="text/javascript">
// This code is a hacked up version of http://bl.ocks.org/njvack/1405439
var w = 800,
h = 400;
var data = [
[22.7433333333000, 53.4869444444000],
[23.2530555556000, 53.5683333333000],
[23.1066666667000, 53.7200000000000],
[22.8452777778000, 53.7758333333000],
[23.0952777778000, 53.4413888889000],
[23.4152777778000, 53.5233333333000],
[22.9175000000000, 53.5322222222000],
[22.7197222222000, 53.7322222222000],
[22.9586111111000, 53.4594444444000],
[23.3425000000000, 53.6541666667000],
[23.0900000000000, 53.5777777778000],
[23.2283333333000, 53.4713888889000],
[23.3488888889000, 53.5072222222000],
[23.3647222222000, 53.6447222222000]
];
// Translate and scale the points. The same scaling factor (2*h) must be used
// on x and y to preserve the relative distances among the points.
// The y coordinates are also flipped.
var vertices = data.map(function(point) {return [2*h*(point[0]-22.5), h - 2*h*(point[1]-53.4)]})
var svg = d3.select("#chart")
.append("svg:svg")
.attr("width", w)
.attr("height", h);
var paths, points;
points = svg.append("svg:g").attr("id", "points");
paths = svg.append("svg:g").attr("id", "point-paths");
paths.selectAll("path")
.data(d3.geom.voronoi(vertices))
.enter().append("svg:path")
.attr("d", function(d) { return "M" + d.join(",") + "Z"; })
.attr("id", function(d,i) {
return "path-"+i; })
.attr("clip-path", function(d,i) { return "url(#clip-"+i+")"; })
.style("fill", d3.rgb(230, 230, 230))
.style('fill-opacity', 0.4)
.style("stroke", d3.rgb(50,50,50));
points.selectAll("circle")
.data(vertices)
.enter().append("svg:circle")
.attr("id", function(d, i) {
return "point-"+i; })
.attr("transform", function(d) { return "translate(" + d + ")"; })
.attr("r", 2)
.attr('stroke', d3.rgb(0, 50, 200));
</script>
</body>
</html>
It generates:
Based on a visual inspection of the results, I'd say they are generating the same Voronoi diagram.

Related

Bubble Map with leaflet and D3.js [problem] : bubbles overlapping

I have a basic map here, with dummy data. Basically a bubble map.
The problem is I have multiple dots (ex:20) with exact same GPS coordinates.
The following image is my csv with dummy data, color blue highlight overlapping dots in this basic example. Thats because many compagny have the same city gps coordinates.
Here is a fiddle with the code I'm working on :
https://jsfiddle.net/MathiasLauber/bckg8es4/45/
Many research later, I found that d3.js add this force simulation fonction, that avoid dots from colliding.
// Avoiding bubbles overlapping
var simulationforce = d3.forceSimulation(data)
.force('x', d3.forceX().x(d => xScale(d.longitude)))
.force('y', d3.forceY().y(d => yScale(d.latitude)))
.force('collide', d3.forceCollide().radius(function(d) {
return d.radius + 10
}))
simulationforce
.nodes(cities)
.on("tick", function(d){
node
.attr("cx", function(d) { return projection.latLngToLayerPoint([d.latitude, d.longitude]).x; })
.attr("cy", function(d) {return projection.latLngToLayerPoint([d.latitude, d.longitude]).y; })
});
The problem is I can't make force layout work and my dots are still on top of each other. (lines: 188-200 in the fiddle).
If you have any tips, suggestions, or if you notice basic errors in my code, just let me know =D
Bunch of code close to what i'm trying to achieve
https://d3-graph-gallery.com/graph/circularpacking_group.html
https://jsbin.com/taqewaw/edit?html,output
There are 3 problems:
For positioning the circles near their original position, the x and y initial positions need to be specified in the data passed to simulation.nodes() call.
When doing a force simulation, you need to provide the selection to be simulated in the on tick callback (see node in the on('tick') callback function).
The simulation needs to use the previous d.x and d.y values as calculated by the simulation
Relevant code snippets below
// 1. Add x and y (cx, cy) to each row (circle) in data
const citiesWithCenter = cities.map(c => ({
...c,
x: projection.latLngToLayerPoint([c.latitude, c.longitude]).x,
y: projection.latLngToLayerPoint([c.latitude, c.longitude]).y,
}))
// citiesWithCenter will be passed to selectAll('circle').data()
// 2. node selection you forgot
const node = selection
.selectAll('circle')
.data(citiesWithcenter)
.enter()
.append('circle')
...
// let used in simulation
simulationforce.nodes(citiesWithcenter).on('tick', function (d) {
node
.attr('cx', function (d) {
// 3. use previously computed x value
// on the first tick run, the values in citiesWithCenter is used
return d.x
})
.attr('cy', function (d) {
// 3. use previously computed y value
// on the first tick run, the values in citiesWithCenter is used
return d.y
})
})
Full working demo here: https://jsfiddle.net/b2Lhfuw5/

Datum-Data difference in map behavior in d3

I'm pretty new to d3js and trying to understand the difference between using data and datum to attach data to elements. I've done a fair bit of reading the material online and I think I theoretically understand what's going on but I still lack an intuitive understanding. Specifically, I have a case where I'm creating a map using topojson. I'm using d3js v7.
In the first instance, I have the following code to create the map within a div (assume height, width, projection etc. setup correctly):
var svg = d3.select("div#map").append("svg")
.attr("width", width)
.attr("height", height)
.attr("transform", "translate(" + 15 + "," + 0 + ")");
var path = d3.geoPath()
.projection(projection);
var mapGroup = svg.append("g");
d3.json("json/world-110m.json").then(function(world){
console.log(topojson.feature(world, world.objects.land))
mapGroup.append("path")
.datum(topojson.feature(world, world.objects.land))
.attr("class", "land")
.attr("d", path);
});
The console log for the topojson feature looks like this:
And the map comes out fine (with styling specified in a css file):
But if I change datum to data, the map disappears. I'm trying to improve my understanding of how this is working and I'm struggling a little bit after having read what I can find online. Can someone explain the difference between data and datum as used in this case and why one works and the other doesn't?
Thanks for your help!
There are several differences between data() and datum(), but for the scope of your question the main difference is that data() accepts only 3 things:
An array;
A function;
Nothing (in that case, it's a getter);
As you can see, topojson.feature(world, world.objects.land) is an object. Thus, all you'd need to use data() here (again, not the idiomatic D3, I'm just addressing your specific question) is wrapping it with an array:
.data([topojson.feature(world, world.objects.land)])
Here is your code using data():
var svg = d3.select("div#map").append("svg")
.attr("width", 500)
.attr("height", 300)
.attr("transform", "translate(" + 15 + "," + 0 + ")");
var path = d3.geoPath();
var mapGroup = svg.append("g");
d3.json("https://raw.githubusercontent.com/d3/d3.github.com/master/world-110m.v1.json").then(function(world) {
const projection = d3.geoEqualEarth()
.fitExtent([
[0, 0],
[500, 300]
], topojson.feature(world, world.objects.land));
path.projection(projection);
mapGroup.append("path")
.data([topojson.feature(world, world.objects.land)])
.attr("class", "land")
.attr("d", path);
});
<script src="https://cdnjs.cloudflare.com/ajax/libs/d3/5.7.0/d3.min.js"></script>
<script src="https://unpkg.com/topojson#3"></script>
<div id="map"></div>

How to crossfilter histogram and scatterplot matrix in d3 v4?

I am using this kind of scatterplot matrix and a histogram as two views, in d3. Both of them get the data from the same csv file. This is how the histogram looks like (x axis):
To brush the histogram I use the code below, which is similar to this snippet:
svg.append("g")
.attr("class", "brush")
.call(d3.brushX()
.on("end", brushed));
function brushed() {
if (!d3.event.sourceEvent) return;
if (!d3.event.selection) return;
var d0 = d3.event.selection.map(x.invert),
d1 = [Math.floor(d0[0]*10)/10, Math.ceil(d0[1]*10)/10];
if (d1[0] >= d1[1]) {
d1[0] = Math.floor(d0[0]);
d1[1] = d1[0]+0.1;
}
d3.select(this).transition().call(d3.event.target.move, d1.map(x));
}
How can I link the two views, so that when I brush the histogram, the scatterplot matrix will show the brushed points as colored in red, and the other points as, lets say, grey?
This can get you started:
3 html files:
2 for the visuals (histogram.html and scatter.html)
1 to hold them in iframes (both.html):
Dependency:
jQuery (add to all 3 files)
Create table with 2 cells in both.html:
Add iframes to each cell:
<iframe id='histo_frame' width='100%' height='600px' src='histo.html'></iframe>
<iframe id='scatter_frame' width='100%' height='600px' src='scatter.html'></iframe>
I am using this histogram, and this scatterplot.
Add the linky_dink function to call the function inside your scatter.html (see below...):
function linky_dink(linked_data) {
document.getElementById('scatter_frame').contentWindow.color_by_value(linked_data);
}
In your scatter.html change your cell.selectAll function to this:
cell.selectAll("circle")
.data(data)
.enter().append("circle")
.attr("cx", function(d) { return x(d[p.x]); })
.attr("cy", function(d) { return y(d[p.y]); })
.attr("r", 4)
.attr('data-x', function(d) { return d.frequency }) // get x value being plotted
.attr('data-y', function(d) { return d.year }) // get y value being plotted
.attr("class", "all_circles") // add custom class
.style("fill", function(d) { return color(d.species); });
}
Note the added lines in bold:
Now our histogram circle elements retain the x and y values, along with a custom class we can use for targeting.
Create a color_by_value function:
function color_by_value(passed_value) {
$('.all_circles').each(function(d, val) {
if(Number($(this).attr('data-x')) == passed_value) {
$(this).css({ fill: "#ff0000" })
}
});
}
We know from above this function will be called from the linky_dink function of the parent html file. If the passed value matches that of the circle it will be recolored to #ff0000.
Finally, look for the brushend() function inside your histogram.html file. Find where it says: d3.selectAll("rect.bar").style("opacity", function(d, i) { .... and change to:
d3.selectAll("rect.bar").style("opacity", function(d, i) {
if(d.x >= localBrushYearStart && d.x <= localBrushYearEnd || brush.empty()) {
parent.linky_dink(d.y)
return(1)
} else {
return(.4)
}
});
Now, in addition to controlling the rect opacity on brushing, we are also calling our linky_dink function in our both.html file, thus passing any brushed histogram value onto the scatterplot matrix for recoloring.
Result:
Not the greatest solution for obvious reasons. It only recolors the scatterplot when the brushing ends. It targets circles by sweeping over all classes which is horribly inefficient. The colored circles are not uncolored when the brushing leaves those values since this overwhelms the linky_dink function. And I imagine you'd rather not use iframes, let alone 3 independent files. Finally, jQuery isn't really needed as D3 provides the needed functionality. But there was also no posted solution, so perhaps this will help you or someone else come up with a better answer.

Force colliding labels but not their points in d3

I am using d3 to make a line chart that has to support up to 100 points on it, making it very crowded. The problem is that some of the labels overlap.
The method I was trying involved drawing all the points, then separately drawing all the labels and running a force collision on the labels to stop them overlapping, then after the force collision drawing a line between each of the labels and their associated point.
I can't make the forces work, let alone the drawing of lines after.
Any suggestions for a better way to do this are heartily welcomed also.
Here is my code:
$.each(data.responseJSON.responsedata, function(k, v) {
var thispoint = svg.append("g").attr("transform", "translate("+pointx+","+pointy+")");
thispoint.append("circle").attr("r", 10).style("fill","darkBlue").style("stroke","black");
var label = svg.append("text").text(v.conceptName).style("text-anchor", "end").attr("font-family", "Calibri");
label.attr("transform", "translate("+(pointx)+","+(pointy-12)+") rotate(90)")
});
nodes = d3.selectAll("text")
simulation = d3.forceSimulation(nodes)
.force("x", d3.forceX().strength(10))
.force("y", d3.forceY().strength(10))
.force("collide",d3.forceCollide(20).strength(5))
.velocityDecay(0.15);
ticks = 0;
simulation.nodes(data)
.on("tick", d => {
ticks = ticks + 1;
d3.select(this).attr("x", function(d) { return d.x }).attr("y", function(d) { return d.x });
console.log("updated" + this)
});
Force layout is a relatively expensive way of moving labels to avoid collision. It is iteratively and computationally intensive.
More efficient algorithms add the labels one at a time, determining the best position for each. For example a 'greedy' strategy adds each label in sequence, selecting the position where the label has the lowest overlap with already added labels.
I've created a D3 components, d3fc-label-layout, that implements a number of label layout strategies:
https://github.com/d3fc/d3fc-label-layout
Here's an example of how to use it:
// Use the text label component for each datapoint. This component renders both
// a text label and a circle at the data-point origin. For this reason, we don't
// need to use a scatter / point series.
const labelPadding = 2;
const textLabel = fc.layoutTextLabel()
.padding(2)
.value(d => d.language);
// a strategy that combines simulated annealing with removal
// of overlapping labels
const strategy = fc.layoutRemoveOverlaps(fc.layoutGreedy());
// create the layout that positions the labels
const labels = fc.layoutLabel(strategy)
.size((d, i, g) => {
// measure the label and add the required padding
const textSize = g[i].getElementsByTagName('text')[0].getBBox();
return [
textSize.width,
textSize.height
];
})
.position((d) => {
return [
d.users,
d.orgs
]
})
.component(textLabel);
https://bl.ocks.org/ColinEberhardt/27508a7c0832d6e8132a9d1d8aaf231c

D3.js shade area between two lines using CSS fill [duplicate]

So I have a chart plotting traffic vs. date and rate vs. date. I'm trying to shade the area between the two lines. However, I want to shade it a different color depending on which line is higher. The following works without that last requirement:
var area = d3.svg.area()
.x0(function(d) { return x(d3.time.format("%m/%d/%Y").parse(d.original.date)); })
.x1(function(d) { return x(d3.time.format("%m/%d/%Y").parse(d.original.date)); })
.y0(function(d) { return y(parseInt(d.original.traffic)); })
.y1(function(d) { return y(parseInt(d.original.rate)); })
However, adding that last requirement, I tried to use defined():
.defined(function(d){ return parseInt(d.original.traffic) >= parseInt(d.original.rate); })
Now this mostly works, except when lines cross. How do I shade the area under one line BETWEEN points? It's shading based on the points and I want it to shade based on the line. If I don't have two consecutive points on one side of the line, I don't get any shading at all.
Since you don't have datapoints at the intersections, the simplest solution is probably to get the areas above and below each line and use clipPaths to crop the difference.
I'll assume you're using d3.svg.line to draw the lines that the areas are based on. This way we'll be able to re-use the .x() and .y() accessor functions on the areas later:
var trafficLine = d3.svg.line()
.x(function(d) { return x(d3.time.format("%m/%d/%Y").parse(d.original.date)); })
.y(function(d) { return y(parseInt(d.original.traffic)); });
var rateLine = d3.svg.line()
.x(trafficLine.x()) // reuse the traffic line's x
.y(function(d) { return y(parseInt(d.original.rate)); })
You can create separate area functions for calculating the areas both above and below your two lines. The area below each line will be used for drawing the actual path, and the area above will be used as a clipping path. Now we can re-use the accessors from the lines:
var areaAboveTrafficLine = d3.svg.area()
.x(trafficLine.x())
.y0(trafficLine.y())
.y1(0);
var areaBelowTrafficLine = d3.svg.area()
.x(trafficLine.x())
.y0(trafficLine.y())
.y1(height);
var areaAboveRateLine = d3.svg.area()
.x(rateLine.x())
.y0(rateLine.y())
.y1(0);
var areaBelowRateLine = d3.svg.area()
.x(rateLine.x())
.y0(rateLine.y())
.y1(height);
...where height is the height of your chart, and assuming 0 is the y-coordinate of the top of the chart, otherwise adjust those values accordingly.
Now you can use the area-above functions to create clipping paths like this:
var defs = svg.append('defs');
defs.append('clipPath')
.attr('id', 'clip-traffic')
.append('path')
.datum(YOUR_DATASET)
.attr('d', areaAboveTrafficLine);
defs.append('clipPath')
.attr('id', 'clip-rate')
.append('path')
.datum(YOUR_DATASET)
.attr('d', areaAboveRateLine);
The id attributes are necessary because we need to refer to those definitions when actually clipping the paths.
Finally, use the area-below functions to draw paths to the svg. The important thing to remember here is that for each area-below, we need to clip to the opposite area-above, so the Rate area will be clipped based on #clip-traffic and vice versa:
// TRAFFIC IS ABOVE RATE
svg.append('path')
.datum(YOUR_DATASET)
.attr('d', areaBelowTrafficLine)
.attr('clip-path', 'url(#clip-rate)')
// RATE IS ABOVE TRAFFIC
svg.append('path')
.datum(YOUR_DATASET)
.attr('d', areaBelowRateLine)
.attr('clip-path', 'url(#clip-traffic)')
After that you'll just need to give the two regions different fill colors or whatever you want to do to distinguish them from one another. Hope that helps!

Categories