I'm working on a chart for which each observation has a value between [-100,100], and I want to plot each point's position on a scale. The challenge is that the vast majority of the points have values in one region of the scale (the distribution is essentially Gaussian with mean 0).
In the past, when I've needed to plot something like a Zipf probability density distribution, I've used log scales to spread out the points in the congested region. Now my situation is similar, except that I have two distributions for which I need to spread out the points (the positive scale from [0, max] and the mirrored negative scale from [0, min]).
I know I could create one scale for positive values and one for negative values, but I'm wondering if it's possible to achieve this layout with only one scale. It seems that something like a parabolic scale could help out here (if that exists). Is it possible to achieve something like this in D3?
Before explaining my proposed solution, some considerations about the comments in this question: you cannot use a log scale in your situation. This is an easy mathematical principle: Log(0) is minus infinity. Actually, this is explicitly stated in the docs:
As log(0) = -∞, a log scale domain must be strictly-positive or strictly-negative; the domain must not include or cross zero.
That being said, let's go to the proposed solution.
You could create your own scale (it's not that complicated). However, here, I'll use a interpolate function, based on this excellent answer (not a duplicate, though, because you want the opposite) and this code from Mike Bostock.
Using a linear scale, we set an interpolator:
var xScale = d3.scaleLinear()
.domain([-100, 100])
.interpolate(easeInterpolate(d3.easeQuadInOut));
Then, we use an easing in the easeInterpolate function:
function easeInterpolate(ease) {
return function(a, b) {
var i = d3.interpolate(a, b);
return function(t) {
return i(ease(t));
};
};
}
Here I'm using d3.easeQuadInOut, which I think suits you, but you can change this for another one, or even creating your own.
Have a look at this demo. I'm creating 50 circles, evenly spaced from -100 to +100 (-100, -96, -92, -88... until +100). You can see that they are moved away from the center. If you use this scale with your data, you'll avoid the crowded data points around zero:
var data = d3.range(51).map(function(d) {
return -100 + (d * 4)
});
var svg = d3.select("body")
.append("svg")
.attr("width", 600)
.attr("height", 100);
var xScale = d3.scaleLinear()
.domain([-100, 100])
.range([20, 580])
.interpolate(easeInterpolate(d3.easeQuadInOut));
svg.append("g")
.attr("transform", "translate(0,70)")
.call(d3.axisBottom(xScale));
svg.selectAll(null)
.data(data)
.enter()
.append("circle")
.attr("cx", function(d) {
return xScale(d)
})
.attr("cy", 50)
.attr("r", 4)
.attr("fill", "teal")
function easeInterpolate(ease) {
return function(a, b) {
var i = d3.interpolate(a, b);
return function(t) {
return i(ease(t));
};
};
}
<script src="https://d3js.org/d3.v4.min.js"></script>
In case you ask, that last tick is not 80100. That's just the 80 tick overlapping with the 100 tick (the same thing happens with the -80 and the -100).
Also, it is worth noting that there is nothing wrong in using transformed scales, and that even if it does deform or skew the chart, it's perfect valid and does not lead to misinterpretations, as long as you inform the users about the transformation.
Related
I'm just starting out with D3 and am quickly understanding that it's a pretty low level tool.
I'm using D3 to produce a Marimekko chart using this great example by Mike Bostock in
b.locks, which is in all honestly a way too advanced place to start for me, but I started using D3 because I need a Marimekko chart, so here I am.
The x-axis here has ticks, 0 to 100% with 10% intervals. If my understanding of these code excerpts is correct...
Set the x axis to a linear scale
var x = d3.scale.linear().range([0, width - 3 * margin]);
Give the x-axis 10 ticks
var xtick = svg.selectAll(".x").data(x.ticks(10))
In my usage case , I'd like to have the x-axis ticks at the irregular intervals inherent to a Marimekko chart, and the axis labels to be the category, rather than a percentage.
The desired behaviour, as far as x-axis labelling, can be illustrated by this b.locks example by 'cool Blue'
I've got as far as understanding that I need a ordinal axis rather than a linear one (as in this excerpt of cool Blue's code)
var padding = 0.1, outerPadding = 0.3,
x = d3.scale.ordinal().rangeRoundBands([0, width], padding, outerPadding);
How can I modify Mike Bostock's code to give me an example where the x-axis ticks label the column (ideally centrally), as opposed to providing a %age of the width?
I wouldn't say that D3 is that low level, since it has a lot of abstractions. However, D3 is not a charting tool (and, in that sense, it is low level...), which makes things hard for a beginner.
However, you're lucky: the changes needed here are minimal. First, you'll pass the correct data to the selection that generates the axis...
var xtick = svg.selectAll(".x")
.data(segments)
//etc..
... and then use the same math for the translate, but adding half the sum:
.attr("transform", function(d, i) {
return "translate(" + x((d.offset + d.sum / 2) / sum) + "," + y(1) + ")";
});
Of course, you'll print the key, not a percentage:
xtick.text(function(d) {
return d.key
});
Here is the updated bl.ocks: https://bl.ocks.org/anonymous/09a8881e5bab2b12e7fd46c90a63b3fd/fd7b1a7b20f8436666f1544b6774778e748934ba
I'm trying to create a simple scatter plot in d3 (similar to this one from matplotlib):
I use extent() to set the scale's input domain range.
xScale.domain(d3.extent(xvalues));
Using this approach results in some dots overlapping axises in d3 plot:
How to avoid axis overlapping and make a margin similar to the matplotlib's plot?
Input values vary, so simple increment / decrement of the extent() output doesn't look like a general solution.
In general, the best way of handling this is to call the scale's .nice() function, which will round the ends of the domain of the scale to nice values. In your particular case, this doesn't work, as the values are "nice" already.
In this case I would compute the extent of the domain and extend it by a fraction of that. For example:
var padding = (xScale.domain()[1] - xScale.domain()[0]) / 10;
xScale.domain([xScale.domain()[0] - padding, xScale.domain()[1] + padding]).nice();
In your matplotlib image, the dots are not overlapping and the x scale has negative value.
In d3:
var xScale = d3.scale.linear()
.domain([
d3.min(data, function(d) {
return d.val;
})-10, //so the domain is wider enough for the zero value
d3.max(data, function(d) {
return d.val;
}),
])
.range([height , 0])
I'm using Mike Bostock's example as a template and building on it. My bar chart here.
After transition to stacked version, I am unable to get the y position of the bars. Bars of higher height overshadow the smaller ones. Most likely because of the valueOffset attribute of the stack. I am stuck on this issue for few days now.
Changes from Mike's example:
removed group labels in stacked chart
new y-axis y2 on linear scale. The domain for this axis is from 0 to the maximum of all the sums of values in each year which is 141.
defined new stack stack_year for relative positions of the bars.
Relevant code:
// y2 definition
y2.domain([0, d3.max(dataByGroup_year, function(d) { return d.year_wise_sum; })]).range([height, 0]);
// calculates sum of all wins per year
dataByGroup_year.forEach(function(d) {
var order = d.values.map(function(d) { return d.value; });
d.year_wise_sum = d3.sum(order);
});
function transitionStacked() {
var t = svg.transition().duration(750),
g = t.selectAll(".group").attr("transform", "translate(0," + y0(y0.domain()[0]) + ")");
g.selectAll("rect").attr("x", function(d) { return x(d.year); })
.attr("y", function(d) { return height - y2(d.valueOffset); })
.attr("height", function(d) { return height - y2(d.value); });
g.selectAll(".group-label").text("");
}
y0 is the ordinal scale used for multiple charts. y1 is the linear scale used for each chart in multiple charts.
Full HTML code at github
Data used: input file. I disabled tips for each bar.
Update: JSFIDDLE
Any help is much appreciated! Thank you
There were a number of issues here, which I've fixed up in this fiddle: http://jsfiddle.net/henbox/wL9x6cjk/4/
Part of the problems was the data itself (as per my comment above). There were some repeated values, which was causing issues when calculating the valueOffset correctly (using the d3.layout.stack)
I've also made some changes to how the y and attribute for each rect are calculated in the transitionStacked function. I changed what you had:
.attr("y", function(d) {
return height - y2(d.valueOffset);
})
to:
.attr("y", function (d) {
return y2(d.value + d.valueOffset) - height;
})
Note that you need to sum the d.value and d.valueOffset, before applying the scaling, to calculate the top left corner position of the rect. Additionally, you don't need to recalculate the x attribute value since this doesn't change between the two chart views, so I removed it
I also removed the call to stack_year(dataByGroup_year);. You don't need to build the stack layout here, just to calculate the maximum sum per year.
Finally I also tidied up the y-axis positioning a bit so there's enough space for the x-axis labels, and simplified the positioning of group elements in the stacked view. I also moved the x-axis to be appended to svg rather than group, which simplified positioning of elements
I am having a bit of a trouble scaling my graph, according to the length on the bars. For example, in the jsfiddle, I can't draw a bar beyond the data point of size 25. I know one way to fix this would be to make the width and height of the body larger. But I was thinking scaling the entire graph would be much more efficient, so that one bar doesn't end up looking abnormally large.
http://jsfiddle.net/NkkDC/
I was thinking, I would have to scale the "y" function here, but I wasn't sure how.
bars.on("click", clickEvent)
.transition().duration(2000).delay(200)
.attr("y", function(d, i) { return i * 20; })
.attr("width", x)
.attr("height", 20);
Thanks in advance :)
The input domain of your xScale can change every time you add a new value (since you could add a new maximum), so we need to recalculate the xScale when we re-render the chart. Moving the declaration of the x-scale inside your render function should do the trick :
var xScale = d3.scale.linear()
.domain([0, d3.max(data)])
.range([0, 420]);
http://jsfiddle.net/NkkDC/1/
Say we have a y-scale that converts the data domain to rangebands that are 20px for each data element.
var y = d3.scale.ordinal()
.domain(data)
.rangeBands([0, 20 * data.length]);
I would normally use this scale to determine the y-coordinate of something in this manner:
svg.selectAll("rect")
.data(data)
.enter()
.append("rect")
.attr("y", y)
But in some tutorials, I've seen an alternative syntax.
.attr("y", function(d) { return y(d); })
I'm not sure what's going on here, and I would like to understand it before I move on with learning D3. I know y is a function, so the d within parenthesis would be an argument of that function. But we already specified the data input in the y function. I would love to read an explanation of what we're actually doing with y(d). What are the pros and cons of the two alternatives?
I think that's a minor difference that's maybe left over from copying and pasting other attrs that are set with an anonymous function (as in the second case). There should be no reason to wrap y in an anonymous function.