dc.js - how to create a row chart from multiple columns - javascript

I need to create a rowchart in dc.js with inputs from multiple columns in a csv. So i need to map a column to each row and each columns total number to the row value.
There may be an obvious solution to this but i cant seem to find any examples.
many thanks
S
update:
Here's a quick sketch. Apologies for the standard
Row chart;
column1 ----------------- 64 (total of column 1)
column2 ------- 35 (total of column 2)
column3 ------------ 45 (total of column 3)

Interesting problem! It sounds somewhat similar to a pivot, requested for crossfilter here. A solution comes to mind using "fake groups" and "fake dimensions", however there are a couple of caveats:
it will reflect filters on other dimensions
but, you will not be able to click on the rows in the chart in order to filter anything else (because what records would it select?)
The fake group constructor looks like this:
function regroup(dim, cols) {
var _groupAll = dim.groupAll().reduce(
function(p, v) { // add
cols.forEach(function(c) {
p[c] += v[c];
});
return p;
},
function(p, v) { // remove
cols.forEach(function(c) {
p[c] -= v[c];
});
return p;
},
function() { // init
var p = {};
cols.forEach(function(c) {
p[c] = 0;
});
return p;
});
return {
all: function() {
// or _.pairs, anything to turn the object into an array
return d3.map(_groupAll.value()).entries();
}
};
}
What it is doing is reducing all the requested rows to an object, and then turning the object into the array format dc.js expects group.all to return.
You can pass any arbitrary dimension to this constructor - it doesn't matter what it's indexed on because you can't filter on these rows... but you probably want it to have its own dimension so it's affected by all other dimension filters. Also give this constructor an array of columns you want turned into groups, and use the result as your "group".
E.g.
var dim = ndx.dimension(function(r) { return r.a; });
var sidewaysGroup = regroup(dim, ['a', 'b', 'c', 'd']);
Full example here: https://jsfiddle.net/gordonwoodhull/j4nLt5xf/5/
(Notice how clicking on the rows in the chart results in badness, because, what is it supposed to filter?)

Are you looking for stacked row charts? For example, this chart has each row represent a category and each color represents a sub-category:
Unfortunately, this feature is not yet supported at DC.js. The feature request is at https://github.com/dc-js/dc.js/issues/397. If you are willing to wade into some non-library code, you could check out the examples referenced in that issue log.
Alternatively, you could use a stackable bar chart. This link seems to have a good description of how this works: http://www.solinea.com/blog/coloring-dcjs-stacked-bar-charts

Related

crossfilter.js - Histogram with custom reduce function fails in filtering data

following situation: I have two plots, one scatterplot and one histogram for the x-values in this scatterplot. I wrote a custom reduce function that looks similar to this:
let grouping = this._cf_dimensions[attribute].group().reduce(
function(elements, item) {
elements.items.push(item);
elements.count++;
return elements;
},
function(elements, item) {
// console.log("item.id = " + item.id);
let match = false;
let values = [];
for (let i = 0; i < elements.items.length && !match; i++) {
// Compare hyperparameter signature.
if (item.id === elements.items[i].id) {
match = true;
elements.items.splice(i, 1);
elements.count--;
}
}
}
return elements;
},
function() {
return {items: [], count: 0};
}
);
The problem: When I select points in my scatterplot, the correlating histogram does not update properly. I traced it back to the remove function, i. e. the second of the three functions above, being called for only one of my five groups (I checked by comparison of the length of elements with the original group size). That means that the items to be removed won't be necessarily found.
In other words, the scatterplot selects the correct set of datapoints, but the remove function in the barchart grouping shown above, while registering the incoming filter update, is not called for all groups of this grouping (equivalently: not called for all bars in the bar chart).
I'm a bit at a loss, since I seem to remember successfully implementing dashboards with dc.js and crossfilter.js and the past exactly like this. Do I misunderstand something about the custom reduce concept or is there something obvious I'm overlooking?
Thanks!

Highcharts - gap between series in stacked area chart

I've created a stacked area chart in Highcharts, which you can see in the image below and in the following jsfiddle: http://jsfiddle.net/m3dLtmoz/
I have a workaround for the gaps you see, which is to group the data for each series by month so that each series looks something like this instead:
series: [{
data: [
[1464739200000,2471],
[1467331200000,6275],
[1470009600000,2574],
[1472688000000,7221],
[1475280000000,3228]
]}
]
While the above isn't exactly what I'm going for, the way the series above is structured does give me what I ultimately want, which is this:
I'm really dying to know why the original setup isn't working appropriately, however. I've tested other instances where datetimes group and aggregate properly based on a single datetime x axis value. I'm stumped as to why this particular data set isn't working. I've tried using the dataGrouping option in the Highstock library, but wasn't able to integrate that effectively. I've messed with options as far as tickInterval goes to no avail. I tried setting the "stacking: 'normal' option in each series instead of in the plotOptions, but that made no difference. I've seen issues on github dealing with the stacked area charts, but nothing seems to exactly match up with what I'm seeing. Any help is appreciated - thank you much!
You receive the error in the console. Most of the series require data to be sorted in ascending order. Stacking has nothing do to it, see example.
Series which do not require data to be sorted are scatter or polygon. No error in scatter
You should sort and group the points on your own. If you want to group them by months you have to prepare the data before you put them in a chart. The example below takes averages from the same datetime.
function groupData(unsortedData) {
var data = unsortedData.slice();
data.sort(function (a, b) {
return a[0] - b[0]
});
var i = 1,
len = data.length,
den = 1,
sum = data[0][1],
groupedData = [[data[0][0], sum]],
groupedData = [];
for (; i < len; i++) {
if (data[i - 1][0] === data[i][0]) {
sum += data[i][1];
den++;
} else {
groupedData.push([data[i - 1][0], sum / den]);
den = 1;
sum = data[i][1];
}
}
groupedData.push([data[i-1][0], sum / den]);
return groupedData;
}
example: http://jsfiddle.net/e4enhw9a/1/

How do I get a Google visualization Table to sort on formatted cell values when present?

I'm using numerous Google visualization Tables to display a variety of data in a single-paged, multi-tabbed web app. Some of these tables have columns that use formatted values. When a user clicks on the column header for these columns, the data is sorted according to the sort order of the underlying values rather than the sort order of the formatted values. This results in some cases in the data not appearing to the user to be sorted.
I'm looking for a way to fix this that can be reused across all my Tables, preferably by writing one event listener that can be attached to each Table and will handle all of the specifics regardless of the data types used in the various columns of the various associated DataTables, and regardless of whether any given column uses formatted values. I do have the condition that if any cell in a column uses a formatted value, then all of the cells in that column do, but I do not have the condition that every column uses formatted values. If a column does not use formatted values, then I want the normal sort based on the type of data in the column (e.g., number, string, date, etc.).
As an example, I could have a DataTable like the following. (This data is not actual data, but mirror situations that are problematic for me.)
var dataTable = new google.visualization.DataTable({
cols: [
{type: 'number', label: 'ID'},
{type: 'number', label: 'Places'},
{type: 'string', label: 'Things'}
],
rows: [
{c:[{v:1},{v:102735,f:'Harbor Place'},{v:'pet',f:'Dalmation'}]},
{c:[{v:2},{v:163848,f:'Alphaville'},{v:'my',f:'Top Favorites'}]},
{c:[{v:3},{v:113787,f:'Beta City'},{v:'ten',f:'Best Things'}]}
]
});
Without my doing any special configuration, this table, if sorted ascending on the ID column, would look like this, as the user would expect:
ID Places Things
----------------------------------
1 Harbor Place Dalmation
2 Alphaville Top Favorites
3 Beta City Best Things
If the user clicks on the header for the Places column, the table then looks like this, because the ascending sort is performed on the underlying numeric values:
ID Places Things
----------------------------------
1 Harbor Place Dalmation
3 Beta City Best Things
2 Alphaville Top Favorites
The user, however, expects the table to look like this:
ID Places Things
----------------------------------
2 Alphaville Top Favorites
3 Beta City Best Things
1 Harbor Place Dalmation
If the user clicks on the header for the Things column, the table would sort ascending as follows, because of the sort order of the underlying string values:
ID Places Things
----------------------------------
2 Alphaville Top Favorites
1 Harbor Place Dalmation
3 Beta City Best Things
The user is expecting this:
ID Places Things
----------------------------------
3 Beta City Best Things
1 Harbor Place Dalmation
2 Alphaville Top Favorites
I've searched on StackOverflow and on Google and haven't found any discussion of this particular situation. So I've been working on writing my own event listener, but it quickly became very involved when trying to write it as something that can be reused for any of my Tables. So am I missing something simple? Has anyone else had a situation like this and resolved it; if so, what did you do?
you can use the DataView Class to set the row order
dataView.setRows([1, 2, 0]);
then draw the chart with the DataView
table.draw(dataView, options);
getting the order of the rows during the 'sort' event is the tricky part,
since there aren't any convenience methods for formatted values, such as getDistinctValues
in the following working snippet, the formatted values are extracted into an array,
sorted, then getFilteredRows is used to get the row index of the formatted value
google.charts.load('current', {
callback: function () {
var dataTable = new google.visualization.DataTable({
cols: [
{type: 'number', label: 'ID'},
{type: 'number', label: 'Places'},
{type: 'string', label: 'Things'}
],
rows: [
{c:[{v:1},{v:102735,f:'Harbor Place'},{v:'pet',f:'Dalmation'}]},
{c:[{v:2},{v:163848,f:'Alphaville'},{v:'my',f:'Top Favorites'}]},
{c:[{v:3},{v:113787,f:'Beta City'},{v:'ten',f:'Best Things'}]}
]
});
// use DataView to set order of rows
var dataView = new google.visualization.DataView(dataTable);
dataView.setRows([1, 2, 0]);
var table = new google.visualization.Table(document.getElementById('chart_div'));
var options = {
// use event to set order
sort: 'event',
// set column arrow
sortColumn: 1,
sortAscending: true
};
google.visualization.events.addListener(table, 'sort', function(props) {
var sortValues = [];
var sortRows = [];
var sortDirection = (props.ascending) ? 1 : -1;
// load values into sortable array
for (var i = 0; i < dataTable.getNumberOfRows(); i++) {
sortValues.push({
v: dataTable.getValue(i, props.column),
f: dataTable.getFormattedValue(i, props.column)
});
}
sortValues.sort(function (row1, row2) {
return row1.f.localeCompare(row2.f) * sortDirection;
});
// set row indexes
sortValues.forEach(function (sortValue) {
sortRows.push(dataTable.getFilteredRows([{column: props.column, value: sortValue.v}])[0]);
});
// use DataView to set order of rows
dataView.setRows(sortRows);
// set column arrow
options.sortColumn = props.column;
options.sortAscending = props.ascending;
table.draw(dataView, options);
});
table.draw(dataView, options);
},
packages: ['table']
});
<script src="https://www.gstatic.com/charts/loader.js"></script>
<div id="chart_div"></div>

First row in D3 DataTable is sorted incorrectly - D3/DC/Crossfilter

I have some basic d3/dc/crossfilter code that is used to render a datatable. The datatable should be sorted by column N (which is simply the row number). Whenever the table is rendered in the browser, some random row is the top row, and then in the second row the highest row is displayed, followed by the next highest as is appropriate. I initially thought this error might be due to chrome's sort being unsafe, but given the fact that all the values of column N are unique, this shouldnt be an issue.
my code:
function makeGraphs(error, trades) {
trades.forEach(function(d) {
d['N'] = +d['N']
})
var tradeTable = dc.dataTable("#dc-table-graph");
var trades = crossfilter(trades);
var NDimension = trades.dimension(function (d) {
return d['N']});
tradeTable.width(960).height(800)
.dimension(NDimension)
.group(function(d) { return "trades"
})
.size(110)
.columns([
function(d) { return d.N; },
function(d) { return d['Profit']; },
])
.sortBy(function(d){ return -d.N; })
.order(d3.descending)
dc.renderAll();
};
my output looks as follows
N Profit
54 .56
107 .36
106 .33
105 .25
104 .21
all the way down to N=1
Obviously the first row shouldn't be 54, yet for some reason it is.
Any ideas?
Thanks
I built a working example of this, but am unable to recreate the problem: http://bl.ocks.org/esjewett/5d84984dd8436542bb33
Please feel free to fork the block at http://blockbuilder.org/esjewett/5d84984dd8436542bb33 and try to recreate the problem for us.
I would note that
trades.forEach(function(d) {
d['N'] = +d['N']
})
doesn't seem to me to do anything with regards to sorting because in the sort it is -d.N, which casts to a number anyway.

Optimising a group of dc.js line graphs

I have a group of graphs visualizing a bunch of data for me (here), based off a csv with approximately 25,000 lines of data, each having 12 parameters. However, doing any interaction (such as selecting a range with the brush on any of the graphs) is slow and unwieldy, completely unlike the dc.js demo found here, which deals with thousands of records as well but maintains smooth animations, or crossfilter's demo here which has 10 times as many records (flights) as I do.
I know the main resource hogs are the two line charts, since they have data points every 15 minutes for about 8 solid months. Removing either of them makes the charts responsive again, but they're the main feature of the visualizations, so is there any way I can make them show less fine-grained data?
The code for the two line graphs specifically is below:
var lineZoomGraph = dc.lineChart("#chart-line-zoom")
.width(1100)
.height(60)
.margins({top: 0, right: 50, bottom: 20, left: 40})
.dimension(dateDim)
.group(tempGroup)
.x(d3.time.scale().domain([minDate,maxDate]));
var tempLineGraph = dc.lineChart("#chart-line-tempPer15Min")
.width(1100).height(240)
.dimension(dateDim)
.group(tempGroup)
.mouseZoomable(true)
.rangeChart(lineZoomGraph)
.brushOn(false)
.x(d3.time.scale().domain([minDate,maxDate]));
Separate but relevant question; how do I modify the y-axis on the line charts? By default they don't encompass the highest and lowest values found in the dataset, which seems odd.
Edit: some code I wrote to try to solve the problem:
var graphWidth = 1100;
var dataPerPixel = data.length / graphWidth;
var tempGroup = dateDim.group().reduceSum(function(d) {
if (d.pointNumber % Math.ceil(dataPerPixel) === 0) {
return d.warmth;
}
});
d.pointNumber is a unique point ID for each data point, cumulative from 0 to 22 thousand ish. Now however the line graph shows up blank. I checked the group's data using tempGroup.all() and now every 21st data point has a temperature value, but all the others have NaN. I haven't succeeded in reducing the group size at all; it's still at 22 thousand or so. I wonder if this is the right approach...
Edit 2: found a different approach. I create the tempGroup normally but then create another group which filters the existing tempGroup even more.
var tempGroup = dateDim.group().reduceSum(function(d) { return d.warmth; });
var filteredTempGroup = {
all: function () {
return tempGroup.top(Infinity).filter( function (d) {
if (d.pointNumber % Math.ceil(dataPerPixel) === 0) return d.value;
} );
}
};
The problem I have here is that d.pointNumber isn't accessible so I can't tell if it's the Nth data point (or a multiple of that). If I assign it to a var it'll just be a fixed value anyway, so I'm not sure how to get around that...
When dealing with performance problems with d3-based charts, the usual culprit is the number of DOM elements, not the size of the data. Notice the crossfilter demo has lots of rows of data, but only a couple hundred bars.
It looks like you might be attempting to plot all the points instead of aggregating them. I guess since you are doing a time series it may be unintuitive to aggregate the points, but consider that your plot can only display 1100 points (the width), so it is pointless to overwork the SVG engine plotting 25,000.
I'd suggest bringing it down to somewhere between 100-1000 bins, e.g. by averaging each day:
var daysDim = data.dimension(function(d) { return d3.time.day(d.time); });
function reduceAddAvg(attr) {
return function(p,v) {
if (_.isLegitNumber(v[attr])) {
++p.count
p.sums += v[attr];
p.averages = (p.count === 0) ? 0 : p.sums/p.count; // gaurd against dividing by zero
}
return p;
};
}
function reduceRemoveAvg(attr) {
return function(p,v) {
if (_.isLegitNumber(v[attr])) {
--p.count
p.sums -= v[attr];
p.averages = (p.count === 0) ? 0 : p.sums/p.count;
}
return p;
};
}
function reduceInitAvg() {
return {count:0, sums:0, averages:0};
}
...
// average a parameter (column) named "param"
var daysGroup = dim.group().reduce(reduceAddAvg('param'), reduceRemoveAvg('param'), reduceInitAvg);
(reusable average reduce functions from the FAQ)
Then specify your xUnits to match, and use elasticY to auto-calculate the y axis:
chart.xUnits(d3.time.days)
.elasticY(true)

Categories