ChartJS difficulty with X Axis and pulling appropriate data - javascript

I'm having some difficulties with iterating over some json data and displaying the proper X axis (Date/Time) I've tried several times to loop through the json data to extract the time for each worker in "history", but unsuccessful.
Here is a sample of the JSON - https://github.com/devdevdevdev1/rvnpoolcharts/blob/main/samplejson.json
And here is my code - https://github.com/devdevdevdev1/rvnpoolcharts/blob/main/worker_stats.js
To get it to work (somewhat), i've just thrown 48 random numbers as points on the x-axis:
displayWorkerHashrateGraph(Array.from(Array(48).keys()), dataset, label);
but obviously there are several problems with this, such as if a new worker joins after one has already been reporting, it starts at 1 and doesn't line up with the other workers.
You can see here that there is no dates/times on the x axis, and the label says 78 instead of a date label.
Current Graph
I've also tried to loop through the list of each worker and extract the time like so:
let labelsList = [];
let historyList = [];
//Begin History Loop
for (var w in workerData.history) {
var worker = getWorkerNameFromAddress(w);
var a = {
key: worker,
hashrate: []
};
for (var wh in workerData.history[w]) {
a.hashrate.push([workerData.history[w][wh].time * 1000, workerData.history[w][wh].hashrate]);
labelsList.push([w, workerData.history[w][wh].time * 1000]);
}
historyList.push(a);
But that ends up looking like this - Seems its taking the time for all 3 of them, but not lining up with the hashrate and pushing all the values to the left.
What I'm trying to achieve:
Pull the time value for each worker iteration from the history element, regardless of when a worker starts reporting.
Mark the hashrate on the graph at the proper date/time
Only display the last hour of stats (BONUS: a way to zoom out to see all 12 hours of stats)
I'd really appreciate any help you can give, I'm a bit new to javascript and I've tried many times to achieve the above and have seriously been banging my head on a wall. Thanks in advance!

Related

Get visible points for a series in LightningChartJs

Exists a function in LightningChartJs to get all visible points from a line or point series in a chart?
If I zoom the chart I want to show something if no visible points available. In some cases I have breaks in my data.
For now I have to check the range and filter all points within this range, but that seems not to be very performant. I guess LC is aware of all the visible points and can give me that.
I would very much welcome any thoughts on the subject or other solutions. Thanks.
LightningChart JS doesn't track the data points that are visible at any time. So the method that you have used to solve the issue is the best way currently.
Something like this seems to be reasonably performant.
function getDataInRange(data, rangeStart, rangeEnd){
const inRangeData = []
const dataLength = data.length
let curPoint
for(let i = 0; i < dataLength; i += 1){
curPoint = data[i]
if(curPoint.x >= rangeStart && curPoint.x <= rangeEnd){
inRangeData.push(curPoint)
}
}
return inRangeData
}
On my personal machine it can process 1 million points in ~10ms ± 2ms. If you only want to know that a point is visible in the range then you could just break the loop as soon as a single point is in the visible range.
Late to the game but for anybody googling:
If you already have a chart defined and it happens to be named 'chart' (otherwise change chart to your chart's object name), you can track the visible start and end data points like this:
axisX = chart.getDefaultAxisX()
window.axisXScaleChangeToken = axisX.onScaleChange((s, e) => {
window.axisXVisibleDataRangeStart = s
window.axisXVisibleDataRangeEnd = e
})
let visiblePoints = [];
for(let i of cur.data){
if(i[0] > window.axisXVisibleDataRangeStart && i[0] < window.axisXVisibleDataRangeEnd) visiblePoints.push(i)
}
Every time the X axis is scaled/zoomed/moved, axisXVisibleDataRangeStart and axisXVisibleDataRangeEnd will change. You're then iterating over where your data points are stored (cur.data in my case and the example) and comparing: If timestamp is within range, push to visiblePoints.
(I am using OHLC where data[0] is the timestamp. Your comparison might be to an object array where {x:} is the value youre looking to compare. You get the idea.)
To remove the listener and stop the logging:
axisX.offScaleChange(window.axisXScaleChangeToken)

Why is my reducer behaving differently between the first filter and subsequent filters applied in dc.js?

I'm working on a data visualization that has an odd little bug:
It's a little tricky to see, but essentially, when I click on a point in the line chart, that point corresponds to a specific issue of a magazine. The choropleth updates to reflect geodata for that issue, but, critically, the geodata is for a sampled period that corresponds to the issue. Essentially, the choropleth will look the same for any issue between January-June or July-December of a given year.
As you can see, I have a key called Sampled Issue Date (for Geodata), and the value should be the date of the issue for which the geodata is based on (basically, they would get geographical distribution for one specific issue and call it representative of ALL data in a six month period.) Yet, when I initially click on an issue, I'm always getting the last sampled date in my data. All of the geodata is correct, and, annoyingly, all subsequent clicks display the correct information. So it's only that first click (after refreshing the page OR clearing an issue) that I have a problem.
Honestly, my code is a nightmare right now because I'm focused on debugging, but you can see my reducer for the remove function on GitHub which is also copy/pasted below:
// Reducer function for raw geodata
function geoReducerAdd(p, v) {
// console.log(p.sampled_issue_date, v.sampled_issue_date, state.periodEnding, state.periodStart)
++p.count
p.sampled_mail_subscriptions += v.sampled_mail_subscriptions
p.sampled_single_copy_sales += v.sampled_single_copy_sales
p.sampled_total_sales += v.sampled_total_sales
p.state_population = v.state_population // only valid for population viz
p.sampled_issue_date = v.sampled_issue_date
return p
}
function geoReducerRemove(p, v) {
const currDate = new Date(v.sampled_issue_date)
// if(currDate.getFullYear() === 1921) {
// console.log(currDate)
// }
currDate <= state.periodEnding && currDate >= state.periodStart ? console.log(v.sampled_issue_date, p.sampled_issue_date) : null
const dateToRender = currDate <= state.periodEnding && currDate >= state.periodStart ? v.sampled_issue_date : p.sampled_issue_date
--p.count
p.sampled_mail_subscriptions -= v.sampled_mail_subscriptions
p.sampled_single_copy_sales -= v.sampled_single_copy_sales
p.sampled_total_sales -= v.sampled_total_sales
p.state_population = v.state_population // only valid for population viz
p.sampled_issue_date = dateToRender
return p
}
// generic georeducer
function geoReducerDefault() {
return {
count: 0,
sampled_mail_subscriptions: 0,
sampled_single_copy_sales: 0,
sampled_total_sales: 0,
state_population: 0,
sampled_issue_date: ""
}
}
The problem could be somewhere else, but I don't think it's a crossfilter issue (I'm not running into the "two groups from the same dimension" problem for sure) and adding additional logic to the add reducer makes things even less predictable (understandably - I don't ever really need to render the sample date for all values anyway.) The point of this is that I'm completely lost about where the flaw in my logic is, and I'd love some help!
EDIT: Note that the reducers are for the reduce method on a dc.js dimension, not the native javascript reducer! :D
Two crossfilters! Always fun to see that... but it can be tricky because nothing in dc.js directly supports that, except for the chart registry. You're on your own for filtering between different chart groups, and it can be tricky to map between data sets with different time resolutions and so on.
The problem
As I understand your app, when a date is selected in the line chart, the choropleth and accompanying text should have exactly one row from the geodata dataset selected per state.
The essential problem is that Crossfilter is not great at telling you which rows are in any given bin. So even though there's just one row selected, you don't know what it is!
This is the same problem that makes minimum, maximum, and median reductions surprisingly complicated. You often end up building new data structures to capture what crossfilter throws away in the name of efficiency.
A general solution
I'll go with a general solution that's more that you need, but can be helpful in similar situations. The only alternative that I know is to go completely outside crossfilter and look in the original dataset. That's fine too, and maybe more efficient. But it can be buggy and it's nice to work within the system.
So let's keep track of which dates we've seen per bin. When we start out, every bin will have all the dates. Once a date is selected, there will be only one date (but not exactly the one that was selected, because of your two-crossfilter setup).
Instead of the sampled_issue_date stuff, we'll keep track of an object called date_counts now:
// Reducer function for raw geodata
function geoReducerAdd(p, v) {
// ...
const canonDate = new Date(v.sampled_issue_date).getTime()
p.date_counts[canonDate] = (p.date_counts[canonDate] || 0) + 1
return p
}
function geoReducerRemove(p, v) {
// ...
const canonDate = new Date(v.sampled_issue_date).getTime()
if(!--p.date_counts[canonDate])
delete p.date_counts[canonDate]
return p
}
// generic georeducer
function geoReducerDefault() {
return {
// ...
date_counts: {}
}
}
What does it do?
Line by line
const canonDate = new Date(v.sampled_issue_date).getTime()
Maybe this is paranoid, but this canonicalizes the input dates by converting them to the number of milliseconds since 1970. I'm sure you'd be safe using the string dates directly, but who knows there could be a space or a zero or something.
You can't index an object with a date object, you have to convert it to an integer.
p.date_counts[canonDate] = (p.date_counts[canonDate] || 0) + 1
When we add a row, we'll check if we currently have a count for the row's date. If so, we'll use the count we have. Otherwise we'll default to zero. Then we'll add one.
if(!--p.date_counts[canonDate])
delete p.date_counts[canonDate]
When we remove a row, we know that we have a count for the date for that row (because crossfilter won't tell us it's removing the row unless it was added earlier). So we can go ahead and decrement the count. Then if it hits zero we can remove the entry.
Like I said, it's overkill. In your case, the count will only go to 1 and then drop to 0. But it's not much more expensive to this rather than just keep
Rendering the side panel
When we render the side panel, there should only be one date left in date_counts for that selected item.
console.assert(Object.keys(date_counts).length === 1) // only one entry
console.assert(Object.entries(date_counts)[0][1] === 1) // with count 1
document.getElementById('geo-issue-date').textContent = new Date(+Object.keys(date_counts)[0]).format('mmm dd, yyyy')
Usability notes
From a usability perspective, I would recommend not to filter(null) on mouseleave, or if you really want to, then put it on a timeout which gets cancelled when you see a mouseenter. One should be able to "scrub" over the line chart and see the changes over time in the choropleth without accidentally switching back to the unfiltered colors.
I also noticed (and filed) an issue because I noticed that dots to the right of the mouse pointer are shown, making them difficult to click. The reason is that the dots are overlapping, so only a little sliver of a crescent is hoverable. At least with my trackpad, the click causes the pointer to travel leftward. (I can see the date go back a week in the tooltip and then return.) It's not as much of a problem when you're zoomed in.

How to calculate ideal and actual burn for burndown chart

I'm trying to calculate data for burndown chart for a course. The course has a start and end dates, exercises count and actual students start and finish dates. I have data JSON from the server with a course data. I process it. First of all, I'm calculating totalExcercisesCount then counting the number of days student havs to finish the course. After all I get the next data object:
const chartDataObj = {
idealBurn: [],
actualBurn: [],
idealIncrement: 0,
totalExcercisesCount: 12,
totalExercisesDoneCount: 4,
timeLine: {
courseFrom: "2018-09-10",
courseTo: "2019-06-21",
start: "2018-09-11",
finish: "2018-10-01",
totalDays: 20,
}
}
After I'm building an ideal line and here comes the first problem. I'm trying to do next,
chartDataObj.idealIncrement = Math.floor(
chartDataObj.timeLine.totalDays / chartDataObj.totalExcercisesCount
);
for (i = 0; i <= chartDataObj.timeLine.totalDays - 1; i++) {
chartDataObj.idealBurn.push(chartDataObj.idealIncrement * (i + 1));
}
chartDataObj.idealBurn.reverse();
The problem is if the count of days much more then exercises I have a wrong ideal burn.
I have 12 exercises to complete but on the second day, it shows like 19. What am I doing wrong here?
And then I need to fill actual burn data. But the problem is, how to fill it according to dates exercises was complete and show it on the graph? I mean in my final dataObject I have just totalExercisesDoneCount but in initial JSON I have info about dates when exercises was finished. Should I group them by dates or not?
I also have a codepen prepared with chart and all the code. Any help will be appreciated. Thanx

Rendering a multi-line chart in DC.js

JSFiddle here: https://jsfiddle.net/ayyrickay/k1crg7xu/47/
My code is a bit of a mess right now, but essentially, I have two choropleths, and I want to render a multiline chart based on the choropleth data - I just have no idea how to wrangle the data to make it work.
The line chart is be a composite line chart. One line would be New Yorker circulation data, the other would be Saturday Evening Post circulation data. The y axis is issue_circulation, the x axis is actual_issue_date
In the current implementation I’ve set up two crossfilters (one for each data set) and I’m creating a dimension for the choropleth and one for the line chart. The choropleths render properly, but I’ve yet to get the line charts to render. I can’t tell if its because of the format of my data ({key: date, value: y-axis-value}) or if my implementation of crossfilter is just too janky. I'm trying to understand based on other StackOverflow questions, but nothing I've tried seems to work (this includes prefiltering the data like I'm doing now, creating two different crossfilters and separate dimensions, trying to be meticulous apart parsing dates, etc.)
When you're using a time scale for the X axis, the keys of your group should be Date objects. So it won't work to format the dates as strings when creating the dimensions & groups; instead just use raw Date objects.
Since Dates are slow, I suggest doing this as a data preprocessing step:
data.forEach(function(d) {
d.actual_issue_date = new Date(d.actual_issue_date);
})
Then your dimension key functions just extract the date object:
const dimension1 = title1Circulation.dimension(d => d.actual_issue_date)
const lineChartYear1 = title1Circulation.dimension(d => d.actual_issue_date)
const lineChartYear2 = title2Circulation.dimension(d => d.actual_issue_date)
This ends up looking kind of messy, because the Saturday Evening Post data fluctuates a lot by week:
Zoomed in:
Assuming this isn't a data cleaning problem (kind of looks like it?), one way to improve the display would be to aggregate by month:
const circulationGroup1 = lineChartYear1.group(d => d3.timeMonth(d)).reduceSum(d => d.issue_circulation)
const circulationGroup2 = lineChartYear2.group(d => d3.timeMonth(d)).reduceSum(d => d.issue_circulation)
composite
.xUnits(d3.timeMonths)
This rounds the group key down to the beginning of each month, adding together all the values for each month.
Still kind of messy, but better:
Welp, you still have some work to do, but anyway, that's why the data was not displaying!
Fork of your fiddle.

Algorithm run from within Node HTTP request takes much longer to run

I have a node app which plots data on an x,y dot plot graph. Currently, I make a GET request from the front end and my back end node server accepts the requests, loops through an array of data points, draws a canvas using Node Canvas and streams it back to the front end where it's displayed as a PNG image.
Complicating things is that there are can be polygons so my algorithm calculates if a point is inside a polygon, using the point in polygon package, and colors that data point differently if it is.
This works fine when there are less than 50,000 data points. However, when there are 800,000 the request takes approximately 23 seconds. I have profiled the code and most of that time is spent looping through all the data points and figuring out where to plot it on the canvas and what color (depending on if it's in one or more polygons). Here's a plunker i made. Basically i do something like this:
for (var i = 0; i < data.length; i++) {
// get raw points
x = data[i][0];
y = data[i][1];
// convert to a point on canvas
pointX = getPointOnCanvas(x);
pointY = getPointOnCanvas(y, 'y');
color = getColorOfCell(pointX, pointY);
color = color;
plotColor.push({
color: color,
pointX: pointX,
pointY : pointY
});
}
// draw the dots down here
The algorithm itself is not the problem. The issue I have is that when the algorithm is run within a HTTP request, it takes a long time to calculate what color a point is - about 16 seconds. But if do it in chrome on the front end, it takes just over a second (see the plunker). When I run the algorithm on the command line with Node, it takes less than a second. So the fact that my app runs the algorithm within a HTTP request is slowing it down massively. So couple of questions:
Why would this be? Why does running an algorithm from within a HTTP request take so much longer?
What can I do to fix this, if anything? Would it somehow be possible to make a request to start the task, and then notify frontend when finished and retrieve the PNG?
EDIT
I fully tested running the algorithm and creating a PNG through the command line. It's much quicker, less than half a second to work out what color each of the 800k data points should be. Im thinking of using socket to make a request to the server and start the task, then have it return the image. I'm baffled though why the code should take so long when run within a HTTP request...
EDIT
The problem is Mongo and Mongoose. I store the coordinates of each polygon in Mongo. I fetch these coordinates once but when I compare them to each x, y point/. Somehow, this is what's massively delaying the algoritm. If I close the Mongo document, the algorithm goes from 16 seconds to 1.5 seconds......
Edit
#DevDig pointed out the main problem in the comments section - when using a Mongoose object there are lots of getters and setters slowing it down. Using lean() in the query reduces algorithm from 16 seconds to 1.5 seconds
Just finished running a version of your code as a nodeJS service. The code is taken from your plunker. Execution time was 171mSec for 100,000 rows in data (replicated first 10K rows 10 times. Here's what I did:
First, your data.json and gates.json files aren't really JSON files, they are javascript files. I removed the var data/gates = statements from the front and removed the ending semicolon. The issue you're encountering may have to do with how you're reading in your data sets in your app. Since you don't modify gates or data, I read them in as part of the set-up on the server, which is exactly how you are processing in the browser. If you need to read the files in each time you access the server, then that, of course, will change the timing. That change took the execution time from 171mSec to 515mSec - still nothing near what you're seeing. This is being executed on a macBook Pro. If needed, I can update timings from a network accessed cloud server.
getting the files:
var fs = require("fs");
var path = require("path");
var data = [];
var allGatesChain;
var events = [];
var x, y, pointX, pointY;
var filename = __dirname + "/data.txt";
data = JSON.parse(fs.readFileSync(filename, "utf-8"));
filename = __dirname + "/gates.json";
var gates = JSON.parse(fs.readFileSync(filename, "utf-8"));
I moved your routines to create allGatesChain and events into the exported function:
allGatesChain = getAllGatesChain();
generateData();
console.log("events is "+events.length+" elements long. events[0] is: "+events[0]);
console.log("data is "+data.length+" elements long. data[0] is "+data[0]);
and then ran your code:
var start, end;
var plotColor = [];
start = new Date().getTime();
for (var i = 0; i < data.length; i++) {
// get raw points
x = data[i][0];
y = data[i][1];
// convert to a point on canvas
pointX = getPointOnCanvas(x);
pointY = getPointOnCanvas(y, 'y');
color = getColorOfCell({
gateChain: allGatesChain,
events: events,
i: i
});
color = color;
plotColor.push({
color: color,
pointX: pointX,
pointY : pointY
});
}
end = new Date().getTime();
var _str = "loop execution took: "+(end-start)+" milliseconds.";
console.log(_str);
res.send(_str);
result was 171mSec.

Categories