I have an ordered data set of decimal numbers. This data is always similar - but not always the same. The expected data is a few, 0 - 5 large numbers, followed by several (10 - 90) average numbers then follow by smaller numbers. There are cases where a large number may be mixed into the average numbers' See the following arrays.
let expectedData = [35.267,9.267,9.332,9.186,9.220,9.141,9.107,9.114,9.098,9.181,9.220,4.012,0.132];
let expectedData = [35.267,32.267,9.267,9.332,9.186,9.220,9.141,9.107,30.267,9.114,9.098,9.181,9.220,4.012,0.132];
I am trying to analyze the data by getting the average without high numbers on front and low numbers on back. The middle high/low are fine to keep in the average. I have a partial solution below. Right now I am sort of brute forcing it but the solution isn't perfect. On smaller datasets the first average calculation is influenced by the large number.
My question is: Is there a way to handle this type of problem, which is identifying patterns in an array of numbers?
My algorithm is:
Get an average of the array
Calculate an above/below average value
Remove front (n) elements that are above average
remove end elements that are below average
Recalculate average
In JavaScript I have: (this is partial leaving out below average)
let total= expectedData.reduce((rt,cur)=> {return rt+cur;}, 0);
let avg = total/expectedData.length;
let aboveAvg = avg*0.1+avg;
let remove = -1;
for(let k=0;k<expectedData.length;k++) {
if(expectedData[k] > aboveAvg) {
remove=k;
} else {
if(k==0) {
remove = -1;//no need to remove
}
//break because we don't want large values from middle removed.
break;
}
}
if(remove >= 0 ) {
//remove front above average
expectedData.splice(0,remove+1);
}
//remove belows
//recalculate average
I believe you are looking for some outlier detection Algorithm. There are already a bunch of questions related to this on Stack overflow.
However, each outlier detection algorithm has its own merits.
Here are a few of them
https://mathworld.wolfram.com/Outlier.html
High outliers are anything beyond the 3rd quartile + 1.5 * the inter-quartile range (IQR)
Low outliers are anything beneath the 1st quartile - 1.5 * IQR
Grubbs's test
You can check how it works for your expectations here
Apart from these 2, the is a comparison calculator here . You can visit this to use other Algorithms per your need.
I would have tried to get a sliding window coupled with an hysteresis / band filter in order to detect the high value peaks, first.
Then, when your sliding windows advance, you can add the previous first value (which is now the last of analyzed values) to the global sum, and add 1 to the number of total values.
When you encounter a peak (=something that causes the hysteresis to move or overflow the band filter), you either remove the values (may be costly), or better, you set the value to NaN so you can safely ignore it.
You should keep computing a sliding average within your sliding window in order to be able to auto-correct the hysteresis/band filter, so it will reject only the start values of a peak (the end values are the start values of the next one), but once values are stabilized to a new level, values will be kept again.
The size of the sliding window will set how much consecutive "stable" values are needed to be kept, or in other words how much UNstable values are rejected when you reach a new level.
For that, you can check the mode of the values (rounded) and then take all the numbers in a certain range around the mode. That range can be taken from the data itself, for example by taking the 10% of the max - min value. That helps you to filter your data. You can select the percent that fits your needs. Something like this:
let expectedData = [35.267,9.267,9.332,9.186,9.220,9.141,9.107,9.114,9.098,9.181,9.220,4.012,0.132];
expectedData.sort((a, b) => a - b);
/// Get the range of the data
const RANGE = expectedData[ expectedData.length - 1 ] - expectedData[0];
const WINDOW = 0.1; /// Window of selection 10% from left and right
/// Frequency of each number
let dist = expectedData.reduce((acc, e) => (acc[ Math.floor(e) ] = (acc[ Math.floor(e) ] || 0) + 1, acc), {});
let mode = +Object.entries(dist).sort((a, b) => b[1] - a[1])[0][0];
let newData = expectedData.filter(e => mode - RANGE * WINDOW <= e && e <= mode + RANGE * WINDOW);
console.log(newData);
I have a line chart for messages sent x day/month and I also have two datepickers on top of it. I want to be able to select a start date, an ending date and the chart reads those dates and display this exactly range of points.
I already have zoom and pan configured on my chart.js file and I can do it manually, but I was wondering how can I do that through what I just described.
On this picture I have 3 months of data already. It always begins displaying 1 month.
I did it!
I created a function to be called after every date change on my datepicker that is like that:
function filterDate(initialDate, finalDate){
//first I cloned my dataset [labels (x) and data (y)] like this:
labelsData2 = [...labelsData];
sentData2 = [...sentData];
//then I used the datepicker's values to pinpoint the index of both dates (initial and final) on my labelsData2 array, sliced the array so it now contains only the range of dates that I want and stored the value on labelsData2 again
labelsData2 = labelsData2.slice(labelsData2.indexOf(initialDate), mesesData2.indexOf(finalDate) + 1); //<- +1 to prevent the index 0 to mess my result.
//I used the same indexOf to slice my sentData2 array as well, since they have the same length.
sentData2 = sentData2.slice(labelsData2.indexOf(initialDate), mesesData2.indexOf(finalDate) + 1);
//then I updated my chart!
myChart.data.datasets[0].data = sentData2;
myChart.data.labels = labelsData2;
myChart.update();
}
I'm trying to do a loan calculator where you can input the amount you want to borrow, the amount of months that you plan to pay everything and select a type of credit (car, studies, home, etc) that determines the interest rate of each.
In JavaScript, I called values from Loan Amount - Months to Pay - dropdown list with Type of Credits that provides different interest values. I try to work around this formula and write it in text like this:
p*(r*(1+r)^n/1-(1+r)^n);
Am I right with the formula I'm using to get Fixed monthly payment -- am I right writing the formula in text/code way? I'm also doing the output this way:
document.getElementById("id-name").innerHTML = p*(r*(1+r)^n/1-(1+r)^n);
Is this the right way to do it or should i do the formula in another var z and call innerHTML = z ?? Finally, this would be the full JS function with the input variables:
function CALCULADORA() {
var x = document.getElementById("list").value;
var i = x/100;
var c = document.getElementById("cuotas").value;
var y = document.getElementById("valor").value;
document.getElementById("CALCULATOR").innerHTML = y*(x*(1+x)^c/1-(1+x)^c);
}
/*
x is Interest rate in percentage given by a dropdown list
i is Percentage / 100
c is Months to pay loan
y is Loan value
*/
The issue is that not getting the full formula, my result is only the Loan value divided by months to be paid -- that is what I got displayed. Thanks to any help you can give me! This is one of my first codes.
Make sure you are converting those values to numbers before trying to math on them. Specifically when adding. Something like 1+x will not give you what you expect when x is a string like 0.2.
Assuming those values can be decimals, you will need to do something like:
var x = parseFloat(document.getElementById("list").value);
var i = x/100;
var c = parseFloat(document.getElementById("cuotas").value);
var y = parseFloat(document.getElementById("valor").value);
I try to use dc.js and Crossfilter but I have some problem by adding values having the same category.
I explain : I have for the demo 3 columns (Project, Amount, Action).
Then I create amount categories with this code:
var amount = d.amount;
if (amount<=10) {
return '< 10';
} else if (amount >10 && amount < 50) {
return '<50';
} else if (amount >= 50 && amount <= 80) {
return '< 80';
} else {
return '> 80';
}
All I want is : if it's the same project, add all the amount and create these categories.
So in < 10 category there is only Redaction. In the >10 and <50 category there'll be design and website hosting... And if >50 there'll be Website Design.
Here is the Jsfiddle : http://jsfiddle.net/nicart/179n4bfg/6/
Thank you for your help, I'm totally lost.
So you want to dynamically calculate this category? For example, if the "Design" project had a "stuff" action with a value of 5, and a filter were applied to only show "stuff", then "Design" would fall into the "<10" category? Or do you want each project categorized according to it's overall value no matter what filters are applied?
If the former, you're going to have to create a group wrapper to create a "fake" group and re-aggregate a standard group on your Project dimension into your categorical values at run-time. See here, where it talks about creating a "fake group": https://github.com/dc-js/dc.js/wiki/FAQ
If the latter, then you should do this as a pre-calculation step and add a category dimension to your underlying data for each record.
Check this out Pie Chart.
.label method used.
Hope this answers your question to some extent. You may modify if it helps.
Let me know :-)
I have an array - it's populated with 1s and 0s.
When "rendered" out - it looks like this:
Basically, I would like to select the lower sections (highlighted in red). Is there anyway to select just the lowest 1s?
Thanks again!
[edit]
I should mention that the lowest points are random each time!
[edit2]
Currently I'm just selecting everything below a certain area and seeing if it's a 1 and doing what I want... Is there no better way?
You loop through the 2d array in reverse...
var lowest = [];
var threshold = 6; // find the 6 "lowest" 1's
for(var row=arr.length-1; row>=0; row--)
for(var col=arr[row].length-1; col>=0; col--)
if(arr[row][col] == 1 && threshold > 0) {
threshold--;
lowest.push({x: col, y: row});
}
Another way :
1) compute per row density = number of black pixel per row
put this data inside a new 1D array.
2) decide where you consider it is leg or not (with a treshold possibly,
or a relative threshold (ex: less than 30% mean value of non-null rows ...) ).
3) push all (x,y) values in the 'leg' rows.
This will avoid lots of small points 'eating' the pixel threshold
before you come to the body of the monster.