how to filter a set of data by variance in typescript? - javascript

Let's say I have a set of data like this with a row for each minute in the last 4 hours:
[
{ X:1000, Y:2000, Z:3000, DateTime: 12/15/2018 12:00 },
{ X:998, Y:2011, Z:3020, DateTime: 12/15/2018 12:01 }
]
I need an array of property names whose values are within a 20% variance for all rows. So if Y and Z above meet this criteria but X does not then the output should look like this:
[Y, Z]
What typescript code could I use to do this?

I don't know exactly what "variance" or "variance percentage" mean in your question. I just used this formula to calculate variance: https://www.wikihow.com/Calculate-Variance
For the variance percentage, I simply divided the variance by the mean value and expressed it in percentage.
Feel free to replace my calculateVariancePercentage with a more correct implementation.
const ACCEPTABLE_VARIANCE_THRESHOLD = 20;
const dataset = [
{ X:1000, Y:2000, Z:3000, DateTime: '12/15/2018 12:00' },
{ X:998, Y:2011, Z:3020, DateTime: '12/15/2018 12:01' }
];
const calculateVariancePercentage = (data) => {
const meanValue = data.reduce((sum, element) => sum + element, 0) / data.length;
const sumOfDeviations = data.reduce((sod, element) => Math.pow(element - meanValue, 2), 0);
const variance = sumOfDeviations / (data.length - 1);
return variance / meanValue * 100;
}
const variables = Object.keys(dataset[0]).filter(key => key !== 'DateTime');
const result = variables.filter(variable => {
const varData = dataset.map(row => row[variable]);
const varianceInPercentage = calculateVariancePercentage(varData);
console.log(varianceInPercentage);
return calculateVariancePercentage(varData) <= ACCEPTABLE_VARIANCE_THRESHOLD;
});
console.log(result);

Related

Generate a new uniformly distributed array from the elements of the previous array

It is basically a constraint optimisation problem where I need to find similar colors given a particular color, from a color database.
Consider an array (signifying the distance between the inputColor and the allColorsInDB): [0, 5, 6, 1, 0.56, 4350, 64, 345, 20, 14, 2.045, 54, 56]
Construct a new array of limit N (user input) with the elements of the previous array such that:
K is the middle element of the new array. (K is user input and is already present in the original array)
Each interval between i th and (i+1)th element in the new array is same or closest possible.
Constraints:
N <= originalArray.length
K is always present in the original Array
This is what I'm trying...
const convert = require('#csstools/convert-colors');
export type Lab = [number, number, number];
const calculateEuclideanDistance = (color1: Lab, color2: Lab) => {
const [l1, a1, b1] = color1;
const [l2, a2, b2] = color2;
return Math.sqrt((l2 - l1) ** 2 + (a2 - a1) ** 2 + (b2 - b1) ** 2);
};
const getSimilarColors = (brand: keyof typeof colors.brands, color: string, threshold?: number, limit?: number) => {
const brandColors = colors.brands[brand].colors;
const colorLab = convert.hex2lab(color);
let allColors = Object.keys(brandColors)
.map((curr) => {
// #ts-ignore
const currLab = brandColors[curr as keyof typeof brandColors].lab as Lab;
return {
...brandColors[curr as keyof typeof brandColors],
hex: curr,
distance: calculateEuclideanDistance(colorLab, currLab),
};
})
// Sorting colors from dark to light
.sort((a, b) => a.lab[0]! - b.lab[0]!);
if (threshold) {
allColors = allColors.filter((c) => c.distance <= threshold);
}
if (!limit || limit >= allColors.length) {
return allColors;
}
// TODO: Improve this logic, right now we're assuming limit is always 5, hence computing it statically
const filteredColors = new Array(5);
const midIndex = allColors.findIndex((c) => c.hex === color)!;
// eslint-disable-next-line prefer-destructuring
filteredColors[0] = allColors[0];
filteredColors[2] = allColors[midIndex];
filteredColors[4] = allColors[allColors.length - 1];
filteredColors[1] = allColors[Math.floor(midIndex / 2)];
filteredColors[3] = allColors[Math.floor((allColors.length - 1 + midIndex) / 2)];
return filteredColors;
};
Result that I'm expecting:
Sorted Colors Based on LAB
Sorted Colors Based on LAB
The thing that I'm getting from the API for now is correct, however the I'm doing it statically.
Input
Output
Formatted Output

Count dates within date intervals

I've an array abc containing date and time. I need to convert it into time slots of 1 day with limits determined by startDate and endDate with 'x' containing time slots and 'y' containing count of occurrence in those time slots. How can I get the count of occurrences in abc as per the interval and map it correctly it as per the date intervals?
const abc = ['2021-09-05T00:53:44.953Z', '2021-08-05T05:08:10.950Z', '2022-03-05T00:53:40.951Z'];
const startDate = '2021-07-05';
const endDate = '2021-11-05';
const res = [{x: '2021-07-05 - 2021-08-05' , y: '1' },{x: '2021-08-05 - 2021-09-05' , y: '2' }, {x: '2021-09-05 - 2021-10-05' , y: '1' },{x: '2021-10-05 - 2021-11-05' , y: '0' }];
console.log(res);
<script src="https://cdnjs.cloudflare.com/ajax/libs/react/16.6.3/umd/react.production.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/react-dom/16.6.3/umd/react-dom.production.min.js"></script>
As per my understanding, I created a simple working demo as per the start and end date you provided in the question :
const abc = ['2021-09-05T00:53:44.953Z', '2021-08-05T05:08:10.950Z', '2022-03-05T00:53:40.951Z'];
const startDate = '2021-07-05';
const endDate = '2021-11-05';
function countDates(inputArray, startDate, endDate) {
let count = 0;
const dateArray = abc.map((item) => new Date(item.split("T")[0]).getTime());
dateArray.forEach((dayTime) => {
if(dayTime >= new Date(startDate).getTime() && dayTime <= new Date(endDate).getTime()) {
count ++;
}
});
return [{x: `${startDate} - ${endDate}`, y: count}];
}
console.log(countDates(abc, startDate, endDate));
Note : I am assuming you have to fetch a range once at a time between startDate and endDate.
This may be one possible solution to achieve the desired objective:
Code Snippet
Please look at countWithinLimits method which houses the significant portions of the solution.
const data = [
'2021-07-05T00:53:44.953Z', '2021-07-04T00:53:44.953Z',
'2021-07-14T00:53:44.953Z', '2021-07-12T00:53:44.953Z',
'2021-07-06T00:53:44.953Z', '2021-07-05T00:53:44.953Z',
'2021-07-07T00:53:44.953Z', '2021-07-11T00:53:44.953Z',
'2021-07-08T00:53:44.953Z', '2021-07-10T00:53:44.953Z',
'2021-07-09T00:53:44.953Z', '2021-07-07T00:53:44.953Z',
'2021-07-10T00:53:44.953Z', '2021-07-05T00:53:44.953Z',
'2021-07-11T00:53:44.953Z', '2021-07-07T00:53:44.953Z',
];
const startDate = '2021-07-05';
const endDate = '2021-07-11';
// expected result structure for reference
const expectedResult = [
{x: '2021-07-05 - 2021-07-06', y: '1' },
{x: '2021-07-06 - 2021-07-07', y: '2' },
{x: '2021-07-07 - 2021-07-08', y: '1' },
{x: '2021-07-08 - 2021-07-09', y: '0' }
];
const countWithinLimits = (st, en, arr = data) => {
// helper function to add 'i' days to given 'dt'
const addDays = (dt, i) => {
const nx = new Date(dt);
return (
(
new Date(nx.setDate(nx.getDate() + i))
).toISOString().split('T')[0]
);
};
// transform 'dt' into look like 'x' in the expected result
const transformToKey = dt => (`${dt} - ${addDays(dt, 1)}`);
// set constants for start and end dates
const stDate = new Date(st), enDate = new Date(en);
// first determine the number of slots
// (each will be 1-day duration, from st till en)
const numDays = (
Math.ceil(
Math.abs(enDate - stDate) / (1000 * 60 * 60 * 24)
)
);
// next, obtain an array with the slots
// something like this: ['2021-07-05 - 2021-07-06', ....]
const slots = (
[...Array(numDays).keys()]
.map(i => addDays(st, i))
.map(d => transformToKey(d))
);
// generate an object with props as the slots and values as zeroes
// like this: { '2021-07-05 - 2021-07-06': 0, ....}
const slotObj = slots.reduce(
(fin, itm) => ({...fin, [itm]: 0}),
{}
);
// iterate through the data (arr)
// find the slot in which a given date fits
// and add 1 to the counter, if the slot is found in slotObj
// do not count the date if it doesn't match any slots
const countPerSlot = arr.reduce(
(fin, itm) => ({
...fin,
...(
[transformToKey(itm.split('T')[0])] in fin
? {
[transformToKey(itm.split('T')[0])]: (
fin[transformToKey(itm.split('T')[0])] + 1
)
}
: {}
)
}),
{...slotObj}
);
// finally, transform the countPerSlot object
// into the expected result array
return (
Object.entries(countPerSlot)
.map(
([k, v]) => ({ x: k, y: v})
)
);
};
console.log(countWithinLimits(startDate, endDate));
Explanation
While there are comments in-line in the above code-snippet, should there be any specific point that requires a more detailed explanation, 'comments' below may be used to notify and this answer may be updated with more details.
The big-picture idea is this:
split the solution into different smaller-parts
first, generate the time-slots (of length 1 day)
next, create an object where the props is the period (like 2021-07-05 - 2021-07-06)
now, iterate through the data and increment a counter corresponding to the prop where the date fits
and finally, transform the object into an array that matches the expected result ([ {x : '2021-07-05 - 2021-07-06', y: '2' }, .... ])

the most performant way finding an item in array within a range of indexes

I'm looking for something like the built-in Array.prototype.find(), but I want to be able to search starting from a given index, without creating a new shallow copy of range of items from this array.
possibilities:
using Array.prototype.find with no starting index.
using Array.prototype.slice() and then .find. something like arr.slice(startIndedx).find(...). creating shallow copy.
writing my own find implementation using for loop that start looking from given index.
using lodash.find(). but I care about bundle size and lodash is quite heavy. I actually prefer avoiding any kind of third-party packages.
here are some performance test results :
const SIZE = 1000000
const START_I = 800000
const ItemVal = 900000
...
find: average time: 12.1ms
findSlice: average time: 2.48ms
findLodash: average time: 0.26ms
myFind: average time: 0.26ms
surprisingly enough its seems that that native .find performed worse even with starting index 0 :
const SIZE = 1000000
const START_I = 0
const ItemVal = 900000
...
find: average time: 12.61ms
findSlice: average time: 17.51ms
findLodash: average time: 1.93ms
myFind: average time: 2.17ms
for array size 1000000 , starting index of 0 and the correct position is 900000, Array.prototype.find() preformed 12.61ms vs 2.17ms of the simple for loop search(!). am I missing something?
the test code is:
const {performance} = require('perf_hooks');
const _ = require('lodash')
const SIZE = 1000000
const START_I = 0
const ItemVal = 900000
let arr = Array(SIZE).fill(0)
arr = arr.map((a, i) => i)
const myFind = (arr, func, startI) => {
for (let i = startI; i < arr.length; i++) {
if (func(arr[i])) return arr[i]
}
return -1
}
const functions = {
find: () => arr.find(a => a === ItemVal), // looking at all array - no starting index
findSlice: () => arr.slice(START_I).find(a => a === ItemVal),
findLodash: () => _.find(arr, a => a === ItemVal, START_I),
myFind: () => myFind(arr, a => a === ItemVal, START_I),
}
const repeat = 100
const test_find = () => {
for (let [name, func] of Object.entries(functions)) {
let totalTime = 0
for (let i = 0; i < repeat; i++) {
let t_current = performance.now()
func()
totalTime += performance.now() - t_current
}
console.log(`${name}: average time: ${Math.round(totalTime / repeat * 100) / 100}ms`)
}
}
test_find()
what is the best way to find an item in an array stating looking from an index? and also, how does Array.prototype.find perform worse than my own simple for loop implementation of find?

JavaScript array of dates to array of ranges of dates

I have a JavaScript array of dates (as strings) like the following:
["2020-07-24T04:00:00.000Z", "2020-07-25T04:00:00.000Z", "2020-07-26T04:00:00.000Z", "2020-07-27T04:00:00.000Z", "2020-07-28T04:00:00.000Z", "2020-07-29T04:00:00.000Z", "2020-07-30T04:00:00.000Z", "2020-07-31T04:00:00.000Z", "2020-08-01T04:00:00.000Z", "2020-11-29T05:00:00.000Z", "2020-12-30T05:00:00.000Z", "2020-12-31T05:00:00.000Z", "2021-01-01T05:00:00.000Z", "2021-01-02T05:00:00.000Z", "2021-02-18T05:00:00.000Z"]
I want to convert this into an array of arrays of [first, last] contiguous date ranges, e.g., as below:
[["2020-07-24T04:00:00.000Z", "2020-08-01T04:00:00.000Z"], ["2020-11-29T05:00:00.000Z"], ["2020-12-30T05:00:00.000Z", "2021-01-02T05:00:00.000Z"], []]
How do I do this? Code attempt below:
var ranges = [];
for (var i = 0; i < popNull.length; i++) {
let currentRange = [];
let current = new Date(popNull[i]);
let tomorrow = new Date(current.getTime() + (24 * 60 * 60 * 1000));
let next = new Date(popNull[i+1]);
if (next === tomorrow) {
}
else {
}
}
I've made a couple of assumptions in the code below
That the dates are pre-sorted in ascending date order
That "contiguous" means less than or equal to 24 hours.
All dates are formatted in a way that can be passed directly to the Date constructor on the platform of choice.
const input = ["2020-07-24T04:00:00.000Z", "2020-07-25T04:00:00.000Z", "2020-07-26T04:00:00.000Z", "2020-07-27T04:00:00.000Z", "2020-07-28T04:00:00.000Z", "2020-07-29T04:00:00.000Z", "2020-07-30T04:00:00.000Z", "2020-07-31T04:00:00.000Z", "2020-08-01T04:00:00.000Z", "2020-11-29T05:00:00.000Z", "2020-12-30T05:00:00.000Z", "2020-12-31T05:00:00.000Z", "2021-01-01T05:00:00.000Z", "2021-01-02T05:00:00.000Z", "2021-02-18T05:00:00.000Z"].map(x => new Date(x));
let aggregation = input.reduce( (acc,i) => {
if(acc.prev){
const diffInHrs = (i - acc.prev)/1000/60/60;
if(diffInHrs <= 24){
acc.result[acc.result.length-1][1] = i;
}
else{
acc.result.push([i])
}
acc.prev = i;
return acc;
}
else{
return {prev:i, result:[[i]]}
}
},{});
console.log(aggregation.result)
You can reduce the dates by keeoing track of the latest and checking the current with the previous. You can diff their epoch valyes and check if they are within a day.
const dates = ["2020-07-24T04:00:00.000Z", "2020-07-25T04:00:00.000Z", "2020-07-26T04:00:00.000Z", "2020-07-27T04:00:00.000Z", "2020-07-28T04:00:00.000Z", "2020-07-29T04:00:00.000Z", "2020-07-30T04:00:00.000Z", "2020-07-31T04:00:00.000Z", "2020-08-01T04:00:00.000Z", "2020-11-29T05:00:00.000Z", "2020-12-30T05:00:00.000Z", "2020-12-31T05:00:00.000Z", "2021-01-01T05:00:00.000Z", "2021-01-02T05:00:00.000Z", "2021-02-18T05:00:00.000Z"];
const DAY_MILLIS = 8.64e7;
const ranges = dates
.reduce((acc, dateStr, index, all) => {
const dateObj = new Date(dateStr);
if (acc.length === 0) {
acc.push({ start: dateObj, prev: dateObj });
} else {
let last = acc[acc.length - 1];
const { start, prev } = last;
if (dateObj.getTime() - prev.getTime() <= DAY_MILLIS) {
last.prev = dateObj;
} else {
last.end = prev;
acc.push({ start: dateObj, prev: dateObj });
}
if (index === all.length - 1) {
last = acc[acc.length - 1];
if (last.end == null) {
last.end = last.prev;
}
}
}
return acc;
}, [])
.map(({ start, prev, end }) =>
((startStr, endStr) =>
startStr !== endStr ? [startStr, endStr] : [startStr])
(start.toISOString(), end.toISOString()));
console.log(ranges);
.as-console-wrapper { top: 0; max-height: 100% !important; }
Output
[
[ "2020-07-24T04:00:00.000Z", "2020-08-01T04:00:00.000Z" ],
[ "2020-11-29T05:00:00.000Z" ],
[ "2020-12-30T05:00:00.000Z", "2021-01-02T05:00:00.000Z" ],
[ "2021-02-18T05:00:00.000Z" ]
]
You can do the following using Array#reduce():
Go through each date.
Check if the current date will extend last range.
if yes, then overwrite the end in the range pair (second element)
if no, start a new range
If it happens that a range only has a single date, then use the start to compare with. The logic still holds - extending the range will add a second date. If the new date is not within the desired time frame, then a new range is created and the previous range is left with a single element in it.
const areDatesWithin = ms => (str1, str2) => {
if (!str1 || !str2)
return false;
const date1 = new Date(str1);
const date2 = new Date(str2);
return (date2 - date1) <= ms;
}
const areDatesWithin1Day = areDatesWithin(1000 * 60 * 60 * 24);
function combineInRanges(dates) {
return dates.reduce((acc, nextDate) => {
const lastDateRange = acc[acc.length-1] ?? [];
//compare with range end (if there) or range start
const lastDate = lastDateRange[1] ?? lastDateRange[0];
//check if the range needs to be extended
const mergeWithRange = areDatesWithin1Day(lastDate, nextDate);
if (mergeWithRange) {
//change the end of the range
lastDateRange[1] = nextDate;
} else {
//start a new range
acc.push([nextDate]);
}
return acc;
}, []);
}
const arr = ["2020-07-24T04:00:00.000Z", "2020-07-25T04:00:00.000Z", "2020-07-26T04:00:00.000Z", "2020-07-27T04:00:00.000Z", "2020-07-28T04:00:00.000Z", "2020-07-29T04:00:00.000Z", "2020-07-30T04:00:00.000Z", "2020-07-31T04:00:00.000Z", "2020-08-01T04:00:00.000Z", "2020-11-29T05:00:00.000Z", "2020-12-30T05:00:00.000Z", "2020-12-31T05:00:00.000Z", "2021-01-01T05:00:00.000Z", "2021-01-02T05:00:00.000Z", "2021-02-18T05:00:00.000Z"];
console.log(combineInRanges(arr));
https://stackoverflow.com/a/67182108/20667780
Jamiec answer is working. If you have a date array with UTC dates correctly offsetted to local timezone, then the daylight save start/end date will have more than 24 hours. You have to change the diffInHrs to 25 instead of 24.
Otherwise, its a perfect answer.
It's a sort of reduction based on the even-ness of the index...
let array = ['a', 'b', 'c', 'd', 'e', 'f'];
let pairs = array.reduce((acc, el, idx) => {
idx % 2 ? acc[acc.length-1].push(el) : acc.push([el]);
return acc;
}, []);
console.log(pairs)

Average of every hour in a array

I have a array which updates every minute. When i want to show it over a day, I want to have the average of every hour that day.
The most recent minute is add the end of the array.
//get the last elements from the array
var hours= (this.today.getHours() + 1) * 60
var data = Array.from(this.temps.data)
let lastData = data.slice(Math.max(data.length - hours))
let newData: any
// calculate the average of every hour
for (let i = 0; i < minutes; i++) {
var cut = i * 60
for (let i = cut; i < (cut + 60); i++) {
newData = newData + lastData[i];
let test = newData/60
console.log(test);
}
}
I can't figure out how I make an array from every last 60 elements.
My goal is to get an array like
avgHour[20,22,30,27,]
The array I have is updated every minute. So I need the average of every 60 elements to get a hour.
array looks like this
data[25,33,22,33]
It is every minute from a week so really long.
This Worked For me
var arrays = [], size = 60;
while (arr.length > 0){
arrays.push(arr.splice(0, size));
}
for (let i = 0; i < (arrays.length - 1); i++) {
var sum = 0
for (let b = 0; b < 60; b++) {
sum += arrays[i][b]
}
let avg = sum/60
arr2.push(avg)
}
this just splits the array every 60 elements. Now I can calculate the average for every 60.
duplicate of How to split a long array into smaller arrays, with JavaScript
Thanks for the help!
I am a big fan of the functional programming library Ramda. (Disclaimer: I'm one of its authors.) I tend to think in terms of simple, reusable functions.
When I think of how to solve this problem, I think of it through a Ramda viewpoint. And I would probably solve this problem like this:
const avgHour = pipe(
splitEvery(60),
map(mean),
)
// some random data
const data = range(0, 7 * 24 * 60).map(_ => Math.floor(Math.random() * 20 + 10))
console.log(avgHour(data))
<script src="//cdnjs.cloudflare.com/ajax/libs/ramda/0.26.1/ramda.js"></script>
<script>const {pipe, splitEvery, map, mean, range} = R</script>
I think that is fairly readable, at least once you understand that pipe creates a pipeline of functions, each handing its result to the next one.
Now, there is often not a reason to include a large library like Ramda to solve a fairly simple problem. But all the functions used in that version are easily reusable. So it might make sense to try to create your own versions of these functions and keep them available to the rest of your application. In fact, that's how libraries like Ramda actually get built.
So here is a version that has simple implementations of those functions, ones you might place in a utility library:
const pipe = (...fns) => (x) => fns.reduce((v, f) => f(v), x)
const splitEvery = (n) => (xs) => {
let i = 0, a = []
while (i < xs.length) {a.push(xs.slice(i, i + n)); i += n}
return a
}
const map = (fn) => (xs) => xs.map(x => fn(x))
const sum = (xs) => xs.reduce((a, b) => a + b, 0)
const mean = (xs) => sum(xs) / (xs.length || 1)
const avgHour = pipe(
splitEvery(60),
map(mean)
)
const range = (lo, hi) => [...Array(hi - lo)].map((_, i) => lo + i)
// some random data
const data = range(0, 7 * 24 * 60).map(_ => Math.floor(Math.random() * 20 + 10))
console.log(avgHour(data))
You can reduce the data and group by hour, then simply map to get each hour's average. I'm using moment to parse the dates below, you can do that with whatever lib/js you prefer...
const arr = Array.from({length: 100}, () => ({time: moment().subtract(Math.floor(Math.random() * 10), 'hours'), value: Math.floor(Math.random() * 100)}));
const grouped = [...arr.reduce((a, b) => {
let o = a.get(b.time.get('hour')) || {value: 0, qty: 0};
a.set(b.time.get('hour'), {value: o.value + b.value, qty: o.qty + 1});
return a;
}, new Map)].map(([k, v]) => ({
[k]: v.value / v.qty
}));
console.log(grouped)
<script src="https://cdnjs.cloudflare.com/ajax/libs/moment.js/2.24.0/moment-with-locales.min.js"></script>
By grouping and then reducing you can do this like following.
function groupBy(list, keyGetter) {
const map = {};
list.forEach((item) => {
const key = keyGetter(item);
if (!map[key]) {
map[key] = [item];
} else {
map[key].push(item);
}
});
return map;
}
const data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16];
const now = (new Date()).getTime();
const stepSize = 60000*1;
const withTime = data.reverse().map((x, i) => { return { time: new Date(now - stepSize * i), temp: x } });
const grouped = groupBy(withTime, x => new Date(x.time.getFullYear(), x.time.getMonth(), x.time.getDate(), x.time.getHours()).valueOf());
const avg = Object.entries(grouped).map((x) => {
return {
time: new Date(Number(x[0])),
temp: x[1].map(y => y.temp).reduce((acc, val) => acc + val) * (1.0 / x[1].length)
}
});
console.log(avg);
To measure the average I needed to split the array every 60 elements.
This is the solution I found
//Calculate the average of every 60 elements to get the average of an hour
var arr2: number[] = []
var arr: number[] = []
arr = Array.from(this.temps.data)
var arrays = [], size = 60;
while (arr.length > 0){
arrays.push(arr.splice(0, size));
}
for (let i = 0; i < (arrays.length - 1); i++) {
var sum = 0
for (let b = 0; b < 60; b++) {
sum += arrays[i][b]
}
let avg = sum/60
arr2.push(avg)
}
After all I think its stupid to get the last elements of the array, Because this is a better solution. But thanks for the help!

Categories