I am trying to figure out a formula to calculate the urgency of a set of arbitrary tasks, based on the number of days until a 'deadline' and the % completion of the task already completed.
So far I have a 'function' which gives the represents:
U = ((dd * 25) - (100 - cp))
Where:
dd = Day difference from deadline to current date (in an integer value)
cp = current completion % (in an integer value - in increments of 5 currently)
This gives me a linear function, and the 25 in the function indicates a 25% per day progression of the task.
So that at any given date:
Where U <0 task is urgent
Where U =0 task is on schedule
Where U >0 task is ahead of schedule
(The actual display on if a task is on schedule (within a range) would be handled separately)
Is there any other methods to calculate the urgency of a task, from the difference of two dates and weighted by a variable?
From current responses:
Using the start date,end date and current date differences along with completion % to calculate urgency
Possibly using a non-linear function to increase U when cp >75% and decrease U when cp < 75%. Are there any advantages for linear vs non-linear functions?
This will be used in MySQL & javascript, as I'd like a way to display how on track a task is using the U value. So finding a method to correctly (more so than my current method) calculate the value for U is what I'm attempting to do.
Solution
The solution I went with (based on marked solution):
((((((end_date - now) / (end_date - start_date)) * 100) * (100 - cp)) * 10) * -1)
Minor Changes made
Using the rule of three as a start, multiplied by 10 just to increase the values & create a wider range without needing to factor for float values too much.
Also multiplied by -1, this was so that completed tasks then give a negative number, while incomplete tasks show a higher number (makes sense: higher urgency of a task therefore a higher number)
I may in future add to this, adding a velocity for a task as suggested & also taking into account for the number of people assigned to a given task.
This function is only going to be used for a rough guide to show someone what tasks (in a given list) the might need to do first.
Also as I used this in MySQL the function needed to be wrapped in a IFNULL (due to existing data in my case)
IFNULL( *function* ,-200)
An initial arbitrary value of -200 if it was null value (as some tasks do not have an start/end date)
Thanks for the assistance & suggestions
Given that:
due is day difference from deadline to current date
estimated is the time needed for a task
done is the progress in percentage
This would be a simple rule of three:
var rest = estimated / 100 * (100 - done);
if(due < rest) {
state = 'behind';
}
if(due == rest) {
state = 'on';
}
if(due > rest) {
state = 'ahead';
}
Note that possibly very few tasks would be "on schedule" because they'd have to match exactly, you could also check in ranges like rest < due + 0.5 && rest > due - 0.5 or so, imitating a non-linear prioritizing.
Related
I have an ordered data set of decimal numbers. This data is always similar - but not always the same. The expected data is a few, 0 - 5 large numbers, followed by several (10 - 90) average numbers then follow by smaller numbers. There are cases where a large number may be mixed into the average numbers' See the following arrays.
let expectedData = [35.267,9.267,9.332,9.186,9.220,9.141,9.107,9.114,9.098,9.181,9.220,4.012,0.132];
let expectedData = [35.267,32.267,9.267,9.332,9.186,9.220,9.141,9.107,30.267,9.114,9.098,9.181,9.220,4.012,0.132];
I am trying to analyze the data by getting the average without high numbers on front and low numbers on back. The middle high/low are fine to keep in the average. I have a partial solution below. Right now I am sort of brute forcing it but the solution isn't perfect. On smaller datasets the first average calculation is influenced by the large number.
My question is: Is there a way to handle this type of problem, which is identifying patterns in an array of numbers?
My algorithm is:
Get an average of the array
Calculate an above/below average value
Remove front (n) elements that are above average
remove end elements that are below average
Recalculate average
In JavaScript I have: (this is partial leaving out below average)
let total= expectedData.reduce((rt,cur)=> {return rt+cur;}, 0);
let avg = total/expectedData.length;
let aboveAvg = avg*0.1+avg;
let remove = -1;
for(let k=0;k<expectedData.length;k++) {
if(expectedData[k] > aboveAvg) {
remove=k;
} else {
if(k==0) {
remove = -1;//no need to remove
}
//break because we don't want large values from middle removed.
break;
}
}
if(remove >= 0 ) {
//remove front above average
expectedData.splice(0,remove+1);
}
//remove belows
//recalculate average
I believe you are looking for some outlier detection Algorithm. There are already a bunch of questions related to this on Stack overflow.
However, each outlier detection algorithm has its own merits.
Here are a few of them
https://mathworld.wolfram.com/Outlier.html
High outliers are anything beyond the 3rd quartile + 1.5 * the inter-quartile range (IQR)
Low outliers are anything beneath the 1st quartile - 1.5 * IQR
Grubbs's test
You can check how it works for your expectations here
Apart from these 2, the is a comparison calculator here . You can visit this to use other Algorithms per your need.
I would have tried to get a sliding window coupled with an hysteresis / band filter in order to detect the high value peaks, first.
Then, when your sliding windows advance, you can add the previous first value (which is now the last of analyzed values) to the global sum, and add 1 to the number of total values.
When you encounter a peak (=something that causes the hysteresis to move or overflow the band filter), you either remove the values (may be costly), or better, you set the value to NaN so you can safely ignore it.
You should keep computing a sliding average within your sliding window in order to be able to auto-correct the hysteresis/band filter, so it will reject only the start values of a peak (the end values are the start values of the next one), but once values are stabilized to a new level, values will be kept again.
The size of the sliding window will set how much consecutive "stable" values are needed to be kept, or in other words how much UNstable values are rejected when you reach a new level.
For that, you can check the mode of the values (rounded) and then take all the numbers in a certain range around the mode. That range can be taken from the data itself, for example by taking the 10% of the max - min value. That helps you to filter your data. You can select the percent that fits your needs. Something like this:
let expectedData = [35.267,9.267,9.332,9.186,9.220,9.141,9.107,9.114,9.098,9.181,9.220,4.012,0.132];
expectedData.sort((a, b) => a - b);
/// Get the range of the data
const RANGE = expectedData[ expectedData.length - 1 ] - expectedData[0];
const WINDOW = 0.1; /// Window of selection 10% from left and right
/// Frequency of each number
let dist = expectedData.reduce((acc, e) => (acc[ Math.floor(e) ] = (acc[ Math.floor(e) ] || 0) + 1, acc), {});
let mode = +Object.entries(dist).sort((a, b) => b[1] - a[1])[0][0];
let newData = expectedData.filter(e => mode - RANGE * WINDOW <= e && e <= mode + RANGE * WINDOW);
console.log(newData);
I need to develop an algorithm that randomly selects values within user-specified intervals. Furthermore, these values need to be separated by a minimum user-defined distance. In my case the values and intervals are times, but this may not be important for the development of a general algorithm.
For example: A user may define three time intervals (0900-1200, 1200-1500; 1500-1800) upon which 3 values (1 per interval) are to be selected. The user may also say they want the values to be separated by at least 30 minutes. Thus, values cannot be 1159, 1201, 1530 because the first two elements are separated by only 2 minutes.
A few hundred (however many I am able to give) points will be awarded to the most efficient algorithm. The answer can be language agnostic, but answers either in pseudocode or JavaScript are preferred.
Note:
The number of intervals, and the length of each interval, are completely determined by the user.
The distance between two randomly selected points is also completely determined by the user (but must be less than the length of the next interval)
The user-defined intervals will not overlap
There may be gaps between the user-defined intervals (e.g., 0900-1200, 1500-1800, 2000-2300)
I already have the following two algorithms and am hoping to find something more computationally efficient:
Randomly select value in Interval #1. If this value is less than user-specified distance from the beginning of Interval #2, adjust the beginning of Interval #2 prior to randomly selecting a value from Interval #2. Repeat for all intervals.
Randomly select values from all intervals. Loop through array of selected values and determine if they are separated by user-defined minimum distance. If not (i.e., values are too close), randomly select new values. Repeat until valid array.
This works for me, and I'm currently not able to make it "more efficient":
function random(intervals, gap = 1){
if(!intervals.length) return [];
// ensure the ordering of the groups
intervals = intervals.sort((a,b) => a[0] - b[0])
// check for distance, init to a value that can't exist
let res = []
for(let i = 0; i < intervals.length; i++){
let [min, max] = intervals[i]
// check if can exist a possible number
if(i < intervals.length - 1 && min + gap > intervals[i+1][1]){
throw new Error("invalid ranges and gap")
}
// if we can't create a number in the current section, try to generate another number from the previous
if( i > 0 && res[i-1] + gap > max){
// reset the max value for the previous interval to force the number to be smaller
intervals[i-1][1] = res[i-1] - 1
res.pop()
i-=2
}
else {
// set as min the lower between the min of the interval and the previous number generated + gap
if( i > 0 ){
min = Math.max(res[i-1] + gap , min)
}
// usual formula to get a random number in a specific interval
res.push(Math.round(Math.random() * (max - min) + min))
}
}
return res
}
console.log(random([
[0900, 1200],
[1200, 1500],
[1500, 1800],
], 400))
this works like:
generate the first number ()
check if can generate second number (for the gap rule)
- if i can, generate it and go back to point 2 (but with the third number)
- if i can't, I se the max of the previous interval to the generated number, and make it generate it again (so that it generates a lower number)
I can't figure out what's the complexity, since there are random number involved, but might happen that with 100 intervals, at the generation of the 100th random number, you see that you can't, and so in the worst case this might go back generating everything from the first one.
However, every time it goes back, it shrinks the range of the intervals, so it will converge to a solution if exists
This seems to do the job. For explanations see comments in the code ...
Be aware, that this code does not do any checks of your conditions, ie non overlapping intervals and intervals are big enough to allow the mindist to be fulfilled. If the conditions are not met, it may generate erroneous results.
This algorithm allows the minimum distance between two values to be defined with each interval separately.
Also be aware, that an interval limit like 900 in this algorithm does not mean 9:00 o'clock, but just the numeric value of 900. If you want the intervals to represent times, you have to represent them as, for instance, minutes since midnight. Ie 9:00 will become 540, 12:00 will become 720 and 15:00 will become 900.
EDIT
With the current edit it also supports wrap-overs at midnight (Although it does not support intervals or minimum distances of more than a whole day)
//these are the values entered by the user
//as intervals are non overlapping I interpret
//an interval [100, 300, ...] as 100 <= x < 300
//ie the upper limit is not part of that interval
//the third value in each interval is the minimum
//distance from the last value, ie [500, 900, 200]
//means a value between 500 and 900 and it must be
//at least 200 away from the last value
//if the upper limit of an interval is less than the lower limit
//this means a wrap-around at midnight.
//the minimin distance in the first interval is obviously 0
let intervals = [
[100, 300, 0],
[500, 900, 200],
[900, 560, 500]
]
//the total upper limit of an interval (for instance number of minutes in a day)
//lenght of a day. if you don't need wrap-arounds set to
//Number.MAX_SAFE_INTEGER
let upperlimit = 1440;
//generates a random value x with min <= x < max
function rand(min, max) {
return Math.floor(Math.random() * (max - min)) + min;
}
//holds all generated values
let vals = [];
let val = 0;
//Iterate over all intervals, to generate one value
//from each interval
for (let iv of intervals) {
//the next random must be greater than the beginning of the interval
//and if the last value is within range of mindist, also greater than
//lastval + mindist
let min = Math.max(val + iv[2], iv[0]);
//the next random must be less than the end of the interval
//if the end of the interval is less then current min
//there is a wrap-around at midnight, thus extend the max
let max = iv[1] < min ? iv[1] + upperlimit : iv[1];
//generate the next val. if it's greater than upperlimit
//it's on the next day, thus remove a whole day
val = rand(min, max);
if (val > upperlimit) val -= upperlimit;
vals.push(val);
}
console.log(vals)
As you may notice, this is more or less an implementation of your proposal #1 but I don't see any way of making this more "computationally efficient" than that. You can't get around selecting one value from each interval, and the most efficent way of always generating a valid number is to adjust the lower limit of the interval, if neccessary.
Of course, with this approach, the selection of next number is always limited by the selection of the previous. Especially if the minimum distance between two numbers is near the length of the interval, and the previous number selected was rather at the upper limit of its interval.
This can simply be done by separating the intervals by required many minutes. However there might be edge cases like a given interval being shorter than a seperation or even worse two consequent intervals being shorter than the separation in which case you can safely throw an error. i.e. had in [[900,1200],[1200,1500]] case 1500 - 900 < 30 been. So you best check this case per consequent tuples and throw an error if they don't satisfy before trying any further.
Then it gets a little hairy. I mean probabilistically. A naive approach would chose a random value among [900,1200] and depending on the result would add 30 to it and accordingly limit the bottom boundary of the second tuple. Say if the random number chosen among [900,1200] turns out to be 1190 then we will force the second random number to be chosen among [1220,1500]. This makes second random choice dependent on the outcome of the first choice and as far as I remember from probability lessons this is no good. I believe we have to find all possible borders and make a random choice among them and then make two safe random choices one from each range.
Another point to consider is, this might be a long list of tuples to start with. So we should care about not limiting the second tuple in each turn since it will be the first tuple on the next turn and we would like to have it as wide as possible. So perhaps getting the minimum possible value from the first range (limitting the first range as much as possible) may turn out to be more productive than random tries which might (most possibly) yield a problem in further steps.
I can give you the code but since you haven't showed any tries you have to settle with this rod to go and fish yourself.
I need to calculate the percentile rank of a particular value against a large number of values filtered in various different ways. The data is all stored on Parse.com, which has a limitation of returning a maximum of 1000 rows per query. The number of values stored is likely to exceed well over 100,000.
By 'percentile rank', I mean I need to calculate the percentage of values that the provided value is greater than. I am not trying to calculate the value of a provided percentile. For example, given a list of values {20, 23, 24, 29, 30, 31, 35, 40, 40, 43} the percentile rank of the provided value 35 is 70%. The algorithm for this is simply the rank of the value / count of values * 100. Not sure if 'percentile rank' is the correct terminology for this.
I have considered a couple of different approaches to this. The first is to pull down the full list of values (into Parse Cloud) and then calculate the percentile rank from there, then filter the list and calculate again, repeating the last two steps as many times as required. The problem with this approach is it will not work once we reach 1000 values, which we can expect pretty quickly.
Another option, which is the best I can come up with so far, is to query the count of items, and the rank of the provided value. For example:
var rank_world_alltime = new Parse.Query("Values")
.lessThan("value", request.params.value) // Filters query to values less than the provided value, so counting this query will return the rank
.count();
var count_world_alltime = new Parse.Query("Values")
.count();
Parse.Promise.when(rank_world_alltime, count_world_alltime).then(function(rank, count) {
percentile = rank / count * 100;
console.log("world_alltime_percentile = " + percentile);
});
This works well for a single calculation, but I need to perform multiple calculations, and this approach very quickly becomes a lot of queries. I expect to need to run about 15 calculations per call, which is 30 queries. All calculations need to complete in under 3 seconds before Parse terminates the job, and I am limited to 30 reqs/second, so this is very quickly going to become a problem.
Does anyone have any suggestions on how else I could approach this? I've thought about somehow pre-processing some of this but can't quite work out how to do so, as the filters will be based on time and location (city and country), so there are potentially a LOT of pre-calculations that will need to be run at regular intervals. The results do not need to be 100% accurate but something close.
I don't know much about parse, but as far as I understand what you say, it is some kind of cloud database thingy that holds your hiscores, and limits you 1000 rows per query, 3 seconds per job, and 30 queries per second.
In order to have approximate calculations and divide by 2 the number of queries, I would first of all cache the total (count_world_alltime, count_region,week, whatever). If you can save them somewhere locally. For numbers of 100K just getting the order of magnitude (thus not the latest updated number) should be good enough to get a percentile.
Maybe you can get several counts per query. However my lack of expertise in parse/nosql kind of stops me from being sure of this, you'll have to check their documentation. If it is possible however, for the case where you need percentiles for a serie of values all in the same category, I would
Order the values, let's call them a,b,c,d,e (once ordered)
Get the number of values between the intervals [0,a] [a,b] [b,c] [c,d] [d,e]
Use the cached total to get the percentiles (where Nxy is the number of values in [x,y]) :
Pa = 100 * N0a / total
Pb = 100 * ( N0a + Nab ) / total
Pc = 100 * ( N0a + Nab + Nbc ) / total
and so on...
If you need a value ranked worldwide, the other per region, some per week others over all times, etc, this doesn't apply. In that case I don't think you can get below 1 query/number, with caching the totals.
Is there a clever way to determine, say an array index, that falls within a given range? The application is similar to a playlist for a single video file with a set of from/to times that denote a "chapter".
i.e. Chapters:
00:01 - 00:30 : Call To Order
00:31 - 00:45 : Pledge of Allegence
00:46 - 02:25 : Opening Remarks
02:26 - 32:07 : Old Business
etc., etc., etc.
I have a list of these items on the page, and as the player reports where in the video it is currently playing by returning the current timestamp, I need to use jQuery to highlight the LI of the "chapter" in which the currently video timestamp falls. So if the video is currently at 1:15, that's "Opening Remarks", and the 3rd list item would be highlighted.
I've tried a number of approaches, but ultimately use PHP to write a huge series of IF/ELSEs because a playlist could have anywhere from 5 to 100 different Chapters in it and can be modified by the user at any time.
Ideally, I'd like an array using the Start time as the Key and chapter as the value, and a function that returns the first index that is >= the current timestamp. Is there any clever approach to accomplishing this? My way "works", but good God, its inefficient, running through 100 if/elses 10 times per second.
P.S. I should mention that all values are actually in seconds, with the question using H:M:S for clarity. Ultimately, I'm trying to understand how to select an array index if it falls within a given range.
Something like this:
var chapters = {
1: "callToOrder",
31: "pledgeOfAllegiance",
46: "openingRemarks",
146: "oldBusiness",
}
function currentChapter(seconds) {
var start, found = Infinity;
for (start in chapters) {
if (start <= seconds && start < found) {
found = start;
}
}
return (found === Infinity) ? null : chapters[found];
}
It will run in linear time in the number of chapters. In practice, this should be acceptable. If it isn't, then you could replace chapters with an array of objects and perform a binary search.
I'm in the process of coding an application that does the following:
Generates a random number with 4 digits.
Changes it once per calendar day.
Won't change that full day. Only once in a day.
I tried:
function my_doubt()
{
var place = document.getElementById("my_div")
place.innerHTML=Math.floor((Math.random()*100)+1);
}
I'm getting a random number with Math.random(). However, I'm rather clueless about how to generate a different number for each day. What are some common approaches for tackling this problem?
Note: It doesn't have to be really random. A pseudo - random number is also OK.
You need to seed the random number generator with a number derived from the current date, for example "20130927" for today.
You haven't been clear about your requirements, so I don't know how random you need (do you have requirements for how uniform of a distribution you need?).
This will generate a random looking 4 digit number which may be good enough for your requirements, but if you perform an analysis you'll find the number isn't actually very random:
function rand_from_seed(x, iterations){
iterations = iterations || 100;
for(var i = 0; i < iterations; i++)
x = (x ^ (x << 1) ^ (x >> 1)) % 10000;
return x;
}
var random = rand_from_seed(~~((new Date)/86400000)); // Seed with the epoch day.
Now that your question is a bit more reasonable, clear and nicer in tone. I can give you a way to get the same result on the client-side. However as others mentioned, to maintain consistency, you probably want to maintain the number on the server to ensure consistency.
var oneDayInMs = 1000*60*60*24;
var currentTimeInMs = new Date().getTime(); // UTC time
var timeInDays = Math.floor(currentTimeInMs / oneDayInMs);
var numberForToday = timeInDays % 9999;
console.log(numberForToday);
// zero-filling of numbers less than four digits might be optional for you
// zero-filled value will be a string to maintain its leading 0s
var fourDigitNumber = numberForToday.toString();
while(fourDigitNumber.length < 4)
{
fourDigitNumber = 0+fourDigitNumber;
}
console.log(fourDigitNumber);
// remember that this number rotates every and is unique for 10000 days
1)create a random number in javascript
2)store in cookie that will expire after one day
3)get value from cookie, if it does not exist goto 1