How to bucket JSON data for a range? - javascript

I'm new to programming so appreciate any help and patience here. :)
We have some data that we're trying to sort by buckets (days of the month in this case) to create smaller sets of data like so:
export const Weather = [
{
city: "STL",
month: "july",
daysbucket: "1 to 10",
percentages: {
sunny: "45.5",
rainy: "20.5",
cloudy: "10.8",
partlycloudy: "23.2",
},
},
Is there a better way than using string like daysbucket: "1 to 10"?
The idea is to run some forecast simulations using a probability function that pulls the percentage of the past weather for a certain day without having to list these percentages for each day of the month. So far I planned to get the day of the month and then do some if statements to slot it into a string for 1-10, 11-20, etc. but wanted to see if there was a better way before I get too far. I have several data sets with a variety of buckets stored as strings but I also have control over the data so can change it as needed. All of the data is stored in MongoDB.
Thanks in advance!!!

To be able to make calculations and comparisons with the daysbucket better to defined it like this:
export const Weather = [
{
city: "STL",
month: "july",
daysbucket: {
from: 1,
to: 10
},
percentages: {
sunny: "45.5",
rainy: "20.5",
cloudy: "10.8",
partlycloudy: "23.2",
},
},
Having this structure you can compare it like:
if (day > daysbucket.from && day < days bucket.to) { ... }
And so on, note that the in order to compare numbers the values should be defined as numbers, not strings, or if string you need to convert them to numbers (use parseInt() or Number())

I would use an object to do that. Something like :
daysbucket: {
min:1,
max: 10
}

Related

Mongodb aggregation query by date difference

I am trying to make a cron job which syncs my documents. It should try to do it x amount of times but only after 2h have passed since the last try. On each document I have "lastSyncAt" field which I can use.
{
"_id" : ObjectId("628d8c4ddb65027a2cfd1019"),
"calculatedAt" : "",
"count" : 0,
"createdAt" : "2022-05-25 01:54:21",
"lastSyncAt" : "2022-05-25 03:54:21"
}
How should I approach this?
Should I get the lastSyncAt value in pipeline and calculate difference between currentDate? How do I get only the hours portion of that in my pipeline?
Should I convert lastSyncAt into unix and get currentDate in unix substract them and divide by 7200 to see if it is greater than 2?
Or should I take another approach?
I'm not even sure what approach to take. Not looking for code but an idea how to handle this.
Thx
Update:
Thanks to #derek-menénedez I managed to get it working as shown below:
[
// Stage 1
{
$addFields: {
lastSyncAt: {
$dateDiff: {
startDate: {$dateFromString: {
dateString: "$lastSyncAt",
timezone: "Europe/Zagreb"
}},
endDate: "$$NOW",
unit: "minute",
timezone: "Europe/Zagreb"
}
}
}
},
// Stage 2
{
$match: {
lastSyncAt: {
$gt: 120
}
}
}
]
You can use the aggregation framework to achieve the things that you want:
https://mongoplayground.net/p/1RzPCYbeHEP
You can try to remove the projection on the example to validate the number of hours.
$dateFromString operator helps you to create a date from a string
$dateDiff operator helps you to extract the diff of two dates

Extract specific digits from a .txt file

I have been asked to count the number of tweets per hour by day (0 - 23) in a huge text file of random tweets. The date is not interesting, only the tweet per hour. I want to return them in a new array of objects. Each object should have properties hour and count like this:
{hour: x, count: y},
I've made a function where I'm declaring an empty array, in which I will put my data:
function(tweets) {
let result = [];
and I think I need to push them like this:
result.push({hour: x, count: y});
But I don't know how to extract the specific hour from my object (key and value).
in the huge, raw data file, each tweet is logged with a date like this:
created_at: "30-06-2015 14:27",
Any suggestions or experience? I'm currently learning about regex and for loops. Should I use them in this code or is there a smarter way?
Edit: as you asked for more details:
The raw data are object in an array with the following structure:
{
time: Date-object,
created_at: "30-06-2015 14:27",
fromUsername: "victor",
text: "asyl og integration",
lang: "da",
source: "Twitter for Android",
}
About extracting text I see good answer here. Instead of console.log add parsing and saving to your array.
About regexp - I think it should be something like
var re = /created_at: \"([^\"]*)\",/g;
What I would do is work from a different angle:
create an object with a dateTimeHour for the start of each hour that you care about. It should presumably be a limited timespan like for all tweets that happened before now:
So generate something that looks like this dynamically:
{
'2019-03-01T17:22:30Z': 0, // or simply '1552667443928'
'2019-03-01T18:22:30Z': 0,
'2019-03-01T19:22:30Z': 0,
'2019-03-01T20:22:30Z': 0,
...etc
}
Which you can do using current Date and then a loop to create additional previous date times:
const now = new Date()
// you can use a generator here or simply a while loop:
const dateTimes = {}
while(now > REQUIRED_DATE)
dateTimes[new Date(now.setHours(now.getHours() - 1))] = 0
Now you have an exhausted list of all the hours.
Then, check if the given tweet is within that hour:
check if item.created_at < currentHourBeingLooked because you should loop through the Object.keys(dateTimes).
Then, loop through each item in your list and check if it fits that dateTime if so increment dateTimes[currentHour]++.
So, the hardest part will be converting created_at to a normal looking date time string:
const [datePortion, timePortion] = "30-06-2015 14:27".split(' ')
const [day, month, year] = datePortion.split('-')
const [hour, minute] = timePortion.split(':')
now with all those date, month, year, hour, and minute you can build a time object in javascript:
It follows the formula:
From MDN:
new Date(year, monthIndex [, day [, hours [, minutes [, seconds [, milliseconds]]]]]);
AKA:
new Date(year, monthIndex, day, hours, minutes, seconds);
So for December 17, 2019 # 3:24am it'll be:
const = new Date(2019, 11, 17, 3, 24, 0);
I'll assume that you already know to use regex from the post pointed by Ralkov to get all of your created_at dates, and my answer will go from that.
You said the date is not important so once you have the string
'created_at: "30-06-2015 14:27"'
we need to get rid of everything except for the hour, i did it by extracting substrings, feel free to try other approaches, this is just to get you started.
var date = obj.substr(obj.indexOf(' ') + 1);
var time = date.substr(date.indexOf(' ') + 1);
var hour = time.substr(0, time.indexOf(':'));
will get yo the hour
"14"
Note that this only works for one day, you need to do some additional changes if you'd like to store tweet hour count for different days in the same data structure
When you write your for-loop use the following function each time you find a tweet and already extracted the hour, it stores a combination of value-pairs into a map variable defined outside the function, creating a new pair if necessary or just updates it with the new tweet count.
function newTweet(hour, tweetsPerHour) {
var tweetsThisHour = tweetsPerHour.get(hour);
tweetsThisHour = tweetsThisHour === undefined ? 0 : tweetsThisHour;
tweetsPerHour.set(hour, ++tweetsThisHour);
console.log(tweetsThisHour)
}
complete code:
var obj = 'created_at: "30-06-2015 14:27"';
var date = obj.substr(obj.indexOf(' ')+1);
var time = date.substr(date.indexOf(' ')+1);
var hour = time.substr(0, time.indexOf(':'));
var tweetsPerHour = new Map();
newTweet(hour, tweetsPerHour); //this is the extracted hour
newTweet("16", tweetsPerHour); //you can try different hours as well
newTweet("17", tweetsPerHour);
function newTweet(hour, tweetsPerHour) {
var tweetsThisHour = tweetsPerHour.get(hour);
tweetsThisHour = tweetsThisHour === undefined ? 0 : tweetsThisHour;
tweetsPerHour.set(hour, ++tweetsThisHour);
console.log(hour + " tweet count: " + tweetsThisHour)
}
what the code is doing is storing the hour and count of tweets in pairs:
[{"14":1} ,{"16":1}, {17:1}]
for example if you add "14" again it would update to
[{"14":2}, {"16":1}, {17:1}]
dig into JavaScript Map Objects as well.
Your code flow is something like the following:
Read .txt file
loop through dates -> get hour from date -> newTweet(hour,
tweetsPerHour).

Get current working shift from array with Moment.js

Original question
Get nearest time in the past with Moment.js
Unfortunately the original question wasn't good enough for my use case. However, M. Mennan Kara's answer is answering exactly to my original question. So you should find it out.
Improved question with the case example can be found below.
Time is now 04:00 (using 24-hour clock). I'd like to parse string 22:00:00 to Moment.js object.
let parsed = moment('22:00:00', 'HH:mm:ss');
That worked like a charm. Unlikely the function returns current day by default. So, my question is: isn't it possible to parse to the nearest time in the past?
Improved question
Get current working shift from array with Moment.js
Following is an example case about how it should work in my project. I have working shifts in array and want to save current shift in currentShift variable.
let currentShift = null;
let shifts = [
{ name: 'early', start: '06:00:00', end: '14:00:00' },
{ name: 'late', start: '14:00:00', end: '22:00:00' },
{ name: 'night', start: '22:00:00', end: '06:00:00' }
];
shifts.forEach(item => {
let start = moment(item.start, 'HH:mm:ss');
if (moment(item.start, 'HH:mm:ss') <= moment()) {
currentShift = item;
}
});
How about if you compare the parsed time with current time and remove one day if it's after current time since you are looking for the nearest time in the past.
let parsed = moment('22:00:00', 'HH:mm:ss');
if (parsed.isAfter(moment())) {
parsed.subtract(1, 'days');
}
https://jsfiddle.net/y40gvsmo/7/

Elastic Search Date Math Issue, unsure of bounds

I'm trying to query an amount of records from a database but am a little bit confused about date math. Here's an example of one of my blocks:
var searchParams = {
"query": {
"bool" : {
"must" : [
{ "term": { } },
{ "term": { } },
{ "range": {
"dateUTC": {
"gt": "now-7d/d",
"lt": "now/d",
"format": "basic_date_time"
}
}
}
]
}
}
};
Now if I wanted to query all the result from only today, would I use gt: now-1d/d, lt: now/d. Or would I use gt: now-2d/d, lt: now/d, like the edges around one day? The program I'm trying to make is meant to query all the results from today, yesterday, etc. from the second that the day starts to the Midnight of that day, would I have to switch my date math for that?
Thanks
I think this post answers it pretty clearly: https://discuss.elastic.co/t/several-date-math-questions/27453/3
To answer your questions though:
"to query all the result from only today": gt: now/d, lt:now
from today midnight to 2 days ago midnight: gt:now - 2d/d, lt:now/d
The /d here takes the day calculated in the formula, and adjust time to midnight, so if now is the 18th of July 00:31:43, now/d yields the 18th of July 00:00:00

How to get the total month by given months

I have an array like this
var array = [
{ date:2014-11-11,
title:test },
{ date:2014-11-12,
title:test },
{ date:2014-11-13,
title:test },
…more
…more
{ date:2015-01-01
title:test},
{ date:2015-01-02
title:test},
…more
…more
{ date:2015-03-01
title:test}
]
My questions is how to get the total month of each year.
For example, I need to have 2 months (nov to dec) in 2014 and 3 months (Jan to March) in 2015.
var firstYear = parseInt($filter('date')(array[0].date, 'yyyy'));
var lastYear = parseInt($filter('date')(array[array.length-1].date, 'yyyy'));
I am not sure what I can do next to get the total month of the year
Can anyone help me about it? Thanks!
Your array syntax is not valid javascript. Do you really have date strings like:
date: '2014-11-11',
or is the value a Date object representing that date? Is the date local or UTC? Anyway, I'll assume you have strings and whether they are UTC or local doesn't matter.
My questions is how to get the total month of each year. For example, I need to have 2 months (nov to dec) in 2014 and 3 months (Jan to March) in 2015.
I'm not exactly sure what you want, you should provide an example of your expected output. The following returns the month ranges in particular years, there is no conversion of Strings to Dates:
// Sample data
var array = [
{ date:'2014-11-11',
title:'test'},
{ date:'2014-11-12',
title:'test'},
{ date:'2014-11-13',
title:'test'},
{ date:'2015-01-01',
title:'test'},
{ date:'2015-01-02',
title:'test'},
{ date:'2015-03-01',
title:'test'}
];
And the function:
function getMonthCount2(arr) {
var yearsMonths = arr.map(function(v){return v.date.substr(0,7).split(/\D/)}).sort();
var monthRanges = {}
yearsMonths.forEach(function(v,i) {
if (monthRanges[v[0]]) {
monthRanges[v[0]] += v[1] - yearsMonths[--i][1];
} else {
monthRanges[v[0]] = 1;
}
});
return monthRanges;
}
console.log(getMonthCount2(array)); // {'2014': 2, '2015': 3}
The above assumes valid input, you may want to put in a validation step to ensure the data is clean before passing it to the function.
If you're dealing with dates and times you should probably think about using a library as there are many nuances that go along with working dates and times.
I just did something similar to this and I solved it using moment js and the date range extension.
Looking at the docs for moment-range it seems that you can do something like this:
var start = new Date(2012, 2, 1);
var end = new Date(2012, 7, 5);
var range1 = moment().range(start, end);
range1.by('months', function(moment) {
// Do something with `moment`
});

Categories