MongoDB aggregate/group -> get amount of single active users by date - javascript

I'm trying to aggregate over saved user sessions and get the amount of single active visitors per date in a date range.
I have a session model which contains these properties:
{
'environment' : ObjectId("xxxxxxxxxxxxxxxxxxxxxxxx"),
'created' : ISODate("2021-01-05T22:02:25.757Z"),
'visitor' : ObjectId("xxxxxxxxxxxxxxxxxxxxxxxx")
}
The result should look like this:
[
{
'date' : '2021-01-03',
'count' : 5 // this is the amount of unique visitors. If there are two documents with the same date and visitor id, only one should be count.
},
{
'date' : '2021-01-05',
'count' : 15
},
{
'date' : '2021-01-06',
'count' : 11
},
...etc...
]
This is the last pipeline I tried which of course is wrong:
const data = await Session.aggregate([
{
'$match': {
environment : ObjectID( args.environment ),
created : { '$gte': new Date( args.start ), '$lte': new Date( args.end ) }
}
},
{
$addFields:{
createdDate:{
$dateFromParts:{
year:{
$year:"$created"
},
month:{
$month:"$created"
},
day:{
$dayOfMonth : "$created"
}
}
}
}
},
{
$group:{
_id:{
date:"$createdDate",visitor:"$visitor"
},
count:{$sum:1}
}
},
{
$project:{
_id:0,
date:"$_id.date",
count:"$count",
}
}
])
I tried a few of my own and a few SO combinations for my pipeline but no great success yet.
Help would be very much appreciated.

I think what you are searching for is the $addToSet operator.
Returns an array of all unique values that results from applying an expression to each document in a group of documents that share the same group by key.
The doc : https://docs.mongodb.com/manual/reference/operator/aggregation/addToSet/
You just need to group by day and add the visitors id to the set, if they exist they are not added if not they are, and kaboom. After that you just need to count how many elements in that list.
const data = await Session.aggregate([
{
$match: {
environment : ObjectID( args.environment ),
created : { '$gte': new Date( args.start ), '$lte': new Date( args.end ) }
}
},
{
$group: {
_id: {
$dateToString: {
format : "%Y-%m-%d",
date : "$created"
},
},
visitors: { $addToSet: "$visitor" }
}
},
{
$project: {
_id: 0,
date: "$_id",
count: { $size: "$visitors" }
}
}
])

You can use a project stage with $dateToString, in addition it allows to specify a timezone if needed
const data = await Session.aggregate([
{
'$match': {
environment : ObjectID( args.environment ),
created : { '$gte': new Date( args.start ), '$lte': new Date( args.end ) }
}
},
{
$project:{
visitor: 1,
createdDate: { $dateToString: { format: "%Y-%m-%d", date: "$created" } },
}
},
{
$group:{
_id:{
date:"$createdDate",visitor:"$visitor"
},
count:{$sum:1}
}
},
{
$project:{
_id:0,
date:"$_id.date",
count:"$count",
}
}
])

Related

Mongoose: insert object into a sub-document array

I am trying to insert a time object into the times array for a specific activity name for a specific user. For example, if the user was "someuser" and I wanted to add a time to the times for guitar I am unsure as to what to do.
{
username: "someuser",
activities: [
{
name: "guitar",
times: []
},
{
name: "code",
times: []
}
]
}, {
username: "anotheruser",
activities: []
}
This is currently the function that I have, I cannot figure out what I am doing wrong, any help would be greatly appreciated:
function appendActivityTime(user, activityName, newRange) {
User.updateOne(
{username: user, 'activities.name': activityName},
{ $push: {'activities.$.times': {newRange}},
function (err) {
if (err) {
console.log(err);
} else {
console.log("Successfully added time range: " + newRange);
}
}}
);
}
appendActivityTime("someuser", "guitar", rangeObject);
i've tried your attempt and it worked for me:
db.getCollection("test").updateOne(
{ username: "someuser", "activities.name": "guitar" },
{ $push: { "activities.$.times": { from: ISODate(), to: ISODate() } } } //Don't worry about ISODate() in node.js use date objects
)
results:
{
"_id" : ObjectId("5f6c384af49dcd4019982b2c"),
"username" : "someuser",
"activities" : [
{
"name" : "guitar",
"times" : [
{
"from" : ISODate("2020-09-24T06:15:03.578+0000"),
"to" : ISODate("2020-09-24T06:15:03.578+0000")
}
]
},
{
"name" : "code",
"times" : [
]
}
]
}
what i would suggest you using instead is arrayFilter, they are much more precise and when you get used to them, they became very handy
If you are not confident with updating nested documents, let mongoose make the query.
let document = await Model.findOne ({ });
document.activities = new_object;
await document.save();

How to work with date conditions using mongodb query operators?

I have a mongo db model with the name DrSlots. One of the fields in the model is slots which is as follows
slots: [
{
slot: {
start: {
type: Date,
},
end: {
type: Date,
},
},
status: {
type: String,
},
},
],
Now I want to find the slots based on certain conditions. Firstly the start time should be greater or equal to the start time provided by the user and the end time should be lesser or equal to the end time provided by the user in the same document. For this reason, I wrote the following query which for some reason is not executing correctly.
const slots = await DrSlots.findOne({
$and: [
{ doctor: req.params.doctorId },
{ dateOfAppointment: params.date },
{
"slots.slot": {
start: { $gte: params.start },
end: { $lte: params.end },
},
},
],
});
I am not getting correct results.
Secondly I also want to implement that if params.start or params.end is not provided by user, the query should not check it. How would i implement this? TIA
In order to find the slots between start and end, you could use $elemMatch and do the following:
$and: [
...,
{
slots: {
$elemMatch: {
start: { $gte: params.start },
end: { $lte: params.end },
}
}
}
]
As also pointed out by #Taplar in the comments.
Reference: https://docs.mongodb.com/manual/reference/operator/query/elemMatch/
// You can make query using some condition basis.
let query = [
{ doctor: req.params.doctorId },
{ dateOfAppointment: params.date }
];
// and after that check params.start and params.end values
if (params.start && params.end) {
query.push({
"slots.slot": {
$elemMatch: {
start: { $gte: params.start },
end: { $lte: params.end },
}
}
})
}
const slots = await DrSlots.findOne({
$and: query
});

Compare Dates from arrays of different objects in aggregation

on my project i have users that complete combinations (called sessions) of courses. the fact of playing a course is called an attempt. During the attempt they can close it and come back later (so we keep a timelog object).
I have a request from the client which needs to return for each session, the users (and their attempts) that have played whole or part of their session during a certain timeframe.
During a certain timeframe means that the client sends a begin and end date and we count a user for a specific session if:
- the first attempt has begun before the end of the timeframe => the started of the first timelog of the first < ending date
- the last attempt has been finished after the begining of the timeframe => the end of the last timelog of the last attempt > starting date
Here is an example of an attempt object (the only one we need to use here):
{
"_id" : ObjectId("5b9148650ab5f43b5e829a4b"),
"index" : 0,
"author" : ObjectId("5acde2646055980a84914b6b"),
"timelog" : [
{
"started" : ISODate("2018-09-06T15:31:49.163Z"),
"ended" : ISODate("2018-09-06T15:32:03.935Z")
},
...
],
"session" : ObjectId("5b911d31e58dc13ab7586f9b")}
My idea was to make an aggregate on the attempts, to group those using author and session as an _id for the $group stage, and to push all the attempts of the user for this particular session into an array userAttempts.
Then to make an $addField stage to retrieve the started field of the first timelog of the first attempt and the last ended of the last attempt.
And finally to $filter or $match using those new fields.
Here is my aggregate:
const newDate = new Date()
_db.attempts.aggregate([
{ $match: {
author: { $in: programSessionsData.users },
$or: [{ programSession: { $in: programSessionIds } }, { oldTryFor: { $in: programSessionIds } }],
globalTime: $ex,
timelog: $ex }
},
{
$group: {
_id: {
user: "$author",
programSession: "$programSession"
},
userAttempts: { $push: { attemptId: "$_id", lastTimelog: { $arrayElemAt: ["$timelog", -1] }, timelog: "$timelog" } }
}
},
{
$addFields: { begin: { $reduce: {
input: "$userAttempts",
initialValue: newDate,
in: {
$cond: {
if: { $lt: ["$$this.timelog.0.started", "$$value"] },
then: "$$this.timelog.0.started",
else: "$$value"
} }
} } }
}
I also tried this for the addFields stage:
{
$addFields: { begin: { $reduce: {
input: "$userAttempts",
initialValue: newDate,
in: { $min: ["$$this.timelog.0.started", "$$value] }
} } }
}
However everytime begin is an empty array.
I do not really know how i can extract those two date, or compare dates between them.
To Note: the end one is more difficult that is why i have to first extract lastTimelog. If you an other method i would gladly take it.
Also this code is on a node server so i cannot use ISODate. and the mongo version used is 3.6.3.
After playing with aggregate a bit i came up with 2 solutions:
Solution 1
_db.attempts.aggregate([
{ $match: {
query
},
{
$group: {
_id: {
user: "$author",
programSession: "$programSession"
},
userAttempts: { $push: { attemptId: "$_id", timelog: "$timelog" } }
}
}, {
$addFields: {
begin: { $reduce: {
input: "$userAttempts",
initialValue: newDate,
in: { $min: [{ $reduce: {
input: "$$this.timelog",
initialValue: newDate,
in: { $min: ["$$this.started", "$$value"] }
} }, "$$value"] }
} },
end: { $reduce: {
input: "$userAttempts",
initialValue: oldDate,
in: { $max: [{ $reduce: {
input: "$$this.timelog",
initialValue: oldDate,
in: { $max: ["$$this.ended", "$$value"] }
} }, "$$value"] }
} }
}
},
{
$match: {
begin: { $lt: req.body.ended },
end: { $gt: req.body.started }
}
}
], { allowDiskUse: true });
newDate is today and oldDate is an arbitrary date in the past.
I had to chain 2 reduce because "$$this.timelog.0.started" would always return nothing. Don't really know why though.
Solution 2
_db.attempts.aggregate([
{ $match: {
query
},
{
$addFields: {
firstTimelog: { $arrayElemAt: ["$timelog", 0] },
lastTimelog: { $arrayElemAt: ["$timelog", -1] }
}
},
{
$group: {
_id: {
user: "$author",
programSession: "$programSession"
},
begin: { $min: "$firstTimelog.started" },
end: { $max: "$lastTimelog.ended" },
userAttempts: { $push: { attemptId: "$_id", timelog: "$timelog"} }
}
},
{
$match: {
begin: { $lt: req.body.ended },
end: { $gt: req.body.started }
}
}
], { allowDiskUse: true });
This one is a lot more straight forward and seems simpler, but oddly enough, from my testing, Solution 1 is always quicker at least in the object distribution for my project.

Access object in object for the next date

I know this has been asked before but I can't seem to find the answer, how to access data in data event, I want to show data for the next date in the collection JadwalBooking.
Schema:
"keterangan" : "keterangan di rubah",
"dataevent" : {
"time_start" : 60,
"time_end" : 660,
"_id" : ObjectId("5b3da607acddef1c24317dd0"),
"name" : "event 1",
"description" : "lorem ipsum, lorem ipsum",
"date" : ISODate("2018-11-25T00:00:00.000Z")
}
Query:
const data = await JadwalBooking.aggregate([
{
$match: {
dataevent: {
$elemMatch: {
date: {
$gte: new Date(new moment().format("YYYY-MM-DD")),
}
}
}
}
},
{
$project:
{
_id: 1,
dataevent: 1,
keterangan: 1,
}
},
{
$sort: { date: 1 }
}
]);
You need to use dot notation for query and sort in datevent date:
const data = await JadwalBooking.aggregate([
{
$match: {
"dataevent.date": {
$gte: new Date(new moment().format("YYYY-MM-DD"))
}
}
},
{
$project:
{
_id: 1,
dataevent: 1,
keterangan: 1,
}
},
{
$sort: { "dataevent.date": 1 }
}
]);
You dont need to use $elemMatch for your case, $elemMatch is used, when you want to query a specific Object from an array of Objects, and return only matched Object from the array.
In your case a simple query with "." notation will work.
Try this:
const data = await JadwalBooking.aggregate([
{
$match: {
dataevent.date: {
$gte: new Date(new moment().format("YYYY-MM-DD"))
}
}
},
{
$project:
{
_id: 1,
dataevent: 1,
keterangan: 1,
}
},
{
$sort: { date: 1 }
}
]);
As not mentioned specifically to the aggregation,
db.collection
.find({"dataevent.date" : {$gt : new Date(new moment().format("YYYY-MM-DD"))}})
.sort({"dataevent.date": 1})
One more thing is:
Based on your schema you really don't need to use $project too. As you are retrieving whole data.
Note:- $elemMatch is used for Arrays, you need to use dot notation.
const data = await JadwalBooking.aggregate([
{
$match: {
"dataevent.date": {
$gte: new Date(new moment().format("YYYY-MM-DD"))
}
}
},
{
$sort: { date: 1 }
}
]);

MongoDB Aggregation Framework - How To Do Multiple $group Queries

I have the following MongoDB aggregation query that finds all records within a specified month, $groups the records by day, and then returns an average price for each day. I would also like to return a price average for the entire month. Can I do this by using multiple $groups, if so, how?
PriceHourly.aggregate([
{ $match: { date: { $gt: start, $lt: end } } },
{ $group: {
_id: "$day",
price: { $avg: '$price' },
system_demand: { $avg: '$system_demand'}
}}
], function(err, results){
results.forEach(function(r) {
r.price = Helpers.round_price(r.price);
r.system_demand = Helpers.round_price(r.system_demand);
});
console.log("Results Length: "+results.length, results);
res.jsonp(results);
}); // PriceHourly();
Here is my model:
// Model
var PriceHourlySchema = new Schema({
created: {
type: Date,
default: Date.now
},
day: {
type: String,
required: true,
trim: true
},
hour: {
type: String,
required: true,
trim: true
},
price: {
type: Number,
required: true
},
date: {
type: Date,
required: true
}
},
{
autoIndex: true
});
The short answer is "What is wrong with just expanding your date range to include all the days in a month?", and therefore that is all you need to change in order to get your result.
And could you "nest" grouping stages? Yes you can add additional stages to the pipeline, that is what the pipeline is for. So if you first wanted to "average" per day and then take the average over all the days of the month, you can form like this:
PriceHourly.aggregate([
{ "$match": {
"date": {
"$gte": new Date("2014-03-01"), "$lt": new Date("2014-04-01")
}
}},
{ "$group": {
"_id": "$day",
"price": { "$avg": "$price" },
"system_demand": { "$avg": "$system_demand" }
}},
{ "$group": {
"_id": null,
"price": { "$avg": "$price" },
"system_demand": { "$avg": "$system_demand" }
}}
])
Even though that is likely to be reasonably redundant as this can arguably be done with one single group statement.
But there is a longer commentary on this schema. You do not actually state much of the purpose of what you are doing other than obtaining an average, or what the schema is meant to contain. So I want to describe something that is maybe a bit different.
Suppose you have a collection that includes the "product", "type" the "current price" and the "timestamp" as a date when that "price" was "changed". Let us call the collection "PriceChange". So every time this event happens a new document is created.
{
"product": "ABC",
"type": 2,
"price": 110,
"timestamp": ISODate("2014-04-01T00:08:38.360Z")
}
This could change many times in an hour, a day or whatever the case.
So if you were interested in the "average" price per product over the month you could do this:
PriceChange.aggregate([
{ "$match": {
"timestamp": {
"$gte": new Date("2014-03-01"), "$lt": new Date("2014-04-01")
}
}},
{ "$group": {
"_id": "$product",
"price_avg": { "$avg": "$price" }
}}
])
Also, without any additional fields you can get the average price per product for each day of the month:
PriceChange.aggregate([
{ "$match": {
"timestamp": {
"$gte": new Date("2014-03-01"), "$lt": new Date("2014-04-01")
}
}},
{ "$group": {
"_id": {
"day": { "$dayOfMonth": "$timestamp" },
"product": "$product"
},
"price_avg": { "$avg": "$price" }
}}
])
Or you can even get the last price for each month over a whole year:
PriceChange.aggregate([
{ "$match": {
"timestamp": {
"$gte": new Date("2013-01-01"), "$lt": new Date("2014-01-01")
}
}},
{ "$group": {
"_id": {
"date": {
"year": { "$year" : "$timestamp" },
"month": { "$month": "$timestamp" }
},
"product": "$product"
},
"price_last": { "$last": "$price" }
}}
])
So those are some things you can do using the build in Date Aggregation Operators to achieve various results. These can even aid in collection of this information for writing into new "pre-aggregated" collections, to be used for faster analysis.
I suppose there would be one way to combine a "running" average against all prices using mapReduce. So again from my sample:
PriceHourly.mapReduce(
function () {
emit( this.timestamp.getDate(), this.price )
},
function (key, values) {
var sum = 0;
values.forEach(function(value) {
sum += value;
});
return ( sum / values.length );
},
{
"query": {
"timestamp": {
"$gte": new Date("2014-03-01"), "$lt": new Date("2014-04-01")
}
},
"out": { "inline": 1 },
"scope": { "running": 0, "counter": 0 },
"finalize": function(key,value) {
running += value;
counter++;
return { "dayAvg": value, "monthAvg": running / counter };
}
}
)
And that would return something like this:
{
"results" : [
{
"_id" : 1,
"value" : {
"dayAvg" : 105,
"monthAvg" : 105
}
},
{
"_id" : 2,
"value" : {
"dayAvg" : 110,
"monthAvg" : 107.5
}
}
],
}
But if you are otherwise expecting to see discrete values for both the day and the month, then that would not be possible without running separate queries.

Categories