How could I aggregate among multiple collections at once with loop - javascript

Suppose Each collection has these common fields birthday, gender
How could I get the grouped birthday with year by group
expected result
group r01 {id: 1987, count: 21121}, {id: 1988, count: 22}, ...
The output should count for user_vip_r01 and user_general_r01
group r15 {id: 1986, count: 2121}, {id: 1985, count: 220}, ...
The output should count for user_vip_r15 and user_general_r15
I know how to write the group by year query,
But don't know how to write an loop to iterate all my collection with javascript.
And what if the collection name is part of irregular,
Something like user_old_r01, user_new_r01, user_bad_r01, should all be processed in group r01,
Is it possible to use regex to get it ?
group by year
pipeline = [
{ '$group':
'_id': '$year': '$birthday'
'count': '$sum': 1
}
{ '$sort': '_id': 1 }
]
cur = db[source_collection].runCommand('aggregate',
pipeline: pipeline_work.concat([{ '$out': output_collection_name}])
allowDiskUse: true)
collection list
"user_vip_r01",
"user_vip_r15",
"user_vip_r16",
"user_vip_r17",
"user_vip_r18",
"user_vip_r19",
"user_vip_r20",
"user_vip_r201",
....
"user_general_r01",
"user_general_r15",
"user_general_r16",
"user_general_r17",
"user_general_r18",
"user_general_r19",
"user_general_r20",
"user_general_r201",
...

Related

how to get array of fields in mongo aggregate

I am using a Mongo aggregated framework, suppose if I am having collection structure like this
{
{
_id: ObjectId(123)
name: john,
sessionDuration: 29
},
{
_id: ObjectId(456)
name: moore,
sessionDuration: 45
},
{
_id: ObjectId(789)
name: cary,
sessionDuration: 25
},
}
I want to query and create a pipeline such that it return something like this:
{
durationsArr: [29, 49, 25, '$sessionDuration_Field_From_Document' ];
}
I am doing this because I want to get average of durations from all the documents, so first adding all of it into an array, then I will add last stage where I do the $avg operation.
Any idea of how can I get the array of sessionDurationField. or do you have any other best approach to calculate the sessionDuration Average from the collection? please thoroughly explain am new to mongo aggregation.
$group - Group all documents.
1.1. $avg - Calculate the average of sessionDuration for all documents.
db.collection.aggregate([
{
$group: {
_id: null,
avgSessionDuration: {
$avg: "$sessionDuration"
}
}
}
])
Demo # Mongo Playground

how to $bucket only for unique document based on some field

I am using a Mongo aggregated framework, suppose if I am having collection structure like this
{
{
_id: ObjectId(123)
name: john,
sessionDuration: 29
},
{
_id: ObjectId(456)
name: moore,
sessionDuration: 45
},
{
_id: ObjectId(789)
name: john,
sessionDuration: 25
},
{
_id: ObjectId(910)
name: john,
sessionDuration: 45
},
etc...
}
user with the same name is the one who is using different sessions like in the following example: John is using service from three device with 3 sessions durations are: 2 less than 30 (29,25) and 1 less than 50(45).
I want to do a bucket query for boundaries [0,30,50] but in the range it must only count the user with a unique names, no same name user with less than 30 or 50 duration count more than one, means the result should look like this
{
time: Unique_Name_Users_Only_Lies_In_This_Boundary,
‘30’: 1,
‘50’: 2,
}
so john was having 2 sessions less than 30 duration so we only need 1 from these two.
What I tried:
I group all the docs first with unique name only, then apply bucket. but this approach will also skip the john with 45 sessionDuration.
How can I only get the unique name document count in a particular duration of $bucket boundary?
One option is to use the $bucket with $addToSet and then use $group with $arrayToObject to get your formatting:
db.collection.aggregate([
{$bucket: {
groupBy: "$sessionDuration",
boundaries: [0, 30, 50],
default: "Other",
output: {res: {$addToSet: "$name"}}
}},
{$group: {
_id: 0,
res: {$push: {k: {$toString: "$_id"}, v: {$size: "$res"}}}
}},
{$replaceRoot: {newRoot: {$arrayToObject: "$res"}}}
])
See how it works on the playground example
Notice that the _id of a bucket is its lower boundary. You can manipulate this if you really want, but I don't recommend it

Mongoose get all documents matching array intersection

By array intersection I mean, the inventory has a lot more elements than each document ingredients array, and the result I want to get from the query is all documents which all array elements are contained within the inventory. $all will get me zero results since the inventory has more elements than can be found in ingredients even if all ingredients are found within the inventory,
I have thousands of docs that have an array field of strings
{
...
recipe: "recipe1",
ingredients: [ "1 cup cooked quinoa", "6 tbsp butter", "1 large egg" ]
...
},
{
...
recipe: "recipe2",
ingredients: [ "2 lemons", "2 tbsp butter", "1 large egg" ]
...
}
{
...
recipe: "recipe3",
ingredients: [ "1lb salmon", "1 pinch pepper", "4 spears asparagus" ]
...
}
and I'm trying to find all documents where all elements in the ingredients array are contained in a sample array that contains lots of elements, lets say for the case this only contains this:
inventory = [ "lemons", "butter", "egg", "milk", "bread", "salmon", "asparagus", "pepper" ]
With this inventory array, I want to get recipe2 and recipe3.
Right now I have this inventory array and query (thanks to turivishal):
let inventory = ["lemons", "butter", "egg", "milk", "bread", "salmon", "asparagus", "pepper"];
inventory = inventory.map((i) => new RegExp(i, "i"));
query:
Recipe.find({
ingredients: { $all: inventory }
})
Expected result:
{
...
recipe: "recipe2",
ingredients: [ "2 lemons", "2 tbsp butter", "1 large egg" ]
...
}
{
...
recipe: "recipe3",
ingredients: [ "1lb salmon", "1 pinch pepper", "4 spears asparagus" ]
...
}
But I'm getting zero results
You can try aggregation operator in mquery using $expr expression condition,
first of all you can join the array of string by | order symbol and make a string, and use it in $regex search,
$filter to iterate loop of ingredients
$regexMatch to match element has any matching word
$size to get the total size of filtered elements
$eq to match filtered result and actual ingredients is equal
let inventory = ["lemons", "butter", "egg", "milk", "bread", "salmon", "asparagus", "pepper"];
let inventoryStr = inventory.join("|");
// "lemons|butter|egg|milk|bread|salmon|asparagus|pepper"
Recipe.find({
$expr: {
$eq: [
{
$size: {
$filter: {
input: "$ingredients",
cond: {
$regexMatch: {
input: "$$this",
regex: inventoryStr,
options: "i"
}
}
}
}
},
{ $size: "$ingredients" }
]
}
})
Playground

How to implement multi level sorting in javascript using lodash?

The problem :
A job board with filter system where the users can set tags preference.
I need to sort the list of jobs based on the number of tags being matched, multiple jobs can have the same tag count and then I have to sort the most recent job within each tag count group.
My approach right now : I group the jobs based on their tag match count. I neglect the ones which have zero count. I proceed to sort each group with their date in descending order.
Finally merge the collections in the descending order of tag count.
Is there a better of doing this in lodash or plain javascript using functional programming like reduce, map, filter.
DateA < DateB < Date C and so on...
Input:
[
{ tagCount: 2, date: DateA},
{ tagCount: 2, date: DateB},
{ tagCount: 1, date: DateC},
{ tagCount: 3, date: DateD},
]
Output:
[
{ tagCount: 3, date: DateD},
{ tagCount: 2, date: DateB},
{ tagCount: 2, date: DateA},
{ tagCount: 1, date: DateC}
]
You can use _.orderBy() and state ['tagCount', 'date'] as the sorting properties, and ['desc', 'desc'] as the respective sort order for each of them:
the desc as the sort order:
const input = [{ tagCount: 2, date: 'DateA'}, { tagCount: 2, date: 'DateB'}, { tagCount: 1, date: 'DateC'}, { tagCount: 3, date: 'DateD'}]
const result = _.orderBy(input, ['tagCount', 'date'], ['desc', 'desc'])
console.log(result)
<script src="https://cdnjs.cloudflare.com/ajax/libs/lodash.js/4.17.20/lodash.min.js" integrity="sha512-90vH1Z83AJY9DmlWa8WkjkV79yfS2n2Oxhsi2dZbIv0nC4E6m5AbH8Nh156kkM7JePmqD6tcZsfad1ueoaovww==" crossorigin="anonymous"></script>

Is it possible to find random documents in collection, without same fields? (monogdb\node.js)

For example, I have a collection users with the following structure:
{
_id: 1,
name: "John",
from: "Amsterdam"
},
{
_id: 2,
name: "John",
from: "Boston"
},
{
_id: 3,
name: "Mia",
from: "Paris"
},
{
_id: 4,
name: "Kate",
from: "London"
},
{
_id: 5,
name: "Kate",
from: "Moscow"
}
How can I get 3 random documents in which names will not be repeated?
Using the function getFourNumbers(1, 5), I get array with 3 non-repeating numbers and search by _id
var random_nums = getThreeNumbersnumbers(1, 5); // [2,3,1]
users.find({_id: {$in: random_nums}, function (err, data) {...} //[John, Mia, John]
But it can consist two Johns or two Kates, what is unwanted behavior. How can I get three random documents ( [John, Mia, Kate]. Not [John, Kate, Kate] or [John, Mia, John]) with 1 or maximum 2 queries? Kate or John (duplicated names) should be random, but should not be repeated.
There you go - see the comments in the code for further explanation of what the stages do:
users.aggregate(
[
{ // eliminate duplicates based on "name" field and keep track of first document of each group
$group: {
"_id": "$name",
"doc": { $first: "$$ROOT" }
}
},
{
// restore the original document structure
$replaceRoot: {
newRoot: "$doc"
}
},
{
// select 3 random documents from the result
$sample: {
size:3
}
}
])
As always with the aggrgation framework you can run the query with more or less stages added in order to see the transformations step by step.
I think what you are looking for is the $group aggregator, which will give you the distinct value of the collection. It can be used as:
db.users.aggregate( [ { $group : { name : "$name" } } ] );
MongoDB docs: Retrieve Distinct Values

Categories