I'm using mongoose to connect to Mongo DB.
At first this is my schema:
const mongoose = require("mongoose");
const Schema = mongoose.Schema;
const TestSchema = new Schema({
story: String,
seenByUser: [String],
medicationStage: {
type: String,
enum: [
"no-medication",
"just-after-medication",
"towards-end-of-medication",
],
required: true,
},
});
// Compile model from schema
const TestModel = mongoose.model("Test", TestSchema);
module.exports = {
Test: TestModel,
};
Here you can see I have a field called seenByUser which is an array of strings that is an array of user names. I'm setting up a data aggregation pipeline where I want to see given a user name fetch me all the documents where this user Name does not occur in seenByUser array. grouped by medicationStage. I'm unable to create this pipeline. Please help me out.
If I've undesrtood correctly you can try this aggregation:
First $match where does not exists yourValue into the array seenByUser.
Then $group by medicationStage. This create a nested array (because $push add the whole array)
And $project to use $reduce and flat the array.
db.collection.aggregate({
"$match": {
"seenByUser": {
"$ne": yourValue
}
}
},
{
"$group": {
"_id": "$medicationStage",
"seenByUser": {
"$push": "$seenByUser"
}
}
},
{
"$project": {
"_id": 0,
"medicationStage": "$_id",
"seenByUser": {
"$reduce": {
"input": "$seenByUser",
"initialValue": [],
"in": {
"$concatArrays": [
"$$value",
"$$this"
]
}
}
}
}
})
Example here
I'm struggling trying to find all the docs with a specific property in the child. For example, I want to find all the users with their child active.
These are my models
const userSchema = new mongoose.Schema({
child: {
type: mongoose.Schema.Types.ObjectId,
ref: 'child'
}
});
const childSchema = new mongoose.Schema({
active: {
type: Boolean,
default: true
}
});
I tried with populate and match ( .populate({path:'child', match: {active: true}})) but I'm getting all the users with the property child as null if not active. I need only the users with an active child. All my researches head to use the dot syntax, but for any reason I get an empty array. See below:
let usersWithActiveChild = await User.find({'child.active': true}));
console.log(usersWithActiveChild) // --> displays '[]'
Thanks for your help!
This can be accomplished easily by using aggregation framework.
First we join two collections with $lookup.
Lookup result is array, but our relation with User and Child is one to one, so we get the first item by using $arrayElemAt: ["$child", 0].
And lastly, we apply our filter "child.active": true, using $match.
Playground
let usersWithActiveChild = await User.aggregate([
{
$lookup: {
from: "childs", //must be PHYSICAL collection name
localField: "child",
foreignField: "_id",
as: "child",
},
},
{
$addFields: {
child: {
$arrayElemAt: ["$child", 0],
},
},
},
{
$match: {
"child.active": true,
},
},
]);
Sample docs:
db={
"users": [
{
"_id": ObjectId("5a834e000102030405000000"),
"child": ObjectId("5a934e000102030405000000")
},
{
"_id": ObjectId("5a834e000102030405000001"),
"child": ObjectId("5a934e000102030405000001")
},
{
"_id": ObjectId("5a834e000102030405000002"),
"child": ObjectId("5a934e000102030405000002")
},
],
"childs": [
{
"_id": ObjectId("5a934e000102030405000000"),
"active": true
},
{
"_id": ObjectId("5a934e000102030405000001"),
"active": false
},
{
"_id": ObjectId("5a934e000102030405000002"),
"active": true
}
]
}
Output:
[
{
"_id": ObjectId("5a834e000102030405000000"),
"child": {
"_id": ObjectId("5a934e000102030405000000"),
"active": true
}
},
{
"_id": ObjectId("5a834e000102030405000002"),
"child": {
"_id": ObjectId("5a934e000102030405000002"),
"active": true
}
}
]
Or as a better approach would be first getting activ childs, and then lookup with users like this:
db.childs.aggregate([
{
$match: {
"active": true
}
},
{
$lookup: {
from: "users",
localField: "_id",
foreignField: "child",
as: "user"
}
}
])
Playground
When you use a ref to refer to another schema, Mongoose stores the documents in separate collections in MongoDB.
The actual value stored in the child field of the user document is a DBRef.
If you were to look at the data directly in MongoDB you would find something similar to this:
User collection
{
_id: ObjectId("5a934e000102030405000000")
child: DBRef("child",ObjectId("5a934e000102030405000001"),"DatabaseName")
}
Child collection:
{
_id: ObjectId("5a934e000102030405000001"),
active: true
}
When you populate the user object, Mongoose fetches the user document, and then fetches the child. Since the user documents have been retrieved already, the match in the populate call filters the children, as you noted.
The dotted notation 'child.active' can only be used if the child is stored in MongoDB as a subdocument, like
{
_id: ObjectId("5a934e000102030405000000")
child:{
_id: ObjectId("5a934e000102030405000001"),
active: true
}
}
But your child is defined as a ref, so this will not be the case.
In order to filter the list of user documents based on the content of the referenced child, you will need to either
- populate with match as you have done and then filter the result set, or
- aggregate the user collection, lookup the child documents, and then match the child field.
I am trying to use $match to find items with a specific _id in a double embedded document.
I have a document called users which contains information such as name, email, and it also contains an embedded document which has the business this user is with.
I also have a document called businesses, which contains an embedded document which has the building that this business is in.
I also have a document called building.
I am trying to have a mongo query which returns all of the users with a business at a certain building ID.
I have an aggregate function which uses $lookup to match the users to the building they are in. and this does work. However now I am trying to use $match to only return the documents with a specific building id.
Here is an example of my user, business and building documents:
_id: 5ca487c0eeedbe8ab59d7a7a
name: "John Smith"
email: "jsmith9#gmail.com"
business: Object
_id: 5ca48481eeedbe8ab59d7a38
name: "Visitors"
_id: 5ca48481eeedbe8ab59d7a38
name: "Visitors"
building: Object
_id: 5ca48481eeedbe8ab59d7a36
name: "Building1"
_id: 5ca48481eeedbe8ab59d7a36
name: "Building1"
When I return the aggregated query it returns documents in the following format:
{
"_id": "5ca487c0eeedbe8ab59d7a7a",
"name": "John Smith",
"email": "jsmith9#gmail.com",
"business": {
"_id": "5ca48481eeedbe8ab59d7a38",
"name": "Visitors"
},
"__v": 0,
"user_building": {
"_id": "5ca48481eeedbe8ab59d7a38",
"name": "Visitors",
"building": {
"_id": "5ca48481eeedbe8ab59d7a36",
"name": "Building1"
},
"__v": 0
}
},
However when I add the match in, it returns []. What am i doing wrong here?
router.get("/:id", async (req, res) => {
const users_buildings = await User.aggregate([
{
$lookup: {
from: "businesses",
localField: "business._id",
foreignField: "_id",
as: "user_building"
}
},
{ $unwind: "$user_building" },
{
$match: {
"user_building.building": { _id: req.params.id }
}
}
]);
You need to match _id inside the building object. Try with this
{
$match: {
"user_building.building._id": req.params.id
}
}
if not working
{
$match: {
"user_building.building._id": ObjectId(req.params.id)
}
}
op edit: I imported ObjectId with:
var ObjectId = require('mongodb').ObjectID;
and used the second solution and it worked correctly.
I'm pretty new to Mongoose and MongoDB in general so I'm having a difficult time figuring out if something like this is possible:
Item = new Schema({
id: Schema.ObjectId,
dateCreated: { type: Date, default: Date.now },
title: { type: String, default: 'No Title' },
description: { type: String, default: 'No Description' },
tags: [ { type: Schema.ObjectId, ref: 'ItemTag' }]
});
ItemTag = new Schema({
id: Schema.ObjectId,
tagId: { type: Schema.ObjectId, ref: 'Tag' },
tagName: { type: String }
});
var query = Models.Item.find({});
query
.desc('dateCreated')
.populate('tags')
.where('tags.tagName').in(['funny', 'politics'])
.run(function(err, docs){
// docs is always empty
});
Is there a better way do this?
Edit
Apologies for any confusion. What I'm trying to do is get all Items that contain either the funny tag or politics tag.
Edit
Document without where clause:
[{
_id: 4fe90264e5caa33f04000012,
dislikes: 0,
likes: 0,
source: '/uploads/loldog.jpg',
comments: [],
tags: [{
itemId: 4fe90264e5caa33f04000012,
tagName: 'movies',
tagId: 4fe64219007e20e644000007,
_id: 4fe90270e5caa33f04000015,
dateCreated: Tue, 26 Jun 2012 00:29:36 GMT,
rating: 0,
dislikes: 0,
likes: 0
},
{
itemId: 4fe90264e5caa33f04000012,
tagName: 'funny',
tagId: 4fe64219007e20e644000002,
_id: 4fe90270e5caa33f04000017,
dateCreated: Tue, 26 Jun 2012 00:29:36 GMT,
rating: 0,
dislikes: 0,
likes: 0
}],
viewCount: 0,
rating: 0,
type: 'image',
description: null,
title: 'dogggg',
dateCreated: Tue, 26 Jun 2012 00:29:24 GMT
}, ... ]
With the where clause, I get an empty array.
With a modern MongoDB greater than 3.2 you can use $lookup as an alternate to .populate() in most cases. This also has the advantage of actually doing the join "on the server" as opposed to what .populate() does which is actually "multiple queries" to "emulate" a join.
So .populate() is not really a "join" in the sense of how a relational database does it. The $lookup operator on the other hand, actually does the work on the server, and is more or less analogous to a "LEFT JOIN":
Item.aggregate(
[
{ "$lookup": {
"from": ItemTags.collection.name,
"localField": "tags",
"foreignField": "_id",
"as": "tags"
}},
{ "$unwind": "$tags" },
{ "$match": { "tags.tagName": { "$in": [ "funny", "politics" ] } } },
{ "$group": {
"_id": "$_id",
"dateCreated": { "$first": "$dateCreated" },
"title": { "$first": "$title" },
"description": { "$first": "$description" },
"tags": { "$push": "$tags" }
}}
],
function(err, result) {
// "tags" is now filtered by condition and "joined"
}
)
N.B. The .collection.name here actually evaluates to the "string" that is the actual name of the MongoDB collection as assigned to the model. Since mongoose "pluralizes" collection names by default and $lookup needs the actual MongoDB collection name as an argument ( since it's a server operation ), then this is a handy trick to use in mongoose code, as opposed to "hard coding" the collection name directly.
Whilst we could also use $filter on arrays to remove the unwanted items, this is actually the most efficient form due to Aggregation Pipeline Optimization for the special condition of as $lookup followed by both an $unwind and a $match condition.
This actually results in the three pipeline stages being rolled into one:
{ "$lookup" : {
"from" : "itemtags",
"as" : "tags",
"localField" : "tags",
"foreignField" : "_id",
"unwinding" : {
"preserveNullAndEmptyArrays" : false
},
"matching" : {
"tagName" : {
"$in" : [
"funny",
"politics"
]
}
}
}}
This is highly optimal as the actual operation "filters the collection to join first", then it returns the results and "unwinds" the array. Both methods are employed so the results do not break the BSON limit of 16MB, which is a constraint that the client does not have.
The only problem is that it seems "counter-intuitive" in some ways, particularly when you want the results in an array, but that is what the $group is for here, as it reconstructs to the original document form.
It's also unfortunate that we simply cannot at this time actually write $lookup in the same eventual syntax the server uses. IMHO, this is an oversight to be corrected. But for now, simply using the sequence will work and is the most viable option with the best performance and scalability.
Addendum - MongoDB 3.6 and upwards
Though the pattern shown here is fairly optimized due to how the other stages get rolled into the $lookup, it does have one failing in that the "LEFT JOIN" which is normally inherent to both $lookup and the actions of populate() is negated by the "optimal" usage of $unwind here which does not preserve empty arrays. You can add the preserveNullAndEmptyArrays option, but this negates the "optimized" sequence described above and essentially leaves all three stages intact which would normally be combined in the optimization.
MongoDB 3.6 expands with a "more expressive" form of $lookup allowing a "sub-pipeline" expression. Which not only meets the goal of retaining the "LEFT JOIN" but still allows an optimal query to reduce results returned and with a much simplified syntax:
Item.aggregate([
{ "$lookup": {
"from": ItemTags.collection.name,
"let": { "tags": "$tags" },
"pipeline": [
{ "$match": {
"tags": { "$in": [ "politics", "funny" ] },
"$expr": { "$in": [ "$_id", "$$tags" ] }
}}
]
}}
])
The $expr used in order to match the declared "local" value with the "foreign" value is actually what MongoDB does "internally" now with the original $lookup syntax. By expressing in this form we can tailor the initial $match expression within the "sub-pipeline" ourselves.
In fact, as a true "aggregation pipeline" you can do just about anything you can do with an aggregation pipeline within this "sub-pipeline" expression, including "nesting" the levels of $lookup to other related collections.
Further usage is a bit beyond the scope of what the question here asks, but in relation to even "nested population" then the new usage pattern of $lookup allows this to be much the same, and a "lot" more powerful in it's full usage.
Working Example
The following gives an example using a static method on the model. Once that static method is implemented the call simply becomes:
Item.lookup(
{
path: 'tags',
query: { 'tags.tagName' : { '$in': [ 'funny', 'politics' ] } }
},
callback
)
Or enhancing to be a bit more modern even becomes:
let results = await Item.lookup({
path: 'tags',
query: { 'tagName' : { '$in': [ 'funny', 'politics' ] } }
})
Making it very similar to .populate() in structure, but it's actually doing the join on the server instead. For completeness, the usage here casts the returned data back to mongoose document instances at according to both the parent and child cases.
It's fairly trivial and easy to adapt or just use as is for most common cases.
N.B The use of async here is just for brevity of running the enclosed example. The actual implementation is free of this dependency.
const async = require('async'),
mongoose = require('mongoose'),
Schema = mongoose.Schema;
mongoose.Promise = global.Promise;
mongoose.set('debug', true);
mongoose.connect('mongodb://localhost/looktest');
const itemTagSchema = new Schema({
tagName: String
});
const itemSchema = new Schema({
dateCreated: { type: Date, default: Date.now },
title: String,
description: String,
tags: [{ type: Schema.Types.ObjectId, ref: 'ItemTag' }]
});
itemSchema.statics.lookup = function(opt,callback) {
let rel =
mongoose.model(this.schema.path(opt.path).caster.options.ref);
let group = { "$group": { } };
this.schema.eachPath(p =>
group.$group[p] = (p === "_id") ? "$_id" :
(p === opt.path) ? { "$push": `$${p}` } : { "$first": `$${p}` });
let pipeline = [
{ "$lookup": {
"from": rel.collection.name,
"as": opt.path,
"localField": opt.path,
"foreignField": "_id"
}},
{ "$unwind": `$${opt.path}` },
{ "$match": opt.query },
group
];
this.aggregate(pipeline,(err,result) => {
if (err) callback(err);
result = result.map(m => {
m[opt.path] = m[opt.path].map(r => rel(r));
return this(m);
});
callback(err,result);
});
}
const Item = mongoose.model('Item', itemSchema);
const ItemTag = mongoose.model('ItemTag', itemTagSchema);
function log(body) {
console.log(JSON.stringify(body, undefined, 2))
}
async.series(
[
// Clean data
(callback) => async.each(mongoose.models,(model,callback) =>
model.remove({},callback),callback),
// Create tags and items
(callback) =>
async.waterfall(
[
(callback) =>
ItemTag.create([{ "tagName": "movies" }, { "tagName": "funny" }],
callback),
(tags, callback) =>
Item.create({ "title": "Something","description": "An item",
"tags": tags },callback)
],
callback
),
// Query with our static
(callback) =>
Item.lookup(
{
path: 'tags',
query: { 'tags.tagName' : { '$in': [ 'funny', 'politics' ] } }
},
callback
)
],
(err,results) => {
if (err) throw err;
let result = results.pop();
log(result);
mongoose.disconnect();
}
)
Or a little more modern for Node 8.x and above with async/await and no additional dependencies:
const { Schema } = mongoose = require('mongoose');
const uri = 'mongodb://localhost/looktest';
mongoose.Promise = global.Promise;
mongoose.set('debug', true);
const itemTagSchema = new Schema({
tagName: String
});
const itemSchema = new Schema({
dateCreated: { type: Date, default: Date.now },
title: String,
description: String,
tags: [{ type: Schema.Types.ObjectId, ref: 'ItemTag' }]
});
itemSchema.statics.lookup = function(opt) {
let rel =
mongoose.model(this.schema.path(opt.path).caster.options.ref);
let group = { "$group": { } };
this.schema.eachPath(p =>
group.$group[p] = (p === "_id") ? "$_id" :
(p === opt.path) ? { "$push": `$${p}` } : { "$first": `$${p}` });
let pipeline = [
{ "$lookup": {
"from": rel.collection.name,
"as": opt.path,
"localField": opt.path,
"foreignField": "_id"
}},
{ "$unwind": `$${opt.path}` },
{ "$match": opt.query },
group
];
return this.aggregate(pipeline).exec().then(r => r.map(m =>
this({ ...m, [opt.path]: m[opt.path].map(r => rel(r)) })
));
}
const Item = mongoose.model('Item', itemSchema);
const ItemTag = mongoose.model('ItemTag', itemTagSchema);
const log = body => console.log(JSON.stringify(body, undefined, 2));
(async function() {
try {
const conn = await mongoose.connect(uri);
// Clean data
await Promise.all(Object.entries(conn.models).map(([k,m]) => m.remove()));
// Create tags and items
const tags = await ItemTag.create(
["movies", "funny"].map(tagName =>({ tagName }))
);
const item = await Item.create({
"title": "Something",
"description": "An item",
tags
});
// Query with our static
const result = (await Item.lookup({
path: 'tags',
query: { 'tags.tagName' : { '$in': [ 'funny', 'politics' ] } }
})).pop();
log(result);
mongoose.disconnect();
} catch (e) {
console.error(e);
} finally {
process.exit()
}
})()
And from MongoDB 3.6 and upward, even without the $unwind and $group building:
const { Schema, Types: { ObjectId } } = mongoose = require('mongoose');
const uri = 'mongodb://localhost/looktest';
mongoose.Promise = global.Promise;
mongoose.set('debug', true);
const itemTagSchema = new Schema({
tagName: String
});
const itemSchema = new Schema({
title: String,
description: String,
tags: [{ type: Schema.Types.ObjectId, ref: 'ItemTag' }]
},{ timestamps: true });
itemSchema.statics.lookup = function({ path, query }) {
let rel =
mongoose.model(this.schema.path(path).caster.options.ref);
// MongoDB 3.6 and up $lookup with sub-pipeline
let pipeline = [
{ "$lookup": {
"from": rel.collection.name,
"as": path,
"let": { [path]: `$${path}` },
"pipeline": [
{ "$match": {
...query,
"$expr": { "$in": [ "$_id", `$$${path}` ] }
}}
]
}}
];
return this.aggregate(pipeline).exec().then(r => r.map(m =>
this({ ...m, [path]: m[path].map(r => rel(r)) })
));
};
const Item = mongoose.model('Item', itemSchema);
const ItemTag = mongoose.model('ItemTag', itemTagSchema);
const log = body => console.log(JSON.stringify(body, undefined, 2));
(async function() {
try {
const conn = await mongoose.connect(uri);
// Clean data
await Promise.all(Object.entries(conn.models).map(([k,m]) => m.remove()));
// Create tags and items
const tags = await ItemTag.insertMany(
["movies", "funny"].map(tagName => ({ tagName }))
);
const item = await Item.create({
"title": "Something",
"description": "An item",
tags
});
// Query with our static
let result = (await Item.lookup({
path: 'tags',
query: { 'tagName': { '$in': [ 'funny', 'politics' ] } }
})).pop();
log(result);
await mongoose.disconnect();
} catch(e) {
console.error(e)
} finally {
process.exit()
}
})()
what you are asking for isn't directly supported but can be achieved by adding another filter step after the query returns.
first, .populate( 'tags', null, { tagName: { $in: ['funny', 'politics'] } } ) is definitely what you need to do to filter the tags documents. then, after the query returns you'll need to manually filter out documents that don't have any tags docs that matched the populate criteria. something like:
query....
.exec(function(err, docs){
docs = docs.filter(function(doc){
return doc.tags.length;
})
// do stuff with docs
});
Try replacing
.populate('tags').where('tags.tagName').in(['funny', 'politics'])
by
.populate( 'tags', null, { tagName: { $in: ['funny', 'politics'] } } )
Update: Please take a look at the comments - this answer does not correctly match to the question, but maybe it answers other questions of users which came across (I think that because of the upvotes) so I will not delete this "answer":
First: I know this question is really outdated, but I searched for exactly this problem and this SO post was the Google entry #1. So I implemented the docs.filter version (accepted answer) but as I read in the mongoose v4.6.0 docs we can now simply use:
Item.find({}).populate({
path: 'tags',
match: { tagName: { $in: ['funny', 'politics'] }}
}).exec((err, items) => {
console.log(items.tags)
// contains only tags where tagName is 'funny' or 'politics'
})
Hope this helps future search machine users.
After having the same problem myself recently, I've come up with the following solution:
First, find all ItemTags where tagName is either 'funny' or 'politics' and return an array of ItemTag _ids.
Then, find Items which contain all ItemTag _ids in the tags array
ItemTag
.find({ tagName : { $in : ['funny','politics'] } })
.lean()
.distinct('_id')
.exec((err, itemTagIds) => {
if (err) { console.error(err); }
Item.find({ tag: { $all: itemTagIds} }, (err, items) => {
console.log(items); // Items filtered by tagName
});
});
#aaronheckmann 's answer worked for me but I had to replace return doc.tags.length; to return doc.tags != null; because that field contain null if it doesn't match with the conditions written inside populate.
So the final code:
query....
.exec(function(err, docs){
docs = docs.filter(function(doc){
return doc.tags != null;
})
// do stuff with docs
});
I have a schema, Comment, like the one below. It's a system of "comments" and "replies", but each comment and reply has multiple versions. When a user wants to view a comment, I want to return just the most recent version with the status of APPROVED.
const Version = new mongoose.Schema({
user: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User'
},
body: String,
created: Date,
title: String,
status: {
type: String,
enum: [ 'APPROVED', 'OPEN', 'CLOSED' ]
}
})
const Reply = new mongoose.Schema({
user: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User'
},
created: Date,
versions: [ Version ]
})
const Comment = new mongoose.Schema({
user: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User'
},
created: Date,
versions: [ Version ],
replies: [ Reply ]
})
I've gotten the parent Comment to display how I want with the code below. However, I've had trouble applying that to the sub-document, Reply.
const requestedComment = yield Comment.aggregate([
{ $match: {
query
} },
{ $project: {
user: 1,
replies: 1,
versions: {
$filter: {
input: '$versions',
as: 'version',
cond: { $eq: [ '$$version.status', 'APPROVED' ] }
}
},
}},
{ "$unwind": "$versions" },
{ $sort: { 'versions.created': -1 } },
{ $group: {
_id: '$_id',
body: { $first: '$versions.body' },
title: { $first: '$versions.title' },
replies: { $first: '$replies' }
}}
])
.exec()
Any help achieving the same result with the replies subdocuments would be appreciated. I would like to return the most recent APPROVED version of each reply in a form like this:
comment: {
body: "The comment's body.",
user: ObjectId(...),
replies: [
{
body: "The reply's body."
user: ObjectId(...)
}
]
}
Basically you just need to continue the same process on from the existing pipeline. But this time to $unwind out the "versions" per each "replies" entry and $sort them there.
So these are "additional" stages to your pipeline.
// Unwind replies
{ "$unwind": "$replies" },
// Unwind inner versions
{ "$unwind": "$replies.versions" },
// Filter for only approved
{ "$match": { "replies.versions.status": "APPROVED" } },
// Sort on all "keys" and then the "version" date
{ "$sort": {
"_id": 1,
"replies._id": 1,
"replies.versions.created": -1
}},
// Group replies to get the latest version of each
{ "$group": {
"_id": {
"_id": "$_id",
"body": "$body",
"title": "$title",
"replyId": "$replies._id",
"replyUser": "$replies.user",
"replyCreated": "$replies.created"
},
"version": { "$first": "$replies.version" }
}},
// Push replies back into an array in the main document
{ "$group": {
"_id": "$_id._id",
"body": { "$first": "$_id.body" },
"title": { "$first": "$_id.title" },
"replies": {
"$push": {
"_id": "$_id.replyId",
"user": "$_id.replyUser" },
"created": "$_id.replyCreated", // <-- Value from Reply
"body": "$version.body", // <-- Value from specific Version
"title": "$version.title"
}
}
}}
All depending of course on which fields you want, being either from ther Reply or from the Version.
Whichever fields, since you "un-wound" two arrays, you $group back "twice".
Once to get the $first items after sorting per Reply
Once more to re-construct the "replies" array using $push
That's all there is too it.
If you were still looking at ways to "sort" the array "in-place" without using $unwind, well MongoDB just does not do that yet.
Bit of advice on your design
As a note, I see where you are going with this and this is the wrong model for the type of usage that you want.
It makes little sense to store "revision history" within the embdedded structure. You are rarely going to use it in general update and query operations, and as this demonstrates, most of the time you just want the "latest".
So just do that instead, and store a "flag" indicating "revisions" if really necessary. That data can then be stored external to the main structure, and you won't have to jump through these hoops just to get the "latest accepted version" on every request.