i want to convert field 'districId' that has long data-type to keyword/text for wildcard search. please guid me how can convert data-type from long to keyword/text data-type in elasticsearch
PUT geoxingsite/_mapping
{
"properties": {
"districtId": {
"type": "keyword"
}
}
}
i am getting error below...
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "mapper [districtId] cannot be changed from type [long] to [keyword]"
}
],
"type" : "illegal_argument_exception",
"reason" : "mapper [districtId] cannot be changed from type [long] to [keyword]"
},
"status" : 400
}
You cannot change the type of a field once it's created. However, you can add a sub-field like this:
PUT geoxingsite/_mapping
{
"properties": {
"districtId": {
"type": "long",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
When done you need to update your index in place using
POST geoxingsite/_update_by_query?wait_for_completion=false
When the task has run, you'll have a new field called districtId.keyword on which you can run your wildcard queries
I have a collection of timestamps which record what actions are performed by users at which time. For now, the collection consists of only two actions start and end. There can only be a single end action, while there can be multiple start actions per user.
Now I want a generate a list of users where the time difference between the last start action and the end action is - for example - less than a minute.
The simplified documents in my collection timestamps look like this:
document #1
{
id: 123,
user: "user1",
type: "start",
date: 2019-09-10
}
document #2
{
id: 234,
user: "user1",
type: "end",
date: 2019-09-11
}
Now the result I want should look like this:
{
id: null,
list: ["user1, user2"]
}
The field list should contain every user, where the time difference between the start and end action is less than a minute.
I am having trouble combining the documents which contain the start and end attribute. I was trying to combine them into documents that looks like this:
{
id: 345
user: "user1"
date_start: 2019-09-10
date_end: 2019-09-11
}
I don't know where to start with the aggregation pipeline and how to split and combine the different types of timestamps. Furthermore, I still need to add a field that contains the difference between both dates.
The following query can get us the expected output:
db.collection.aggregate([
{
$sort:{
"date":-1
}
},
{
$group:{
"_id":{
"id":"$id",
"type":"$type"
},
"id":{
$first:"$id"
},
"user":{
$first:"$user"
},
"type":{
$first:"$type"
},
"date":{
$first:"$date"
}
}
},
{
$group:{
"_id":"$id",
"user":{
$first:"$user"
},
"info":{
$push:{
"k":"$type",
"v":"$date"
}
}
}
},
{
$addFields:{
"info":{
$arrayToObject:"$info"
}
}
},
{
$match:{
$expr:{
$lt:[
{
$subtract:[
{
$toDate:"$info.end"
},
{
$toDate:"$info.start"
}
]
},
60000
]
}
}
},
{
$group:{
"_id":null,
"users":{
$push:"$user"
}
}
},
{
$project:{
"_id":0
}
}
]).pretty()
Data set:
{
"_id" : ObjectId("5d77a117bd4e75c58d598214"),
"id" : 123,
"user" : "user1",
"type" : "start",
"date" : "2019-09-10T13:01:14.242Z"
}
{
"_id" : ObjectId("5d77a117bd4e75c58d598215"),
"id" : 123,
"user" : "user1",
"type" : "start",
"date" : "2019-09-10T13:04:14.242Z"
}
{
"_id" : ObjectId("5d77a117bd4e75c58d598216"),
"id" : 123,
"user" : "user1",
"type" : "start",
"date" : "2019-09-10T13:09:02.242Z"
}
{
"_id" : ObjectId("5d77a117bd4e75c58d598217"),
"id" : 123,
"user" : "user1",
"type" : "end",
"date" : "2019-09-10T13:09:14.242Z"
}
{
"_id" : ObjectId("5d77a117bd4e75c58d598218"),
"id" : 234,
"user" : "user2",
"type" : "start",
"date" : "2019-09-10T13:02:02.242Z"
}
{
"_id" : ObjectId("5d77a117bd4e75c58d598219"),
"id" : 234,
"user" : "user2",
"type" : "end",
"date" : "2019-09-10T13:09:14.242Z"
}
{
"_id" : ObjectId("5d77a117bd4e75c58d59821a"),
"id" : 345,
"user" : "user3",
"type" : "start",
"date" : "2019-09-10T13:08:55.242Z"
}
{
"_id" : ObjectId("5d77a117bd4e75c58d59821b"),
"id" : 345,
"user" : "user3",
"type" : "end",
"date" : "2019-09-10T13:09:14.242Z"
}
Output:
{ "users" : [ "user3", "user1" ] }
Query analysis:
Stage I: Sorting the documents in descending order of the date
Stage II: Grouping on [id, type] and picking the first date for
each type i.e. the latest date for each type
Stage III: Grouping only on id and pushing the type and associated date into an array as key-value pairs
Stage IV: Converting the array of key-value pairs into an object
Stage V: Filtering documents which has the difference between end and start date less than 60000 ms. (milliseconds equivalent of 1 minute)
Stage VI: Pushing all filtered names into an array
I've got a sample document that I'm trying to project within a MongoDB aggregate pipeline. I'm testing with a single document that looks roughly like this:
{
"_id" : "",
"title" : "Questions",
"sortIndex" : 0,
"topics" : [
{
"_id" : "",
"title" : "Creating a Question",
"sortIndex" : 1,
"thumbnail" : "CreatingAQuestion.jpg",
"seenBy" : [ "user101", "user202" ],
"pages" : [
{
"visual" : "SelectPlanets.gif",
"text" : "Some Markdown"
}
]
},
{
"_id" : "",
"title" : "Deleting a Question",
"sortIndex" : 0,
"thumbnail" : "DeletingAQuestion.jpg",
"seenBy" : [ "user101" ],
"pages" : [
{
"visual" : "SelectCard.gif",
"text" : "Some Markdown"
}
]
}
]
}
The output I'm trying to obtain is something along these lines:
{
"_id" : "",
"title" : "Questions",
"topics" : [
{
"title" : "Creating a Question",
"thumbnail" : "CreatingAQuestion.jpg",
"seen" : true
},
{
"title" : "Deleting a Question",
"thumbnail" : "DeletingAQuestion.jpg",
"seen" : false
}
]
}
Specifically the bit I'm struggling with is the seen flag.
I've read the docs which state:
When projecting or adding/resetting a field within an embedded document...
... Or you can nest the fields:
contact: { address: { country: <1 or 0 or expression> } }
I wish to use an expression and I took note of the following:
When nesting the fields, you cannot use dot notation inside the embedded document to specify the field, e.g. contact: { "address.country": <1 or 0 or expression> } is invalid.
So I'm trying to work out how to "reference" a subdocument field within an expression. That quote suggests I can't use dot notation but when I can't seem to get it working with nested notation either. Here's what I've got so far:
db
.getCollection('chapters')
.aggregate([
{
$project: {
title: 1,
topics: {
title: 1,
thumbnail: 1,
publishedAt: 1,
test: "$seenBy",
seen: { $in: ["user202", "$seenBy"] },
}
}
}
])
So I've hard coded user202 into my query for now, and expected to see true and false for the 2 documents. I've also put in a test field to map out the seenBy field from the sub-document. What this produces is:
{
"_id" : "",
"title" : "Questions",
"topics" : [
{
"title" : "Creating a Question",
"thumbnail" : "CreatingAQuestion.jpg",
"test" : [
"user101",
"user202"
],
"seen" : true
},
{
"title" : "Deleting a Question",
"thumbnail" : "DeletingAQuestion.jpg",
"test" : [
"user101",
"user202"
],
"seen" : true
}
]
}
So obviously my "$seenBy" isn't accessing the correct topic because the test field contains the data from the 1st document.
So ultimately my question is, how can I access the seenBy field within a subdocument, referring to the current subdocument so I can create an expression?
Note: I have got this working with multiple $project and an $unwind but wanted to try compress/clean it up a bit.
You really need to use $map here. Simply notating the array in projection ( which is a bit of a boon since MongoDB 3.2 ) does not really cut it when you need a localized value for the current element. That is what you need and it's what $map provides:
db.getCollection('chapters').aggregate([
{ $project: {
title: 1,
topics: {
$map: {
input: "$topics",
as: "t",
in: {
title: "$$t.title",
thumbnail: "$$t.thumbnail",
publishedAt: "$$t.publishedAt",
test: "$$t.seenBy",
seen: { $in: ["user202", "$$t.seenBy"] },
}
}
}}
])
So for each element the current value of "seenBy" as a property is being tested by the expression. Without the $map that is not possible, and you can only really notate the "whole" array. Which is really not what you want to test here.
I have a projection stage as follows,
{
'name': {$ifNull: [ '$invName', {} ]},,
'info.type': {$ifNull: [ '$invType', {} ]},
'info.qty': {$ifNull: [ '$invQty', {} ]},
'info.detailed.desc': {$ifNull: [ '$invDesc', {} ]}
}
I am projecting empty object({}) in case of a field not present, because if sorting is performed in a field and the field doesn't exist, that document is coming first in sort order(Sort Documents Without Existing Field to End of Results). Next stage is sorting and wanted non-existing fields to come last in sorting order. This is working as expected.
Now, I want to remove those fields which are having empty object as values(if info.detailed.desc is empty info.detailed should not be there in output). I could do this in node level using lodash like this(https://stackoverflow.com/a/38278831/6048928). But I am trying to do this in mongodb level. Is it possible? I tried $redact, but it is filtering out entire document. Is is possible to PRUNE or DESCEND fields of a document based on value?
Removing properties completely from documents is not a trivial thing. The basics are that the server itself has not had any way of doing this prior to MongoDB 3.4 and the introduction of $replaceRoot, which essentially allows an expression to be returned as the document context.
Even with that addition it's somewhat impractical to do so without further features of $objectToArray and $arrayToObject as introduced in MongoDB 3.4.4. But to run through the cases.
Working with a quick sample
{ "_id" : ObjectId("59adff0aad465e105d91374c"), "a" : 1 }
{ "_id" : ObjectId("59adff0aad465e105d91374d"), "a" : {} }
Conditionally return root object
db.junk.aggregate([
{ "$replaceRoot": {
"newRoot": {
"$cond": {
"if": { "$ne": [ "$a", {} ] },
"then": "$$ROOT",
"else": { "_id": "$_id" }
}
}
}}
])
That's a pretty simple principle and can in fact be applied to any nested property to remove it's sub-keys but would require various levels of nesting $cond or even $switch to apply possible conditions. The $replaceRoot of course is needed for "top level" removal since it's the only way to conditionally express top level keys to return.
So whilst you can in theory use $cond or $switch to decide what to return, it's generally cumbersome and you would want something more flexible.
Filter the Empty Objects
db.junk.aggregate([
{ "$replaceRoot": {
"newRoot": {
"$arrayToObject": {
"$filter": {
"input": { "$objectToArray": "$$ROOT" },
"cond": { "$ne": [ "$$this.v", {} ] }
}
}
}
}}
])
This is where $objectToArray and $arrayToObject come into use. Instead of writing out the conditions for every possibly key we just convert the object contents into an "array" and apply $filter on the array entries to decide what to keep.
The $objectToArray translates any object into an array of documents representing each property as "k" for the name of the key and "v" for the value from that property. Since these are now accessible as "values", then you can use methods like $filter to inspect the each array entry and discard the unwanted ones.
Finally $arrayToObject takes the "filtered" content and translates those "k" and "v" values back into property names and values as a resulting object. In this way, the "filter" conditions removes any properties from the result object that did not meet the criteria.
A Return to $cond
db.junk.aggregate([
{ "$project": {
"a": { "$cond": [{ "$eq": [ "$a", {} ] }, "$$REMOVE", "$a" ] }
}}
])
MongoDB 3.6 introduces a new player with the $$REMOVE constant. This is a new feature that can be applied with $cond in order to decide whether or not to show the property at all. So that is another approach when of course the release is available.
In all those above cases the "a" property is not returned when the value is the empty object that we wanted to test for removal.
{ "_id" : ObjectId("59adff0aad465e105d91374c"), "a" : 1 }
{ "_id" : ObjectId("59adff0aad465e105d91374d") }
More Complex Structures
Your specific ask here is for data containing nested properties. So continuing on from the outlined approaches we can work with demonstrating how that is done.
First some sample data:
{ "_id" : ObjectId("59ae03bdad465e105d913750"), "a" : 1, "info" : { "type" : 1, "qty" : 2, "detailed" : { "desc" : "this thing" } } }
{ "_id" : ObjectId("59ae03bdad465e105d913751"), "a" : 2, "info" : { "type" : 2, "qty" : 3, "detailed" : { "desc" : { } } } }
{ "_id" : ObjectId("59ae03bdad465e105d913752"), "a" : 3, "info" : { "type" : 3, "qty" : { }, "detailed" : { "desc" : { } } } }
{ "_id" : ObjectId("59ae03bdad465e105d913753"), "a" : 4, "info" : { "type" : { }, "qty" : { }, "detailed" : { "desc" : { } } } }
Applying the filter method
db.junk.aggregate([
{ "$replaceRoot": {
"newRoot": {
"$arrayToObject": {
"$filter": {
"input": {
"$concatArrays": [
{ "$filter": {
"input": { "$objectToArray": "$$ROOT" },
"cond": { "$ne": [ "$$this.k", "info" ] }
}},
[
{
"k": "info",
"v": {
"$arrayToObject": {
"$filter": {
"input": { "$objectToArray": "$info" },
"cond": {
"$not": {
"$or": [
{ "$eq": [ "$$this.v", {} ] },
{ "$eq": [ "$$this.v.desc", {} ] }
]
}
}
}
}
}
}
]
]
},
"cond": { "$ne": [ "$$this.v", {} ] }
}
}
}
}}
])
This needs more complex handling because of the nested levels. In the main case here you need to look at the "info" key here independently and remove any sub-properties that do not qualify first. Since you need to return "something", we basically then need to remove the "info" key itself when all of it's inner properties are removed. This is the reason for the nested filter operations on each set of results.
Applying $cond with $$REMOVE
Where available this would at first seem a more logical choice, so it helps to look at this from the most simplified form first:
db.junk.aggregate([
{ "$addFields": {
"info.type": {
"$cond": [
{ "$eq": [ "$info.type", {} ] },
"$$REMOVE",
"$info.type"
]
},
"info.qty": {
"$cond": [
{ "$eq": [ "$info.qty", {} ] },
"$$REMOVE",
"$info.qty"
]
},
"info.detailed.desc": {
"$cond": [
{ "$eq": [ "$info.detailed.desc", {} ] },
"$$REMOVE",
"$info.detailed.desc"
]
}
}}
])
But then you need to look at the output this actually produces:
/* 1 */
{
"_id" : ObjectId("59ae03bdad465e105d913750"),
"a" : 1.0,
"info" : {
"type" : 1.0,
"qty" : 2.0,
"detailed" : {
"desc" : "this thing"
}
}
}
/* 2 */
{
"_id" : ObjectId("59ae03bdad465e105d913751"),
"a" : 2.0,
"info" : {
"type" : 2.0,
"qty" : 3.0,
"detailed" : {}
}
}
/* 3 */
{
"_id" : ObjectId("59ae03bdad465e105d913752"),
"a" : 3.0,
"info" : {
"type" : 3.0,
"detailed" : {}
}
}
/* 4 */
{
"_id" : ObjectId("59ae03bdad465e105d913753"),
"a" : 4.0,
"info" : {
"detailed" : {}
}
}
Whilst the other keys are removed the "info.detailed" still stays around because there is nothing that actually tests at this level. In fact you simply cannot express this in simple terms, so the only way to work around this is to evaluate the object as an expression and then apply additional filtering an conditions on each level of output to see where the empty objects still reside, and remove them:
db.junk.aggregate([
{ "$addFields": {
"info": {
"$let": {
"vars": {
"info": {
"$arrayToObject": {
"$filter": {
"input": {
"$objectToArray": {
"type": { "$cond": [ { "$eq": [ "$info.type", {} ] },"$$REMOVE", "$info.type" ] },
"qty": { "$cond": [ { "$eq": [ "$info.qty", {} ] },"$$REMOVE", "$info.qty" ] },
"detailed": {
"desc": { "$cond": [ { "$eq": [ "$info.detailed.desc", {} ] },"$$REMOVE", "$info.detailed.desc" ] }
}
}
},
"cond": { "$ne": [ "$$this.v", {} ] }
}
}
}
},
"in": { "$cond": [ { "$eq": [ "$$info", {} ] }, "$$REMOVE", "$$info" ] }
}
}
}}
])
That approach as with the plain $filter method actually removes "all" empty objects from the results:
/* 1 */
{
"_id" : ObjectId("59ae03bdad465e105d913750"),
"a" : 1.0,
"info" : {
"type" : 1.0,
"qty" : 2.0,
"detailed" : {
"desc" : "this thing"
}
}
}
/* 2 */
{
"_id" : ObjectId("59ae03bdad465e105d913751"),
"a" : 2.0,
"info" : {
"type" : 2.0,
"qty" : 3.0
}
}
/* 3 */
{
"_id" : ObjectId("59ae03bdad465e105d913752"),
"a" : 3.0,
"info" : {
"type" : 3.0
}
}
/* 4 */
{
"_id" : ObjectId("59ae03bdad465e105d913753"),
"a" : 4.0
}
Doing it all in Code
So everything here really depends on latest features or indeed "coming features" to be available in the MongoDB version you are using. Where these are not available the alternate approach is to simply remove the empty objects from the results returned by the cursor.
It's often the most sane thing to do, and really is all you require unless the aggregation pipeline needs to continue past the point where the fields are being removed. Even then, you probably should be logically working around that and leave the final results to cursor processing.
As JavaScript for the shell you can use the following approach, and the principles essentially stay the same no matter which actual language implementation:
db.junk.find().map( d => {
let info = Object.keys(d.info)
.map( k => ({ k, v: d.info[k] }))
.filter(e => !(
typeof e.v === 'object' &&
( Object.keys(e.v).length === 0 || Object.keys(e.v.desc).length === 0 )
))
.reduce((acc,curr) => Object.assign(acc,{ [curr.k]: curr.v }),{});
delete d.info;
return Object.assign(d,(Object.keys(info).length !== 0) ? { info } : {})
})
Which is pretty much the native language way of stating the same as the examples above being that where one of the expected properties contains an empty object, remove that property from the output completely.
I have removed the brands object in the output JSON using $project at end of the aggregation pipeline
db.Product.aggregate([
{
$lookup: {
from: "wishlists",
let: { product: "$_id" },
pipeline: [
{
$match: {
$and: [
{ $expr: { $eq: ["$$product", "$product"] } },
{ user: userId }
]
}
}
],
as: "isLiked"
}
},
{
$lookup: {
from: "brands",
localField: "brand",
foreignField: "_id",
as: "brands"
}
},
{
$addFields: {
isLiked: { $arrayElemAt: ["$isLiked.isLiked", 0] }
}
},
{
$unwind: "$brands"
},
{
$addFields: {
"brand.name": "$brands.name" ,
"brand._id": "$brands._id"
}
},
{
$match:{ isActive: true }
},
{
$project: { "brands" : 0 }
}
]);
$group: {
_id: '$_id',
tasks: {
$addToSet: {
$cond: {
if: {
$eq: [
{
$ifNull: ['$tasks.id', ''],
},
'',
],
},
then: '$$REMOVE',
else: {
id: '$tasks.id',
description: '$tasks.description',
assignee: {
$cond: {
if: {
$eq: [
{
$ifNull: ['$tasks.assignee._id', ''],
},
'',
],
},
then: undefined,
else: {
id: '$tasks.assignee._id',
name: '$tasks.assignee.name',
thumbnail: '$tasks.assignee.thumbnail',
status: '$tasks.assignee.status',
},
},
},
},
},
},
},
}