Unused indexes in MongoDB

Unused indexes in MongoDB - javascript

I have the following code snippet to list unused indexes in Mongo but how can I exclude TTL indexes from the results?
db.getMongo().getDBNames().forEach(function (dbname) {
if (dbname != "admin") {
db.getSiblingDB(dbname).getCollectionNames().forEach(function (cname) {
output = db.getSiblingDB(dbname)[cname].aggregate({$indexStats:{} });
output.forEach(function(findUnused) {
if (findUnused.accesses.ops == 0 && findUnused.name != "_id_") {
print(dbname + " \t" + cname + " \t" + JSON.stringify(findUnused) );
}
})
})
}})
Typical outoput is below am not sure how to exclude the TTL indexes denoted by the expireAfterSeconds field.
dbname collection {"name":"collection","key":{"backupTime":1},"host":"server:27017","accesses":{"ops":0,"since":"2001-01-12T10:01:03.338Z"},"spec":{"v":2,"key":{"backupTime":1},"name":"collection","expireAfterSeconds":1234567}}

In your aggregation, after the $indexStats stage, you can add a $match with "spec.expireAfterSeconds": {$exists: false} to check for TTL indexs. That is a field unique to TTL indexs.
db.orders.aggregate([
{
$indexStats: {}
},
{
$match: {
"spec.expireAfterSeconds": {
$exists: false
},
"accesses.ops": 0,
"name": {
$ne: "_id_"
}
}
}
])
Mongo Playground

Related

Update or Upsert multiple document mongodb

I am working on a project using mongodb database, and I want to update or upsert multiple document.
So here is what I want to do:
if (you received an array of id )
{
create a verify condition which checks if there is a document for each id and respect the dateFin condition
}
else if (you received one id)
{
create a verify condition which checks if there is a document equal the id and respect the dateFin condition
}
then check if there is a document, then update it, else create it either you got an array or one id
if (Array.isArray(req.body.user) === true)
verify = {
idUser: { $in: req.body.user.map(id => { return ObjectId(id) }) },
dateFin: { $gt: moment().utc(1).format('YYYY-MM-DD HH:MM:SS') }
}
else verify =
{
idUser: ObjectId(req.body.user),
dateFin: { $gt: moment().utc(1).format('YYYY-MM-DD HH:MM:SS') }
}
db.collection("paiement").updateMany(verify,
{
$set: {
dateFin: moment(req.body.date).format('YYYY-MM-DD HH:MM:SS'),
paiementType: "offre",
abonnement: req.body.abonnement,
dateInsert: moment().utc(1).format('YYYY-MM-DD HH:MM:SS'),
},
},
{ upsert: true },
(err, document) => {
if (err) return res.status(400).json(err);
console.log(err);
console.log(document.result);
})
}
but it works with one id, and not with multiple ids, Thank you if there is any one can help me.

MongoDB $graphLookup inside update query [duplicate]

In MongoDB, is it possible to update the value of a field using the value from another field? The equivalent SQL would be something like:
UPDATE Person SET Name = FirstName + ' ' + LastName
And the MongoDB pseudo-code would be:
db.person.update( {}, { $set : { name : firstName + ' ' + lastName } );

The best way to do this is in version 4.2+ which allows using the aggregation pipeline in the update document and the updateOne, updateMany, or update(deprecated in most if not all languages drivers) collection methods.
MongoDB 4.2+
Version 4.2 also introduced the $set pipeline stage operator, which is an alias for $addFields. I will use $set here as it maps with what we are trying to achieve.
db.collection.<update method>(
{},
[
{"$set": {"name": { "$concat": ["$firstName", " ", "$lastName"]}}}
]
)
Note that square brackets in the second argument to the method specify an aggregation pipeline instead of a plain update document because using a simple document will not work correctly.
MongoDB 3.4+
In 3.4+, you can use $addFields and the $out aggregation pipeline operators.
db.collection.aggregate(
[
{ "$addFields": {
"name": { "$concat": [ "$firstName", " ", "$lastName" ] }
}},
{ "$out": <output collection name> }
]
)
Note that this does not update your collection but instead replaces the existing collection or creates a new one. Also, for update operations that require "typecasting", you will need client-side processing, and depending on the operation, you may need to use the find() method instead of the .aggreate() method.
MongoDB 3.2 and 3.0
The way we do this is by $projecting our documents and using the $concat string aggregation operator to return the concatenated string.
You then iterate the cursor and use the $set update operator to add the new field to your documents using bulk operations for maximum efficiency.
Aggregation query:
var cursor = db.collection.aggregate([
{ "$project": {
"name": { "$concat": [ "$firstName", " ", "$lastName" ] }
}}
])
MongoDB 3.2 or newer
You need to use the bulkWrite method.
var requests = [];
cursor.forEach(document => {
requests.push( {
'updateOne': {
'filter': { '_id': document._id },
'update': { '$set': { 'name': document.name } }
}
});
if (requests.length === 500) {
//Execute per 500 operations and re-init
db.collection.bulkWrite(requests);
requests = [];
}
});
if(requests.length > 0) {
db.collection.bulkWrite(requests);
}
MongoDB 2.6 and 3.0
From this version, you need to use the now deprecated Bulk API and its associated methods.
var bulk = db.collection.initializeUnorderedBulkOp();
var count = 0;
cursor.snapshot().forEach(function(document) {
bulk.find({ '_id': document._id }).updateOne( {
'$set': { 'name': document.name }
});
count++;
if(count%500 === 0) {
// Excecute per 500 operations and re-init
bulk.execute();
bulk = db.collection.initializeUnorderedBulkOp();
}
})
// clean up queues
if(count > 0) {
bulk.execute();
}
MongoDB 2.4
cursor["result"].forEach(function(document) {
db.collection.update(
{ "_id": document._id },
{ "$set": { "name": document.name } }
);
})

You should iterate through. For your specific case:
db.person.find().snapshot().forEach(
function (elem) {
db.person.update(
{
_id: elem._id
},
{
$set: {
name: elem.firstname + ' ' + elem.lastname
}
}
);
}
);

Apparently there is a way to do this efficiently since MongoDB 3.4, see styvane's answer.
Obsolete answer below
You cannot refer to the document itself in an update (yet). You'll need to iterate through the documents and update each document using a function. See this answer for an example, or this one for server-side eval().

For a database with high activity, you may run into issues where your updates affect actively changing records and for this reason I recommend using snapshot()
db.person.find().snapshot().forEach( function (hombre) {
hombre.name = hombre.firstName + ' ' + hombre.lastName;
db.person.save(hombre);
});
http://docs.mongodb.org/manual/reference/method/cursor.snapshot/

Starting Mongo 4.2, db.collection.update() can accept an aggregation pipeline, finally allowing the update/creation of a field based on another field:
// { firstName: "Hello", lastName: "World" }
db.collection.updateMany(
{},
[{ $set: { name: { $concat: [ "$firstName", " ", "$lastName" ] } } }]
)
// { "firstName" : "Hello", "lastName" : "World", "name" : "Hello World" }
The first part {} is the match query, filtering which documents to update (in our case all documents).
The second part [{ $set: { name: { ... } }] is the update aggregation pipeline (note the squared brackets signifying the use of an aggregation pipeline). $set is a new aggregation operator and an alias of $addFields.

Regarding this answer, the snapshot function is deprecated in version 3.6, according to this update. So, on version 3.6 and above, it is possible to perform the operation this way:
db.person.find().forEach(
function (elem) {
db.person.update(
{
_id: elem._id
},
{
$set: {
name: elem.firstname + ' ' + elem.lastname
}
}
);
}
);

I tried the above solution but I found it unsuitable for large amounts of data. I then discovered the stream feature:
MongoClient.connect("...", function(err, db){
var c = db.collection('yourCollection');
var s = c.find({/* your query */}).stream();
s.on('data', function(doc){
c.update({_id: doc._id}, {$set: {name : doc.firstName + ' ' + doc.lastName}}, function(err, result) { /* result == true? */} }
});
s.on('end', function(){
// stream can end before all your updates do if you have a lot
})
})

update() method takes aggregation pipeline as parameter like
db.collection_name.update(
{
// Query
},
[
// Aggregation pipeline
{ "$set": { "id": "$_id" } }
],
{
// Options
"multi": true // false when a single doc has to be updated
}
)
The field can be set or unset with existing values using the aggregation pipeline.
Note: use $ with field name to specify the field which has to be read.

Here's what we came up with for copying one field to another for ~150_000 records. It took about 6 minutes, but is still significantly less resource intensive than it would have been to instantiate and iterate over the same number of ruby objects.
js_query = %({
$or : [
{
'settings.mobile_notifications' : { $exists : false },
'settings.mobile_admin_notifications' : { $exists : false }
}
]
})
js_for_each = %(function(user) {
if (!user.settings.hasOwnProperty('mobile_notifications')) {
user.settings.mobile_notifications = user.settings.email_notifications;
}
if (!user.settings.hasOwnProperty('mobile_admin_notifications')) {
user.settings.mobile_admin_notifications = user.settings.email_admin_notifications;
}
db.users.save(user);
})
js = "db.users.find(#{js_query}).forEach(#{js_for_each});"
Mongoid::Sessions.default.command('$eval' => js)

With MongoDB version 4.2+, updates are more flexible as it allows the use of aggregation pipeline in its update, updateOne and updateMany. You can now transform your documents using the aggregation operators then update without the need to explicity state the $set command (instead we use $replaceRoot: {newRoot: "$$ROOT"})
Here we use the aggregate query to extract the timestamp from MongoDB's ObjectID "_id" field and update the documents (I am not an expert in SQL but I think SQL does not provide any auto generated ObjectID that has timestamp to it, you would have to automatically create that date)
var collection = "person"
agg_query = [
{
"$addFields" : {
"_last_updated" : {
"$toDate" : "$_id"
}
}
},
{
$replaceRoot: {
newRoot: "$$ROOT"
}
}
]
db.getCollection(collection).updateMany({}, agg_query, {upsert: true})

(I would have posted this as a comment, but couldn't)
For anyone who lands here trying to update one field using another in the document with the c# driver...
I could not figure out how to use any of the UpdateXXX methods and their associated overloads since they take an UpdateDefinition as an argument.
// we want to set Prop1 to Prop2
class Foo { public string Prop1 { get; set; } public string Prop2 { get; set;} }
void Test()
{
var update = new UpdateDefinitionBuilder<Foo>();
update.Set(x => x.Prop1, <new value; no way to get a hold of the object that I can find>)
}
As a workaround, I found that you can use the RunCommand method on an IMongoDatabase (https://docs.mongodb.com/manual/reference/command/update/#dbcmd.update).
var command = new BsonDocument
{
{ "update", "CollectionToUpdate" },
{ "updates", new BsonArray
{
new BsonDocument
{
// Any filter; here the check is if Prop1 does not exist
{ "q", new BsonDocument{ ["Prop1"] = new BsonDocument("$exists", false) }},
// set it to the value of Prop2
{ "u", new BsonArray { new BsonDocument { ["$set"] = new BsonDocument("Prop1", "$Prop2") }}},
{ "multi", true }
}
}
}
};
database.RunCommand<BsonDocument>(command);

MongoDB 4.2+ Golang
result, err := collection.UpdateMany(ctx, bson.M{},
mongo.Pipeline{
bson.D{{"$set",
bson.M{"name": bson.M{"$concat": []string{"$lastName", " ", "$firstName"}}}
}},
)

Getting a list of unique fields within a collection including nested [duplicate]

I'd like to get the names of all the keys in a MongoDB collection.
For example, from this:
db.things.insert( { type : ['dog', 'cat'] } );
db.things.insert( { egg : ['cat'] } );
db.things.insert( { type : [] } );
db.things.insert( { hello : [] } );
I'd like to get the unique keys:
type, egg, hello

You could do this with MapReduce:
mr = db.runCommand({
"mapreduce" : "my_collection",
"map" : function() {
for (var key in this) { emit(key, null); }
},
"reduce" : function(key, stuff) { return null; },
"out": "my_collection" + "_keys"
})
Then run distinct on the resulting collection so as to find all the keys:
db[mr.result].distinct("_id")
["foo", "bar", "baz", "_id", ...]

With Kristina's answer as inspiration, I created an open source tool called Variety which does exactly this: https://github.com/variety/variety

You can use aggregation with the new $objectToArray aggregation operator in version 3.4.4 to convert all top key-value pairs into document arrays, followed by $unwind and $group with $addToSet to get distinct keys across the entire collection. (Use $$ROOT for referencing the top level document.)
db.things.aggregate([
{"$project":{"arrayofkeyvalue":{"$objectToArray":"$$ROOT"}}},
{"$unwind":"$arrayofkeyvalue"},
{"$group":{"_id":null,"allkeys":{"$addToSet":"$arrayofkeyvalue.k"}}}
])
You can use the following query for getting keys in a single document.
db.things.aggregate([
{"$match":{_id: "<<ID>>"}}, /* Replace with the document's ID */
{"$project":{"arrayofkeyvalue":{"$objectToArray":"$$ROOT"}}},
{"$project":{"keys":"$arrayofkeyvalue.k"}}
])

A cleaned up and reusable solution using pymongo:
from pymongo import MongoClient
from bson import Code
def get_keys(db, collection):
client = MongoClient()
db = client[db]
map = Code("function() { for (var key in this) { emit(key, null); } }")
reduce = Code("function(key, stuff) { return null; }")
result = db[collection].map_reduce(map, reduce, "myresults")
return result.distinct('_id')
Usage:
get_keys('dbname', 'collection')
>> ['key1', 'key2', ... ]

If your target collection is not too large, you can try this under mongo shell client:
var allKeys = {};
db.YOURCOLLECTION.find().forEach(function(doc){Object.keys(doc).forEach(function(key){allKeys[key]=1})});
allKeys;

If you are using mongodb 3.4.4 and above then you can use below aggregation using $objectToArray and $group aggregation
db.collection.aggregate([
{ "$project": {
"data": { "$objectToArray": "$$ROOT" }
}},
{ "$project": { "data": "$data.k" }},
{ "$unwind": "$data" },
{ "$group": {
"_id": null,
"keys": { "$addToSet": "$data" }
}}
])
Here is the working example

Try this:
doc=db.thinks.findOne();
for (key in doc) print(key);

Using python. Returns the set of all top-level keys in the collection:
#Using pymongo and connection named 'db'
reduce(
lambda all_keys, rec_keys: all_keys | set(rec_keys),
map(lambda d: d.keys(), db.things.find()),
set()
)

Here is the sample worked in Python:
This sample returns the results inline.
from pymongo import MongoClient
from bson.code import Code
mapper = Code("""
function() {
for (var key in this) { emit(key, null); }
}
""")
reducer = Code("""
function(key, stuff) { return null; }
""")
distinctThingFields = db.things.map_reduce(mapper, reducer
, out = {'inline' : 1}
, full_response = True)
## do something with distinctThingFields['results']

I am surprise, no one here has ans by using simple javascript and Set logic to automatically filter the duplicates values, simple example on mongo shellas below:
var allKeys = new Set()
db.collectionName.find().forEach( function (o) {for (key in o ) allKeys.add(key)})
for(let key of allKeys) print(key)
This will print all possible unique keys in the collection name: collectionName.

I think the best way do this as mentioned here is in mongod 3.4.4+ but without using the $unwind operator and using only two stages in the pipeline. Instead we can use the $mergeObjects and $objectToArray operators.
In the $group stage, we use the $mergeObjects operator to return a single document where key/value are from all documents in the collection.
Then comes the $project where we use $map and $objectToArray to return the keys.
let allTopLevelKeys = [
{
"$group": {
"_id": null,
"array": {
"$mergeObjects": "$$ROOT"
}
}
},
{
"$project": {
"keys": {
"$map": {
"input": { "$objectToArray": "$array" },
"in": "$$this.k"
}
}
}
}
];
Now if we have a nested documents and want to get the keys as well, this is doable. For simplicity, let consider a document with simple embedded document that look like this:
{field1: {field2: "abc"}, field3: "def"}
{field1: {field3: "abc"}, field4: "def"}
The following pipeline yield all keys (field1, field2, field3, field4).
let allFistSecondLevelKeys = [
{
"$group": {
"_id": null,
"array": {
"$mergeObjects": "$$ROOT"
}
}
},
{
"$project": {
"keys": {
"$setUnion": [
{
"$map": {
"input": {
"$reduce": {
"input": {
"$map": {
"input": {
"$objectToArray": "$array"
},
"in": {
"$cond": [
{
"$eq": [
{
"$type": "$$this.v"
},
"object"
]
},
{
"$objectToArray": "$$this.v"
},
[
"$$this"
]
]
}
}
},
"initialValue": [
],
"in": {
"$concatArrays": [
"$$this",
"$$value"
]
}
}
},
"in": "$$this.k"
}
}
]
}
}
}
]
With a little effort, we can get the key for all subdocument in an array field where the elements are object as well.

This works fine for me:
var arrayOfFieldNames = [];
var items = db.NAMECOLLECTION.find();
while(items.hasNext()) {
var item = items.next();
for(var index in item) {
arrayOfFieldNames[index] = index;
}
}
for (var index in arrayOfFieldNames) {
print(index);
}

Maybe slightly off-topic, but you can recursively pretty-print all keys/fields of an object:
function _printFields(item, level) {
if ((typeof item) != "object") {
return
}
for (var index in item) {
print(" ".repeat(level * 4) + index)
if ((typeof item[index]) == "object") {
_printFields(item[index], level + 1)
}
}
}
function printFields(item) {
_printFields(item, 0)
}
Useful when all objects in a collection has the same structure.

To get a list of all the keys minus _id, consider running the following aggregate pipeline:
var keys = db.collection.aggregate([
{ "$project": {
"hashmaps": { "$objectToArray": "$$ROOT" }
} },
{ "$group": {
"_id": null,
"fields": { "$addToSet": "$hashmaps.k" }
} },
{ "$project": {
"keys": {
"$setDifference": [
{
"$reduce": {
"input": "$fields",
"initialValue": [],
"in": { "$setUnion" : ["$$value", "$$this"] }
}
},
["_id"]
]
}
}
}
]).toArray()[0]["keys"];

I know I am late to the party, but if you want a quick solution in python finding all keys (even the nested ones) you could do with a recursive function:
def get_keys(dl, keys=None):
keys = keys or []
if isinstance(dl, dict):
keys += dl.keys()
list(map(lambda x: get_keys(x, keys), dl.values()))
elif isinstance(dl, list):
list(map(lambda x: get_keys(x, keys), dl))
return list(set(keys))
and use it like:
dl = db.things.find_one({})
get_keys(dl)
if your documents do not have identical keys you can do:
dl = db.things.find({})
list(set(list(map(get_keys, dl))[0]))
but this solution can for sure be optimized.
Generally this solution is basically solving finding keys in nested dicts, so this is not mongodb specific.

Based on #Wolkenarchitekt answer: https://stackoverflow.com/a/48117846/8808983, I write a script that can find patterns in all keys in the db and I think it can help others reading this thread:
"""
Python 3
This script get list of patterns and print the collections that contains fields with this patterns.
"""
import argparse
import pymongo
from bson import Code
# initialize mongo connection:
def get_db():
client = pymongo.MongoClient("172.17.0.2")
db = client["Data"]
return db
def get_commandline_options():
description = "To run use: python db_fields_pattern_finder.py -p <list_of_patterns>"
parser = argparse.ArgumentParser(description=description)
parser.add_argument('-p', '--patterns', nargs="+", help='List of patterns to look for in the db.', required=True)
return parser.parse_args()
def report_matching_fields(relevant_fields_by_collection):
print("Matches:")
for collection_name in relevant_fields_by_collection:
if relevant_fields_by_collection[collection_name]:
print(f"{collection_name}: {relevant_fields_by_collection[collection_name]}")
# pprint(relevant_fields_by_collection)
def get_collections_names(db):
"""
:param pymongo.database.Database db:
:return list: collections names
"""
return db.list_collection_names()
def get_keys(db, collection):
"""
See: https://stackoverflow.com/a/48117846/8808983
:param db:
:param collection:
:return:
"""
map = Code("function() { for (var key in this) { emit(key, null); } }")
reduce = Code("function(key, stuff) { return null; }")
result = db[collection].map_reduce(map, reduce, "myresults")
return result.distinct('_id')
def get_fields(db, collection_names):
fields_by_collections = {}
for collection_name in collection_names:
fields_by_collections[collection_name] = get_keys(db, collection_name)
return fields_by_collections
def get_matches_fields(fields_by_collections, patterns):
relevant_fields_by_collection = {}
for collection_name in fields_by_collections:
relevant_fields = [field for field in fields_by_collections[collection_name] if
[pattern for pattern in patterns if
pattern in field]]
relevant_fields_by_collection[collection_name] = relevant_fields
return relevant_fields_by_collection
def main(patterns):
"""
:param list patterns: List of strings to look for in the db.
"""
db = get_db()
collection_names = get_collections_names(db)
fields_by_collections = get_fields(db, collection_names)
relevant_fields_by_collection = get_matches_fields(fields_by_collections, patterns)
report_matching_fields(relevant_fields_by_collection)
if __name__ == '__main__':
args = get_commandline_options()
main(args.patterns)

As per the mongoldb documentation, a combination of distinct
Finds the distinct values for a specified field across a single collection or view and returns the results in an array.
and indexes collection operations are what would return all possible values for a given key, or index:
Returns an array that holds a list of documents that identify and describe the existing indexes on the collection
So in a given method one could do use a method like the following one, in order to query a collection for all it's registered indexes, and return, say an object with the indexes for keys (this example uses async/await for NodeJS, but obviously you could use any other asynchronous approach):
async function GetFor(collection, index) {
let currentIndexes;
let indexNames = [];
let final = {};
let vals = [];
try {
currentIndexes = await collection.indexes();
await ParseIndexes();
//Check if a specific index was queried, otherwise, iterate for all existing indexes
if (index && typeof index === "string") return await ParseFor(index, indexNames);
await ParseDoc(indexNames);
await Promise.all(vals);
return final;
} catch (e) {
throw e;
}
function ParseIndexes() {
return new Promise(function (result) {
let err;
for (let ind in currentIndexes) {
let index = currentIndexes[ind];
if (!index) {
err = "No Key For Index "+index; break;
}
let Name = Object.keys(index.key);
if (Name.length === 0) {
err = "No Name For Index"; break;
}
indexNames.push(Name[0]);
}
return result(err ? Promise.reject(err) : Promise.resolve());
})
}
async function ParseFor(index, inDoc) {
if (inDoc.indexOf(index) === -1) throw "No Such Index In Collection";
try {
await DistinctFor(index);
return final;
} catch (e) {
throw e
}
}
function ParseDoc(doc) {
return new Promise(function (result) {
let err;
for (let index in doc) {
let key = doc[index];
if (!key) {
err = "No Key For Index "+index; break;
}
vals.push(new Promise(function (pushed) {
DistinctFor(key)
.then(pushed)
.catch(function (err) {
return pushed(Promise.resolve());
})
}))
}
return result(err ? Promise.reject(err) : Promise.resolve());
})
}
async function DistinctFor(key) {
if (!key) throw "Key Is Undefined";
try {
final[key] = await collection.distinct(key);
} catch (e) {
final[key] = 'failed';
throw e;
}
}
}
So querying a collection with the basic _id index, would return the following (test collection only has one document at the time of the test):
Mongo.MongoClient.connect(url, function (err, client) {
assert.equal(null, err);
let collection = client.db('my db').collection('the targeted collection');
GetFor(collection, '_id')
.then(function () {
//returns
// { _id: [ 5ae901e77e322342de1fb701 ] }
})
.catch(function (err) {
//manage your error..
})
});
Mind you, this uses methods native to the NodeJS Driver. As some other answers have suggested, there are other approaches, such as the aggregate framework. I personally find this approach more flexible, as you can easily create and fine-tune how to return the results. Obviously, this only addresses top-level attributes, not nested ones.
Also, to guarantee that all documents are represented should there be secondary indexes (other than the main _id one), those indexes should be set as required.

We can achieve this by Using mongo js file. Add below code in your getCollectionName.js file and run js file in the console of Linux as given below :
mongo --host 192.168.1.135 getCollectionName.js
db_set = connect("192.168.1.135:27017/database_set_name"); // for Local testing
// db_set.auth("username_of_db", "password_of_db"); // if required
db_set.getMongo().setSlaveOk();
var collectionArray = db_set.getCollectionNames();
collectionArray.forEach(function(collectionName){
if ( collectionName == 'system.indexes' || collectionName == 'system.profile' || collectionName == 'system.users' ) {
return;
}
print("\nCollection Name = "+collectionName);
print("All Fields :\n");
var arrayOfFieldNames = [];
var items = db_set[collectionName].find();
// var items = db_set[collectionName].find().sort({'_id':-1}).limit(100); // if you want fast & scan only last 100 records of each collection
while(items.hasNext()) {
var item = items.next();
for(var index in item) {
arrayOfFieldNames[index] = index;
}
}
for (var index in arrayOfFieldNames) {
print(index);
}
});
quit();
Thanks #ackuser

Following the thread from #James Cropcho's answer, I landed on the following which I found to be super easy to use. It is a binary tool, which is exactly what I was looking for:
mongoeye.
Using this tool it took about 2 minutes to get my schema exported from command line.

I know this question is 10 years old but there is no C# solution and this took me hours to figure out. I'm using the .NET driver and System.Linq to return a list of the keys.
var map = new BsonJavaScript("function() { for (var key in this) { emit(key, null); } }");
var reduce = new BsonJavaScript("function(key, stuff) { return null; }");
var options = new MapReduceOptions<BsonDocument, BsonDocument>();
var result = await collection.MapReduceAsync(map, reduce, options);
var list = result.ToEnumerable().Select(item => item["_id"].ToString());

This one lines extracts all keys from a collection into a comma separated sorted string:
db.<collection>.find().map((x) => Object.keys(x)).reduce((a, e) => {for (el of e) { if(!a.includes(el)) { a.push(el) } }; return a}, []).sort((a, b) => a.toLowerCase() > b.toLowerCase()).join(", ")
The result of this query typically looks like this:
_class, _id, address, city, companyName, country, emailId, firstName, isAssigned, isLoggedIn, lastLoggedIn, lastName, location, mobile, printName, roleName, route, state, status, token

I extended Carlos LM's solution a bit so it's more detailed.
Example of a schema:
var schema = {
_id: 123,
id: 12,
t: 'title',
p: 4.5,
ls: [{
l: 'lemma',
p: {
pp: 8.9
}
},
{
l: 'lemma2',
p: {
pp: 8.3
}
}
]
};
Type into the console:
var schemafy = function(schema, i, limit) {
var i = (typeof i !== 'undefined') ? i : 1;
var limit = (typeof limit !== 'undefined') ? limit : false;
var type = '';
var array = false;
for (key in schema) {
type = typeof schema[key];
array = (schema[key] instanceof Array) ? true : false;
if (type === 'object') {
print(Array(i).join(' ') + key+' <'+((array) ? 'array' : type)+'>:');
schemafy(schema[key], i+1, array);
} else {
print(Array(i).join(' ') + key+' <'+type+'>');
}
if (limit) {
break;
}
}
}
Run:
schemafy(db.collection.findOne());
Output
_id <number>
id <number>
t <string>
p <number>
ls <object>:
0 <object>:
l <string>
p <object>:
pp <number>

I was trying to write in nodejs and finally came up with this:
db.collection('collectionName').mapReduce(
function() {
for (var key in this) {
emit(key, null);
}
},
function(key, stuff) {
return null;
}, {
"out": "allFieldNames"
},
function(err, results) {
var fields = db.collection('allFieldNames').distinct('_id');
fields
.then(function(data) {
var finalData = {
"status": "success",
"fields": data
};
res.send(finalData);
delteCollection(db, 'allFieldNames');
})
.catch(function(err) {
res.send(err);
delteCollection(db, 'allFieldNames');
});
});
After reading the newly created collection "allFieldNames", delete it.
db.collection("allFieldNames").remove({}, function (err,result) {
db.close();
return;
});

I have 1 simpler work around...
What you can do is while inserting data/document into your main collection "things" you must insert the attributes in 1 separate collection lets say "things_attributes".
so every time you insert in "things", you do get from "things_attributes" compare values of that document with your new document keys if any new key present append it in that document and again re-insert it.
So things_attributes will have only 1 document of unique keys which you can easily get when ever you require by using findOne()

Update field in sub document mongoose

My parent model
var GameChampSchema = new Schema({
name: String,
gameId: { type: String, unique: true },
status: Number,
countPlayers: {type: Number, default: 0},
companies: [
{
name: String,
login: String,
pass: String,
userId: ObjectId
}
],
createdAt: {type: Date, default: Date.now},
updateAt: Date
})
I need insert userId property in first child where he is not set
So, need this action only on parent with condition ({status: 0, countPlayers: { $lt: 10 })

Since this is an embedded document it is quite easy:
If you want to update a document that is the first element of the array, that doesn't have a userId
db.collection.update(
{
"status": 0,
"countPlayers": {"$lt": 10 },
"companies.userId": {"$exists": false }
},
{ "$set": {"companies.$.userId": userId } }
)
Which would be nice, but apparently this doesn't match how MongoDB processes the logic and it considers that nothing matches if there is something in the array that does have the field present. You could get that element using the aggregation framework but that doesn't help in finding the position, which we need.
A simplified proposal is where there are no elements in the array at all:
db.collection.update(
{
"status": 0,
"countPlayers": {"$lt": 10 },
"companies.0": {"$exists": false }
},
{ "$push": {"userId": userId } }
)
And that just puts a new thing on the array.
The logical thing to me is that you actually know something about this entry and you just want to set the userId field. So I would match on the login:
db.collection.update(
{
"status": 0,
"countPlayers": {"$lt": 10 },
"companies.login": login,
},
{ "$set": {"companies.$.userId": userId } }
)
As a final thing if this is just updating the first element in the array then we don't need to match the position, as we already know where it is:
db.collection.update(
{
status: 0,
countPlayers: {"$lt": 10 }
},
{ $set: { "companies.0.userId": userId } }
)
Tracing back to my logical case, see the document structure:
{
"_id" : ObjectId("530de54e1f41d9f0a260d4cd"),
"status" : 0,
"countPlayers" : 5,
"companies" : [
{ "login" : "neil" },
{ "login" : "fred", "userId" : ObjectId("530de6221f41d9f0a260d4ce") },
{ "login": "bill" },
]
}
So if what you are looking for is finding "the first document where there is no userId", then this doesn't make sense as there are several items and you already have a specific userId to update. That means you must mean one of them. How do we tell which one? It seems by the use case that you are trying to match the information that is there to an userId based on information you have.
Logic says, look for the key value that you know, and update the position that matches.
Just substituting the db.collection part for your model object for use with Mongoose.
See the documentation on $exists, as well as $set and $push for the relevant details.

Big thanks.
I solved his problem
exports.joinGame = function(req, res) {
winston.info('start method');
winston.info('header content type: %s', req.headers['content-type']);
//достаем текущего пользователя
var currentUser = service.getCurrentUser(req);
winston.info('current username %s', currentUser.username);
//формируем запрос для поиска игры
var gameQuery = {"status": 0, "close": false};
gameChamp.findOne(gameQuery, {}, {sort: {"createdAt": 1 }}, function(error, game) {
if (error) {
winston.error('error %s', error);
res.send(error);
}
//если игра нашлась
if (game) {
winston.info('Append current user to game: %s', game.name);
//добавляем userId только к одной компании
var updateFlag = false;
for (var i=0; i<game.companies.length; i++) {
if (!game.companies[i].userId && !updateFlag) {
game.companies[i].userId = currentUser._id;
updateFlag = true;
winston.info('Credentials for current user %s', game.companies[i]);
//если пользовател последний закрываем игру и отправляем в bw, что игра укомплектована
if (i == (game.companies.length-1)) {
game.close = true;
winston.info('game %s closed', game.name);
}
}
}
//сохраняем игру в MongoDB
game.save(function(error, game) {
if (error) {
winston.error('error %s', error);
res.send(error);
}
if (game) {
res.send({ game: game })
winston.info('Append successful to game %s', game.name);
}
});
}
});
}

How to replace string in all documents in Mongo

I need to replace a string in certain documents. I have googled this code, but it unfortunately does not change anything. I am not sure about the syntax on the line bellow:
pulpdb = db.getSisterDB("pulp_database");
var cursor = pulpdb.repos.find();
while (cursor.hasNext()) {
var x = cursor.next();
x['source']['url'].replace('aaa', 'bbb'); // is this correct?
db.foo.update({_id : x._id}, x);
}
I would like to add some debug prints to see what the value is, but I have no experience with MongoDB Shell. I just need to replace this:
{ "source": { "url": "http://aaa/xxx/yyy" } }
with
{ "source": { "url": "http://bbb/xxx/yyy" } }

It doesn't correct generally: if you have string http://aaa/xxx/aaa (yyy equals to aaa) you'll end up with http://bbb/xxx/bbb.
But if you ok with this, code will work.
To add debug info use print function:
var cursor = db.test.find();
while (cursor.hasNext()) {
var x = cursor.next();
print("Before: "+x['source']['url']);
x['source']['url'] = x['source']['url'].replace('aaa', 'bbb');
print("After: "+x['source']['url']);
db.test.update({_id : x._id}, x);
}
(And by the way, if you want to print out objects, there is also printjson function)

The best way to do this if you are on MongoDB 2.6 or newer is looping over the cursor object using the .forEach method and update each document usin "bulk" operations for maximum efficiency.
var bulk = db.collection.initializeOrderedBulkOp();
var count = 0;
db.collection.find().forEach(function(doc) {
print("Before: "+doc.source.url);
bulk.find({ '_id': doc._id }).update({
'$set': { 'source.url': doc.source.url.replace('aaa', 'bbb') }
})
count++;
if(count % 200 === 0) {
bulk.execute();
bulk = db.collection.initializeOrderedBulkOp();
}
// Clean up queues
if (count > 0)
bulk.execute();
From MongoDB 3.2 the Bulk() API and its associated methods are deprecated you will need to use the db.collection.bulkWrite() method.
You will need loop over the cursor, build your query dynamically and $push each operation to an array.
var operations = [];
db.collection.find().forEach(function(doc) {
print("Before: "+doc.source.url);
var operation = {
updateOne: {
filter: { '_id': doc._id },
update: {
'$set': { 'source.url': doc.source.url.replace('aaa', 'bbb') }
}
}
};
operations.push(operation);
})
operations.push({
ordered: true,
writeConcern: { w: "majority", wtimeout: 5000 }
})
db.collection.bulkWrite(operations);

Nowadays,
starting Mongo 4.2, db.collection.updateMany (alias of db.collection.update) can accept an aggregation pipeline, finally allowing the update of a field based on its own value.
starting Mongo 4.4, the new aggregation operator $replaceOne makes it very easy to replace part of a string.
// { "source" : { "url" : "http://aaa/xxx/yyy" } }
// { "source" : { "url" : "http://eee/xxx/yyy" } }
db.collection.updateMany(
{ "source.url": { $regex: /aaa/ } },
[{
$set: { "source.url": {
$replaceOne: { input: "$source.url", find: "aaa", replacement: "bbb" }
}}
}]
)
// { "source" : { "url" : "http://bbb/xxx/yyy" } }
// { "source" : { "url" : "http://eee/xxx/yyy" } }
The first part ({ "source.url": { $regex: /aaa/ } }) is the match query, filtering which documents to update (the ones containing "aaa")
The second part ($set: { "source.url": {...) is the update aggregation pipeline (note the squared brackets signifying the use of an aggregation pipeline):
$set is a new aggregation operator (Mongo 4.2) which in this case replaces the value of a field.
The new value is computed with the new $replaceOne operator. Note how source.url is modified directly based on the its own value ($source.url).
Note that this is fully handled server side which won't allow you to perform the debug printing part of your question.

MongoDB can do string search/replace via mapreduce. Yes, you need to have a very special data structure for it -- you can't have anything in the top keys but you need to store everything under a subdocument under value. Like this:
{
"_id" : ObjectId("549dafb0a0d0ca4ed723e37f"),
"value" : {
"title" : "Top 'access denied' errors",
"parent" : "system.admin_reports",
"p" : "\u0001\u001a%"
}
}
Once you have this neatly set up you can do:
$map = new \MongoCode("function () {
this.value['p'] = this.value['p'].replace('$from', '$to');
emit(this._id, this.value);
}");
$collection = $this->mongoCollection();
// This won't be called.
$reduce = new \MongoCode("function () { }");
$collection_name = $collection->getName();
$collection->db->command([
'mapreduce' => $collection_name,
'map' => $map,
'reduce' => $reduce,
'out' => ['merge' => $collection_name],
'query' => $query,
'sort' => ['_id' => 1],
]);

We Keep Coding

JavaScript is the programming language of the Web.

Unused indexes in MongoDB - javascript

Related

Update or Upsert multiple document mongodb

MongoDB $graphLookup inside update query [duplicate]

Getting a list of unique fields within a collection including nested [duplicate]

Update field in sub document mongoose

How to replace string in all documents in Mongo

Categories

Resources