I am running a bulk insert with continueOnError on, so that I can insert an array of documents to a collection with a unique constraint; adding documents that pass, ignoring the ones that fail.
However, despite the continueOnError flag on, Mongo still returns the error; this makes the framework I use think that there's a problem. I would know how to suppress all errors, but this bulk insert is only one of many operations, and I'd still need to see errors from those (mind you, I'd want to see errors of any other kind than 11000 from this bulk insert too).
I am using the official MongoDB driver under Node.js; and I'm controlling the flow with ff.
How do I suppress the unique failures for the bulk insert, without suppressing other errors, or disrupting the flow?
var mongo_options = {
collection: { strict: true },
insert: { w: 1, strict: true },
insertbulk: { w: 1, strict: true, continueOnError: true, keepGoing: true }
}
var f = ff(this, function () {
this.connection.collection(table, mongo_options.collection, f.slot());
}, function (collection) {
// DON'T SUPPRESS THIS
collection.insert(data, mongo_options.insert, f.slot());
}, function () {
this.connection.collection("tags", mongo_options.collection, f.slot());
}, function (collection) {
// SUPPRESS THIS
collection.insert(tag_array, mongo_options.insertbulk, f.slot());
}).cb(next); // <-- error going out into the world
Related
what is the best way to update many records with different data ?
I'm doing it like this
const updateBody = JSON.parse(req.body);
try {
for (let object of updateBody) {
await prisma.comissions.upsert({
where: {
producer: object.producer,
},
update: {
rate: object.rate,
},
create: object,
});
}
I'm being able to update it, but it's taking a really long time to do so. I'm aware of transaction, but i'm not sure how to use it.
In Prisma transaction query is used in two ways.
Sequential operations: Pass an array of Prisma Client queries to be executed sequentially inside of a transaction.
Interactive transactions: Pass a function that can contain user code including Prisma Client queries, non-Prisma code, and other control flow to be executed in a transaction.
In our case we should use the interactive transaction, Because it contain user code, To use the callback function in the Prisma transaction, we need to add a preview feature to the Prisma.schema file
generator client {
provider = "prisma-client-js"
previewFeatures = ["interactiveTransactions"]
}
prisma.$transaction(async(prisma) => {
try {
for (let object of updateBody) {
await prisma.comissions.upsert({
where: {
producer: object.producer,
},
update: {
rate: object.rate,
},
create: object,
});
}
});
I wrote a service that analyses videos with Google Cloud Video Intelligence
And I save the analysis results to the MongoDB with mongoose
This is the model I use (I've simplified everything to avoid confusion):
// Video.js
const mongoose = require('mongoose');
const videoSchema = new mongoose.Schema({
analysis_progress: {
percent: { type: Number, required: true },
details: {}
},
status: {
type: String,
enum: ['idle', 'processing', 'done', 'failed'],
default: 'idle'
}
});
module.exports = mongoose.model('Video', videoSchema);
When analyse operation ends, I call the function below and run update like this:
function detectFaces(video, results) {
//Build query
let update = {
$set: {
'analysis_results.face_annotations': results.faceDetectionAnnotations // results is the the test result
}
};
Video.findOneAndUpdate({ _id: video._id }, update, { new: true }, (err, result) => {
if (!err)
return console.log("Succesfully saved faces annotiations:", video._id);
throw err // This is the line error thrown
});
}
And this is the error I get:
Error: cyclic dependency detected
at serializeObject (C:\Users\murat\OneDrive\Masaüstü\bycape\media-analysis-api\node_modules\bson\lib\bson\parser\serializer.js:333:34)
at serializeInto (C:\Users\murat\OneDrive\Masaüstü\bycape\media-analysis-api\node_modules\bson\lib\bson\parser\serializer.js:947:17)
...
Solutions I tried:
Added {autoIndex: false} inside db config.
mongoose.connect(process.env.DB_CONNECTION, {useNewUrlParser: true, useUnifiedTopology: true, useFindAndModify: false, autoIndex: false });
Removing retryWrites=true from Mongo URI structure. (I didn't have that parameter in my connection URI already)
So, I think the source of the problem is that I am saving the whole test result but I don't have any other option to do that. I need to save as it is.
I am open to all kinds of suggestions.
Just as I guessed, the problem was that there was a cyclic dependency in the object that came to me from google.
With help of my colleague:
Then since JSON.stringify() changes an object into simple types:
string, number, array, object, boolean it is not capable of storing
references to objects therefor by using stringify and then parse you
destroy the information that stringify cannot convert.
Another way would be knowing which field held the cyclic reference and
then unsetting, or deleting that field.
I couldn't find which field has cycylic dependency so I used I JSON.stringfy() and JSON.parse() to remove it.
let videoAnnotiations = JSON.stringify(operationResult.annotationResults[0]);
videoAnnotiations = JSON.parse(videoAnnotiations);
I want to implement a follow system between users.
For that, I want to display all of the 250 users of my app, then add a checkmark button next to the ones I already follow, and an empty button next to the ones I do not follow.
var usersRef = firebase.database().ref(‘/users’);
var followingRef = firebase.database().ref(‘/followingByUser’);
var displayedUsers = [];
// I loop through all users of my app
usersRef.once('value', users => {
users.forEach(user => {
// For each user, I check if I already follow him or not
followingRef.child(myUid).child(user.key).once('value', follow => {
if (follow.val()) {
// I do follow this user, follow button is on
displayedUsers.push({
name: user.val().name,
following: true
});
} else {
// I do not follow this user, follow button is off
displayedUsers.push({
name: user.val().name,
following: false
});
}
})
})
})
When doing that, I often (not always) get the following error: "Error: Firebase Database (4.1.3) INTERNAL ASSERT FAILED: sendRequest call when we're not connected not allowed."
Eventually, all the data is fetched, but after 10 seconds instead of 1 (without the error).
I do not believe it is an internet connection issue, as I have a very fast and stable wifi.
Is it a bad practice to nest queries like that?
If not, why do I get this error?
My data is structured as below:
users: {
userId1: {
name: User 1,
email: email#exemple.com,
avatar: url.com
},
userId2: {
name: User 2,
email: email#exemple.com,
avatar: url.com
},
...
}
followByUser: {
userId1: {
userId2: true,
userId10: true,
userId223: true
},
userId2: {
userId23: true,
userId100: true,
userId203: true
},
...
}
Your current database structure allows you to efficiently look up who each user is following. As you've found out it does not allow you to look who a user is follow by. If you also want to allow an efficient lookup of the latter, you should add additional data to your model:
followedByUser: {
userId2: {
userId1: true,
}
userId10: {
userId1: true,
},
userId223: {
userId1: true,
},
...
}
This is a quite common pattern in Firebase and other NoSQL databases: you often expand your data model to allow the use-cases that your app needs.
Also see my explanation on modeling many-to-many relations and the AskFirebase video on the same topic.
I am trying to use findAndModify with the node.js mongodb module monk.This is the method that I am using,this throws a 500 error in my cmd:
notesCollection.findAndModify({_id:_id},[],{_id:_id,title:title,content:content},{'new':true,'upsert':true},function(err,doc){
if(err)
console.error(err);
else
{
console.log("Find and modify successfull");
console.dir(doc);
}
});
I obtained the method signature here.I get an error that looks like this and is uninformative:
POST /notes/edit/542bdec5712c0dc426d41342 500 86ms - 1.35kb
Monk implements methods that are more in line with the shell syntax for method signatures than what is provided by the node native driver. So in this case the "shell" documentation for .findAndModify() is more appropriate for here:
notescollection.findAndModify(
{
"query": { "_id": id },
"update": { "$set": {
"title": title,
"content": content
}},
"options": { "new": true, "upsert": true }
},
function(err,doc) {
if (err) throw err;
console.log( doc );
}
);
Also noting that you should be using the $set operator or posibly even the $setOnInsert operator where you only want fields applied when the document is created. When operators like this a re not applied then the "whole" document is replaced with whatever content you specify for the "update".
You also don't need to supply the "_id" field in the update section, as even when an "upsert" occurs, anything present in the "query" portion of the statement is implied to be created in a new document.
The monk documentation also hints at the correct syntax to use for the method signature.
Had the same problem, and even though I liked it, the accepted answer didn't work for me.
It's not clear enough, but the documentation hints at the correct syntax, starting with the signatures:
All commands accept the simple data[, …], fn. For example
findAndModify({}, {}, fn)
And from the finding section:
users.findAndModify({ _id: '' }, { $set: {} });
Finally, continuing with the signatures section:
You can pass options in the middle: data[, …], options, fn
Putting it all together:
collection.findAndModify({
_id: '',
}, {
$set: {
value: '',
},
}, {
upsert: true,
});
So in this case, data[, …] is the couple {}, {} objects: query and update. Then you can add the callback as a 4th parameter in my snippet.
On the site I am creating, users can enter different tags and separate them with commas. ExpressJS should then search through whether they exist or not. If they do not exist, then it should create an object for each of them. I have an array and am iterating through it with a for function, however, only one object is created thanks to the callback... Is there any possible way to create multiple objects at once depending on the array's length?
for (i=0;i<postTopics.length;i++) {
var postTopic = postTopics[i],
postTopicUrl = postTopic.toString().toLowerCase().replace(' ', '-');
Topic.findOne({ "title": postTopics[i] }, function (err, topic) {
if (err) throw err;
if (!topic) {
Topic.create({
title: postTopic,
url: postTopicUrl
}, function (err, topic) {
if (err) throw err;
res.redirect('/');
});
}
});
}
Try out async.parallel.
$ npm install async
// Get the async module so we can do our parallel asynchronous queries much easier.
var async = require('async');
// Create a hash to store your query functions on.
var topicQueries = {};
// Loop through your postTopics once to create a query function for each one.
postTopics.forEach(function (postTopic) {
// Use postTopic as the key for the query function so we can grab it later.
topicQueries[postTopic] = function (cb) {
// cb is the callback function passed in by async.parallel. It accepts err as the first argument and the result as the second.
Topic.findOne({ title: postTopic }, cb);
};
});
// Call async.parallel and pass in our topicQueries object.
// If any of the queries passed an error to cb then the rest of the queries will be aborted and this result function will be called with an err argument.
async.parallel(topicQueries, function (err, results) {
if (err) throw err;
// Create an array to store our Topic.create query functions. We don't need a hash because we don't need to tie the results back to anything else like we had to do with postTopics in order to check if a topic existed or not.
var createQueries = [];
// All our parallel queries have completed.
// Loop through postTopics again, using postTopic to retrieve the resulting document from the results object, which has postTopic as the key.
postTopics.forEach(function (postTopic) {
// If there is no document at results[postTopic] then none was returned from the DB.
if (results[postTopic]) return;
// I changed .replace to use a regular expression. Passing a string only replaces the first space in the string whereas my regex searches the whole string.
var postTopicUrl = postTopic.toString().toLowerCase().replace(\ \g, '-');
// Since this code is executing, we know there is no topic in the DB with the title you searched for, so create a new query to create a new topic and add it to the createQueries array.
createQueries.push(function (cb) {
Topic.create({
title: postTopic,
url: postTopicUrl
}, cb);
});
});
// Pass our createQueries array to async.parallel so it can run them all simultaneously (so to speak).
async.parallel(createQueries, function (err, results) {
// If any one of the parallel create queries passes an error to the callback, this function will be immediately invoked with that err argument.
if (err) throw err;
// If we made it this far, no errors were made during topic creation, so redirect.
res.redirect('/');
});
});
First we create an object called topicQueries and we attach a query function to it for each postTopic title in your postTopics array. Then we pass the completed topicQueries object to async.parallel which will run each query and gather the results in a results object.
The results object ends up being a simple object hash with each of your postTopic titles as the key, and the value being the result from the DB. The if (results[postTopic]) return; line returns if results has no document under that postTopic key. Meaning, the code below it only runs if there was no topic returned from the DB with that title. If there was no matching topic then we add a query function to our createQueries array.
We don't want your page to redirect after just one of those new topics finishes saving. We want to wait until all your create queries have finished, so we use async.parallel yet again, but this time we use an array instead of an object hash because we don't need to tie the results to anything. When you pass an array to async.parallel the results argument will also be an array containing the results of each query, though we don't really care about the results in this example, only that no errors were thrown. If the parallel function finishes and there is no err argument then all the topics finished creating successfully and we can finally redirect the user to the new page.
PS - If you ever run into a similar situation, except each subsequent query requires data from the query before it, then checkout async.waterfall :)
If you really want to see if things exist already and avoid getting errors on duplicates then the .create() method already accepts a list. You don't seem to care about getting the document created in response so just check for the documents that are there and send in the new ones.
So with "finding first", run the tasks in succession. async.waterfall just to pretty the indent creep:
// Just a placeholder for your input
var topics = ["A Topic","B Topic","C Topic","D Topic"];
async.waterfall(
[
function(callback) {
Topic.find(
{ "title": { "$in": topics } },
function(err,found) {
// assume ["Topic B", "Topic D"] are found
found = found.map(function(x) {
return x.title;
});
var newList = topics.filter(function(x) {
return found.indexOf(x) == -1;
});
callback(err,newList);
}
);
},
function(newList,callback) {
Topic.create(
newList.map(function(x) {
return {
"title": x,
"url": x.toString().toLowerCase().replace(' ','-')
};
}),
function(err) {
if (err) throw err;
console.log("done");
callback();
}
);
}
]
);
You could move the "url" generation to a "pre" save schema hook. But again if you really don't need the validation rules, go for "bulk API" operations provided your target MongoDB and mongoose version is new enough to support this, which really means getting a handle to the underlying driver:
// Just a placeholder for your input
var topics = ["A Topic","B Topic","C Topic","D Topic"];
async.waterfall(
[
function(callback) {
Topic.find(
{ "title": { "$in": topics } },
function(err,found) {
// assume ["Topic B", "Topic D"] are found
found = found.map(function(x) {
return x.title;
});
var newList = topics.filter(function(x) {
return found.indexOf(x) == -1;
});
callback(err,newList);
}
);
},
function(newList,callback) {
var bulk = Topic.collection.initializeOrderedBulkOp();
newList.forEach(function(x) {
bullk.insert({
"title": x,
"url": x.toString().toLowerCase().replace(' ','-')
});
});
bulk.execute(function(err,results) {
console.log("done");
callback();
});
}
]
);
That is a single write operation to the server, though of course all inserts are actually done in order and checked for errors.
Otherwise just hang the errors from duplicates and insert as an "unordered Op", check for "non duplicate" errors after if you want:
// Just a placeholder for your input
var topics = ["A Topic","B Topic","C Topic","D Topic"];
var bulk = Topic.collection.initializeUnorderedBulkOp();
topics.forEach(function(x) {
bullk.insert({
"title": x,
"url": x.toString().toLowerCase().replace(' ','-')
});
});
bulk.execute(function(err,results) {
if (err) throw err;
console.log(JSON.stringify(results,undefined,4));
});
Output in results looks something like the following indicating the "duplicate" errors, but does not "throw" the error as this is not set in this case:
{
"ok": 1,
"writeErrors": [
{
"code": 11000,
"index": 1,
"errmsg": "insertDocument :: caused by :: 11000 E11000 duplicate key error index: test.topic.$title_1 dup key: { : \"B Topic\" }",
"op": {
"title": "B Topic",
"url": "b-topic",
"_id": "53b396d70fd421057200e610"
}
},
{
"code": 11000,
"index": 3,
"errmsg": "insertDocument :: caused by :: 11000 E11000 duplicate key error index: test.topic.$title_1 dup key: { : \"D Topic\" }",
"op": {
"title": "D Topic",
"url": "d-topic",
"_id": "53b396d70fd421057200e612"
}
}
],
"writeConcernErrors": [],
"nInserted": 2,
"nUpserted": 0,
"nMatched": 0,
"nModified": 0,
"nRemoved": 0,
"upserted": []
}
Note that when using the native collection methods, you need to take care that a connection is already established. The mongoose methods will "queue" up until the connection is made, but these will not. More of a testing issue unless there is a chance this would be the first code executed.
Hopefully versions of those bulk operations will be exposed in the mongoose API soon, but the general back end functionality does depend on having MongoDB 2.6 or greater on the server. Generally it is going to be the best way to process.
Of course, in all but the last sample which does not need this, you can go absolutely "async nuts" by calling versions of "filter", "map" and "forEach" that exist under that library. Likely not to be a real issue unless you are providing really long lists for input though.
The .initializeOrderedBulkOP() and .initializeUnorderedBulkOP() methods are covered in the node native driver manual. Also see the main manual for general descriptions of Bulk operations.