Mongoose/Node server restart and duplicates - javascript

Ok so after a ton of trial and error, I've determined that when I drop a collection and then recreate it through my app, unique doesn't work until I restart my local node server. Here's my Schema
var mongoose = require('mongoose');
var Schema = mongoose.Schema;
var Services = new Schema ({
type : {type : String},
subscriptionInfo : Schema.Types.Mixed,
data : Schema.Types.Mixed
},{_id:false});
var Hashtags = new Schema ({
name: {type : String},
services : [Services]
},{_id:false});
var SubscriptionSchema = new Schema ({
eventId : {type: String, index: { unique: true, dropDups: true }},
hashtags : [Hashtags]
});
module.exports = mongoose.model('Subscription', SubscriptionSchema);
And Here's my route...
router.route('/')
.post(function(req, res) {
var subscription = new subscribeModel();
subscription.eventId = eventId;
subscription.save(function(err, subscription) {
if (err)
res.send(err);
else
res.json({
message: subscription
});
});
})
If I drop the collection, then hit the /subscribe endpoint seen above, it will create the entry but will not honor the duplicate. It's not until I then restart the server that it starts to honor it. Any ideas why this is? Thanks!

What mongoose does when your application starts and it itself initializes is scan your schema definitions for the registered models and calls the .ensureIndexes() method for the supplied arguments. This is the "by design" behavior and is also covered with this statement:
When your application starts up, Mongoose automatically calls ensureIndex for each defined index in your schema. While nice for development, it is recommended this behavior be disabled in production since index creation can cause a significant performance impact. Disable the behavior by setting the autoIndex option of your schema to false.
So your general options here are:
Don't "drop" the collection and call .remove() which leaves the indexes intact.
Manually call .ensureIndexes() when you issue a drop on a collection in order to rebuild them.
The warning in the document is generally that creating indexes for large collections can take some time and take up server resources. If the index exists this is more or less a "no-op" to MongoDB, but beware of small changes to the index definition which would result in creating "additional" indexes.
As such, it is generally best to have a deployment plan for production systems where you determine what needs to be done.

This post seems to argue that indexes are not re-built when you restart: Are MongoDB indexes persistent across restarts?

Related

Mongoose-fuzzy-searching returns empty array unless query is empty

I am working on a wiki-like website component and I am trying to implement a fuzzy search. I found a popular Node.js plugin on npmjs for a fuzzy search of a cloud mongoDB database handled with Mongoose. I installed and saved mongoose-fuzzy-searching and hooked it up to my model and route. Then I updated all of the models, resaving each's value for the field I wanted to index. I can seem to call the function on a model, and it seems like there's an index on MongoDB Atlas, but it returns an empty array instead of any results. 2 questions:
Am I doing something wrong, or is there something I am missing?
Is there a better node library that's free? (*Free on heroku, which I think eliminates flexsearch as an option)
Here's my code.
Model:
const mongoose = require("mongoose"),
mongoose_fuzzy_searching = require('mongoose-fuzzy-searching'),
Schema = mongoose.Schema;
const IssueTemplateSchema = new Schema({
name: String,
info: String,
image: String,
tags: [{type: Schema.Types.ObjectId, ref: "Tag"}],
active: {type: Boolean, default: true},
instances: [{type: Schema.Types.ObjectId, ref: "LocalIssue"}],
issues: {type: Schema.Types.ObjectId, ref: "Issuegraph" },
});
IssueTemplateSchema.plugin(mongoose_fuzzy_searching, { fields: ['name'] });
module.exports = mongoose.model("IssueTemplate", IssueTemplateSchema);
An update to all of the issuetemplate models:
const express = require('express'),
router = express.Router(),
Issue = require('../api/issue/issue.template');
router.get("/createIndex", async (req, res) => {
Issue.find({}, async (err, issues) => {
if(err){console.log(err)}else {
for(issue of issues){
if(err){console.log(err)}else{
const name = issue.name;
await Issue.findByIdAndUpdate(issue._id, {name: name}, {strict: false});
}
}
}
});
return res.send("done");
});
Route:
router.get("/search", (req, res) => {
console.log(req.query.target);
let searchTerm = "";
if(req.query.target){
searchTerm = decodeURIComponent(req.query.target.replace(/\+/g, ' '));
}
Issue.fuzzySearch(searchTerm, (err, issue)=> {
if(err){
console.log(err);
res.send("Error fuzzy searching: " + err);
} else {
returnResult(issue);
}
});
function returnResult(result) {
console.log(result);
return res.send(result);
}
});
When I ran
npm install --save mongoose-fuzzy-searching
I received an error saying it needed mongoose 5.10, and I have 5.11, but it seemed to at least plug in. Can't see why When I send a request through Postman, I receive an empty array. If I leave the query blank, then I get everything. I have reset Node and am using mongoDB Cloud, where I see an index has been created. Is there perhaps a reset of the cloud database I would need to do (I don't know of such a thing), or is resetting the server enough? My knowledge level is: studying to be a freelance web developer and would appreciate any general tips on best practice, etc.
Mongoose need to define schema. which makes it slow and find() method is good for development, not for production level. Also ,This Process is outdated. You are working on MongoDB. if you need search then take a look into MongoDB atlas Full Text-Search.It includes all of those searching features like: autocomplete, Fuzzy Search everything.
It seems like my update operation was not actually updating the field I wanted to update, because findByIdAndUpdate() returns a query and I wasn't executing that query (with .exec() ). Await was also being used incorrectly because its purpose is not just to pause in an async function like I thought, but to wait for a promise to fulfill. A query is not a promise, and await is specifically for promises. Before I learned these details, I solved my problem in another way by using .find() , then .markModified("name") , then .save() . Once I did that, it all worked!

nodejs modules code execution

need a little assistance with an understanding nodejs code organization,
so I'm from C++ world and suppose that didn't understand a principles.
So I need to implement some js module which should connect to MongoDB and exports a few methods for other modules: e.g. insert, update, delete.
when I write something like:
var db = MongoClient.connect(config.connectionString, {native_parser:true},function (err, db) {...});
exports.insert = function(a, b) {
// db using
//...
};
I suppose that "db" local static variable and will be initialized in any case. at the time of call "require('this module') " but seems it's not so, and db is uninitialized at the time of the call of exported functions? another question - I suppose this should be implemented using "futures" (class from c++, didn't find an analogue from js) to guaratee that db object is copmpletely constructed at the moment of the using??
So the problem I see is that you want to use DB but since DB is returned async, it may or may not be available in the exported function, hence you need to convert the connect from async to sync.
Since MongoDB driver cannot do sync, i suggest you use a wrapper, i suggest mongoskin.
https://github.com/kissjs/node-mongoskin
var mongo = require('mongoskin');
var db = mongo.db(config.connectionString, {native_parser:true});
Now this should work for you.
I had worked with C++, Java before (sometime back, not now) and now working in nodejs. I think I understood your question. Here are some key points.
Yes, Nodejs modules are somewhat like classes that they encapsulate the variables and you access only through public methods (exposed through exports). I think you are aware that there is no class implementation at all here, but it loosely maps to the behaviour.
The key difference in nodejs is the asynchronous nature of resource instantiation. By this, I mean if there are 2 statements stmt1 and stmt2, if stmt1 is called and takes time, nodejs does not wait for it to end (that is synchronous behaviour), instead it moves on to stmt2. In pre-nodejs world, we assume that reaching stmt2 means stmt1 is complete.
So, what is the workaround? How to ensure you do something after db connection is obtained. If your code is not making db calls immediately, you could assume that connection will be through. Or if you immediately want to invoke db, you write the code on a callback. Mongo exposes events called 'open' and 'error'. You can use this to ensure connection is open. Also it is best practise to track error event.
db.on('error', console.error.bind(console, 'connection error'));
db.once('open', function callback() {
console.log("Connection with database succeeded.");
// put your code
});
I am not aware of C++ future and so cannot comment on that.
Hope this helps !
[Updated] To add example
You could have db.js for setting up db connection and expose Mongoose object to create models.
'use strict';
var Mongoose = require('mongoose'),
Config = require('./config');
Mongoose.connect(Config.database.url);
var db = Mongoose.connection;
db.on('error', console.error.bind(console, 'connection error'));
db.once('open', function callback() {
console.log("Connection with database succeeded.");
});
exports.Mongoose = Mongoose;
exports.db = db;
You can include db.js in the server.js like
var DB = require('db.js');
which will do the initialisation.
You then use mongoose (mongoose is a Object relational mapper to work with mongo and highly recommended) to get models of database objects as shown below.
//userModel.js
var mongoose = require('mongoose'),
Schema = mongoose.Schema,
var UserSchema = new Schema({
uid : {type : Number, required: false}
, email : {type : String, get: toLower, set: toLower, required: true, index: { unique: true } }
, passwd : {type : String, required: false}
);
var user = mongoose.model('user', UserSchema);
module.exports = {
User : user
};
For more information on mongoose, you can refer http://mongoosejs.com
The db is generally not closed as I use in web environment and is always on. There is db connection pooling maintained and connections are reused optimally. I saw noticed a thread in SO which adds more details. Why is it recommended not to close a MongoDB connection anywhere in Node.js code?

Mongoose - Updating a referenced Document when saving

If I have a Schema which has an Array of references to another Schema, is there a way I can update both Documents with one endpoint?
This is my Schema:
CompanySchema = new Schema({
addresses: [{
type: Schema.Types.ObjectId,
ref: 'Address'
}]
});
I want to send a Company with the full Address object to /companies/:id/edit. With this endpoint, I want to edit attributes on Company and Address at the same time.
In Rails you can use something like nested attributes to do one big UPDATE call, and it will update the Company and update or add the Address as well.
Any idea how would you do this in Mongoose?
Cascade saves are not natively supported in Mongoose (issue).
But there are plugins (example: cascading-relations) that implement this behavior on nested populate objects.
Take in mind that mongodb is not a fully transactional database, and the "big save" is achieved with various insert()/update() op calls and you (or the plugin) have to handle errors and rollback.
Example of cascade save:
company.save()
.then(() => Promise.all(company.addresses.map(address => {
/* update fkeys if needed */
return address.save()
}))
.catch(err => console.error('something went wrong...', err))

Using Multiple Mongodb Databases with Meteor.js

Is it possible for 2 Meteor.Collections to be retrieving data from 2 different mongdb database servers?
Dogs = Meteor.Collection('dogs') // mongodb://192.168.1.123:27017/dogs
Cats = Meteor.Collection('cats') // mongodb://192.168.1.124:27017/cats
Update
It is now possible to connect to remote/multiple databases:
var database = new MongoInternals.RemoteCollectionDriver("<mongo url>");
MyCollection = new Mongo.Collection("collection_name", { _driver: database });
Where <mongo_url> is a mongodb url such as mongodb://127.0.0.1:27017/meteor (with the database name)
There is one disadvantage with this at the moment: No Oplog
Old Answer
At the moment this is not possible. Each meteor app is bound to one database.
There are a few ways you can get around this but it may be more complicated that its worth:
One option - Use a separate Meteor App
In your other meteor app (example running at port 6000 on same machine). You can still have reactivity but you need to proxy inserts, removes and updates through a method call
Server:
Cats = Meteor.Collection('cats')
Meteor.publish("cats", function() {
return Cats.find();
});
Meteor.methods('updateCat, function(id, changes) {
Cats.update({_id: id}, {$set:changes});
});
Your current Meteor app:
var connection = DDP.connect("http://localhost:6000");
connection.subscribe("cats");
Cats = Meteor.Collection('cats', {connection: connection});
//To update a collection
Cats.call("updateCat", <cat_id>, <changes);
Another option - custom mongodb connection
This uses the node js mongodb native driver.
This is connecting to the database as if you would do in any other node js app.
There is no reactivity available and you can't use the new Meteor.Collection type collections.
var mongodb = Npm.require("mongodb"); //or var mongodb = Meteor.require("mongodb") //if you use npm package on atmosphere
var db = mongodb.Db;
var mongoclient = mongodb.MongoClient;
var Server = mongodb.Server;
var db_connection = new Db('cats', new Server("127.0.0.1", 27017, {auto_reconnect: false, poolSize: 4}), {w:0, native_parser: false});
db.open(function(err, db) {
//Connected to db 'cats'
db.authenticate('<db username>', '<db password>', function(err, result) {
//Can do queries here
db.close();
});
});
The answer is YES: it is possible set up multiple Meteor.Collections to be retrieving data from different mongdb database servers.
As the answer from #Akshat, you can initialize your own MongoInternals.RemoteCollectionDriver instance, through which Mongo.Collections can be created.
But here's something more to talk about. Being contrary to #Akshat answer, I find that Oplog support is still available under such circumstance.
When initializing the custom MongoInternals.RemoteCollectionDriver, DO NOT forget to specify the Oplog url:
var driver = new MongoInternals.RemoteCollectionDriver(
"mongodb://localhost:27017/db",
{
oplogUrl: "mongodb://localhost:27017/local"
});
var collection = new Mongo.Collection("Coll", {_driver: driver});
Under the hood
As described above, it is fairly simple to activate Oplog support. If you do want to know what happened beneath those two lines of code, you can continue reading the rest of the post.
In the constructor of RemoteCollectionDriver, an underlying MongoConnection will be created:
MongoInternals.RemoteCollectionDriver = function (
mongo_url, options) {
var self = this;
self.mongo = new MongoConnection(mongo_url, options);
};
The tricky part is: if MongoConnection is created with oplogUrl provided, an OplogHandle will be initialized, and starts to tail the Oplog (source code):
if (options.oplogUrl && ! Package['disable-oplog']) {
self._oplogHandle = new OplogHandle(options.oplogUrl, self.db.databaseName);
self._docFetcher = new DocFetcher(self);
}
As this blog has described: Meteor.publish internally calls Cursor.observeChanges to create an ObserveHandle instance, which automatically tracks any future changes occurred in the database.
Currently there are two kinds of observer drivers: the legacy PollingObserveDriver which takes a poll-and-diff strategy, and the OplogObseveDriver, which effectively use Oplog-tailing to monitor data changes. To decide which one to apply, observeChanges takes the following procedure (source code):
var driverClass = canUseOplog ? OplogObserveDriver : PollingObserveDriver;
observeDriver = new driverClass({
cursorDescription: cursorDescription,
mongoHandle: self,
multiplexer: multiplexer,
ordered: ordered,
matcher: matcher, // ignored by polling
sorter: sorter, // ignored by polling
_testOnlyPollCallback: callbacks._testOnlyPollCallback
});
In order to make canUseOplog true, several requirements should be met. A bare minimal one is: the underlying MongoConnection instance should have a valid OplogHandle. This is the exact reason why we need to specify oplogUrl while creating MongoConnection
This is actually possible, using an internal interface:
var d = new MongoInternals.RemoteCollectionDriver("<mongo url>");
C = new Mongo.Collection("<collection name>", { _driver: d });

Node.js / MongoDB / Mongoose: Buffer Comparison

First, a little background:
I'm trying to check to see if an image's binary data has already been saved in Mongo. Given the following schema:
var mongoose = require('mongoose')
, Schema = mongoose.Schema;
var imageSchema = new Schema({
mime: String,
bin: { type: Buffer, index: { unique: true }},
uses : [{type: Schema.Types.ObjectId}]
});
module.exports = mongoose.model('Image', imageSchema);
...I want to query to see if an image exists, if it does add a reference that my object is using it, and then update it. If it doesn't, I want to create (upsert) it.
Given the case that it does not exist, the below code works perfectly. If it does, the below code does not and adds another Image document to Mongo. I feel like it is probably a comparison issue for the Mongo Buffer type vs node Buffer, but I can't figure out how to properly compare them. Please let me know how to update the below! Thanks!
Image.findOneAndUpdate({
mime : contentType,
bin : image
}, {
$pushAll : {
uses : [ myObject._id ]
}
}, {
upsert : true
}, function(err, image) {
if (err)
console.log(err);
// !!!image is created always, never updated!!!
});
Mongoose converts Buffer elements destined to be stored to mongodb Binary, but it performs the appropriate casts when doing queries.
The expected behavior is also checked in units tests (also the storage and retrieval of a node.js Buffer).
Are you sure you are passing a node.js Buffer?
In any case I think the best approach to handle the initial problem (check if an image is already in the db) would be storing a strong hash digest (sha1, sha256, ...) of the binary data and check that (using the crypto module).
When querying, as a preliminary test you could also check the binary length to avoid unnecessary computations.
For an example of how to get the digest for your image before storing/querying it:
var crypto = require('crypto');
...
// be sure image is a node.js Buffer
var image_digest = crypto.createHash('sha256');
image_digest.update(image);
image_digest = image_digest.digest('base64');
It is not a good idea to query for your image by the node.js Buffer that contains the image data. You're right that it's probably an issue between the BSON binary data type and a node Buffer, but does your application really require such a comparison?
Instead, I'd add an imageID or slug field to your schema, add an index to this field, and query on it instead of bin in your findOneAndUpdate call:
var imageSchema = new Schema({
imageID: { type: String, index: { unique: true }},
mime: String,
bin: Buffer,
uses : [{type: Schema.Types.ObjectId}]
});
the hash does work, another filter I have used is the exif data for the image.
As this is structured information, if you have a match on exif data, you could then go to the next step of checking for a match on the hash or file size...
heaps of node modules to get the exif data nice and easily for your storage :)
example code to get exif data for node

Categories