Why does creating a new tedious connection keep my program from finishing? - javascript

I'm trying to wrap the tedious MSSQL library API with promises to make it easier to use but whenever I make a new Promise that creates a new tedious SQL connection the program never exits and I'm having trouble figuring out why.
This is a stripped down version of my real code with the bare minimum needed to cause the issue.
const {Connection} = require('tedious');
const connect = () =>
new Promise((resolve, reject) =>
{
const config = {
userName: '----',
password: '----',
domain: '----',
server: '----',
options: {
database: '----',
port: 1805,
connectTimeout: 6000,
readOnlyIntent: true,
rowCollectionOnRequestCompletion: true,
encrypt: true
}
};
console.log('Pre new conn');
// const conn = new Connection(config);
console.log('Post new conn');
resolve('Resolved');
});
connect()
.then(conn => console.log(`conn: ${conn}`))
.catch(error => console.log(`Err: ${error}`));
When the connection succeeds I get the following output:
Pre new conn
Post new conn
conn: Resolved
If I uncomment the line const conn = new Connection(config); then I get the exact same output, but the program never exits!
I'm using tedious v2.6.4 and I'm running the program with node v8.11.3.

Node.js keeps track of open network connections, running timers and other things like that that might indicate that your node.js program is not yet done with whatever it was trying to do and when it sees that count is non-zero, it does not automatically exit. If you want it exit in that situation, you have three options:
You can close the connections that you are no longer using.
You can call .unref() on those connections to remove them from the count node.js is keeping. If it's a higher level thing like a database connection, you may need to call .unref() on the actual socket itself (which the DB interface may or may not make available to you) or perhaps the database shares it's own .unref() method for this purpose.
You can manually exit your process with process.exit() when you're doing with everything you wanted to do.

Related

How to shutdown gracefully in sveltekit?

I am using adapter-node and a mysql pool in a sveltekit web app.
Previously, using just nodejs and express and no sveltekit, I found I needed to shutdown the mysql pool connections cleanly or mysql could hang when restarting the app.
I had something like:
process.on('SIGINT', () => server.close(() => pool.end()));
How would I achieve the same result in a sveltekit app? Or is it not necessary (and why)?
I can see in the sveltekit implementation where it creates the server, but does not seem to have any way to access it so I can call close(). I don't think it would be safe to call pool.end() before the server closes.
I also couldn't find any discussion of graceful shutdown in the sveltekit docs. There was 1 github issue but it was closed over a year ago and that change has since been removed from the code.
I found a similar issue asked in the svelte github. It has no resolution, so there is likely no official solution yet. https://github.com/sveltejs/kit/issues/6841
Disclaimers at the time of writing this answer:
I am brand new to Svelte and SvelteKit and only a couple years into web development in general
SvelteKit is pre-1.0 and subject to change
I am not 100% sure this answer handles all cases
I am using adapter-node
SvelteKit currently recommends doing one-time setup code in src/hooks.server.ts. So when talking about graceful shutdown, I will only worry about shutting down the things I setup in src/hooks.server.ts.
The brief answer is to set up process.on() handlers for exit and SIGINT that do any required cleanup.
An example for setting up and shutting down a mysql database pool when using adapter-node:
// src/hooks.server.ts
// One time setup code
await import('./db.js');
// ... remaining hooks.server.ts code
// src/db.ts
import { PRIVATE_MYSQL_PASSWORD } from '$env/static/private';
import type { Pool, PoolConnection, MysqlError } from 'mysql';
import { createPool } from 'mysql';
const pool = createPool({
connectionLimit: 10,
host: 'localhost',
user: 'root',
password: PRIVATE_MYSQL_PASSWORD,
database: 'my_db',
multipleStatements: false,
timezone: 'UTC',
dateStrings: ['DATE', 'DATETIME'],
});
process.on('exit', (code) => end_db_pool(pool));
process.on('SIGINT', () => end_db_pool(pool));
function end_db_pool(pool: any) {
pool.getConnection(function (err, connection) {
connection.query('select 1 from my_table;', function (err, rows) {
connection.release();
// pool.end() only works inside getConnection();
pool.end((err) => {
if (err) log('pool.end err: ' + err);
});
});
});
}
// ... remaining API for DB operations using the pool
Another solution I tried and might still be useful is to create a custom adapter. I copied the adapter-node code into my project and modified files/index.js for some experimenting. For now, I am using the code above and not a custom adapter-node.

How can data schema be rejected before the mongodb connection is made by node.js and mongoose?

I don't understand a specific asynchronous javascript code: I have a very simple few lines of javascript running by node.js where I query a local mongoDB, on the main lines it does this:
require mongoose
a promise to connect to the db
mongoose.connect("...url to my local mongoDB...")
.then(console.log("Connected to DB..."))
create a schema
create a model from schema
define an async function to create a new object, save it as document in mongoDB and console.log the result returned after the attempt to save the document.
What I don't understand is the order of the console.log("Connect to DB") and console.log(result from document.save()): indeed, when there are no error on saving, the order seems ok: i have first the "Connected to DB..." then the returned saved document:
But when there is a data validation error for not respecting some requirements, then the "Connected to DB" is printed after the "Connected to DB":
Regarding the structure of the code, I don't understand why the "Connected to the DB..." is printed after the print of the Error. I suspect ansynchronous code to be the reason but i don't understand why. This very simple few lines of code come from the "Programming with Mosh" course where we can see the exact same behavior on his console.
A little bit more code details:
const mongoose = require("mongoose")
mongoose
.connect(my_mongo_db_url)
.then(() => console.log("Connected to DB"))
.catch(err => console.log("Could not connect to DB"))
const courseSchema = new mongoose.Schema({ ...course schema... })
const Course= mongoose.model("Course", courseSchema )
async function createCourse(){
const course = new Course({ ...new course values... })
try { const result = await course.save()}
catch (err) { console.log(err.message)}
}
createCourse()
I copy here the #jonrsharpe comment that answered my question:
"The call to course.save may be executed before the connection is made, but its internal implementation waits for the connection: https://mongoosejs.com/docs/connections.html#buffering"

Is it safe to use a single Mongoose database from two files/processes?

I've been working on a server and a push notification daemon that will both run simultaneously and interact with the same database. The idea behind this is that if one goes down, the other will still function.
I normally use Swift but for this project I'm writing it in Node, using Mongoose as my database. I've created a helper class that I import in both my server.js file and my notifier.js file.
const Mongoose = require('mongoose');
const Device = require('./device'); // This is a Schema
var uri = 'mongodb://localhost/devices';
function Database() {
Mongoose.connect(uri, { useMongoClient: true }, function(err) {
console.log('connected: ' + err);
});
}
Database.prototype.findDevice = function(params, callback) {
Device.findOne(params, function(err, device) {
// etc...
});
};
module.exports = Database;
Then separately from both server.js and notifier.js I create objects and query the database:
const Database = require('./db');
const db = new Database();
db.findDevice(params, function(err, device) {
// Simplified, but I edit and save things back to the database via db
device.token = 'blah';
device.save();
});
Is this safe to do? When working with Swift (and Objective-C) I'm always concerned about making things thread safe. Is this a concern? Should I be worried about race conditions and modifying the same files at the same time?
Also, bonus question: How does Mongoose share a connection between files (or processes?). For example Mongoose.connection.readyState returns the same thing from different files.
The short answer is "safe enough."
The long answer has to do with understanding what sort of consistency guarantees your system needs, how you've configured MongoDB, and whether there's any sharding or replication going on.
For the latter, you'll want to read about atomicity and consistency and perhaps also peek at write concern.
A good way to answer these questions, even when you think you've figured it out, is to test scenarios: Hammer a duplicate of your system with fake data and events and see if what happen is OK or not.

MongoDB executes queries sequentially instead of in parallel

I have an API endpoint which I am trying to stress test which reads a very large MongoDB database collection (2 million documents). Each query takes roughly 2 seconds however the problem I am having is that the connection to the database isn't being pooled correctly so each query runs sequentially instead of concurrently.
I am using Mongoose to connect to my database and I am using artillery.io for testing.
Here is my connection code:
const mongoose = require('mongoose');
const Promise = require('bluebird');
const connectionString = process.env.MONGO_DB || 'mongodb://localhost/mydatabase';
mongoose.Promise = Promise;
mongoose.connect(connectionString, {
server: { poolSize: 10 }
});
const db = mongoose.connection;
db.on('error', console.error.bind(console, 'connection error: '));
db.once('open', function() {
console.log('Connected to: ' + connectionString);
});
module.exports = db;
It's your pretty bog standard connection procedure however probably the most important part is the server: { poolSize: 10 } line.
I am using the following script for artillery.io testing:
config:
target: 'http://localhost:1337'
phases:
-
duration: 10
arrivalRate: 5
name: "Warm-up"
scenarios:
-
name: "Search by postcodes"
flow:
-
post:
url: "/api/postcodes/gb_full/search"
headers:
Content-Type: 'application/json'
json:
postcodes:
- ABC 123,
- DEF 345,
- GHI 678
This test executes 50 calls to the API over 10 seconds. Now here's where the problem is, the API appears to execute queries sequentially, see the test results below:
"latency": {
"min": 1394.1,
"max": 57693,
"median": 30222.7,
"p95": 55396.8,
"p99": 57693
},
And the database logs are as follows:
connection accepted from 127.0.0.1:60770 #1 (1 connection now open)
...
2017-04-10T18:45:55.389+0100 ... 1329ms
2017-04-10T18:45:56.711+0100 ... 1321ms
2017-04-10T18:45:58.016+0100 ... 1304ms
2017-04-10T18:45:59.355+0100 ... 1338ms
2017-04-10T18:46:00.651+0100 ... 1295ms
It appears as though the API is only using one connection, which seems correct however it was my understanding that this will automatically put the poolSize to good use and execute these queries concurrently instead of one at a time.
What am I doing wrong here? How can I execute these database queries in parallel?
Edit 1 - Model and Query
To hopefully make things a little clearer, I am using the following model:
const mongoose = require('mongoose');
const db = require('...');
const postcodeSchema = mongoose.Schema({
postcode: { type: String, required: true },
...
location: {
type: { type: String, required: true },
coordinates: [] //coordinates must be in longitude, latitude order.
}
});
//Define the index for the location object.
postcodeSchema.index({location: '2dsphere'});
//Export a function that will allow us to define the collection
//name so we'll pass in something like: GB, IT, DE ect for different data sets.
module.exports = function(collectionName) {
return db.model('Postcode', postcodeSchema, collectionName.toLowerCase());
};
Where the db object is the connection module explained at the top of this question.
And I am executing a query using the following:
/**
* Searches and returns GeoJSON data for a given array of postcodes.
* #param {Array} postcodes - The postcode array to search.
* #param {String} collection - The name of the collection to search, i.e 'GB'.
*/
function search(postcodes, collection) {
return new Promise((resolve, reject) => {
let col = new PostcodeCollection(collection.toLowerCase());
col.find({
postcode: { $in: postcodes }
})
.exec((err, docs) => {
if (err)
return reject(err);
resolve(docs);
});
});
}
And here is an example of how the function can be called:
search(['ABC 123', 'DEF 456', 'GHI 789'], 'gb_full')
.then(postcodes => {
console.log(postcodes);
})
.catch(...);
To re-iterate, these queries are executed via the node.js API, therefore they should already be asynchronous however the queries themselves are being executed one after the other. Therefore I believe the problem may be on the MongoDB side but I have no idea where to even start looking. It's almost as if MongoDB is blocking any other queries from being executed against the collection if there is already one running.
I am running an instance of mongod.exe locally on a Windows 10 machine.
Firstly, MongoDB has a read lock when a query is issued (see here). That's why it was executing queries sequentially. The only way to improve this further is by sharding the collection.
If you are using mongo 3.0+ with wiredtiger as the storage engine you have document level locking. The queries should not execute sequentially, the sharding would definitely help with the paralelism but 2kk docs should not be a problem for most modern computer/server hardware.
You mention the log file of mongodb on the first question, you should have more than one connection opened, is that the case?
Ok, so I managed to figure out what the issues were.
Firstly, MongoDB has a read lock when a query is issued (see here). That's why it was executing queries sequentially. The only way to improve this further is by sharding the collection.
Also, as Jorge suggested, I added an index on the postcode field and this massively reduced the latency.
postcodeSchema.index({ postcode: 1 }); //, { unique: true } is a tiny bit faster.
To put it into perspective, here are the results of the stress test with the new index in place:
"latency": {
"min": 5.2,
"max": 72.2,
"median": 11.1,
"p95": 17,
"p99": null
},
The median latency has dropped from 30 seconds to 11 milliseconds which is an astonishing improvement.

sequelizejs saving an object when row was removed

I have the following code. The idea is that I update a database row in an interval, however if I remove the row manually from the database while this script runs, the save() still goes into success(), but the row is not actually put back into the database. (Because sequelize does an update query with a where clause and no rows match.) I expected a new row to be created or error() to be called. Any ideas to what I can do to make this behave like I want to?
var Sequelize = require("sequelize")
, sequelize = new Sequelize('test', 'test', 'test', {logging: false, host: 'localhost'})
, Server = sequelize.import(__dirname + "/models/Servers")
sequelize.sync({force: true}).on('success', function() {
Server
.create({ hostname: 'Sequelize Server 1', ip: '127.0.0.1', port: 0})
.on('success', function(server) {
console.log('Server added to db, going to interval');
setInterval(function() { console.log('timeout reached'); server.port = server.port + 1; server.save().success(function() { console.log('saved ' + server.port) }).error(function(error) { console.log(error); }); }, 1000);
})
})
I'm afraid what you are trying to do is not currently supported by sequelize.
Error callbacks are only ment for actual error situations, i.e. SQL syntax errors, stuff like that. Trying to update a non-existing row is not an error in SQL.
The import distinction here is, that you are modifying your database outside of your program. Sequelize has no way of knowing that! I have two possible solutions, only one of which is viable right now:
1 (works right now)
Use sequelize.query to include error handling in your query
IF EXISTS (SELELCT * FROM table WHERE id = 42)
UPDATE table SET port = newport WHERE id = 42
ELSE
INSERT INTO table ... port = newport
Alternatively you could create a feature request on the sequelize github for INSERT ... ON DUPLICATE KEY UPDATE syntax to be implemented see and here
2 (will work when transactions are implemented
Use transactions to first check if the row exists, and insert it if it does not. Transactions are on the roadmap for sequelize, but not currently supported. If you are NOT using connection pooling, you might be able to acomplish transactions manually by calling sequelize.query('BEGIN / COMMIT TRANSACTION').

Categories