KnexJS Migration With Associated Seed Data - javascript

I'm, in the process of learning BookshelfJS/KnexJS (switching from SequelizeJS), and I'm running into an issue with importing data into multiple tables that were created via the migrations feature within KnexJS. There's 4 tables:
servers
operating_systems
applications
applications_servers
With the following constraints:
servers.operating_system_id references operating_systems.id
applications_servers.server_id references servers.id
applications_servers.application_id references applications.id
The tables get created just fine when I run knex migrate:latest --env development servers, it's when I import seeded data into the tables I get an error.
Originally, I organized the seed data for the 4 tables into 4 different files within the directory ./seeds/dev, which is just ${table_name}.js:
operating_systems.js
servers.js
applications.js
applications_servers.js
After a bit of debugging, I came to the realization that the error is being generated when the seed data within the file applications_servers.js, since when I take that one out, the other 3 run just fine. Then when I remove the 3 seed files and move applications_servers.js to the ./seeds/dev/ directory and execute knex seed:run, the applications_servers table gets populated just fine. However when I try to import the seeded data of all 4 files at once, I receive the following error:
# knex seed:run
Using environment: development
Error: ER_NO_REFERENCED_ROW_2: Cannot add or update a child row: a foreign key constraint fails (`bookshelf_knex_lessons`.`applications_servers`, CONSTRAINT `applications_servers_server_id_foreign` FOREIGN KEY (`server_id`) REFERENCES `servers` (`id`) ON DELETE CASCADE)
at Query.Sequence._packetToError (/Users/me/Documents/scripts/js/node/bookshelf_knex/node_modules/mysql/lib/protocol/sequences/Sequence.js:48:14)
at Query.ErrorPacket (/Users/me/Documents/scripts/js/node/bookshelf_knex/node_modules/mysql/lib/protocol/sequences/Query.js:83:18)
And not a single row gets inserted, into any table.
I thought that maybe it had something to do with the order that they were being imported. (Since they have to be imported in the order that the files were listed above). So thinking that maybe they were referenced in alpha-numeric order, I renamed them to:
1-operating_systems.js
2-servers.js
3-applications.js
4-applications_servers.js
However, nothing changed, no rows were inserted into any table, so just to be sure, I reversed the sequence of the number prefixes on the files, and again, no changes, not a single row was inserted, and same error returned.
Note: The migration script for creating the tables, as well as all of the seed data, was copy and pasted from a JS file I created which creates the same tables and imports the same data using BookshelfJS/KnexJS, but instead of using the migration features, it just does it manually when executed via node. You can view this file here
Any help would be appreciated!
Edit: When I combine all of the seed files within ./seeds/dev into one file, ./seeds/dev/servers.js, everything gets imported just fine. This makes me think that it may be caused by the knex migrations being executed asynchronously, thus the ID's inserted in the pivot table and the servers.operating_system_id may not be inserted yet into the associated tables... If this is the case, is there a way to setup dependencies in the seed files?

Knex.js's seed functionality does not provide any order of execution guarantees. Each seed should be written such that it can be executed in isolation - ie. your single file approach is correct.
If you want to break your individual seed files into submodules, then you might try the following:
// initial-data.js
var operatingSystems = require('./initial-data/operating-systems.js');
var servers = require('./initial-data/servers.js');
exports.seed = function(knex, Promise) {
return operatingSystems.seed(knex, Promise)
.then(function () {
return servers.seed(knex, Promise);
}).then(function() {
// next ordered migration...
});
}

I use the sql-fixtures module to handle FK relation dependencies in my seed file(s).
Imaginary implementation:
const dataSpec = {
applications_servers: [{
name: 'My ASP.Net thingie',
application_id: 'applications:0',
server_id: 'servers:0'
}],
servers: [{
name: 'My Windows server',
operating_system_id: 'operating_systems:0'
}],
operating_systems: [{
name: 'Windows Server 2k10'
}],
applications: [{
name: 'My fab web guestbook',
description: '...'
}]
}
but it might be superfluous if you declare your dependencies by means of the ORM.
Sometimes however, it is desirable to avoid coupling the DB seeds to your exact current Model implementation.

Related

Sequelize run hook once after include

I'm new to Sequelize and try to achieve the following:
Assume I have a very simple database with 3 Models/Tables:
Person, Group and Category.
Person has a Many-To-One relation to Group (1 Person can be in 1 Group, 1 Group holds multiple people) & Group has a Many-To-One relation to Category (1 Group has 1 Category, 1 Category can be applied to multiple Groups).
Because I don't want to save the whole Category in my database, but only a short string, I have a mapper in the backend in my app.
Let's say my Category-Mapper looks like this:
//category.mapper.js
module.exports = Object.freeze({
cat1: "Here is the String that should be sent to and displayed by the FrontEnd",
cat2: ".....",
});
So basically, in my database I save "cat1" as the category and every time I get one or more Categories via Sequelize from the database, I want to go into my mapper, resolve the short string to the long string and send it to the Frontend, so I wrote the following code:
//category.model.js
const categoryMapper = require("../mapper/category.mapper");
Category.afterFind((models) => {
if(!Array.isArray(models)) {
models = [models];
}
models.forEach(model => {
model.name = categoryMapper[model.name];
});
});
This works great when I call Category.findAll()..., but does not trigger when I include the Category as in this example:
Group.findAll({
include: [Category]
})
There is this rather old GitHub Issue referencing this behavior, where someone published some code to make sure the hooks run on include. See here.
I tried implementing the referenced code into my project, but when I do, the hook for Category runs twice in my following code:
Person.findAll({
include: [{
model: Group,
include: [Category]
}]
})
My assumption is, that, with the code from the GitHub-Issue comment, my hook gets triggered every time the relationship is detected and the code runs. Therefore the hook runs once after including Group, because Group has a relationship to Category and a second time when Category is actually included, which breaks my mapping function because the second time it tries to resolve the long string, which doesn't work.
I'm looking for a solution that basically runs my hooks once and only once, namely when the actual include for my model triggers, regardless of on what level the include happens.
Sorry for the lengthy post, but I did not find any solution to my problem online, but don't believe what I am trying to achieve is very exotic or specific to my project only.
If there is a better solution I am not seeing, I'm open to suggestions and new approaches.
Thanx in advance!

Cloud Firestore optimized database setup for minimal read/write operations

I'm to get an answer on whether or not the way my database is setup will cause too many recursive reads and therefore exponentially increase the number of read operations.
I currently have a collection of users, inside of each user doc, I have 3 other catalogs, goods, bundles and parts. Each user has a list of parts and a list of bundles, for example.
Each doc in the bundles catalog has an array of maps with a reference to a doc in the parts catalog in each map.
When I query the bundles, I want to also get the details of each part in the bundle. Does this require that I run another onSnapshot?
Here's an example:
database:
users (catalog)
userID
parts
partID1
partID2
partID3
bundles
bundleID1
title: "string",
parts: [
part:"/users/userID/parts/partID1,
qty: 1
]
parts: [
part:"/users/userID/parts/partID2,
qty: 1
]
parts: [
part:"/users/userID/parts/partID3,
qty: 1
]
getting the bundles
initBundle(bid) {
const path = this.database.collection('users').doc('userID').collection('bundles').doc(bid);
path.ref.onSnapshot(bundle => {
const partsArr = [];
bundle.data().parts.forEach(part => {
part.part.onSnapshot(partRef => {
const partObj = {
data: partRef.data(),
qty: part.qty
};
partsArr.push(partObj);
});
});
const bundleObj = {
title: bundle.data().title,
parts: partsArr
};
this.bundle.next(bundleObj);
});
return this.bundle;
}
I'm using Ionic/Angular for this, so when I return the item, it needs to be an array of objects. I'm sort of recreating the object to include each part on this init. As you can see, for each part in the bundles return, I am doing another onSnapshot. This seems incorrect to me.
Something that is dawning on me is that I probably should be making a single call to the user, which in turn returns everything? But how do I get the sub catalogs at that point? I'm not sure how to proceed without racking up a bill!
If you do a nested part.part.onSnapshot(partRef => { listener, be sure to manage those listeners. I know of three common approaches:
Once your outer onSnapshot listener disappears, the nested ones should probably also be stopped (as their data likely isn't needed anymore). This is a fairly simple approach, since you just need one list of listeners for the entire bundle
Alternatively you can manage each "part" listener based on the status of that part in the outer listener, removing the listener for "part1" when that disappears from the bundle. This can be made into a highly efficient solution, but does require (quite some) additional code.
Many developers use get()s for the nested document reads, as that means there is nothing to manage.

Sequelize dynamic seeding

I'm currently seeding data with Sequelize.js and using hard coded values for association IDs. This is not ideal because I really should be able to do this dynamically right? For example, associating users and profiles with a "has one" and "belongs to" association. I don't necessarily want to seed users with a hard coded profileId. I'd rather do that in the profiles seeds after I create profiles. Adding the profileId to a user dynamically once profiles have been created. Is this possible and the normal convention when working with Sequelize.js? Or is it more common to just hard code association IDs when seeding with Sequelize?
Perhaps I'm going about seeding wrong? Should I have a one-to-one number of seeds files with migrations files using Sequelize? In Rails, there is usually only 1 seeds file you have the option of breaking out into multiple files if you want.
In general, just looking for guidance and advice here. These are my files:
users.js
// User seeds
'use strict';
module.exports = {
up: function (queryInterface, Sequelize) {
/*
Add altering commands here.
Return a promise to correctly handle asynchronicity.
Example:
return queryInterface.bulkInsert('Person', [{
name: 'John Doe',
isBetaMember: false
}], {});
*/
var users = [];
for (let i = 0; i < 10; i++) {
users.push({
fname: "Foo",
lname: "Bar",
username: `foobar${i}`,
email: `foobar${i}#gmail.com`,
profileId: i + 1
});
}
return queryInterface.bulkInsert('Users', users);
},
down: function (queryInterface, Sequelize) {
/*
Add reverting commands here.
Return a promise to correctly handle asynchronicity.
Example:
return queryInterface.bulkDelete('Person', null, {});
*/
return queryInterface.bulkDelete('Users', null, {});
}
};
profiles.js
// Profile seeds
'use strict';
var models = require('./../models');
var User = models.User;
var Profile = models.Profile;
module.exports = {
up: function (queryInterface, Sequelize) {
/*
Add altering commands here.
Return a promise to correctly handle asynchronicity.
Example:
return queryInterface.bulkInsert('Person', [{
name: 'John Doe',
isBetaMember: false
}], {});
*/
var profiles = [];
var genders = ['m', 'f'];
for (let i = 0; i < 10; i++) {
profiles.push({
birthday: new Date(),
gender: genders[Math.round(Math.random())],
occupation: 'Dev',
description: 'Cool yo',
userId: i + 1
});
}
return queryInterface.bulkInsert('Profiles', profiles);
},
down: function (queryInterface, Sequelize) {
/*
Add reverting commands here.
Return a promise to correctly handle asynchronicity.
Example:
return queryInterface.bulkDelete('Person', null, {});
*/
return queryInterface.bulkDelete('Profiles', null, {});
}
};
As you can see I'm just using a hard coded for loop for both (not ideal).
WARNING: after working with sequelize for over a year, I've come to realize that my suggestion is a very bad practice. I'll explain at the bottom.
tl;dr:
never use seeders, only use migrations
never use your sequelize models in migrations, only write explicit SQL
My other suggestion still holds up that you use some "configuration" to drive the generation of seed data. (But that seed data should be inserted via migration.)
vv DO NOT DO THIS vv
Here's another pattern, which I prefer, because I believe it is more flexible and more readily understood. I offer it here as an alternative to the accepted answer (which seems fine to me, btw), in case others find it a better fit for their circumstances.
The strategy is to leverage the sqlz models you've already defined to fetch data that was created by other seeders, use that data to generate whatever new associations you want, and then use bulkInsert to insert the new rows.
In this example, I'm tracking a set of people and the cars they own. My models/tables:
Driver: a real person, who may own one or more real cars
Car: not a specific car, but a type of car that could be owned by someone (i.e. make + model)
DriverCar: a real car owned by a real person, with a color and a year they bought it
We will assume a previous seeder has stocked the database with all known Car types: that information is already available and we don't want to burden users with unnecessary data entry when we can bundle that data in the system. We will also assume there are already Driver rows in there, either through seeding or because the system is in-use.
The goal is to generate a whole bunch of fake-but-plausible DriverCar relationships from those two data sources, in an automated way.
const {
Driver,
Car
} = require('models')
module.exports = {
up: async (queryInterface, Sequelize) => {
// fetch base entities that were created by previous seeders
// these will be used to create seed relationships
const [ drivers , cars ] = await Promise.all([
Driver.findAll({ /* limit ? */ order: Sequelize.fn( 'RANDOM' ) }),
Car.findAll({ /* limit ? */ order: Sequelize.fn( 'RANDOM' ) })
])
const fakeDriverCars = Array(30).fill().map((_, i) => {
// create new tuples that reference drivers & cars,
// and which reflect the schema of the DriverCar table
})
return queryInterface.bulkInsert( 'DriverCar', fakeDriverCars );
},
down: (queryInterface, Sequelize) => {
return queryInterface.bulkDelete('DriverCar');
}
}
That's a partial implementation. However, it omits some key details, because there are a million ways to skin that cat. Those pieces can all be gathered under the heading "configuration," and we should talk about it now.
When you generate seed data, you usually have requirements like:
I want to create at least a hundred of them, or
I want their properties determined randomly from an acceptable set, or
I want to create a web of relationships shaped exactly like this
You could try to hard-code that stuff into your algorithm, but that's the hard way. What I like to do is declare "configuration" at the top of the seeder, to capture the skeleton of the desired seed data. Then, within the tuple-generation function, I use that config to procedurally generate real rows. That configuration can obviously be expressed however you like. I try to put it all into a single CONFIG object so it all stays together and so I can easily locate all the references within the seeder implementation.
Your configuration will probably imply reasonable limit values for your findAll calls. It will also probably specify all the factors that should be used to calculate the number of seed rows to generate (either by explicitly stating quantity: 30, or through a combinatoric algorithm).
As food for thought, here is an example of a very simple config that I used with this DriverCar system to ensure that I had 2 drivers who each owned one overlapping car (with the specific cars to be chosen randomly at runtime):
const CONFIG = {
ownership: [
[ 'a', 'b', 'c', 'd' ], // driver 1 linked to cars a, b, c, and d
[ 'b' ], // driver 2 linked to car b
[ 'b', 'b' ] // driver 3 has two of the same kind of car
]
};
I actually used those letters, too. At runtime, the seeder implementation would determine that only 3 unique Driver rows and 4 unique Car rows were needed, and apply limit: 3 to Driver.findAll, and limit: 4 to Car.findAll. Then it would assign a real, randomly-chosen Car instance to each unique string. Finally, when generating association tuples, it uses the string to look up the chosen Car from which to pull foreign keys and other values.
There are undoubtedly fancier ways of specifying a template for seed data. Skin that cat however you like. Hopefully this makes it clear how you'd marry your chosen algorithm to your actual sqlz implementation to generate coherent seed data.
Why the above is bad
If you use your sequelize models in migration or seeder files, you will inevitably create a situation in which the application will not build successfully from a clean slate.
How to avoid madness:
Never use seeders, only use migrations
(Anything you can do in a seeder, you can do in a migration. Bear that in mind as I enumerate the problems with seeders, because that means none of these problems gain you anything.)
By default, sequelize does not keep records of which seeders have been run. Yes, you can configure it to keep records, but if the app has already been deployed without that setting, then when you deploy your app with the new setting, it'll still re-run all your seeders one last time. If that's not safe, your app will blow up. My experience is that seed data can't and shouldn't be duplicated: if it doesn't immediately violate uniqueness constraints, it'll create duplicate rows.
Running seeders is a separate command, which you then need to integrate into your startup scripts. It's easy for that to lead to a proliferation of npm scripts that make app startup harder to follow. In one project, I converted the only 2 seeders into migrations, and reduced the number of startup-related npm scripts from 13 to 5.
It's been hard to pin down, but it can be hard to make sense of the order in which seeders are run. Remember also that the commands are separate for running migrations and seeders, which means you can't interleave them efficiently. You'll have to run all migrations first, then run all seeders. As the database changes over time, you'll run into the problem I describe next:
Never use your sequelize models in your migrations
When you use a sequelize model to fetch records, it explicitly fetches every column it knows about. So, imagine a migration sequence like this:
M1: create tables Car & Driver
M2: use Car & Driver models to generate seed data
That will work. Fast-forward to a date when you add a new column to Car (say, isElectric). That involves: (1) creating a migraiton to add the column, and (2) declaring the new column on the sequelize model. Now your migration process looks like this:
M1: create tables Car & Driver
M2: use Car & Driver models to generate seed data
M3: add isElectric to Car
The problem is that your sequelize models always reflect the final schema, without acknowledging the fact that the actual database is built by ordered accretion of mutations. So, in our example, M2 will fail because any built-in selection method (e.g. Car.findOne) will execute a SQL query like:
SELECT
"Car"."make" AS "Car.make",
"Car"."isElectric" AS "Car.isElectric"
FROM
"Car"
Your DB will throw because Car doesn't have an isElectric column when M2 executes.
The problem won't occur in environments that are only one migration behind, but you're boned if you hire a new developer or nuke the database on your local workstation and build the app from scratch.
Instead of using different seeds for Users and Profiles you could seed them together in one file using sequelizes create-with-association feature.
And additionaly, when using a series of create() you must wrap those in a Promise.all(), because the seeding interface expects a Promise as return value.
up: function (queryInterface, Sequelize) {
return Promise.all([
models.Profile.create({
data: 'profile stuff',
users: [{
name: "name",
...
}, {
name: 'another user',
...
}]}, {
include: [ model.users]
}
),
models.Profile.create({
data: 'another profile',
users: [{
name: "more users",
...
}, {
name: 'another user',
...
}]}, {
include: [ model.users]
}
)
])
}
Not sure if this is really the best solution, but thats how I got around maintaining foreign keys myself in seeding files.

Meteor: Data from External API call not rendering

I am relatively new to Meteor, and I'm trying to create a web store for my sister-in-law that takes data from her existing Etsy store and puts a custom skin on it. I've defined all of my Meteor.methods to retrieve the data, and I've proofed the data with a series of console.log statements... So, the data is there, but it won't render on the screen. Here is an example of some of the code on the server side:
Meteor.methods({
...
'getShopSections': function() {
this.unblock();
var URL = baseURL + "/sections?api_key="+apiKey;
var response = Meteor.http.get(URL).data.results;
return response;
}
...
});
This method returns an array of Object. A sample bit of JSON string from one of the returned Objects from the array:
{
active_listing_count: 20,
rank: 2,
shop_section_id: 1******0,
title: "Example Title",
user_id: 2******7
}
After fetching this data without a hitch, I was ready to make the call from the client side, and I tried and failed in several different ways before a Google search landed me at this tutorial here: https://dzone.com/articles/integrating-external-apis-your
On the client side, I have a nav.js file with the following bit of code, adapted from the above tutorial:
Template.nav.rendered = function() {
Meteor.call('getShopSections', function(err, res) {
Session.set('sections', res);
return res;
});
};
Template.nav.helpers({
category: function() {
var sections = Session.get('sections');
return sections;
}
});
And a sample call from inside my nav.html template...
<ul>
{{#each category}}
<li>{{category.title}}</li>
{{/each}}
</ul>
So, there's a few things going on here that I'm unsure of. First and foremost, the DOM is not rendering any of the category.title String despite showing the appropriate number of li placeholders. Secondly, before I followed the above tutorial, I didn't define a Session variable. Considering that the list of shop categories should remain static once the template is loaded, I didn't think it was necessary from what I understand about Session variables... but for some reason this was the difference between the template displaying a single empty <li> tag versus a number of empty <li>'s equal to category.length --- so, even though I can't comprehend why the Session variable is needed in this instance, it did bring me one perceived step closer to my goal... I have tried a number of console.log statements on the client side, and I am 100% sure the data is defined and available, but when I check the source code in my Developer Tools window, the DOM just shows a number of empty li brackets.
Can any Meteor gurus explain why 1) the DOM is not rendering any of the titles, and 2) if the Session variable indeed necessary? Please let me know if more information is needed, and I'll be very happy to provide it. Thanks!
You set the data context when you use #each, so simply use:
<li>{{title}}</li>
If a Session is the right type of reactive variable to use here or not is hard to determine without knowing what you are doing but my rough guess is that a Mini Mongo collection may be better suited for what it appears you are doing.
To get you started on deciding the correct type of reactive variable to use for this head over to the full Meteor documentation and investigate: collections, sessions, and reactive vars.
Edit: To step back and clarify a bit, a Template helper is called a reactive computation. Reactive computations inside of helpers will only execute if they are used in their respective templates AND if you use a reactive variable inside of the computation. There are multiple types of reactive variable, each with their own attributes. Your code likely didn't work at all before you used Session because you were not using a reactive variable.

Structuring Monitoring-Frontend with backbone.js

I'm building a monitoring/statistic tool. The current infrastructure looks like following:
Collector-Backend: Receives queries (JSON-objects) from the frontend, fetches and stores them in a cache. Finally it notifies frontend over a message queue.
Frontend-Server: Handles subscriptions on the 'live'-views, pushes the data received from the backend to the user via WebSockets. (Also contains some user-management etc.)
Frontend itself: Some JavaScript-Spaghetti-Code which render the data using jQuery-Flot-Graphs. There are a couple of graphs per environment which are displayed on the same page. A graph is a set of measurments.
An, what I call, environment is a (set of) system(s) which contains different measurments. Therefore each environment contains following views:
a 'live'-view which displays the measurements of the last 2 hours (updated every minute).
a statistic-view on which different pre-defined graphs can be selected over an arbitrary time (same queries as the live-views, just another view)
some special reports like the yesterdays statistics.
An environment contains following variables:
{
name: 'FooEnvironment',
description: 'Foo Environment, <insert buzzwords here>',
baseTable: 'foobar', # Which table in the backend contains the corresponding data
target: 'barTarget', # The responsible target (backend-plugin)
}
The graphs are defined as a set of measurements (as a JSON-Object) and is defined like
this:
{
name: 'FooProtocol Traffic', # Graphs display name
size: 'full' # only matters for some CSS-transformations
interval: X # Which delta each tuple differs from the other
start-date: "DATE",
end-date: "DATE",
axes: {
"y1": { label: 'Foo/t', data: [['MEASURMENT1', 'AGGREGATION'], ['MEASURMENT2', 'AGGREGATION']] },
"y2": { label: 'Bar/sec', data: [['MEASURMENT_ON_Y2', 'AGGREGATION']] }
}
}
I want to be able to manage those environments on my frontend (Adding/remove environments and add/remove graphs to them).
How would you modeling this using MVC (backbone.js)?
Please report me if I have to provide some more information :-)
I'd probably try having an Environment model and a Graph model.
The Environment model would have a Graphs property, which is a collection of Graph models.
A user's subscriptions could be a collection of Environments that are being monitored.
There would be at least three views - one for holding the representation of a graph, one for holding the representation of an environment, and one for holding the collection of environments.
As you've mentioned adding/removing graphs and environments, there's a little more detail required.
For example, you'd need a list of available environments for the user to subscribe to, and under each environment subscribed to, a list of available graphs for that environment.
I know the answer is rambling and vague, but without further detail of what data is on hand, it's the best I can do.

Categories