Updating a collection from a different database - javascript

I'm using Mongo 4.1 and would like to update a collection named "location_copy", by adding a new field to it of type object named "time", with two subfields: "utcTime", which will be populated by the value of that documents "time" field, and "tz", which will be populated by value of "subject.contactInf[0].addresses[0].timeZoneID" from of the document in the collection "subjects" in the database "Subjects" (a different database from the one of the first collection) with "_id" field value corresponding to "subjectID" field in locations_copy.
I have tried to accomplish this with the following code:
const get_time_zone_id = function(doc) {return doc.contactInfo[0].addresses[0].timeZoneID}
const get_location_doc = function(subjectID) { return db.getSiblingDB('Subjects').subjects.find({"_id": subjectID, "contactInfo": {"$exists": true}, "$where" : function() {
return (this.contactInfo.length > 0 && this.contactInfo[0].addresses && this.contactInfo[0].addresses.length > 0 && this.contactInfo[0].addresses[0].timeZoneID)
}}, {"contactInfo" : {"$slice": 1}, "contactInfo.addresses": {"$slice": 1},"contactInfo.addresses.timeZoneID" : 1}).map(get_time_zone_id)}
db.locations_copy.aggregate( [
{ $match: {"subjectID": {"$exists": true}}},
{ $addFields: {
time: { utc: "$timeUTC",
tz: { "$arrayElemAt": [get_location_doc(ObjectId("$subjectID")), 0 ] }}
}
}
] ).forEach(function(x){db.locations_copy.save(x)})
everything works except for one thing: when I try to pass ObjectId("$subjectID") as a parameter to "get_location_doc", it parses "$subjectID" as a literal string rather than passing the value of the underlying field in each document. I have also tried passing simply subjectID (without quotes) in which case it was simply undefined, or "$$subjectID" which led me to a literal string again. I understand this is due to client/server side parsing in run time.
I have tried to utilize the "$function" operator, but apparently it's only available from version 4.4 (I'm using 4.1).
I should note, that if I replace "$subjectID" with a hard-coded string ID (for example "5ff4c037bc0a716381231277") everything works as you'd expect.
Can anyone please help me accomplish what I intend? since this script is only meant to be executed once, performance is not much of an issue.
Thank you!

db.getSiblingDB().collection.find() is a client-side operation. It is not something you can use to join collections as part of a query. For that, see https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/.
The second thing you are doing is retrieving nested fields out of a document. You can do this with $set and dot notation. See specifically the example at https://docs.mongodb.com/manual/reference/operator/aggregation/set/#adding-fields-to-an-embedded-document.
You will need to construct a single aggregation pipeline that does everything your current mix of aggregation and javascript does using only the operations documented in https://docs.mongodb.com/manual/reference/operator/aggregation/ and the stages documented in https://docs.mongodb.com/manual/reference/operator/aggregation-pipeline/.

Related

using dynamodb SET to append to a string set

I have looked at recent posts and nothing on them has worked for me. I have a string set called "friendRequests" in my DynamoDB table and I am trying to append an element to it. However, I keep getting many errors when I try to call db.updateItem with these parameters. Here is my current code, the error is with ExpressionAttributeValues but I have probably spent an hour changing my syntax to no avail.
var params = {
TableName: "users",
Key: { "username": { "S": addFriendInfo.friendUsername } },
UpdateExpression: "SET friendRequests = list_append(friendRequests, :newFriend)",
ExpressionAttributeValues: {
':newFriend': { "SS" : [addFriendInfo.username] }
}
}
That is my code above. addFriendInfo.username / friendUsername are both just strings. This currently gives me a 'Invalid UpdateExpression: Incorrect operand type for operator or function; operator or function: list_append, operand type: SS'. I have tried many things. Can anyone point me in right direction to fixing this damn syntax?
As the DynamoDB documentation explains, DynamoDB has two similar but distinct types - a list and a set. A list is an ordered list of items of any type, while a set is a non-empty, non-ordered collection of items of the same type. The "SS" type you used is a "set of strings". This is distinct from a "list", and you cannot use list_append on sets, as the error message tells you.
To add an element to a set (or create a new set if it doesn't yet exist) you can use the ADD operation, e.g., ADD friendRequests :newFriend.

Algolia search on nested objects in a record - multiple facetFilters in one object

I’m migrating from Mongo to Firebase with Algolia on top to provide the search. But hitting a snag coming up with a comparable way to search in individual elements of a record.
I have an object that stores when a room is available: from and to. Each record can have many individual from/to combos (see the sample below with 2). I want to be able to run a search something like:
roomavailable.from <= 1522195200 AND roomavailable.to >=1522900799
But only have the query search a match within each element, not any facet in all elements. An element query in Mongo works like that. But if I run that query on the record listed below, it will return the record, because the two roomavailable objects satisfy the .from and .to query. I think.
Is there a way to ensure the search is looking only at matching a pair of .from and .to in an individual object/element?
Below is the pertinent part of the record stored in Algolia so you can see the structure.
"roomavailable": [
{
"_id": "rJbdWvY9M",
"from": 1522195200,
"to": 1522799999
},
{
"_id": "r1H_-vKqz",
"from": 1523923200,
"to": 1524268799
}
],
And here is the Mongo (mongoose) equivalent where its searching inside individual elements (this works):
$elemMatch: {
from: {
$lte: moment(dateArray[0]).utc().startOf('day').format()
},
to: {
$gte: moment(dateArray[1]).utc().endOf('day').format()
}
}
I have also tried this query but it seems to still match either the .from AND .to but in any of the the individual roomavailable elements:
index.search({
query: '',
filters: filters,
facetFilters: [roomavailable.from: 1522195200, roomavailable.to: 1524268799],
attributesToRetrieve: [
"roomavailable",
],
restrictHighlightAndSnippetArrays: true
})
I found a couple posts on Algolia discussing using 1 bracket vs. 2 brackets in the facetFilters. I've tried both. Neither work.
Any suggestions would be awesome. Thanks!
Edit: See discussion on Algolia Discourse:
https://discourse.algolia.com/t/how-to-match-multiple-attributes-in-nested-object-with-numericfilters/4887/8
Hi #kanec, thanks for clarifying your question!
Indeed what #Alefort suggested (using roomavailable in a separate index) would be the easiest option since the query I mentioned above will definitely return the results you want. This will mean that you'll have to query the room availability index separately in order to get which IDs are available, so you'll have to use multiple-queries:
https://www.algolia.com/doc/api-reference/api-methods/multiple-queries/
That said, I asked our core API team to see if there's a more reasonable way to approach this issue, but I fear that this is a filter limit due to performance reasons with arrays. You could transform your data structure in the following and index your rooms as an object instead:
[
{
"roomavailable": {
"0": {
"_id": "rJbdWvY9M",
"from": 1522195200,
"to": 1522799999
},
"1": {
"_id": "r1H_-vKqz",
"from": 1523923200,
"to": 1524268799
}
}
}
]
So you can apply the following filter:
{
"filters": "roomavailable.0.from <= 1522195200 AND roomavailable.0.to >= 1522799999 AND roomavailable.1.from <= 1522195200 AND roomavailable.1.to >=1522900799"
}
The downside of this is that you'll need to know the length of roomavailable in order to build the search query on the front-end (you can do so at indexing time by adding a roomavailable_count property) and also this will probably will be less performant with a considerable number of rooms per item; in this case, switching to a dedicated index makes totally sense for the following reasons:
If in your backend you frequently update available rooms you won't impact the other indices' build time
Filters will perform better (as explained above)
Indexing strategy will be simpler to handle
Let me know what you think about this and if it helps you out.

Mongoose findOneAndUpdate: create and then update nested array

I have a program where I'm requesting weather data from a server, processing the data, and then saving it to an mlab account using mongoose. I'm gathering 10 years of data, but the API that I'm requesting the data from only allows about a year at a time to be requested.
I'm using findOndAndUpdate to create/update the document for each weather station, but am having trouble updating the arrays within the data object. (Probably not the best way to describe it...)
For example, here's the model:
const stnDataSchema = new Schema(
{
station: { type: String, default: null },
elevation: { type: String, default: null },
timeZone: { type: String, default: null },
dates: {},
data: {}
},
{ collection: 'stndata' },
{ runSettersOnQuery: true }
)
where the dates object looks like this:
dates: ["2007-01-01",
"2007-01-02",
"2007-01-03",
"2007-01-04",
"2007-01-05",
"2007-01-06",
"2007-01-07",
"2007-01-08",
"2007-01-09"]
and the data object like this:
"data": [
{
"maxT": [
0,
null,
4.4,
0,
-2.7,
etc.....
what I want to have happen is when I run findOneAndUpdate I want to find the document based on the station, and then append new maxT values and dates to the respective arrays. I have it working for the date array, but am running into trouble with the data array as the elements I'm updated are nested.
I tried this:
const update = {
$set: {'station': station, 'elevation': elevation, 'timeZone': timeZone},
$push: {'dates': datesTest, 'data.0.maxT': testMaxT}};
StnData.findOneAndUpdate( query, update, {upsert: true} ,
function(err, doc) {
if (err) {
console.log("error in updateStation", err)
throw new Error('error in updateStation')
}
else {
console.log('saved')
but got an output into mlab like this:
"data": {
"0": {
"maxT": [
"a",
"b",
the issue is that I get a "0" instead of an array of one element. I tried 'data[0].maxT' but nothing happens when I do that.
The issue is that the first time I run the data for a station, I want to create a new document with data object of the format in my third code block, and then on subsequent runs, once that document already exists, update the maxT array with new values. Any ideas?
You are getting this output:
"data": {
"0": {
"maxT": [
"a",
"b",
because you are upserting the document. Upserting gets a bit complicated when dealing with arrays of documents.
When updating an array, MongoDB knows that data.0 refers to the first element in the array. However, when inserting, MongoDB can't tell if it's meant to be an array or an object. So it assumes it's an object. So rather than inserting ["val"], it inserts {"0": "val"}.
Simplest Solution
Don't use an upsert. Insert a document for each new weather station then use findOndAndUpdate to push values into the arrays in the documents. As long as you insert the arrays correctly the first time, you will be able to push to them without them turning into objects.
Alternative Simple Solution if data just Contains one Object
From your question, it looks like you only have one object in data. If that is the case, you could just make the maxT array top-level, instead of being a property of a single document in an array. Then it would act just like dates.
More Complicated MongoDB 3.6 Solution
If you truly cannot do without upserts, MongoDB 3.6 introduced the filtered positional operator $[<identifier>]. You can use this operator to update specific elements in an array which match a query. Unlike the simple positional operator $, the new $[<identifier>] operator can be used to upsert as long as an exact match is used.
You can read more about this operator here: https://docs.mongodb.com/manual/reference/operator/update/positional-filtered/
So your data objects will need to have a field which can be matched exactly on (say name). An example query would look something like this:
let query = {
_id: 'idOfDocument',
data: [{name: 'subobjectName'}] // Need this for an exact match
}
let update = {$push: {'data.$[el].maxT': testMaxT}}
let options = {upsert: true, arrayFilters: [{'el.name': 'subobjectName'}]}
StnData.findOneAndUpdate(query, update, options, callbackFn)
As you can see this adds much more complexity. It would be much easier to forget about trying to do upserts. Just do one insert then update.
Moreover mLab currently does not support MongoDB 3.6. So this method won't be viable when using mLab until 3.6 is supported.

CouchDB query issues

I will start off by saying while I am not new to CouchDB, I am new to querying the views using JavaScript and the web.
I have looked at multiple other questions on here, including CouchDB - Queries with params, couchDB queries, Couchdb query with AND operator, CouchDB Querying Dates, and Basic CouchDB Queries, just to list a few.
While all have good information in them, I haven't found one that has my particular problem in it.
I have a view set up like so:
function (docu) {
if(docu.status && docu.doc && docu.orgId.toString() && !docu.deleted){
switch(docu.status){
case "BASE":
emit(docu.name, docu);
break;
case "AIR":
emit(docu.eta, docu);
break;
case "CHECK":
emit(docu.checkTime, docu);
break;
}
}
}
with all documents having a status, doc, orgId, deleted, name, eta, and checkTime. (I changed doc to docu because of my custom doc key.
I am trying to query and emit based on a set of keys, status, doc, orgId, where orgId is an integer.
My jQuery to do this looks like so:
$.couch.db("myDB").view("designDoc/viewName", {
keys : ["status","doc",orgId],
success: function(data) {
console.log(data);
},
error: function(status) {
console.log(status);
}
});
I receive
{"total_rows":59,"offset":59,"rows":[
]}
Sometimes the offset is 0, sometimes it is 59. I feel I must be doing something wrong for this not to be working correctly.
So for my questions:
I did not mention this, but I had to set docu.orgId.toString() because I guess it parses the URL as a string, is there a way to use this number as a numeric value?
How do I correctly view multiple documents based on multiple keys, i.e. if(key1 && key2) emit(doc.name, doc)
Am I doing something obviously wrong that I lack the knowledge to notice?
Thank you all.
You're so very close. To answer your questions
When you're using docu.orgId.toString() in that if-statement you're basically saying: this value must be truthy. If you didn't convert to string, any number, other than 0, would be true. Since you are converting to a string, any value other than an empty string will be true. Also, since you do not use orgId as the first argument in an emit call, at least not in the example above, you cannot query by it at all.
I'll get to this.
A little.
The thing to remember is emit creates a key-value table (that's really all a view is) that you can use to query. Let's say we have the following documents
{type:'student', dept:'psych', name:'josh'},
{type:'student', dept:'compsci', name:'anish'},
{type:'professor', dept:'compsci', name:'kender'},
{type:'professor', dept:'psych', name:'josh'},
{type:'mascot', name:'owly'}
Now let's say we know that for this one view, we want to query 1) everything but mascots, 2) we want to query by type, dept, and name, all of the available fields in this example. We would write a map function like this:
function(doc) {
if (doc.type === 'mascot') { return; } // don't do anything
// allow for queries by type
emit(doc.type, null); // the use of null is explained below
// allow queries by dept
emit(doc.dept, null);
// allow for queries by name
emit(doc.name, null);
}
Then, we would query like this:
// look for all joshs
$.couch.db("myDB").view("designDoc/viewName", {
keys : ["josh"],
// ...
});
// look for everyone in the psych department
$.couch.db("myDB").view("designDoc/viewName", {
keys : ["psych"],
// ...
});
// look for everyone that's a professor and everyone named josh
$.couch.db("myDB").view("designDoc/viewName", {
keys : ["professor", "josh"],
// ...
});
Notice the last query isn't and in the sense of a logical conjunction, it's in the sense of a union. If you wanted to restrict what was returned to documents that were only professors and also joshs, there are a few options. The most basic would be to concatenate the key when you emit. Like
emit('type-' + doc.type + '_name-' + doc.name, null);
You would then query like this: key : ["type-professor_name-josh"]
It doesn't feel very proper to rely on strings like this, at least it didn't to me when I first started doing it, but it is a quite common method for querying key-value stores. The characters - and _ have no special meaning in this example, I simply use them as delimiters.
Another option would be what you mentioned in your comment, to emit an array like
emit([ doc.type, doc.name ], null);
Then you would query like
key: ["professor", "josh"]
This is perfectly fine, but generally, the use case for emitting arrays as keys, is for aggregating returned rows. For example, you could emit([year, month, day]) and if you had a simple reduce function that basically passed the records through:
function(keys, values, rereduce) {
if (rereduce) {
return [].concat.apply([], values);
} else {
return values;
}
}
You could query with the url parameter group_level set to 1 or 2 and start querying by year and month or just year on the exact same view using arrays as keys. Compared to SQL or Mongo it's mad complicated and convoluted, but hey, it's there.
The use of null in the view is really for resource saving. When you query a view, the rows contain an _id that you can use in a second ajax call to get all the documents from, for example, _all_docs.
I hope that makes sense. If you need any clarification you can use the comments and I'll try my best.

Find a document that contains a specific value in an array but not if it's the last element

My current approach is:
var v = 'Value';
Collection.find({arrayToLookIn: v}).forEach(function(obj) {
if (obj.arrayToLookIn.indexOf(v) !== obj.arrayToLookIn.length - 1) {
// do stuff
}
}
I was wondering if there's a way to specify such a rule in the find() call and do this without the inner check?
I've looked through https://docs.mongodb.org/manual/tutorial/query-documents/#match-an-array-element but didn't spot what I seek.
First question, please be gentle :)
What you can do now
You want $where, which can use JavaScript evaluation to match the document. So here you ask the evaluating code to test each array element, but not the last one:
Collection.find({
"arrayToLookIn": v,
"$where": function() {
var array = this.arrayToLookIn;
array.pop(); // remove last element
return array.some(function(el) { return el == 'Value' });
}
})
Note that as it is JavaScript sent to the server the "Value" needs to be specified in that code rather than using a variable. You can optionally contruct the JavaScript code as a "string" to join in that variable as a literal and submit that as the argument to $where.
Note that I'm leaving in the basic equality match, as $where cannot match using an index like that can, and therefore it's job is to "filter" out the results where the match is on the last element, and not test every single document to find whether it is even there at all.
Better Future Way
For the curious, as of the present MongoDB 3.0 release series there is not a really efficient way to do this with the aggregation framework, so the JavaScript evalution is the better option.
You would presently need to do something silly like find the last element in a $group after $unwind and then $match out the value after another $unwind. It's not efficient and prone to error where the value exists more than once.
Future releases will have a $slice operator which could be used like this with $redact:
Collection.aggregate([
// Still wise to do this as mentioned earlier
{ "$match": { "arrayToLookIn": v } },
// Then only return if the match was not in the last element
{ "$redact": {
"$cond": {
"if": {
"$setIsSubset": [
[v],
{ "$slice": [
"$arrayToLookIn",
0,
{ "$subtract": [ { "$size": "$arrayToLookIn" }, 1 ] }
]}
]
},
"then": "$$KEEP",
"else": "$$PRUNE"
}
}}
Where $setIsSubset does the comparison of the array which has had it's last entry removed via $slice by only returning elements from 0 to the $size minus 1.
And that should be more efficient than $where as it uses native coded operations for the comparison, when the next release that has that $slice for aggregation becomes available.
Not to mention $unwind also has an option to include the index position in future releases as well. But it's still not a great option even with that addition.

Categories