In Mongodb (2.6.1), I need to query a document by _id using pure json (without using ObjectIds). As mentioned in the mongodb extended json, I was expecting db.collection.findOne({"_id": {"$oid": "51b6eab8cd794eb62bb3e131"}}) to work but it does not. It even throw the following exception.
Can't canonicalize query: BadValue unknown operator: $oid
Anyone knows how to do it?
The extended JSON syntax is intended as a "transfer" format so that if for example you are sending JSON ouput to a remote client there is still a way to determine the actual implemented type such as ObjectId, Date, Binary etc.
The only place AFIAK where this is implemented is within the C# driver which provides a json parser utility method which would take JSON with the extended syntax fields and then "cast" those into objects of the required type.
So in much the same way you can implement your own parser routine to do much the same thing, it is just a matter of testing the key values for something that represents the type of object specified in the key. Given a sample fragment:
{ "_id": { "$oid": "51b6eab8cd794eb62bb3e131" } }
In simplified form without recursively checking by depth:
data = JSON.parse( json );
for ( k in data ) {
if ( data[k].hasOwnProperty("$oid") )
data[k] = new ObjectId( data[k]["$oid"] );
// etc
}
So just because you may be using JavaScript it doesn't mean the "extended syntax" is valid as a query source, but you can as with other languages post-process the parsed JSON into the valid object notation required by that language and the query interface.
Similar "casting" is performed by some drivers on "string" values supplied against an _id field in order to cast to the correct object type required by the BSON wire protocol.
Related
I am migrating few queries from Google BigQuery to MySQL and need help in replicating the below BigQuery Java script UDF to equivalent MySQL. I don't see any reference over the internet. Does MySQL support Java Script UDFs ?
The requirement here is to Split a JSON array into a simple array of string ( each string represents individual JSON string ).
CREATE OR REPLACE FUNCTION `<project>.<dataset>.json2array`(json STRING) RETURNS ARRAY<STRING> LANGUAGE js AS R"""
if (json) {
return JSON.parse(json).map(x=>JSON.stringify(x));
} else {
return [];
}
""";
No, MySQL does not support JavaScript stored functions. MySQL supports stored functions written in a procedural language. It also supports server-loadable functions compiled from C or C++, but these are less common.
MySQL doesn't have an ARRAY data type. The closest you can get in MySQL is a JSON data type, which may be a one-dimensional array of strings. If your JSON document is assured to be an array in that format, then you can simply do the following:
CREATE FUNCTION json2array(in_string TEXT) RETURNS JSON DETERMINISTIC
RETURN CAST(in_string AS JSON);
I'm not sure what the point of creating a stored function is in this case, since it only does what CAST() can do. So you might as well just call CAST() and skip creating the stored function.
Perhaps a good use of a stored function is to test the input to make sure it's a document with an array format (use JSON_TYPE()). For example:
CREATE FUNCTION json2array(in_string TEXT) RETURNS JSON DETERMINISTIC
RETURN IF(JSON_TYPE(in_string) = 'ARRAY', CAST(in_string AS JSON), JSON_ARRAY());
Let's jump straight to an example code:
create table test_json_table
(
data json not null
);
I can insert to the table like this:
const columns = { data: "{ some_json: 123 }" }; // notice that the data column is passed as string
await knex('test_json_table').insert(columns);
And get data from the table like this:
await knex('test_json_table').select();
// returns:
// [
// { data: { some_json: 123 } } // notice that the data is returned as parsed JavaScript object (not a string)
// ]
When inserting a row the JSON column needs to be passed as a serialised string. When retrieving the row, an already parsed object is returned.
This is creating quite a mess in the project. We are using TypeScript and would like to have the same type for inserts as for selects, but this makes it impossible. It'd be fine to either always have string or always object.
I found this topic being discussed at other places, so it looks like I am not alone in this (link, link). It seems like there is no way to convert the object to string automatically. Or I am missing something?
It'd be nice if knex provided a hook where we could manually serialise the object into string when inserting.
What would be the easiest way to achieve that? Is there any lightweight ORM with support for that? Or any other option?
You could try objection.js that allows you to declare certain columns to be marked as json attributes and those should be stringified automatically when inserting / updating their values https://vincit.github.io/objection.js/api/model/static-properties.html#static-jsonattributes
I haven't tried if it works with mysql though. I don't see any reason why it wouldn't.
I think the easiest way using jsonb data type. mysql json type
We prefer postgresql for this kind of problem at office, easier and solid database for your problem.
Well you could call your own function before inserting that converts all objects to string and call it every time before you insert.
You can probably wrap knex to do it automatically as well.
I am just trying to parse a Json document with a field Date like this:
´ death':Date('2007-03-17T04:00:00Z') using
com.mongodb.util.JSON.parse(document)
There is an exception when the value Date is encountered. Any help?
The key here is whatever has exported the data has done it wrong. Possibly someone has run something from the MongoDB shell and redirecting console output to a file. That is basically "doing it wrong".
There is a concept called MongoDB Extended JSON and has in fact been picked up in a few other areas, notably the EJSON project.What this tries to do is make sure that any exported JSON maintains "type" information to the BSON type identifier ( or other Object Type, in the purpose of EJSON ) so that a similar "extended JSON" parser can "re-construct" the object to it's intended form.
For a "date" object, the intented JSON representation is this:
{ "death": { "$date": "2007-03-17T04:00:00Z" } }
Since com.mongodb.util.JSON.parse is enabled with knowledge of the Extended JSON spec, then any such JSON contruct will result in a correct date object being constructed from the parsed data.
So what you have right now is just a "string". In fact, if it is not "quoted" like this:
´ { "death" : "Date('2007-03-17T04:00:00Z')" }
Then it is not in fact even valid JSON and would even need to be manipulated to a correct form before even a basic JSON parser would not error. At any rate, the result is just a "string" still, so you would need to make a regex match for the numerical data, then pass that to a date object construct.
Clearly the "best" thing to do here is to fix the "export" source of the data so that it is doing this date parsing to JSON in the correct extended way. Then the parser you are using will do the right thing.
I am using Node.js to query MongoDB and have a field that is a Date (ISODate). In Node.js after querying the Date format that is returned looks like this
DT : 2014-10-02T02:36:23.354Z
What I am trying to figure out is how based on the code i have below, can efficiently convert the DT field from UTC to Local Time ISOString. In other words if local time is EDT then something like this
DT : 2014-10-02T23:36:23.354Z
I don't think there is anything I can do in the query itself from Mongo. Should I traverse the Array result set and manually change the dates? Is there a better approach here? I am sending the response to an HTTP client.
collection.find(query,options).toArray(function (err, items) {
if (err) {
logAndSendDebugError(500, "Error issuing find against mongo Employees Collection -" + type, res);
} else {
var response = {
'type': type,
'employees': items
};
res.jsonp(response);
}
});
In ES5, Date.parse should be able to parse that format, however it isn't reliable. Manually parsing it isn't hard:
// Parse ISO 8601 UTC string like 2014-10-02T23:36:23.354Z
function parseISOUTC(s) {
var b = s.split(/\D/);
return new Date(Date.UTC(b[0], --b[1], b[2], b[3], b[4], b[5], b[6]));
}
The date instance that is created will have a local timezone offset calculated from system settings. To get a UTC ISO 8601 string from a local date you can use toISOString:
var date = new Date();
console.log(date.toISOString());
Note that toISOString is ES5 so may need a polyfill if this is used in older browsers. See MDN Date.prototype.toISOString().
I would have thought the most efficient way to handle dates between the client and server would be to use the EJSON package. This covers a few things that are skirted around in other discussion here and the importance is placed on maintaining "type fidelity" when handling JSON conversion.
So what you are getting right now is the result of a "string" which is called from the "Date" object in response to a JSON.stringify call. Whatever the method being used, this is essentially what is happening where the .toJSON() method is being called from the "Date" prototype.
Rather than muck around with the prototypes or other manual processing of coversions, the EJSON package allows you to call EJSON.stringify instead which has some built in behavior to preserve types, where specifically the generated JSON string would look like this for a Date element:
{ "myCreatedDate": { "$date": 1412227831060 } }
The value there is an epoch timestamp, essentially obtained from the .valueOf() prototype method, but the field is given a special structure automagically as it were. The same is true for types other than dates as well.
The corresponding "client" processing which you can add with simple includes to your web application in the browser, e.g:
<script src="components/ejson/base64.js"></script>
<script src="components/ejson/ejson.js"></script>
This allows a same EJSON object to be present where you can process the received JSON with EJSON.parse. The resulting JavaScript Object is maintained as a "Date" type when the de-serialize is done.
var obj = EJSON.parse( "{ \"myCreatedDate\": { \"$date\": 1412227831060 } }" );
{
myCreatedDate: /* Actually a Date Object here */
}
So now in your client browser, you have a real Date object without any other processing. Any .toString() method called on that object is going to result in a value represented in a way that matches the current locale settings for that client.
So if you use this to pass the values around between server and client in a way that is going to maintain an actual "Date" object, then the correct Object values are maintained on either client and server and needs no further conversion.
Very simple to include in your project and it takes a lot of the heavy lifting of maintaining "timezone" conversions off your hands. Give it a try.
Probably worth noting that the "core" of this comes from a MongoDB specification for Extended JSON Syntax. So aside from this (partial) implementation in the EJSON package, the same "type identifiers" are supported in several MongoDB tools as well as within several driver implementations with a custom JSON parser that will automatically convert the types. Notably the Java and C# drivers have this capability shipped with the driver libraries.
It's fairly easy to follow the convention outlined in that link, and it is intended to "map" to the BSON type specifications as well. At the worst, you can always "inspect" the results from a standard JSON parser and implement custom routines to "re-instantiate" the "types". But as noted, the software is already in place with several libraries.
From JSON website:
JSON is built on two structures:
A collection of name/value pairs. In various languages, this is realized as an object, record, struct, dictionary, hash table, keyed list, or associative array.
An ordered list of values. In most languages, this is realized as an array, vector, list, or sequence.
Now I have a sample service that returns a boolean (this is in PHP, but it could be any server side language):
<?php
header('Content-Type: application/json');
echo 'true';
exit;
And when requesting this page with ajax (for example with jQuery):
$.ajax({
url: 'service.php',
dataType: 'json',
success: function (data) {
console.log(data);
console.log(typeof data);
}
});
The result would be:
-> true
-> boolean
My question is why it's allowed to return boolean as a JSON.
Doesn't it have conflict with JSON definition?
ALSO
Also I can return number or string in my service:
<?php
header('Content-Type: application/json');
echo '2013';
exit;
And the result is:
-> 2013
-> number
And for string:
<?php
header('Content-Type: application/json');
echo '"What is going on?"';
exit;
And the result is:
-> What is going on?
-> string
You are correct that a valid JSON text can only be an object or an array. I asked Douglas Crockford about this in 2009 and he confirmed it, saying "Strictly speaking, it is object|array, as in the RFC."
The JSON RFC specifies this in section 2:
A JSON text is a serialized object or array.
JSON-text = object / array
The original JSON syntax listed on json.org does not make this clear at all. It defines all of the JSON types, but it doesn't say anywhere which of these types may be used as a "JSON text" - a complete valid piece of JSON.
That's why I asked Doug about it and he referred me to the RFC. Unfortunately, he didn't follow up on my suggestion to update json.org to clarify this.
Probably because of this confusion, many JSON libraries will happily create and parse (invalid) JSON for a standalone string, number, boolean, etc. even though those aren't really valid JSON.
Some JSON parsers are more strict. For example, jsonlint.com rejects JSON texts such as 101, "abc", and true. It only accepts an object or array.
This distinction may not matter much if you're just generating JSON data for consumption in your own web app. After all, JSON.parse() is happy to parse it, and that probably holds true in all browsers.
But it is important if you ever generate JSON for other people to use. There you should follow the standard more strictly.
I would suggest following it even in your own app, partly because there's a practical benefit: By sending down an object instead of a bare string, you have a built-in place to add more information if you ever need to, in the form of additional properties in the object.
Along those lines, when I'm defining a JSON API, I never use an array at the topmost level. If what I have is an array of items of some sort, I still wrap it in an object:
{
"items": [
...
]
}
This is partly for the same reason: If I later want to add something else to the response, having the top level be an object makes that easy to do without disrupting any existing client code.
Perhaps more importantly, there's also a possible security risk with JSON arrays. (I think that risk only affects the use of eval() or the Function constructor to parse JSON, so you're safe with JSON.parse(), but I'm not 100% sure on this.)
Note, the answer from Michael Geary is outdated since rfc7158 in 2013 which does not limit JSON text to array or object anymore. The current RFC https://www.rfc-editor.org/rfc/rfc8259 says:
A JSON text is a serialized value. Note that certain previous
specifications of JSON constrained a JSON text to be an object or an
array. Implementations that generate only objects or arrays where a
JSON text is called for will be interoperable in the sense that all
implementations will accept these as conforming JSON texts.