Given a JSON such:
[
{ "id":"A", "status": 1, "rank":1, "score": },
{ "id":"B", "status": 1, "rank":1, "score": }
]
My script fails due to the empty score.
Given a JS such :
if (json[i].score) { do something } else { calculate it }
I want to keep the field empty, and not use 0. I may use "score": "", but this will imply that it's a string (empty at start), while I want score to be a numeral (empty at start). So the number I push in it stay a number, and not a string.
How to state an empty/undefined numeral ?
Note: I intuitively think it's why I sometime meet undefined.
EDIT: the question was edited to clarify the context, the need to check the existence of obj.score, where 0 would be misleading.
TL;DR: Use 0 (zero), if everyone starts with zero score. Use null, if you specifically want to say that someone's score is not set yet. Do not define the property, if it should not be placed in specific case (like object that should not even have "score" property).
Omitting the value in JSON object
Well, in JSON you are not allowed to just omit the value. It must be set to something like number (integer or float), object, list, boolean, null or string. To read more about syntax, try this resource: http://json.org/. This is the diagram taken from that site, showing you the syntax of object representations in JSON:
Most popular approaches: null, 0 (zero), undefined
The usual approach is to set null. In other cases it can be better to use 0 (zero), if applicable (eg. the field means "sum").
Another approach is just to not set the property. After deserialization you are then able to perform tests checking the existence of specific property like that:
JavaScript:
if (typeof my_object.score !== 'undefined'){
// "score" key exists in the object
}
Python:
if 'score' in my_object:
pass # "score" key exists in the object
PHP:
if (array_key_exists('score', $my_object)) {
// "score" key exists in the object
}
Less consistent approaches: false, ""
Some people also accept the value to be false in such case, but it is rather inconsistent. Similarly when it comes to empty string (""). However, both cases should be properly supported in most programming languages during deserialization.
Why not just start score at 0? Everyone will start with a score of 0 in just about anything involving score.
[
{ "id":"A", "status": 1, "rank":1, "score": 0 },
{ "id":"B", "status": 1, "rank":1, "score": 0 }
]
By definition, a numeral value is a value type, which cannot be null. Only reference types (like strings, arrays, etc.) should be initialized to null.
For the semantic of your situation, I would suggest you to use a boolean to know if weither or not there is a score to be read.
[
{ "id":"A", "status": 1, "rank":1, "empty":true },
{ "id":"B", "status": 1, "rank":1, "empty":false, "score":100}
]
Then,
if (!foo.empty) {
var score = foo.score;
}
While a null could be tested as well, it is a wrong representation of a number.
Related
When importing a document, I get an error that is attached below.
I guess the problem arose when the data provider (esMapping.js) was changed, to use the integer sub-field to sort documents.
Is it possible to use some pattern to sort the document so that this error does not occur again? Does anyone have an idea?
The question refers to the one already asked - Enable ascending and descending sorting of numbers that are of the keyword type (Elasticsearch)
Error:
022-05-18 11:33:32.5830 [ERROR] ESIndexerLogger Failed to commit bulk. Errors:
index returned 400 _index: adama_gen_ro_importdocument _type: _doc _id: 4c616067-4beb-4484-83cc-7eb9d36eb175 _version: 0 error: Type: mapper_parsing_exception Reason: "failed to parse field [number.sequenceNumber] of type [integer] in document with id '4c616067-4beb-4484-83cc-7eb9d36eb175'. Preview of field's value: 'BS-000011/2022'" CausedBy: "Type: number_format_exception Reason: "For input string: "BS-000011/2022"""
Mapping (sequenceNumber used for sorting):
"number": {
"type": "keyword",
"copy_to": [
"_summary"
],
"fields": {
"sequenceNumber": {
"type": "integer"
}
}
}
In the returned error message, the value being indexed into the number field is a string with alphabetical characters, 'BS-000011/2022'. This is no problem for the number field that has a keyword type. However, it is an issue for the sequenceNumber sub-field which has an integer type. The text value passed into number is also passed into sequenceNumber sub-field, hence the error.
Unfortunately, the text analyzer used in the previous question won't help either, as sorting can't be performed on a text field. However, the tokenizer used by the custom analyzer document_number_analyzer can be repurposed into an ingest pipeline.
The custom tokenizer, for context, provided by the author in the previous question :
"tokenizer": {
"document_number_tokenizer": {
"type": "pattern",
"pattern": "-0*([1-9][0-9]*)\/",
"group": 1
}
}
If the custom analyzer is used, with the Elasticsearch _analyze API on the value above like so (stack_index being a temporary index to use the analyzer) :
POST stack_index/_analyze
{
"analyzer": "document_number_analyzer",
"text": ["BS-000011/2022"]
}
The analyzer returns one token of 11, but tokens are for search analysis, not sorting.
An Elasticsearch ingest pipeline, using the grok processor, can be applied to the index to perform the extraction of the desired number from the value and indexed as an integer. The processor needs to be configured to expect the value's format, which would be similar to 'BS-0000011/2022'. An example is provided below:
PUT _ingest/pipeline/numberSort
{
"processors": [
{
"grok": {
"field": "number",
"patterns": ["%{WORD}%{ZEROS}%{SORTVALUES:sequenceNumber:int}%{SEPARATE}%{NUMBER}"],
"pattern_definitions": {
"SEPARATE": "[/]",
"ZEROS" : "[-0]*",
"SORTVALUES": "[1-9][0-9]*"
}
}
}
]
}
Grok takes an input text value and extracts structured fields from it. The pattern where the sortable number will be extracted is the SORTVALUES pattern, %{SORTVALUES:sequenceNumber:int}. A new field, called sequenceNumber, will be created in the document. When 'BS-000011/2022' is indexed in the number field, 11 is indexed into the sequenceNumber field as an integer.
You can then create an index template to apply the ingest pipeline. The sequenceNumber field will need to be explicitly added as an integer type. The ingest pipeline will automatically index into as long as a value matching the format of the input above is indexed into the number field. The sequenceNumber field will then be available to sort on.
I am trying to construct a JSON schema that meets the following:
Declares a top-level object with at least one property
The value of each property will be an array, each of which must contain exactly N items
Array items must be integers taken from the closed interval [J, K], or null
Integer items in each array must be unique within that array
There is no uniqueness constraint applied to null (so no implied relationship between N and the interval size K-J)
The problem I am running into is #4 and #5. It is easy enough to meet the first 3 requirements, plus part of the 4th, using this schema:
{
"$schema": "http://json-schema.org/draft/2019-09/schema#",
"type": "object",
"minProperties": 1,
"additionalProperties": {
"type": "array",
"minItems": N,
"maxItems": N,
"items": {
"anyOf": [
{
"type": "integer",
"minimum": J,
"maximum": K
},
{
"type": "null"
}
]
},
"uniqueItems": true
}
}
I am not sure how (or if it's even possible) to specify an array that applies the uniqueItems constraint to only a subset of the allowable items. I tried moving uniqueItems to lower levels of the schema with the hope that it might operate with restricted scope, but that doesn't work.
This might be possible using conditionals, but I haven't gone down that road yet since I'm not sure it will actually work, and I am hoping there is an easier approach that I have overlooked.
So, my question is: Is there a way to specify a JSON schema array that selectively enforces a uniqueness constraint only on the items that are not null?
this is beyond the capabilities of uniqueItems and not a constraint JSON Schema is able to express. you will need to check this requirement elsewhere in your application's business logic.
Given I have a string which represents JSON object. It might be invalid, as there might be some params which will be replaced by another system (e.g. %param%). I need to remove all objects with known propertyName equal to "true" using regex.
{
"someTopLevelProp": "value",
"arrayOfData": [
{
"firstPropIAlwaysKnow": "value",
"dontCareProp": $paramValue$,
"dontCareProp2": 2,
"flagWhichShouldIUse": true,
"somethingAtTheEnd": "value"
},
{
"absolutelyAnotherObject": %paramValue%
},
{
"firstPropIAlwaysKnow": "value",
"dontCareProp": "value",
"dontCareProp2": 2,
"flagWhichShouldIUse": false,
"somethingAtTheEnd": "value"
},
{
"firstPropIAlwaysKnow": "value",
"dontCareProp": "value",
"dontCareProp2": 2,
"flagWhichShouldIUse": true,
"somethingAtTheEnd": "value"
}
]
}
In the example above, I always have "firstPropIAlwaysKnow" which means that object can contain flag which I need. After that there might be other properties. But the most important here is "flagWhichShouldIUse" prop, which mean this object should be removed (but only in case when value equal to 'true'). As result I should receive:
{
"someTopLevelProp": "value",
"arrayOfData": [
{
"absolutelyAnotherObject": %paramValue%
},
{
"firstPropIAlwaysKnow": "value",
"dontCareProp": "value",
"dontCareProp2": 2,
"flagWhichShouldIUse": false,
"somethingAtTheEnd": "value"
}
]
}
My knowledge in regex are not strong enough, thus kindly ask for community's help.
P.S. Please do not mention that parsing JSON with regex it's crazy\incorrect\bad idea - be sure I know that.
ANSWER: now I have working regex which do that stuff. Thank you everyone who tried to help here. Maybe it will be useful for someone.
/{\s+?"firstPropIAlwaysKnow": "value"[^{}]+?(?:\{[^}]*\}[^}]+?)*[^}]+?"flagWhichShouldIUse": true[^}]+?},?/gi
Regexper
You really can't do this with just regular expressions. Something like this might work:
let filtered = jsonstring
// split into the individual 'objects'
// might need to modify this depending on formatting. You
// could use something like /},\s*{/ to split the string,
// but couldn't re-join with same formatting
.split('},{')
// filter for only the strings with the relevant property
// set to false
.filter(s => s.match(/"flagWhichShouldIUse":\s*false/) // again, may need to change
// put humpty-dumpty back together again
.join('},{');
The exact splitting method will depend heavily on the structure of your JSON, and this isn't fool-proof. It doesn't handle nesting properly. If your JSON is pretty-printed, you could use the number of tab/space characters as part of the splitter: this for instance would only split for one tab only: /\n\t},\s*{/
My JSON object looks like this and I want to sort them based on property starting with sort_
{
"sort_11832": "1",
"productsId": [
"11832",
"160",
"180"
],
"sort_160": "0",
"sort_180": "2"
}
Ideally I would like to get a result of ids based on sort order like this -
[ "160","18832","180" ]
Any suggestion on how to sort by wildcard property name. Using Javascript/Jquery ofcourse.
Here's what I'd do:
Go back to whatever is producing that truly absurd data and have them just return an array in the correct order in the first place, rather than producing an array and then repeating all the entries as separate properties with sort_ in front of them and a number.
Very, very, very, very far down in second place if the above weren't successful, I'd do this:
Call sort on the array, passing in a comparator function
In the comparator function, put "sort_" in front of each of the two values the function was given to compare, and use that property name to look up the the numeric string for that property in the object
Parse those two numeric strings into numbers
Return the result of subtracting the second one's number from the first
Code is left as an exercise to the reader.
Thanks everyone. It was a daft way of approaching this. Point taken and I have now changed the data set to something like this which makes everything very easier -
sort: [
{
\"id\": \"160\",
\"sort\": 0
},
{
\"id\": \"11832\",
\"sort\": 1
}
]
I am interacting with a mongodb through Javascript.
Is there any difference between the following two statements?
myCollection.insert({tiles: new Int32Array(256)});
myCollection.insert({tiles: new Array(256)});
Yes, it's very different.
The first using Int32Array, which isn't a JavaScript type that the current driver uses/understands, instead becomes projected into an object with properties following the indices of the array:
"tiles" : { "0" : 0, "1" : 0, "2" : 0.... }
The second translates to an Array in MongoDB, of type Null.
"tiles": [ null, null, null .... ]
So, the first does not remain as an array, while the second one does.