Convert Excel file to JSON: design of JSON code check - javascript

I want to convert the data from an Excel file to a JSON file. However, I'm not sure about the design of my JSON code (i.e. is it organized in a proper way in order to process it easily?)
I will use this JSON file with D3.js.
This a small part of my Excel file:
I'd like to convert this data into a JSON file in order to use it with D3.js. This is what I have so far:
So my question is: is this a good design (way) for organizing the data in order to use it with D3.js?
This is a sample output:
Thanks in advance!

This is a somewhat subjective question, but from my experience, there is a better way:
Since you're working in d3, you're probably doing something like this:
d3.selectAll('div')
.data(entities)
.enter()
.append('div')
...
So you want entities to be an array. The question is what are your entities? Is there a view where entities are all the countries in the world? Is there a view where entities are all the countries plus all the regions plus the whole world? Or, are all the views going to be simply all the countries in a selected region, not including the region itself?
The unless the JSON structure you're proposing matches the combinations of entities that you plan to display, your code will have to do a bunch of concat'ing and/or filtering of arrays in order to get a single entities array that you can bind to. Maybe that's ok, but it will create some unnecessary amount of coupling between your code and the structure of the data.
From my experience, it turns out that the most flexible way (and also probably the simplest in terms of coding) is to keep the hierarchy flat, like it is in the excel file. So, instead of encoding regions into the hierarchy, just have them in a single, flat array like so:
{
"immigration": [
{
"name": "All Countries"
"values: [
{ "Year": ..., "value": ... },
{ "Year": ..., "value": ... },
...
]
},
{
"name": "Africa"
"values: [
{ "Year": ..., "value": ... },
{ "Year": ..., "value": ... },
...
]
},
{
"name": "Eastern Africa"
"continent": "Africa"
"values": [
{ "Year": ..., "value": ... },
{ "Year": ..., "value": ... },
...
]
},
{
"name": "Burundi"
"continent": "Africa"
"region": "East Africa"
"values": [
{ "Year": ..., "value": ... },
{ "Year": ..., "value": ... },
...
]
},
{
"name": "Djibouti"
"continent": "Africa"
"region": "East Africa"
"values": [
{ "Year": ..., "value": ... },
{ "Year": ..., "value": ... },
...
]
},
...
]
}
Note that even though the array is flat, there is still a hierarchy here -- the region and sub-region properties.
You'll have to do a bit of filtering to extract just the countries/regions you want to show. But that's simpler than traversing the hierarchy you're proposing:
var africanEntities = data.immigration.filter(function(country) {
return country.continent == "Africa";
}); // Includes the region "East Africa"
var justCountries = data.immigration.filter(function(country) {
return country.continent != null && country.region != null;
});
Also, d3 has the awesome d3.nest(), which lets you turn this flat data into hierarchical one with little effort:
var countriesByContinent = d3.nest()
.key(function(d) { return d.continent; })
.map(data.immigration);
var africanEntities = countriesByContinent['Africa'];
Hope that helps....

Related

How can I store mapping orders in BBDD and then eval them

I'm trying to store in MongoDB one document with an object with the properties I want to map latter. My idea it's to create a function that will receive 2 params. First the object where I got to find the mapping, and second the object where I have to take the info from.
For example I want to store this JSON (that would be the first parameter in the function):
{
"name": "client.firstName",
"surname": "client.surname",
"age": "client.age",
"skills": [
{
"skillName": "client.skills[index].name",
"level": "client.skills[index].levelNumber",
"categories": [
{
"categoryName": "client.skills[index].categories[index].name",
"isImportant": "client.skills[index].categories[index].important"
}
]
}
]
}
And the second paramenter would be something like this (it's the object where you find the information.
{
"client": {
"firstName": "Jake",
"surname": "Long",
"age": 20,
"skills": [
{
"name": "Fly",
"level": 102,
"categories": [
{
"name": "air",
"important": true
},
{
"name": "superpower",
"important": false
}
]
},
{
"name": "FastSpeed",
"level": 163,
"categories": [
{
"name": "superpower",
"important": false
}
]
}
]
}
}
The idea it's: with de paths that I have in the first object, find it in the second one.. The problem I found it's when I have arrays, because when I defined the mapping rules I don't know how many positions will have the array I want to map. So in the mapping object (first) I'll only define the path but I'll not put it with the same lenght of the secondone because I don't know how much it will have.

Search inside array using Elastic Search

I am using Elastic version 6.8, created one index into whose schema is as follow:
{
"properties": {
"title": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"tags": {
"type": "keyword",
"fields": {
"raw": {
"type": "text"
}
}
}
}}
and I have added following documents into it
[{
"title": "one",
"tags": ["html", "css", "javascript"]
}, {
"title": "two",
"tags": ["java", "jsp", "servlet"]
}, {
"title": "three",
"tags": ["spring", "java"]
}, {
"title": "four",
"tags": ["react", "angular", "javascript"]
}, {
"title": "five",
"tags": ["java"]
}, {
"title": "six",
"tags": []
}]
now I have more than 10 millions document in elastic search. Now I want to search following cases:
List all tags. with unique result (using skip, limit) skip value change but limit is fixed.so here I want result like
html,
css,
javascript,
java,
jsp,
servlet,
spring,
react,
angular
Partil search inside tags, it means if I search using act then it should give result as : react this also using skip limit.
How I can get this using Elastic search query. please help me here?
You can find unique possible value by using term aggregation.
GET yourindex/_search
{
"size": 0,
"aggs": {
"all_tags": {
"terms": {
"field": "tags",
"size": 100
}
}
}
}
"size":100 Get at most 100 unique values. Default is 10. You can increase more but it will include cost. You can check more on doc.
For partial search you can use wildcard query OR you can try N-Gram Tokeninzer. Both will allow to do partial search but wildcard query will be costly. You can evaluate according to your use case.

Nested or different mongodb collections for e-commerce products with multi language?

He there, I use to work with Relational DBS, but right now trying to implement e-commerce shop and use mongodb.
I need the product, sub-products and description (multi lang);
I prefere to separate everything by 3 collection (maybe its not a good idea because mongo use 1 collection for one entity, in my example, 3 collection for 1 entity)
"content": [{
pid: 1,
lang: "ru",
title: "Привет"
},
{
pid: 1,
lang: "en",
title: "Hello"
},
{
pid: 2,
lang: "ru",
title: "Пока"
},
{
pid: 2,
lang: "en",
title: "Bye"
}
],
"products": [{
"_id": 1,
"item": "almonds",
"price": 12,
},
{
"_id": 2,
"item": "pecans",
"price": 20,
},
],
"sub": [{
"_id": 11,
"pid": 1,
"features": {
"color": ["red"],
"size": 42
},
"qt": 5
},
{
"_id": 12,
"pid": 1,
"features": {
"color": ["red"],
"size": 43
},
"qt": 2
},
{
"_id": 13,
"pid": 1,
"features": {
"color": ["yellow"],
"size": 44
},
"qt": 3
},
{
"_id": 21,
"pid": 2,
"features": {
"color": ["yellow"],
"size": 41
},
"qt": 6
},
{
"_id": 22,
"pid": 2,
"features": {
"color": ["red"],
"size": 47
},
"qt": 10
}
]
Products should have sub-products in order to use filter, for example when i want to filter items i will seek into sub-product collection find all the yellow t-short for example with size 44, then i just $group the items by main productId and make $lookup with main products and return it.
Also in order to receive main product with description I should to do $lookup with content collection.
Is it a great idea or should i use 1 collection for product and content?
Like:
"products": [{
"_id": 1,
"item": "almonds",
"price": 12,
"content": [{
lang: "ru",
title: "Привет"
},
{
lang: "en",
title: "Hello"
},
},
]
and maybe should I include sub-item also to main product, like:
"products": [{
"_id": 1,
"item": "almonds",
"price": 12,
"content": [{
lang: "ru",
title: "Привет"
},
{
lang: "en",
title: "Hello"
},
},
"sub": [{
"features": {
"color": ["red"],
"size": 42
},
"qt": 5
},
{
"features": {
"color": ["red"],
"size": 43
},
"qt": 2
},
]
]
The main Question is it good idea to compare everything and don't care about size of the collection? And if so how should I do a filter on nested documents ('sub-products')(previously 'sub-products' collection was like a plain collection and I could make aggregation in order to find all items by color for example: {"features.color": { $in: ['red'] }}) how can i manage it with nested document, and it will not be overwhelmed operation?
Your question about the data model design decision of how to split up your entities/documents cannot be answered in a useful way with just the information given, but I can give you some aspects to think about for the decision.
MongoDb is a key-value store and as such is most useful if you can design your data in a way that uses mostly key-based lookups. This can be extended with creating additional indexes on document fields other than _id, which is indexed by default. Everything that is not indexed will result in collection scans and be very inefficient to query. Indexes can substantially increase the amount of storage required, so depending on your scenario cost may be a prohibiting factor to just index every field you want to query by later. That means when designing entities you will also have to consider estimated size of collections and the possibility to reference between collections (by guid for example).
So in order to make the right design decisions for your data model we cannot judge just based on the data schema you want to store, but rather based on the queries you are looking to perform later and the expected collection sizes / future scaling requirements. Those are aspects that you only touch very lightly in your question. For example if you plan to query all kind of complex joins and combinations of property values across all your entities, you may ask yourself if you can afford the extra storage cost of non-normalized data and additional indexes, or if a traditional (or modern) SQL-RDB, or maybe a graph database may be more suitable than a key-value store for your use case. Whereas if your database scale will be small at all times and your main concern is developer productivity these considerations may be worthless.
Your specific question about accessing "sub"-documents within an array of the parent "products" can be answered by that it is supported by using an elemmatch. For example:
db.products.find({sub: {$elemMatch: {'features.size':43, 'features.color':'red'}}})
Please note again that these queries will only be efficient if you index the fields in your query. In case of array sub-documents that means looking into multi-key indexes.
In order to acquire more knowledge for better decisioning around DB models and your questions about queries in MongoDB I recommend reading the official MongoDB data model design guide, the documentation for querying arrays, as well as googling some articles on normalization and the motivation of SQL vs noSQL in terms of scaling/sharding, ACID and eventual consistency.

How to iterate through following compex JSON data and access its all values including nested?

I have following JSON data but I don't know how to iterate through it and read its all values:
var students = {
"student1": {
"first_name": "John",
"last_name": "Smith",
"age": 24,
"subject": [{
"name": "IT",
"marks": 85
},
{
"name": "Maths",
"marks": 75
},
{
"name": "English",
"marks": 60
}
]
},
"student2": {
"first_name": "David",
"last_name": "Silva",
"age": 22,
"subject": [{
"name": "IT",
"marks": 85
},
{
"name": "Maths",
"marks": 75
},
{
"name": "English",
"marks": 60
}
]
}
};
I would like to use following methods to do the needful:
Using for in loop
Using simple for loop
Using $.each in jQuery
I will prefer to display above values in <ul> <li> nicely formatted.
Also, please suggest me what will be look of above JSON data if I put it in an external .json file?
You can use for in loop to iterate over the object, as it iterates over the properties of an object in an arbitrary order, and needs to use .hasOwnProperty, unless inherited properties want to be shown.
Now about accessing the object, let's say I have a JSON like
var myJson={name:"john",age:22,email:"email#domain.com"};
and I need to access the value of name i would simply use . operator using the myJson variable i.e console.log(myJson.name) will output john. because it will be treated as an object, now if I make a little change and make the object like below
var myJson=[{name:"john",age:22,email:"email#domain.com"}];
now if you try to access the value of the property name with the same statement above you will get undefined because the [] will now treat it as an object(JSON) with an array of 1 person or a JSON Array, now if you access it like console.log(myJson[0].name) it will print john in console what if there was more than one person in the array? then it will look like following
var myJson=[
{name:"john",age:22,email:"john#domain.com"},
{name:"nash",age:25,email:"nash#domain.com"}
];
console.log(myJson[0].name) will print john and console.log(myJson[1].name) will print nash so as I mentioned in the start that you should use for in loop for iterating over an object and if we want to print all the names of the person in the JSON Array it will be like.
for(var person in myJson){
console.log(myJson[person].name, myJson[person].age, myJson[person].email);
}
it will output in the console like below
john, 22, john#domain.com
nash, 25, nash#domain.com
I have tried to keep it simple so that you understand you can look into for in and hasOwnProperty, in your case you have a nested object in which property/key subject is an array so if I want to access the first_name of student1 i will write students.student1.first_name and if I want to print the name of the first subject of student1 I will write students.student1.subject[0].name
Below is a sample script to print all the students along with their subjects and marks and personal information since you JSON is nested I am using a nested for in, although Nested iterations are not necessarily a bad thing, even many well-known algorithms rely on them. But you have to be extremely cautious what you execute in the in the nested loops.
For the sake of understanding and keeping the given example of json object, i am using the same to make a snippet. Hope it helps you out
var students = {
"student1": {
"first_name": "John",
"last_name": "Smith",
"age": 24,
"subject": [{
"name": "IT",
"marks": 85
},
{
"name": "Maths",
"marks": 75
},
{
"name": "English",
"marks": 60
}
]
},
"student2": {
"first_name": "David",
"last_name": "Silva",
"age": 22,
"subject": [{
"name": "IT",
"marks": 85
},
{
"name": "Maths",
"marks": 75
},
{
"name": "English",
"marks": 60
}
]
}
};
$("#print").on('click', function() {
for (var student in students) {
console.log(students[student].first_name + '-' + students[student].last_name);
for (var subject in students[student].subject) {
console.log(students[student].subject[subject].name, students[student].subject[subject].marks);
}
}
setTimeout('console.clear()', 5000);
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<input type="button" id="print" value="print-now">

Immutablejs: How to deserialise a complex JS object

If I receive data from a server in plain JSON that looks like this:
{
"f223dc3c-946f-4da3-8e77-e8c1fe4d241b": {
"name": "Dave",
"age": 16,
"jobs": [{
"description": "Sweep the floor",
"difficulty": 4
},{
"description": "Iron the washing",
"difficulty": 6
}]
},
"84af889a-8fc9-499b-a6ea-97e7a483130c": {
...
}
}
Do I need to loop through all the jobs and convert them to Maps, then convert each object's jobs into a List, then the entire thing into a Map?
Or does ImmutableJS do this all recursively for me?
There is Immutable.fromJS() designed for exactly that.

Categories