How to convert Keras weights to JSON?

How to convert Keras weights to JSON? - javascript

(This problem probably involves both python keras and js.)
TensorflowJS has the model.loadWeights() for a Keras LayersModel and the description is:
Loads all layer weights from a JSON object.
Porting Note: HDF5 weight files cannot be directly loaded in JavaScript / TypeScript. The utility script at scripts/pykeras.py offers means to convert them into JSON strings compatible with this method. Porting Note: TensorFlow.js Layers supports only loading by name currently.
#param weights
A JSON mapping weight names to weight values as nested arrays of numbers, or a NamedTensorMap, i.e., a JSON mapping weight names to tf.Tensor objects.
I have a hard time either understanding what this means or how to convert the weights to json. They link a scripts/pykeras.py file but not where could this be found (it is not in the node_modules.)
Any help?

Sometimes you may find only the weights for some models. These have the modelTopology:null when you convert them using tensorflowjs_converter.
Needless to say, this isn't necessary if your model topology is set to a meaningful value.
Yet if it is not too complicated of a NN, it may be worth to implement it in JS or TS and load the weights from external sources.
I did it for VGG16, but other models like the Mobile Nets are a lot more complex.
So this is how you can solve when you have the model but want to load the weights.
Get (load or write) the model (must be a layered model afaik.)
Then load the weights only
const json = JSON.parse(readFileSync("path/to/model.json"))
model.loadWeights(json.weightsManifest, false);

Related

Format for multi-level json and data extraction method with javascript

My apologies, but I am not well versed in json. We are currently using a single level json file for date information, one file per tour package. These are fetched via js then processed and inserted into the appropriate spots on the webpage. What we would like to do is combined all the tours date information, plus some additional details, into a single json file that once fetched is cached on the browser for a few hours. Basically ending up with a local flat file "database" with all tours for js to access.
Doing the single level json was fairly straight forward, but combining it into multiple levels is more daunting. I am wondering:
1) if there is a specific format for the data as outlined below?
2) how to use js to extract the data from that format?
Each tour is designated by a numerical id and has number of values. So should this first level be one or two levels (this is only the data concept, not json code):
tours -> tour_id, price1, price2, price3, duration, level, dates
tours -> tour_id -> price1, price2, price3, duration, level, dates
The dates value will have multiple dates each with several values:
dates -> date1, date2, date3, date4, etc
each date has -> trip_code, start_date, end_date, price, spaces
The basic functionality will be, when the page is loaded js will read the tour value from the page, then find the appropriate tour within the json file. The general values will be extracted by one function and simply inserted into the page as is using innerHTML. The date values will be used by a different function to build strings and then those strings likewise inserted into the page.
As I read through available info, I find some folks use only braces, some use braces & brackets, various suggestions for extraction, etc. I appreciate any help towards which format / extraction method would be preferable. And by preferrable, I mean whichever format / method puts the least workload on the browser. Having a slightly larger file size due to extra braces or brackets is fine if it reduces the js overhead and speeds up the finished page.
While it is probably of no consequence to the answer, the json file will be built by PHP and saved as a static file on the server.

You can make the multi-level object in the form
multiLevel = { [tour_id]: yourOriginal1LevelObject}
For retrieving the tour you can use
const tour = multiLevel[tour_id]
or using destructuring
const {tour_id: tour} = multiLevel

What is difference between .json and .js file which are used while coding three.js?

.json and .js file are used in three.js. They are one of the object's format. what is difference between them? Do they require different loaders to load the object?
I was following this three.js example: http://mrdoob.github.io/three.js/examples/webgl_materials_cars.html. In that example .js file for object and for that BinaryLoader is used. But when I do the same means I had used the .js file for my object with BinaryLoader, it did not worked. It worked for JSONLoader. So I am wondering that how one can recognize .js or .json file and loader for the corresponding object?

.json and .js file are used in three.js. They are one of the object's format. what is difference between them?
js extension marks that content of the file should be script following javascript syntax and so is human readable.
json extension marks that content of the file should be tree structure following javascript object syntax (javascript object has nothing to do with 3d objects) and so is human readable. Also this structure is valid for .js extension, in other words any valid json file is also valid js file.
Three js loaders are from the big part file parsers. These loaders does not care about file extension at all. It is ignored. Only important thing for the parser is content of the file.
Do they require different loaders to load the object?
As I know, three.js is able to load multiple kinds of structures. Each kind has each own loader (and loader contain one or more parsers).
The most basic one is JSONLoader. It requires file with specific json structure (data about materials, normals, positions, texture coordinates and more or less, not everything is compulsory).
The example you provided uses BinaryLoader. This binary loader requires two files (as I understand). First file contains json structure with materials and location of other file (so json parser is used to parse this file). Second file contains buffers (data about normals, positions, texture coord...) and is binary file. I have no idea what exact binary structure is used here. You see this is kind of hybrid and if you provide buffer data in json structure, it wont be able to read it.
Last loader I have heard of is fbxloader, which can read results created in blender for example. But Im not sure if this one is working.
In that example .js file for object and for that BinaryLoader is used. But when I do the same means I had used the .js file for my object with BinaryLoader, it did not worked.
I hope this is clear now. BinaryLoader expects two files with json and binary structures. It ignores file name including extension. If you create two files named blablabla.wtf and blabla.omg, but with correct structures inside, it will work. I guess you had one file with correct json structure. This will work with JSONLoader only.
More about file loaders
There are 3 aspects we can talk about. Parsing speed, file size and maintainability.
Parsing speed is more important if you want to download more and more data on the run.
File size is more important if current size is going to break some limit (that shouldnt or cannot be breaked).
Maintainability is more important if you need to change the file content a lot.
Binary formats are better for file size and parsing speed. But major browsers use gzip/bzip compression which makes json files almost same size as binary. Maintainability should always be the most important aspect. Json structures are very easy to maintain and debug. fbx or other binary formats are better for big and robust projects with lot of assets.
EDIT:
Im afraid I will have to explain a little bit more...
Lets talk about entire concept for a while. Lets say we have an empty world and we would like to put two models inside, simple cube and some animal.
There are three basic ways to do that. Generate it procedurally, use external data or hybrid (part is procedural, part is external data).
Procedural or hybrid might be for example sea with waves.
Procedural generation is done by some algorithm in the program, while external data must be inserted with some program tool, the loader.
Examine the cube and the animal now. Cube is just simple object made of 6 planes. It can't move, breathe, eat, anything. It just exists. On the other side, animal is much more complicated, it wont just stay in the middle and does not move. All these things will be part of the external data (file or files).
I provided two very different things, but it is important to know even the simplest things are complicated in 3D and can be manipulated different ways. For example what color does that cube has? Does each plane have same color? Is it shiny? Can it reflect?
The main thing is what kind of description can loader accept, read and understand. First you must know everything about the loader and then you can create an object.
This is example what structures can JSONLoader accept:
https://github.com/mrdoob/three.js/wiki/JSON-Geometry-format-4
https://github.com/mrdoob/three.js/wiki/JSON-Material-format-4
For example, if "metadata" contains "type": "Geometry", then loader will look for "indices", "vertices", "normals" and "uvs". Some parts might be optional, like "uvs".
Simple cube can be assembled only from vertices, but it is probably not what this particular loader knows and even if your structure does make sence, it might be unknown for the loader.
Binary loaders are very different, because there are no words in binary code, just 0 and 1. So you need a lot of metadata to specify what exactly is inside. Metadata can be part of the same file or some different file. But it again depends where the loader will look for them.
Could you please tell me, what do mean by JSON structure?
Usually I mean the structure readable with specific loader.
I am guessing that it might be the content that .js file has.
In case of example you provided, yes in this file is json structure:
http://mrdoob.github.io/three.js/examples/obj/veyron/VeyronNoUv_bin.js
.js file content is different when it is used with BinaryLoader as you have mentioned it contains buffers.
Be more precise, it does not contain buffers. It contains keyword "buffers": leading us to file "VeyronNoUv_bin.bin", where are data for buffers.
Also it contains additional important informations related to "VeyronNoUv_bin.bin" (how many vertices, normals etc.). So you could say, .js file content contains metadata for itself and metadata for related binary file.
Data about vertices, normals etc. are later loaded into buffers in the program, this is why they choose keyword buffers. More precise identification would be dataForBuffers.
And when it is used with JSONLoader it contains long list of vertices. Am I understanding right?
Exactly! When JSONLoader is used, long list of vertices etc. is readed and then loaded into buffers.

StarDict support for JavaScript and a Firefox OS App

I wrote a dictionary app in the spirit of GoldenDict (www.goldendict.org, also see Google Play Store for more information) for Firefox OS: http://tuxor1337.github.io/firedict and https://marketplace.firefox.com/app/firedict
Since apps for ffos are based on HTML, CSS and JavaScript (WebAPI etc.), I had to write everything from scratch. At first, I wrote a basic library for synchronous and asynchronous access to StarDict dictionaries in JavaScript: https://github.com/tuxor1337/stardict.js
Although the app can be called stable by now, overall performance is still a bit sluggish. For some dictionaries, I have a list of words of almost 1,000,000 entries! That's huge. Indexing takes a really long time (up to several minutes per dictionary) and lookup as well. At the moment, the words are stored in an IndexedDB object store. Is there another alternative? With the current solution (words accessed and inserted using binary search) the overall experience is pretty slow. Maybe it would become faster, if there was some locale sort support by IndexedDB... Actually, I'm not even storing the terms themselves in the DB but only their offsets in the *.syn/*.idx file. I hope to save some memory doing that. But of course I'm not able to use any IDB sorting functionality with this configuration...
Maybe it's not the best idea to do the sorting in memory, because now the app is killed by the kernel due to an OOM on some devices (e.g. ZTE Open). A dictionary with more than 500,000 entries will definitely exceed 100 MB in memory. (That's only 200 Byte per entry and if you suppose the keyword strings are UTF-8, you'll exceed 100 MB immediately...)
Feel free to contribute directly to the project on GitHub. Otherwise, I would be glad to hear your advice concerning the above issues.

I am working on a pure Javascript implementation of MDict parser (https://github.com/fengdh/mdict-js) simliliar to your stardict project. MDict is another popular dictionary format with rich format (embeded image/audio/css etc.), which is widely support on window/linux/ios/android/windows phone. I have some ideas to share, and wish you can apply it to improve stardict.js in future.
MDict dictionary file (mdx/mdd) divides keyword and record into (optionaly compressed) block each contains around 2000 entries, and also provides a keyword block index table and record block index table to help quick look-up. Because of its compact data structure, I can implement my MDict parser scanning directly on dictionary file with small pre-load index table but no need of IndexDB.
Each keyword block index looks like:
{num_entries: ..,
first_word: ..,
last_word: ..,
comp_size: .., // size in compression
decomp_size: .., // size after decompression
offset: .., // offset in mdx file
index: ..
}
In keyblock, each entries is a pair of [keyword, offset]
Each record block index looks like:
{comp_size: .., // size in compression
decomp_size: .., // size after decompression
}
Given a word, use binary search to locate the keyword block maybe containing it.
Slice the keyword block and Load all keys in it, filter out matched one and get its record offfset.
Use binary search to locate the record block containing the word's record.
Slice the record block and retrieve its record (a definition in text or resource in ArrayBuffer) directly.
Since each block contains only around 2000 entries, it is fast enough to lookup word among 100K~1M dictionary entries within 100ms, quite decent value for human interaction. mdict-js parses file head only, it is super fast and of low memory usage.
In the same way, it is possible to retrieve a list of neighboring words for given phrase, even with wild card.
Please take a look on my online demo here: http://fengdh.github.io/mdict-js/
(You have to choose a local MDict dictionary: a mdx + optional mdd file)

JSON diff of large JSON data, finding some JSON as a subset of another JSON

I have a problem I'd like to solve to not have to spend a lot of manual work to analyze as an alternative.
I have 2 JSON objects (returned from different web service API or HTTP responses). There is intersecting data between the 2 JSON objects, and they share similar JSON structure, but not identical. One JSON (the smaller one) is like a subset of the bigger JSON object.
I want to find all the interesecting data between the two objects. Actually, I'm more interested in the shared parameters/properties within the object, not really the actual values of the parameters/properties of each object. Because I want to eventually use data from one JSON output to construct the other JSON as input to an API call. Unfortunately, I don't have the documentation that defines the JSON for each API. :(
What makes this tougher is the JSON objects are huge. One spans a page if you print it out via Windows Notepad. The other spans 37 pages. The APIs return the JSON output compressed as a single line. Normal text compare doesn't do much, I'd have to reformat manually or w/ script to break up object w/ newlines, etc. for a text compare to work well. Tried with Beyond Compare tool.
I could do manual search/grep but that's a pain to cycle through all the parameters inside the smaller JSON. Could write code to do it but I'd also have to spend time to do that, and test if the code works also. Or maybe there's some ready made code already for that...
Or can look for JSON diff type tools. Searched for some. Came across these:
https://github.com/samsonjs/json-diff or https://tlrobinson.net/projects/javascript-fun/jsondiff
https://github.com/andreyvit/json-diff
both failed to do what I wanted. Presumably the JSON is either too complex or too large to process.
Any thoughts on best solution? Or might the best solution for now be manual analysis w/ grep for each parameter/property?
In terms of a code solution, any language will do. I just need a parser or diff tool that will do what I want.
Sorry, can't share the JSON data structure with you either, it may be considered confidential.

Beyond Compare works well, if you set up a JSON file format in it to use Python to pretty-print the JSON. Sample setup for Windows:
Install Python 2.7.
In Beyond Compare, go under Tools, under File Formats.
Click New. Choose Text Format. Enter "JSON" as a name.
Under the General tab:
Mask: *.json
Under the Conversion tab:
Conversion: External program (Unicode filenames)
Loading: c:\Python27\python.exe -m json.tool %s %t
Note, that second parameter in the command line must be %t, if you enter two %ss you will suffer data loss.
Click Save.

Jeremy Simmons has created a better File Format package Posted on forum: "JsonFileFormat.bcpkg" for BEYOND COMPARE that does not require python or so to be installed.
Just download the file and open it with BC and you are good to go. So, its much more simpler.
JSON File Format
I needed a file format for JSON files.
I wanted to pretty-print & sort my JSON to make comparison easy.
I have attached my bcpackage with my completed JSON File Format.
The formatting is done via jq - http://stedolan.github.io/jq/
Props to
Stephen Dolan for the utility https://github.com/stedolan.
I have sent a message to the folks at Scooter Software asking them to
include it in the page with additional formats.
If you're interested in seeing it on there, I'm sure a quick reply to
the thread with an up-vote would help them see the value posting it.
Attached Files Attached Files File Type: bcpkg JsonFileFormat.bcpkg
(449.8 KB, 58 views)

I have a small GPL project that would do the trick for simple JSON. I have not added support for nested entities as it is more of a simple ObjectDB solution and not actually JSON (Despite the fact it was clearly inspired by it.
Long and short the API is pretty simple. Make a new group, populate it, and then pull a subset via whatever logical parameters you need.
https://github.com/danielbchapman/groups
The API is used basically like ->
SubGroup items = group
.notEqual("field", "value")
.lessThan("field2", 50); //...etc...
There's actually support for basic unions and joins which would do pretty much what you want.
Long and short you probably want a Set as your data-type. Considering your comparisons are probably complex you need a more complex set of methods.
My only caution is that it is GPL. If your data is confidential, odds are you may not be interested in that license.

Passing a large dataset to the client - Javascript arrays or JSON?

I'm passing a table of up to 1000 rows, consisting of name, ID, latitude and longitude values, to the client.
The list will then be processed by Javascript and converted to markers on a Google map.
I initially planned to do this with JSON, as I want the code to be readable and easy to deal with, and because we may be adding more structure to it over time.
However, my colleague suggested passing it down as a Javascript array, as it would reduce the size greatly.
This made me think, maybe JSON is a bit redundant. After all, for each row defined, the name of each field is also being outputted repetitively. Whereas, for an array, the position of the cells is used to indicate the field.
However, would there really be a performance improvement by using an array?
The site uses GZIP compression. Is this compression effective enough to take care of any redundancy found in a JSON string?
[edit]
I realize JSON is just a notation.
But my real question is - what notation is best, performance-wise?
If I use fully named attributes, then I can have code like this:
var x = resultset.rows[0].name;
Whereas if I don't, it will look less readable, like so:
var x = resultset.rows[0][2];
My question is - would the sacrifice in code readability be worth it for the performance gains? Or not?
Further notes:
According to Wikipedia, the Deflate compression algorithm (used by gzip) performs 'Duplicate string elimination'. http://en.wikipedia.org/wiki/DEFLATE#Duplicate_string_elimination
If this is correct, I have no reason to be concerned about any redundancy in JSON, as it's already been taken care of.

JSON is just a notation (Javascript Object Notation), and includes JS arrays -- even if there is the word "object" in its name.
See its grammar on http://json.org/ which defines an array like this (quoting) :
An array is an ordered collection of
values. An array begins with [ (left
bracket) and ends with ] (right
bracket). Values are separated by ,
(comma).
This means this (taken from JSON Data Set Sample) would be valid JSON :
[ 100, 500, 300, 200, 400 ]
Even if it doesn't include nor declare nor whatever any object at all.
In your case, I suppose you could use some array, storing data by position, and not by name.
If you are worried about size you could want to "compress" that data on the server side by yourself, and de-compress it on the client side -- but I wouldn't do that : it would mean you'd need more processing time/power on the client side...
I'd rather go with gzipping of the page that contains the data : you'll have nothing to do, it's fully automatic, and it works just fine -- and the difference in size will probably not be noticeable.

I suggest to use a simple CSV format. There is a nice article on the Flickr Development Blog where they talked about their experience with such a problem. But the best would be to try it on your own.

We Keep Coding

JavaScript is the programming language of the Web.