Lucene-like searching through JSON objects in JavaScript

Lucene-like searching through JSON objects in JavaScript - javascript

I have a pretty big array of JSON objects (its a music library with properties like artist, album etc, feeding a jqgrid with loadonce=true) and I want to implement lucene-like (google-like) query through whole set - but locally, i.e. in the browser, without communication with web server. Are there any javascript frameworks that will help me?

Go through your records, to create a one time index by combining all search
able fields in a single string field called index.
Store these indexed records in an Array.
Partition the Array on index .. like all a's in one array and so on.
Use the javascript function indexOf() against the index to match the query entered by the user and find records from the partitioned Array.
That was the easy part but, it will support all simple queries in a very efficient manner because the index does not have to be re-created for every query and indexOf operation is very efficient. I have used it for searching up to 2000 records. I used a pre-sorted Array. Actually, that's how Gmail and yahoo mail work. They store your contacts on browser in a pre-sorted array with an index that allows you to see the contact names as you type.
This also gives you a base to build on. Now you can write an advanced query parsing logic on top of it. For example, to support a few simple conditional keywords like - AND OR NOT, will take about 20-30 lines of custom JavaScript code. Or you can find a JS library that will do the parsing for you the way Lucene does.
For a reference implementation of above logic, take a look at how ZmContactList.js sorts and searches the contacts for autocomplete.

You might want to check FullProof, it does exactly that:
https://github.com/reyesr/fullproof

Have you tried CouchDB?
Edit:
How about something along these lines (also see http://jsfiddle.net/7tV3A/1/):
var filtered_collection = [];
var query = 'foo';
$.each(collection, function(i,e){
$.each(e, function(ii, el){
if (el == query) {
filtered_collection.push(e);
}
});
});
The (el == query) part of course could/should be modified to allow more flexible search patterns than exact match.

Related

Is there mongodb query which will insert document if field is unique otherwise execute custom function

Trying to create an activation code which should be unique, but it only consists of specific characters.
So, this is solution which i build
function findByActivationId() {
return Activation
.findOne({activationId})
.lean()
.exec();
}
let activationId = buildActivationId();
while (await findByActivationId(activationId)) {
activationId = buildActivationId();
}
This makes too many db calls, is there any better way to make query to mongodb?

Well, the major problem of checking if key is unique is based on how you are creating those.
Choose the best way for you to avoid bunch of problems later.
Your own generated string as a key
Well, you can do this but it's important to understand few disclaimers
If you want to generate your own key by the code and then compare if it is unique
in the database with all other currently created it can be done. Just create key by your
algorithm then select all keys from db and check if array of selected rows contains this freshly created string
Problems of this solution
As we can see we need to select all keys from DB and then compare each one to freshly created one. Problem can appear when your database is storing big amount of data. Every time application have to "download" big amount of data and then compare it to new one so in addition this might produce some freezes.
But if you are sure that your database will store not that much amount of unique rows, it is cool to work with.
Then it is important to create those keys properly. Now we talking about complexity, more symbols key is created from, harder to get same ones.
Shall we take a look at this example?
If you are creating keys based on letters a-z and numbers 1-9
and the length of key is for example 5, the complexity of this key is 35^5
which generates more than 52 milions possibilities.
Same keys can be generated but it is like a win on a lottery, almost impossible
And then you can just check if generated key is really unique, if not. (oh cmon) Repeat.
Other ways
Use mongodb _id which is always unique
Use UNIX timestamp to create unique key

Parse.com relations count

I want to query object from Parse DB through javascript, that has only 1 of some specific relation object. How can this criteria be achieved?
So I tried something like this, the equalTo() acts as a "contains" and it's not what I'm looking for, my code so far, which doesn't work:
var query = new Parse.Query("Item");
query.equalTo("relatedItems", someItem);
query.lessThan("relatedItems", 2);

It seems Parse do not provide a easy way to do this.
Without any other fields, if you know all the items then you could do the following:
var innerQuery = new Parse.Query('Item');
innerQuery.containedIn('relatedItems', [all items except someItem]);
var query = new Parse.Query('Item');
query.equalTo('relatedItems', someItem);
query.doesNotMatchKeyInQuery('objectId', 'objectId', innerQuery);
...
Otherwise, you might need to get all records and do filtering.
Update
Because of the data type relation, there are no ways to include the relation content into the results, you need to do another query to get the relation content.
The workaround might add a itemCount column and keep it updated whenever the item relation is modified and do:
query.equalTo('relatedItems', someItem);
query.equalTo('itemCount', 1);

There are a couple of ways you could do this.
I'm working on a project now where I have cells composed of users.
I currently have an afterSave trigger that does this:
const count = await cell.relation("members").query().count();
cell.put("memberCount",count);
This works pretty well.
There are other ways that I've considered in theory, but I've not used
them yet.
The right way would be to hack the ability to use select with dot
notation to grab a virtual field called relatedItems.length in the
query, but that would probably only work for me because I use PostGres
... mongo seems to be extremely limited in its ability to do this sort
of thing, which is why I would never make a database out of blobs of
json in the first place.
You could do a similar thing with an afterFind trigger. I'm experimenting with that now. I'm not sure if it will confuse
parse to get an attribute back which does not exist in its schema, but
I'll find out, by the end of today. I have found that if I jam an artificial attribute into the objects in the trigger, they are returned
along with the other data. What I'm not sure about is whether Parse will decide that the object is dirty, or, worse, decide that I'm creating a new attribute and store it to the database ... which could be filtered out with a beforeSave trigger, but not until after the data had all been sent to the cloud.
There is also a place where i had to do several queries from several
tables, and would have ended up with a lot of redundant data. So I wrote a cloud function which did the queries, and then returned a couple of lists of objects, and a few lists of objectId strings which
served as indexes. This worked pretty well for me. And tracking the
last load time and sending it back when I needed up update my data allowed me to limit myself to objects which had changed since my last query.

How to handle indices in Neo4j server via javascript REST client?

I have data in a standalone Neo4j REST server, including an index of nodes. I want pure JavaScript client to connect to Neo4j and serve the formatted data to d3.js, a visualisation library built on Node.js.
JugglingDB is very popular, but the Neo4j implementation was done "wrong": https://github.com/1602/jugglingdb/issues/56
The next most popular option on github is: https://github.com/thingdom/node-neo4j
looking at the method definitions https://github.com/thingdom/node-neo4j/blob/develop/lib/GraphDatabase._coffee
I'm able to use "getNodeById: (id, _) ->"
> node1 = db.getNodeById(12, callback);
returns the output from the REST server, including node properties. Awesome.
I can't figure out how to use "getIndexedNodes: (index, property, value, _) ->"
> indexedNodes = db.getIndexedNodes:(index1, username, Homer, callback);
...
indexedNodes don't get defined. I've tried a few different combinations. No joy. How do I use this command?
Also, getIndexedNodes() requires a key-value pair. Is there any way to get all, or a subset of the items in the index without looping?

One of the authors/maintainers of node-neo4j here. =)
indexedNodes don't get defined. I've tried a few different combinations. No joy. How do I use this command?
Your example seems to have some syntax errors. Are index1, username and Homer variables defined elsewhere? Assuming not, i.e. assuming those are the actual index name, property name and value, they need to be quoted as string literals, e.g. 'index1', 'username' and 'Homer'. But you also have a colon right before the opening parenthesis that shouldn't be there. (That's what's causing the Node.js REPL to not understand your command.)
Then, note that indexedNodes should be undefined -- getIndexedNodes(), like most Node.js APIs, is asynchronous, so its return value is undefined. Hence the callback parameter.
You can see an example of how getIndexedNodes() is used in the sample node-neo4j-template app the README references:
https://github.com/aseemk/node-neo4j-template/blob/2012-03-01/models/user.js#L149-L160
Also, getIndexedNodes() requires a key-value pair. Is there any way to get all, or a subset of the items in the index without looping?
getIndexedNodes() does return all matching nodes, so there's no looping required. Getting a subset isn't supported by Neo4j's REST API directly, but you can achieve the result with Cypher.
E.g. to return the 6th-15th user (assuming they have a type property set to user) sorted alphabetically by username:
db.query([
'START node=node:index1(type="user")',
'RETURN node ORDER BY node.username',
'SKIP 5 LIMIT 10'
].join('\n'), callback);
Cypher is still rapidly evolving, though, so be sure to reference the documentation that matches the Neo4j version you're using.
As mentioned above, in general, take a look at the sample node-neo4j-template app. It covers a breadth of features that the library exposes and that a typical app would need.
Hope this helps. =)

Neo4j 2 lets you do indices VIA REST. Docs here
REST Indicies

Query using multiple conditions

I recently discovered (sadly) that WebSQL is no longer being supported for HTML5 and that IndexedDB will be replacing it instead.
I'm wondering if there is any way to query or search through the entries of an IndexedDB in a similar way to how I can use SQL to search for an entry satisfying multiple conditions.
I've seen that I can search through IndexedDB using one condition with the KeyRange. However, I can't seem to find any way to search two or more columns of data without grabbing all the data from the database and doing it with for loops.
I know this is a new feature that's barely implemented in the browsers, but I have a project that I'm starting and I'm researching the different ways I could do it.
Thank you!

Check out this answer to the same question. It is more detailed than the answer I give here. The keypath parameter to store.createIndex and IDBKeyRange methods can be an array. So, crude example:
// In onupgradeneeded
var store = db.createObjectStore('mystore');
store.createIndex('myindex', ['prop1','prop2'], {unique:false});
// In your query section
var transaction = db.transaction('mystore','readonly');
var store = transaction.objectStore('mystore');
var index = store.index('myindex');
// Select only those records where prop1=value1 and prop2=value2
var request = index.openCursor(IDBKeyRange.only([value1, value2]));
// Select the first matching record
var request = index.get(IDBKeyRange.only([value1, value2]));

Let's say your SQL Query is something like:
SELECT * FROM TableName WHERE Column1 = 'value1' AND Column2 = 'value2'
Equivalent Query in JsStore library:
var Connection = new JsStore.Instance("YourDbName");
Connection.select({
From: "YourTableName"
Where: {
Column1: 'value1',
Column2: 'value2'
},
OnSuccess:function (results){
console.log(results);
},
OnError:function (error) {
console.log(error);
}
});
Now, if you are wondering what JsStore is, let me tell you it is a library to query IndexedDB in a simplified manner. Click here to learn more about JsStore

I mention some suggestions for querying relationships in my answer to this question, which may be of interest:
Conceptual problems with IndexedDB (relationships etc.)
As to querying multiple fields at once, it doesn't look like there's a native way to do that in IndexedDB (I could be wrong; I'm still new to it), but you could certainly create a helper function that used a separate cursor for each field, and iterated over them to see which records met all the criteria.

Yes, opening continuous key range on an index is pretty much that is in indexedDB. Testing for multiple condition is not possible in IndexedDB. It must be done on cursor loop.
If you find the solution, please let me know.
BTW, i think looping cursor could be very fast and require less memory than possible with Sqlite.

I'm a couple of years late, but I'd just like to point out that Josh's answer works on the presumption that all of the "columns" in the condition are part of the index's keyPath.
If any of said "columns" exist outside the the index's keyPath, you will have to test the conditions involving them on each entry which the cursor created in the example iterates over. So if you're dealing with such queries, or your index isn't unique, be prepared to write some iteration code!
In any case, I suggest you check out BakedGoods if you can represent your query as a boolean expression.
For these types of operations, it will always open a cursor on the focal objectStore unless you're performing a strict equality query (x ===? y, given x is an objectStore or index key), but it will save you the trouble writing your own cursor iteration code:
bakedGoods.getAll({
filter: "keyObj > 5 && valueObj.someProperty !== 'someValue'",
storageTypes: ["indexedDB"],
complete: function(byStorageTypeResultDataObj, byStorageTypeErrorObj){}
});
Just for the sake of complete transparency, BakedGoods is maintained by moi.

javascript: array of object for simple localization

I need to implement a simple way to handle localization about weekdays' names, and I came up with the following structure:
var weekdaysLegend=new Array(
{'it-it':'Lunedì', 'en-us':'Monday'},
{'it-it':'Martedì', 'en-us':'Tuesday'},
{'it-it':'Mercoledì', 'en-us':'Wednesday'},
{'it-it':'Giovedì', 'en-us':'Thursday'},
{'it-it':'Venerdì', 'en-us':'Friday'},
{'it-it':'Sabato', 'en-us':'Saturday'},
{'it-it':'Domenica', 'en-us':'Sunday'}
);
I know I could implement something like an associative array (given the fact that I know that javascript does not provide associative arrays but objects with similar structure), but i need to iterate through the array using numeric indexes instead of labels.
So, I would like to handle this in a for cycle with particular values (like j-1 or indexes like that).
Is my structure correct? Provided a variable "lang" as one of the value between "it-it" or "en-us", I tried to print weekdaysLegend[j-1][lang] (or weekdaysLegend[j-1].lang, I think I tried everything!) but the results is [object Object]. Obviously I'm missing something..
Any idea?

The structure looks fine. You should be able to access values by:
weekdaysLegend[0]["en-us"]; // returns Monday
Of course this will also work for values in variables such as:
weekdaysLegend[i][lang];
for (var i = 0; i < weekdaysLegend.length; i++) {
alert(weekdaysLegend[i]["en-us"]);
}
This will alert the days of the week.
Sounds like you're doing everything correctly and the structure works for me as well.

Just a small note (I see the answer is already marked) as I am currently designing on a large application where I want to put locals into a javascript array.
Assumption: 1000 words x4 languages generates 'xx-xx' + the word itself...
Thats 1000 rows pr. language + the same 7 chars used for language alone = wasted bandwitdh...
the client/browser will have to PARSE THEM ALL before it can do any lookup in the arrays at all.
here is my approach:
Why not generate the javascript for one language at a time, if the user selects another language, just respond(send) the right javascript to the browser to include?
Either store a separate javascript with large array for each language OR use the language as parametre to the server-side script aka:
If the language file changes a lot or you need to minimize it per user/module, then its quite archivable with this approach as you can just add an extra parametre for "which part/module" to generate or a timestamp so the cache of the javascript file will work until changes occures.
if the dynamic approach is too performance heavy for the webserver, then publish/generate the files everytime there is a change/added a new locale - all you'll need is the "language linker" check in the top of the page, to check which language file to server the browser.
Conclusion
This approach will remove the overhead of a LOT of repeating "language" ID's if the locales list grows large.

You have to access an index from the array, and then a value by specifying a key from the object.
This works just fine for me: http://jsfiddle.net/98Sda/.
var day = 2;
var lang = 'en-us';
var weekdaysLegend = [
{'it-it':'Lunedì', 'en-us':'Monday'},
{'it-it':'Martedì', 'en-us':'Tuesday'},
{'it-it':'Mercoledì', 'en-us':'Wednesday'},
{'it-it':'Giovedì', 'en-us':'Thursday'},
{'it-it':'Venerdì', 'en-us':'Friday'},
{'it-it':'Sabato', 'en-us':'Saturday'},
{'it-it':'Domenica', 'en-us':'Sunday'}
];
alert(weekdaysLegend[day][lang]);

We Keep Coding

JavaScript is the programming language of the Web.