How to make a dynamic search application in MarkLogic?

How to make a dynamic search application in MarkLogic? - javascript

I am new to MarkLogic.
Is it possible to make a search application in such a way that when a user makes a search he/she receives the URI links of the documents along with a little bit of summary? When they click on the URI link they get to see the full document. I also want to give collection facets which will further filter the records. There are some fields that I want to use as facets, these fields are present in documents of some collections but are not present in other collections. However, these collections do have a common unique field which can be used for making joins or linking them. I want to know, how is it possible? How do we make collection facets? How do we make a join on different collections? How do we make the URI link clickable and direct the users to a full document? I want to answer questions like show me all the maintenance documents that have word 'housekeeping' in them, then I click on the names of the locations (location info could be in a different collection) to further narrow the search or I can click on the names of the employees who worked on these "housekeeping" jobs to further narrow the search. I would really appreciate your help. I built a search app just like Top-Songs from MarkLogic tutorials but it had just one collection and same XML schema for all the documents but now different collections and different XML schema are confusing to me. Please also tell me whether I should use Search API or cts:search to achieve this. Is this achievable by keeping these collections separate or do I need to denormalize them?
I would really appreciate your help.
much regards

I'd recommend having a look at slush-marklogic-node. It is a generator that creates you a complete project with a fairly full-featured search app. It comes with some JSON sample-data, and has some example facets that work with it, but you can also upload other data, and play with that, provided you put it in the 'data' collection.
It runs on a slightly outdated stack unfortunately, but it is fairly stable, and might give you good ideas on how to approach certain aspects. Once deployed properly, it should look like this:
http://slush-default.demo.marklogic.com/
Update:
Regarding facets on collections, the generated app comes with several example facets of which the first is based on collections. It is driven by the faceting capabilities of the REST endpoint /v1/search, which in turn builds on top of search:search(). That function takes so-called search options that can define constraints. Here two examples:
<!-- Facet based on document collections, simple yet elegant -->
<constraint name="Collection">
<collection facet="true" />
<!-- optionally enable a prefix to see a specific subset of facets
<collection facet="true" prefix="data/" />
-->
</constraint>
<!-- Example range facet based on the sample-data -->
<constraint name="eyeColor">
<range type="xs:string" facet="true" collation="http://marklogic.com/collation/codepoint">
<facet-option>limit=5</facet-option>
<facet-option>frequency-order</facet-option>
<facet-option>descending</facet-option>
<path-index>eyeColor</path-index>
</range>
</constraint>
See also: https://github.com/marklogic-community/slush-marklogic-node/blob/master/app/templates/rest-api/config/options/all.xml#L105
HTH!

Related

Auto generate sub database firestore

I have a firestore collection with a bunch of documents, each with plenty subfields. On a web page I need a list of a specific subfields from each document.
Currently I load the the entire database when you load the page and then loop through and get the wanted values. This uses way to many reads to get very little data.
Is there a way to solve this e.g. a autogenerated a collection that contains field from other collection in an array or something.
Many thanks in advance

Auto-creating such a subcollection with just the fields you need is a great way to reduce the bandwidth needed to load the data.
There is nothing built into Firestore to create those derived documents, but it's fairly easy to build something using Cloud Functions. Create a function that responds to a Firestore onWrite trigger, and write the subset of the data to its destination there. It's common to have a separate Cloud Function for each such use-case, and I regularly see projects with 100+ such functions.
I expect we'll also start seeing Firebase Extensions for this type of thing, but right now no-one seems to have built one.

AWS most efficient way of finding datapoints with a particular tag (string value in a list)

I have a very newbie question on AWS. Let's say that I am running a store that offers 1.000.000 different products. Each of the products have their own row in dynamoDB in a table named products. Now I would like to attach a list of tags to each product for example ['football', 'outdoor', 'sport' ... ], so that when the customer searches for sport products with that tag shows up in the results.
I am thinking of the best way to approach this in order to offer fast but also cost-efficient searches, I have so far thought of 2 viable options:
Option 1: Include a tags field for each product that takes a list of tags.
'product_3' -> ['football', 'outdoor', 'sport']
Option 2: Create a new table where the key is each tag and includes a field that takes a list of products instead.
'sport' -> ['product_1', product_3, ... ]
I am inclined to go with option 2 since it feels like it will render the faster search, but I want to double check that I haven't made any wrong assumptions of missed any other superior option.
Would also be great to have an infrastructure that worked with word2vec so that products related to the search word also shows up, even if they are not identical string values.

DynamoDB is a powerful and very useful database, but it is not designed for search. My suggestion would be to use the correct tool for the job.
The pattern that I've used successfully multiple times is to use DynamoDB Streams and Lambda to replicate a table to an Elasticsearch index.
You can then have that string set of tags on each item in DynamoDB and manage them their. Your nominal read when you know the item hash key can be done against DynamoDB. When you want to search you hit Elasticsearch and get all of the benefit and flexibility it provides for searching. One of those benefits being really good pagination compared to DynamoDB's API as well as the ability to sort based on other attributes.

GatsbyJS, linking related items from CMS data using GraphQL

I am more of a beginner with graphql and a bit confused between learning the basics and how Gatsby uses it to import all data for its static site needs. And wanted to know what might be best practice when it comes to my situation.
I have two content types setup in GraphCMS. {Content type A} links to {Content type B} in a one to many relationship. And in my site I have a few places where I want a list of A that shows the links to B in them and visa versa.
I would love any ideas on what might be the best way to go about making this data available when its imported from Graphcms on build startup/generation of pages. I could just reference all fields twice (using fragments?) in the configs query. Or do I query each the list of a and b separate and have a field reserved for their id and then in gatsby-node attach their linked items matching their ids. Alternatively it could be done at on client side but I imagine i should take full advantage of the generator tools?
Any and all help is greatly appreciated.

Search engine (elastic search + meteor): Is javascript array manipulation inefficient for arrays containing up to thousands of results?

I am working on a project in Meteor which uses ElasticSearch as a search engine. I need the search feature on the site to allow 'stacking' searches. So, for instance, one can search for a file that a user in a certain 'group' uploaded by 'stacking' the user's name, followed by the group name and ending with the file name or some content in the file.
Now, on the MongoDB database the group, user, and files would be stored in separate collections and be related to each other through Ids. However, ElasticSearch uses a distributed datastore where everything is 'flat'. This makes it necessary to denormalize data/do application-side joins/etc. (https://www.elastic.co/guide/en/elasticsearch/guide/current/relations.html).
My question is: which method would be the best...
Denormalize data, use nests, etc.
--> So, when rivering data to the elasticsearch datastore, I would make copies of the data and replace every parent element with a new one which has the data added to it.
FOR EX. If someone comments on let's say a post in a group. The server would have to add to the general list of comments + find the post object, append the comment to it, and re-add the post object to the database + update the group object which contains the post object which should contain the comment + do the same for a user object (since I want to be able to stack searches on groups, users, etc.).
Basically When ever something is added or deleted, I'd have to update every object in the database that relates to it.
Run multiple elastic search queries (https://www.elastic.co/guide/en/elasticsearch/guide/current/application-joins.html) to retrieve the data I want.
Just perform search queries on each de-centralized collection, and use javascript on the server-side to compare the arrays and produce the search results.
** Note: this is for scaling up to a relatively mid-level load/usage. So around hundreds-thousands of instances of data to search through. Although, if this can work larger scale (millions), that would be great!
Please correct me if my understanding of anything is wrong, and thank you for reading through all this!

Controller and View for creating one-to-many object, both "container" and unlimited number of "content" objects?

Users will be able to write some documents. Those documents will consists of chapters (one-to-many relation).
Normally I would do this by creating separate views for creating chapter and document.
How to implement web page that allow to edit "composite" view? Where I can edit document details, but also create chapters, without visiting different pages? Also how can I ensure that I pass order of chapter user have arranged (by moving chapters freely up and down)?
(Sorry if that question already have be asked&answered but I do not even know how to search for it :| since I do not know proper keywords beyond "AJAX", so help in naming my requirement would also be welcomed!)

Backend servers applications based on REST principles work nicely with Ajax client-side implementations.
For example, your URLs could be:
/book/1
/book/1/chapters
/book/1/chapter/1
You could set it up so that a POST to /book/1/chapters would add a chapter. A GET on that same URL would return all chapters. A GET on /book/1/chapter/1/ would only return chapter 1. A PUT on /book/1/chapter/1/ would update an existing chapter. This is a "RESTful" architecture:
http://en.wikipedia.org/wiki/Representational_state_transfer
This is an interesting introduction: http://tomayko.com/writings/rest-to-my-wife
This is a big subject, but if you create the right backend server architecture you will find your job a lot easier. Hope this helps answer your question.

Ok Partial solution.
Just google Nested Forms Ruby on Rails. Plenty of examples, all in ajax, all easy.

We Keep Coding

JavaScript is the programming language of the Web.