Elasticsearch must and must_not not working as expected - javascript

I'm trying to run a query as shown below:
query = {
"filtered": {
"filter": {
"bool": {
"must_not": [{"term":{"text":"exclude_word"}}],
"must": [{"term":{"text":"include_word"}}]
}
},
"query": {
"dis_max": {
"queries": [
{"match": {"text": "search_term"}},
{"match": {"title": "search_term"}}
]
}
}
}
}
}
client.search(
{
index: 'docs',
body: {
from: start_page,
size: 10,
query: query,
highlight : {
fields : {
text : {
}
}
}
})
The purpose of this query is to search through all documents for the "search_term" and exclude documents which contain "exclude_word" and don't contain "include_word" in their text field.
My problem is that it doesn't seem to filter out the documents containing the exclude_word. Assume that "start_page" is a variable I'm updating and its working fine alongside a paging front end.
My text field is analyzed and its in the following format:
"text": ["sentence 1", "sentence 2", "sentence 3"]
Also the client.search part is just an API for node.js that I'm using and it works fine and can be ignored.

Related

How to search for specific word and exact match in ElasticSearch

sample data for title
actiontype test
booleanTest
test-demo
test_demo
Test new account object
sync accounts data test
default Mapping for title
"title": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
tried with this query search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "test"
}
}
]
}
},
}
here my expectation
with specific word(e.g. : test ) it should return following titles
expect
actiontype test
booleanTest
test-demo
test_demo
Test new account object
sync accounts data test
But
got
actiontype test
test-demo
test_demo
Test new account object
sync accounts data test
With exact match (e.g. : sync accounts data test ) it should return only this(sync accounts data test) but got all records those contains this words (sync,account,data,test).
What should I do to make this happen ? Thanks.
I am not sure which ES version you're using but the following should give you an idea.
Using your mapping you can get all title text with test, including booleanTest using query string query type. Eg.
GET {index-name}/{mapping}/_search
{
"query": {
"query_string": {
"default_field": "title",
"query": "*test*"
}
}
}
However, for this to work, make sure you give your title field an analyzer with a lowercase analyzer filter (see below settings example). Your current mapping will not work since it's just pure text as is... test /= TEST by default.
There are other ways, if you're interested to know the workings of ES... Eg. You can also match booleanTest in your match query by writing a custom nGram filter to your index settings. Something like,
{
"index": {
"settings": {
"index": {
"analysis": {
"filter": {
"nGram": {
"type": "nGram",
"min_gram": "2",
"max_gram": "20"
}
},
"ngram_analyzer": {
"filter": [
"lowercase",
"nGram"
],
"type": "custom",
"tokenizer": "standard"
}
}
}
}
}
}
NB: ngram_analyzer is just a name. You can call it whatever.
min_gram & max_gram: Pick numbers that work for you.
Learn more about n-gram filter, the goods and bad here: N-GRAM
Then you can add the analyzer to your field mapping like,
{
"title": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256,
"analyzer": "ngram_analyzer"
}
}
}
}
Lastly for exact matches, these work on type keyword. So based on your mapping, you already have the keyword field so you can use term query to get the exact match by searching on the title.keyword field;
GET {index-name}/{mapping}/_search
{
"query": {
"term": {
"title.keyword": {
"value": "sync accounts data test"
}
}
}
}
Also, you will want to read/ learn more about these solutions and decide on the best solution based on your indexing setup and needs. Also, there may be more ways to achieve what you need, this should be a good start.

How to add a string variable to a JSON query in Javascript without adding extra escaping

I'm having an issues in being able to add a string variable to a section of JSON, and when I pass the full JSON request to the server, the string variable that I added to the JSON always contains extra escape quotes and I don't know how to remove them on insertion.
I've tried using JSON.Parse(queryString) but I then get an exception saying an unexpected token.
The match queries I'm trying to add will vary in number depending on the qty of report filters applied, therefore I can't hard code the match queries in the JSON code block as they're shown in the commented out section.
Code sample:
var queryString = '{ "match": { "log.level": "Information" } }, { "match": { "metadata.log_event_category": "Open Id Connect" } }'
var request =
{
"size": 0,
"query": {
"bool": {
"must": [
// Need to insert a string variable here that would contain the match queries commented out below...
// queryString, // doesnt work, adding the var here results in extra escapes
// JSON.parse(queryString), doesnt work!
// If I manually write in the match queries below then everything goes through OK
//{ "match": { "log.level": "Information" } },
//{ "match": { "metadata.log_event_category": "Open Id Connect" } },
{
"range": {
"#timestamp": {
"gte": chartArray[0].dateTimeFrom,
"lte": chartArray[0].dateTimeTo
}
}
}
]
}
},
"aggs": {
"myAggName": {
"date_histogram": {
"field": "#timestamp",
[elasticIntervalType]: elasticIntervalUnits,
"format": chartArray[0].scaleLabelAttributes
}
},
}
}
Screenshot shows that I'm getting extra escape quotes from the match queries when sending to the server, but I need these to be removed:
Below is a screenshot of a working example but I'm trying to replicate building this equivalent request in javascript in order to pass to my controller and send off to Elastic Search.
It looks like you could just add square brackets around the querystring to make json.parse work. Try:
var queryString = '{ "match": { "log.level": "Information" } }, { "match": { "metadata.log_event_category": "Open Id Connect" } }'
const must = JSON.parse(“[“ + queryString + ”]”);
must.push({
"range": {
"#timestamp": {
"gte": chartArray[0].dateTimeFrom,
"lte": chartArray[0].dateTimeTo
}
}
});
var request =
{
"size": 0,
"query": {
"bool": {
"must": must
}
},
"aggs": { //etc }
};
I had to split the match queries into separate items, then push them individually into the JSON request.
var must = [];
must.push(JSON.parse('{ "match": { "log.level": "Information" } }'))
must.push(JSON.parse('{ "match": { "metadata.log_event_category": "Open Id Connect" } }'))
must.push({
"range": {
"#timestamp": {
"gte": chartArray[0].dateTimeFrom,
"lte": chartArray[0].dateTimeTo
}
}
});
var request =
{
"size": 0,
"query": {
"bool": {
"must": must
}
},
"aggs": {
"myAggName": {
"date_histogram": {
"field": "#timestamp",
[elasticIntervalType]: elasticIntervalUnits,
"format": chartArray[0].scaleLabelAttributes
}
},
}
}

ElasticSearch EJS query to fetch missing field

I have an existing ejs query as below:
let queryBody = ejs.Request()
.size(0)
.query(
ejs.BoolQuery()
.must(
ejs.RangeQuery('hour_time_stamp').gte(this.lastDeviceDate).lte(this.lastDeviceDate)
)
)
.agg(ejs.TermsAggregation('market_agg').field('market').order('sum', 'asc').size(50000)
.agg(ejs.SumAggregation('sum').field('num_devices'))
)
currently the field('market') returns the values where data for market is present. There is data in the database for missing values for market as well, which I need to access. How do I do that?
EDIT:
Values for market in ES is either null or field is missing. I wrote ES query to get all those fields but I am not able to incorporate an ejs query for the same. Any idea how this can be done?
{
"query": {
"bool": {
"should": [
{
"exists": {
"field": "market"
}
},
{
"bool": {
"must_not": [
{
"exists": {
"field": "market"
}
}
]
}
}
]
}
}
}
As per your problem you need a way to group the empty market fields too.
So for that you can use the "missing" value parameter. It defines how the values which are missing(as in your case) are grouped. So you query in json form will be modified like below :-
{
"query":
{
"must": [
"range": {
"hour_time_stamp": {
"gte": lastDeviceDate,
"lte": lastDeviceDate
}
}
]
},
"aggs": {
"market_agg" : {
"market": {
"missing": "empty_markets",
"order": { "sum": "asc" }
}
},
"sum_agg": {
"sum" : { "field" : "num_devices" }
}
}
}
Or in your code it could be done by adding missing parameter like this.
let queryBody = ejs.Request()
.size(0)
.query(
ejs.BoolQuery()
.must(
ejs.RangeQuery('hour_time_stamp').gte(this.lastDeviceDate).lte(this.lastDeviceDate)
)
)
.agg(ejs.TermsAggregation('market_agg').field('market').missing('empty_markets').order('sum', 'asc').size(50000)
.agg(ejs.SumAggregation('sum').field('num_devices'))
)

Parsing Exception error when using Terms in ElasticSearch

I'm getting an error on this elastic search for terms. The error message is
"[parsing_exception] [terms] unknown token [START_ARRAY] after [activeIds], with { line=1 & col=63 }"
Active Ids is an array of unique ids. It sort of looks like
const activeIds = [ '157621a1-d892-4f4b-80ca-14feddb837a0',
'd04c5c93-a22c-48c3-a3b0-c79a61bdd923',
'296d40d9-f316-4560-bbc9-001d6f46858b',
'2f8c6c37-588d-4d24-9e69-34b6dd7366c2',
'ba0508dd-0e76-4be8-8b6e-9e938ab4abed',
'ab076ed9-1dd5-4987-8842-15f1b995bc0d',
'ea6b0cff-a64f-4ce3-844e-b36d9f161e6f' ]
let items = await es.search({
"index": table,
"body": {
"from": 0, "size": 25,
"query": {
"terms" : {
"growerId" : {
activeIds
}
},
"bool": {
"must_not": [
{ "match":
{
"active": false
}
},
],
"must": [
{ "query_string" :
{
"query": searchQuery,
"fields": ["item_name"]
}
}
],
}
}
}
})
Appreciate the help!
Edit: Answering this question- "What's the expected result? Can you elaborate and share some sample data? – Nishant Saini 15 hours ago"
I'll try to elaborate a bit.
1) Overall I'm trying to retrieve items that belong to active users. There are 2 tables: user and items. So I'm initially running an ES that returns all the users that contain { active: true } from the user table
2) Running that ES returns an array of ids which I'm calling activeIds. The array looks like what I've already displayed in my example. So this works so far (let me know if you want to see the code for that, but if I'm getting an expected result then I don't think we need that now)
3) Now I want to search through the items table, and retrieve only the items that contain one of the active ids. So an item should look like:
4) expected result is retrieve an array of objects that match the growerId with one of the activeIds. So if I do a search query for "flowers", a single expected result should look like:
[ { _index: 'items-dev',
_type: 'items-dev_type',
_id: 'itemId=fc68dadf-21c8-43c2-98d2-cf574f71f06d',
_score: 11.397207,
_source:
{ itemId: 'fc68dadf-21c8-43c2-98d2-cf574f71f06d',
'#SequenceNumber': '522268700000000025760905838',
item_name: 'Flowers',
grower_name: 'Uhs',
image: '630b5d6e-566f-4d55-9d31-6421eb2cff87.jpg',
dev: true,
growerId: 'd04c5c93-a22c-48c3-a3b0-c79a61bdd923',
sold_out: true,
'#timestamp': '2018-12-20T16:09:38.742599',
quantity_type: 'Pounds',
active: true,
pending_inventory: 4,
initial_quantity: 5,
price: 10,
item_description: 'Field of flowers' } },
So here the growerId matches activeIds[1]
But if I do a search for "invisible", which is created by a an inactive user, I get:
[ { _index: 'items-dev',
_type: 'items-dev_type',
_id: 'itemId=15200473-93e1-477c-a1a7-0b67831f5351',
_score: 1,
_source:
{ itemId: '15200473-93e1-477c-a1a7-0b67831f5351',
'#SequenceNumber': '518241400000000004028805117',
item_name: 'Invisible too',
grower_name: 'Field of Greens',
image: '7f37d364-e768-451d-997f-8bb759343300.jpg',
dev: true,
growerId: 'f25040f4-3b8c-4306-9eb5-8b6c9ac58634',
sold_out: false,
'#timestamp': '2018-12-19T20:47:16.128934',
quantity_type: 'Pounds',
pending_inventory: 5,
initial_quantity: 5,
price: 122,
item_description: 'Add' } },
Now that growerId does not match any of the ids in activeIds.
5) Using the code you helped with, it's returning 0 items.
Let me know if you need more detail. I've been working on this for a bit too long :\
Terms query accept array of terms so the terms query should be defined as below:
"terms": {
"growerId": activeIds
}
You might face other errors as well after making the above correction. So below is full query which might help you:
{
"from": 0,
"size": 25,
"query": {
"bool": {
"must_not": [
{
"match": {
"active": false
}
}
],
"must": [
{
"query_string": {
"query": searchQuery,
"fields": [
"item_name"
]
}
},
{
"terms": {
"growerId": activeIds
}
}
]
}
}
}

Graphql: how to query using prefix string

I have the following query in GraphQL:
{
allBibliografia(nome: "Bibliografia 1") {
id
nome
}
}
which works. But I need to query my backend using a prefix string, as the user is typing. For example: if I type "Bibliografia 1" as above, my query returns
{
"data": {
"allBibliografia": [
{
"id": "1",
"nome": "Bibliografia 1"
}
]
}
}
but if I input just "Bibliografia" or "Bibliografia*", I get this
{
"data": {
"allBibliografia": []
}
}
I was hoping that "Bibliografia*" would work just like a wildcard for SQL, but it didn't. Does anybody knows how to achieve that?

Categories