- Title: Sort
- Start Date: 2021-07-20
- Specification PR: #55 (opens new window)
- Discovery Issue: #43 (opens new window)
# Sort
# 1. Functional Specification
# I. Summary
The purpose of this specification is to add a sort feature at search time to quickly sort the search results as an end-user. Fields called sortable-attributes
must be known for the search to be usable. These fields can be of type string
and number
. We have also introduced a new ranking rule called sort
, allowing the user to adjust how the sorting should behave. Its position within the ranking rules allows to adjust its behavior according to the needs of exhaustivity and relevancy. We also introduced a sort
parameter on the search resource to give the end-user the ability to sort the search results according to his needs.
# Summary Key points
sortable-attributes
setting MUST be known by the engine before search time.sort
search parameter MUST be able to operate on multiple fields at search time. e.g. sort="price:asc,label:desc,...".string
andnumber
fields MUST be supported. Sorting on nested fields WON'T be supported for this iteration.sort
ranking rule allows developers to adjust the sorting behavior between exhaustivity and relevancy.
# II. Motivation
According to our user feedback, the lack of a sorting feature is mentioned as one of the biggest deal-breakers for choosing MeiliSearch as a search engine. A search engine must be able to offer this feature, especially for e-commerce use. Moreover, competitors all offer it. Today, users must find workarounds that take time to develop and maintain to sort search results.
We want to offer a simple and versatile solution for their needs.
# III. Explanation
# As a developer, I want to configure the sortable attributes so that the end-user can sort results.
- Introduce a new
sortableAttributes
field in the global settings resource schema. - Introduce a new
/sortable-attributes
sub-setting resource.
sortableAttributes
field definition
- Name:
sortableAttributes
- Type: Array[String]
- Default: []
GET settings /indexes/{indexUid}/settings
200 - Response with empty sortableAttributes
(default case)
{
...,
"sortableAttributes": []
...
}
200 - Response with already configured sortableAttributes
{
...,
"sortableAttributes": ["price", "release_date"]
...
}
đĄ The values order in sortableAttributes
has no impact. The order will be determined at search time from the sort
parameter.
POST settings /indexes/{indexUid}/settings
Request body
{
...,
"sortableAttributes": ["price", "release_date", "title"],
...
}
202 Accepted - Response body
{
"updateId": 7
}
- đĄ Sending an inexistent field WON'T throw an error. It is possible to define settings before indexing documents so we accept fields that may not yet exist within a document.
- đĄ
sortableAttributes
acceptsnull
and[]
values to be reset. đ´ Sending other than string value as array item results in a 400 bad request - invalid_request_error.
DELETE settings /indexes/{indexUid}/settings
đĄ Resetting all settings will result to the default case e.g. sortableAttributes: []
202 Accepted - Response body
{
"updateId": 8
}
GET settings/sortable-attributes /indexes/{indexUid}/settings/sortable-attributes
200 - Response body (Default case)
[]
200 - Response body with sortableAttributes
already configured
[
"price",
"release_date"
]
- đ´ If a master key is set and missing from the client, the new GET
settings/sortable-attributes
API method is protected and returns a 401 Unauthorizedmissing_authorization_header
. - đ´ If the index is not found, a 404 Not Found response is returned.
POST settings/sortable-attributes /indexes/{indexUid}/settings/sortable-attributes
Request body
[
"price",
"release_date",
"title"
]
202 Accepted - Response body
{
"updateId": 7
}
- đĄ Sending an inexistent field WON'T throw an error. It is possible to define settings before indexing documents so we accept fields that may not yet exist within a document.
- đĄ POST request body accept null and [] values to reset the
sortableAttributes
. - đ´ Sending other than string value as array item results in a 400 bad request - invalid_request_error.
- đ´ If a master key is set and missing from the client, the new POST
settings/sortable-attributes
API method is protected and returns a 401 Unauthorizedmissing_authorization_header
. - đ´ If the index is not found, a 404 Not Found response is returned.
DEL settings/sortable-attributes /indexes/{indexUid}/settings/sortable-attributes
202 Accepted - Response body
{
"updateId": 8
}
- đ´ If a master key is set and missing from the client, the new DELETE
settings/sortable-attributes
API method is protected and returns a 401 Unauthorizedmissing_authorization_header
. - đ´ If the index is not found, a 404 Not Found response is returned.
# As an End-User, I want to specify a sort
parameter at search time so that I can sort search result in ascending/descending order from document attributes, whether they are numeric
or string
.
- Introduce a
sort
parameter on GET/POST/search
methods.
đĄ In the case where an attribute is specified as a sort criterion at search time and if this attribute is of a different type between several documents, the numeric type will always be favored first by the engine. This means that documents with numeric values for this attribute will be sorted before those with string values. This can lead to awkward sorting behavior, so the user should make sure to have the same type on the attribute he wants to sort on for all these documents.
đĄ In the case where an attribute is specified as a sort criterion at search time and does not exist on a document, the document will be placed at the end of the ranking rule sort.
GET Search /indexes/{indexUid}/search
sort
- String - E.g. sort="price:asc,release_date:desc"
:asc
POST Search /indexes/{indexUid}/search
sort
- Array[String] - E.g.
Request body
{
...,
"sort": [
"price:asc",
"release_date:desc"
],
...
}
- đ´ Sending a sort parameter while the
sort
ranking rule is not specified in the ranking rules settings will lead to a 400 Bad Request - invalid_sort error.
{
"message": "You must specify where `sort` is listed in the rankingRules setting to use the sort parameter at search time.",
"errorCode": "invalid_sort",
"errorType": "invalid_request_error",
"errorLink": "https://docs.meilisearch.com/errors#invalid_sort"
}
- đ´ Sending a value not set in
sortableAttribute
will lead to a 400 Bad Request - invalid_sort error.
{
"message": "Attribute :attribute is not sortable, available sortable attributes are: ..., ...",
"errorCode": "invalid_sort",
"errorType": "invalid_request_error",
"errorLink": "https://docs.meilisearch.com/errors#invalid_sort"
}
:attribute
is inferred when the message is generated.đ´ Sending a wrong formatted value will lead to a 400 Bad Request - invalid_sort.
{
"message": "Invalid syntax for the sort parameter: :syntaxErrorHelper.",
"errorCode": "invalid_sort",
"errorType": "invalid_request_error",
"errorLink": "https://docs.meilisearch.com/errors#invalid_sort"
}
:syntaxErrorhelper
is inferred when the message is generated.
We want to align the way custom ordering rules are written with the syntax of the sort
search parameter.
Current custom ranking rule definition syntax
[
"asc(title)"
]
become
[
"title:asc"
]
Search example with sort
With this set of document
[
{
"id": 1,
"label": "Vans Classic II sweatshirt in black",
"price": 52.00,
"colors": ["black"],
"sizes": ["xs", "s", "m", "m", "xl"],
"reviews_rating": 4.5
},
{
"id": 2,
"label": "The North Face Drew Peak hoodie in green",
"price": 36.00,
"colors": ["green"],
"reviews_rating": 4.89
},
{
"id": 3,
"label": "Nike Club hoodie in navy",
"price": 52.00,
"colors": ["navy"],
"reviews_rating": 4.7
}
]
With this ranking rules definition
[
"sort"
"typo",
"words",
"proximity",
"attribute",
"exactness"
]
And with this sortableAttributes
definition
[
"price",
"reviews_rating"
]
POST Search
Request Body
{
...,
"sort": [
"price:asc",
"reviews_rating:desc"
]
}
200 - Response
{
"hits": [
{
"id": 2,
"label": "The North Face Drew Peak hoodie in green",
"price": 36.00,
"colors": ["green"],
"reviews_rating": 4.89
},
{
"id": 3,
"label": "Nike Club hoodie in navy",
"price": 52.00,
"colors": ["navy"],
"reviews_rating": 4.7
},
{
"id": 1,
"label": "Vans Classic II sweatshirt in black",
"price": 52.00,
"colors": ["black"],
"sizes": ["xs", "s", "m", "m", "xl"],
"reviews_rating": 4.5
}
]
}
As we can see sort
is the most important criterion in play according to the ranking rules ordering. Moreover, we see that the sort
search parameter is able to handle several attributes. The priorities are determined from left to right, so the price
field is more important, if N documents share the same price value they are sorted by reviews_rating
and so on.
# As a Developer, I want to change the position of the sort
ranking rule so that I tweak the behavior of the sort
ranking rule between exhaustivity and relevancy.
GET settings/ranking-rules /indexes/{indexUid}/settings/ranking-rules
200 - Response body (Default case)
[
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
]
POST settings/ranking-rules
Request body
[
"sort",
"words",
"typo",
"proximity",
"attribute",
"exactness"
]
202 Accepted - Response body
{
"updateId": 7
}
- âšī¸ The position of the sort ranking rule is significant. The higher it is, the more critical it will be. The lower it is, the less important it will be and other rules will take priority. This allows the developer to adjust the behavior between exhaustivity and relevancy with the sort criterion. If the
sort
ranking rule is in the last position you will have a very relevant sort; that is, results will emphasize relevancy over your sort criteria. Likewise, ifsort
is in the first position, you will have a very exhaustive search that gives precedence to results less relevant to the query terms but more in line with your sort criteria.
Real condition explanation
e.g. Relevant Sort
I want a relevant sort (sort on relevant items). I put sort at the latest level of my ranking rules.
[
"typo",
"words",
"proximity",
"attribute",
"exactness",
"sort"
]
At search time, if I use (knowing that price and reviews_rating have been set as sortable-attributes previously)
{
"q": "Mac Book",
"sort": [
"price:asc",
"reviews_rating:desc"
]
}
Ranking rules can virtually be represented that way for this search request.
[
"typo",
"words",
"proximity",
"attribute",
"exactness",
"price:asc", //First part of the `sort` ranking rule.
"reviews_rating:desc" //Second part of the `sort`ranking rule.
]
Note that if I change the sort
search parameter's value order, it changes the inner element of the sort
ranking rule.
{
"q": "Mac Book",
"sort": [
"reviews_rating:desc",
"price:asc"
]
}
[
"typo",
"words",
"proximity",
"attribute",
"exactness",
"reviews_rating:desc" //The ordering of inner elements of the sort ranking rule is made from the order of the `sort` query/request parameter.
"price:asc",
]
# IV. Finalized Key Changes
- Add
sortableAttributes
parameter in settings schema. - Add
sortable-attributes
API ressource. - Add the
sort
ranking rule afterattribute
ranking rule by default to promote relevant sort. - Add
sort
query/request parameter on/search
resource. Supportstring
andnumber
.string
can be sorted in lexicographical order. - Change custom ranking rule format. e.g.
asc(price)
becomeprice:asc
- Add a new error
invalid_sort
similar toinvalid_filter
. - The numeric type is preferred to the string type by the
sort
ranking rule. If an attribute has different types among the documents, those containing a numeric value will be placed before the documents containing a string value type. - A document not containing an attribute requested in the
sort
parameter will be placed last during the tie-breaking of thesort
ranking-rule.
# 2. Technical details
# I. Measuring
- Number of sorted attributes used in
sort
search parameter to calculate an avg per server. ranking-rules
setting definition to evaluate position customization.- Number of
sortableAttributes
to calculate an avg per server.
# 3. Future Possibilities
- Support sort on nested fields
- Support computational functions avg, sum, min, max, median. For now, the workaround is to pre-compute values in the document before indexing and sort them at search time with a
sort
on the pre-computed field. e.g. sort="precomputed_avg_price:desc". - Ability to set a custom ranking rules order at search time. Out of scope from the
sort
.
â Rename max mdb size var Geo search â