- Title: Sort
- Start Date: 2021-07-20
- Specification PR: #55 (opens new window)
- Discovery Issue: #43 (opens new window)
# Sort
# 1. Functional Specification
# I. Summary
The purpose of this specification is to add a sort feature at search time to quickly sort the search results as an end-user. Fields called sortable-attributes must be known for the search to be usable. These fields can be of type string and number. We have also introduced a new ranking rule called sort, allowing the user to adjust how the sorting should behave. Its position within the ranking rules allows to adjust its behavior according to the needs of exhaustivity and relevancy. We also introduced a sort parameter on the search resource to give the end-user the ability to sort the search results according to his needs.
# Summary Key points
sortable-attributessetting MUST be known by the engine before search time.sortsearch parameter MUST be able to operate on multiple fields at search time. e.g. sort="price:asc,label:desc,...".stringandnumberfields MUST be supported. Sorting on nested fields WON'T be supported for this iteration.sortranking rule allows developers to adjust the sorting behavior between exhaustivity and relevancy.
# II. Motivation
According to our user feedback, the lack of a sorting feature is mentioned as one of the biggest deal-breakers for choosing MeiliSearch as a search engine. A search engine must be able to offer this feature, especially for e-commerce use. Moreover, competitors all offer it. Today, users must find workarounds that take time to develop and maintain to sort search results.
We want to offer a simple and versatile solution for their needs.
# III. Explanation
# As a developer, I want to configure the sortable attributes so that the end-user can sort results.
- Introduce a new
sortableAttributesfield in the global settings resource schema. - Introduce a new
/sortable-attributessub-setting resource.
sortableAttributes field definition
- Name:
sortableAttributes - Type: Array[String]
- Default: []
GET settings /indexes/{indexUid}/settings
200 - Response with empty sortableAttributes (default case)
{
...,
"sortableAttributes": []
...
}
200 - Response with already configured sortableAttributes
{
...,
"sortableAttributes": ["price", "release_date"]
...
}
đĄ The values order in sortableAttributes has no impact. The order will be determined at search time from the sort parameter.
POST settings /indexes/{indexUid}/settings
Request body
{
...,
"sortableAttributes": ["price", "release_date", "title"],
...
}
202 Accepted - Response body
{
"updateId": 7
}
- đĄ Sending an inexistent field WON'T throw an error. It is possible to define settings before indexing documents so we accept fields that may not yet exist within a document.
- đĄ
sortableAttributesacceptsnulland[]values to be reset. đ´ Sending other than string value as array item results in a 400 bad request - invalid_request_error.
DELETE settings /indexes/{indexUid}/settings
đĄ Resetting all settings will result to the default case e.g. sortableAttributes: []
202 Accepted - Response body
{
"updateId": 8
}
GET settings/sortable-attributes /indexes/{indexUid}/settings/sortable-attributes
200 - Response body (Default case)
[]
200 - Response body with sortableAttributes already configured
[
"price",
"release_date"
]
- đ´ If a master key is set and missing from the client, the new GET
settings/sortable-attributesAPI method is protected and returns a 401 Unauthorizedmissing_authorization_header. - đ´ If the index is not found, a 404 Not Found response is returned.
POST settings/sortable-attributes /indexes/{indexUid}/settings/sortable-attributes
Request body
[
"price",
"release_date",
"title"
]
202 Accepted - Response body
{
"updateId": 7
}
- đĄ Sending an inexistent field WON'T throw an error. It is possible to define settings before indexing documents so we accept fields that may not yet exist within a document.
- đĄ POST request body accept null and [] values to reset the
sortableAttributes. - đ´ Sending other than string value as array item results in a 400 bad request - invalid_request_error.
- đ´ If a master key is set and missing from the client, the new POST
settings/sortable-attributesAPI method is protected and returns a 401 Unauthorizedmissing_authorization_header. - đ´ If the index is not found, a 404 Not Found response is returned.
DEL settings/sortable-attributes /indexes/{indexUid}/settings/sortable-attributes
202 Accepted - Response body
{
"updateId": 8
}
- đ´ If a master key is set and missing from the client, the new DELETE
settings/sortable-attributesAPI method is protected and returns a 401 Unauthorizedmissing_authorization_header. - đ´ If the index is not found, a 404 Not Found response is returned.
# As an End-User, I want to specify a sort parameter at search time so that I can sort search result in ascending/descending order from document attributes, whether they are numeric or string.
- Introduce a
sortparameter on GET/POST/searchmethods.
đĄ In the case where an attribute is specified as a sort criterion at search time and if this attribute is of a different type between several documents, the numeric type will always be favored first by the engine. This means that documents with numeric values for this attribute will be sorted before those with string values. This can lead to awkward sorting behavior, so the user should make sure to have the same type on the attribute he wants to sort on for all these documents.
đĄ In the case where an attribute is specified as a sort criterion at search time and does not exist on a document, the document will be placed at the end of the ranking rule sort.
GET Search /indexes/{indexUid}/search
sort - String - E.g. sort="price:asc,release_date:desc"
:asc
POST Search /indexes/{indexUid}/search
sort - Array[String] - E.g.
Request body
{
...,
"sort": [
"price:asc",
"release_date:desc"
],
...
}
- đ´ Sending a sort parameter while the
sortranking rule is not specified in the ranking rules settings will lead to a 400 Bad Request - invalid_sort error.
{
"message": "You must specify where `sort` is listed in the rankingRules setting to use the sort parameter at search time.",
"errorCode": "invalid_sort",
"errorType": "invalid_request_error",
"errorLink": "https://docs.meilisearch.com/errors#invalid_sort"
}
- đ´ Sending a value not set in
sortableAttributewill lead to a 400 Bad Request - invalid_sort error.
{
"message": "Attribute :attribute is not sortable, available sortable attributes are: ..., ...",
"errorCode": "invalid_sort",
"errorType": "invalid_request_error",
"errorLink": "https://docs.meilisearch.com/errors#invalid_sort"
}
:attributeis inferred when the message is generated.đ´ Sending a wrong formatted value will lead to a 400 Bad Request - invalid_sort.
{
"message": "Invalid syntax for the sort parameter: :syntaxErrorHelper.",
"errorCode": "invalid_sort",
"errorType": "invalid_request_error",
"errorLink": "https://docs.meilisearch.com/errors#invalid_sort"
}
:syntaxErrorhelperis inferred when the message is generated.
We want to align the way custom ordering rules are written with the syntax of the sort search parameter.
Current custom ranking rule definition syntax
[
"asc(title)"
]
become
[
"title:asc"
]
Search example with sort
With this set of document
[
{
"id": 1,
"label": "Vans Classic II sweatshirt in black",
"price": 52.00,
"colors": ["black"],
"sizes": ["xs", "s", "m", "m", "xl"],
"reviews_rating": 4.5
},
{
"id": 2,
"label": "The North Face Drew Peak hoodie in green",
"price": 36.00,
"colors": ["green"],
"reviews_rating": 4.89
},
{
"id": 3,
"label": "Nike Club hoodie in navy",
"price": 52.00,
"colors": ["navy"],
"reviews_rating": 4.7
}
]
With this ranking rules definition
[
"sort"
"typo",
"words",
"proximity",
"attribute",
"exactness"
]
And with this sortableAttributes definition
[
"price",
"reviews_rating"
]
POST Search
Request Body
{
...,
"sort": [
"price:asc",
"reviews_rating:desc"
]
}
200 - Response
{
"hits": [
{
"id": 2,
"label": "The North Face Drew Peak hoodie in green",
"price": 36.00,
"colors": ["green"],
"reviews_rating": 4.89
},
{
"id": 3,
"label": "Nike Club hoodie in navy",
"price": 52.00,
"colors": ["navy"],
"reviews_rating": 4.7
},
{
"id": 1,
"label": "Vans Classic II sweatshirt in black",
"price": 52.00,
"colors": ["black"],
"sizes": ["xs", "s", "m", "m", "xl"],
"reviews_rating": 4.5
}
]
}
As we can see sort is the most important criterion in play according to the ranking rules ordering. Moreover, we see that the sort search parameter is able to handle several attributes. The priorities are determined from left to right, so the price field is more important, if N documents share the same price value they are sorted by reviews_rating and so on.
# As a Developer, I want to change the position of the sort ranking rule so that I tweak the behavior of the sort ranking rule between exhaustivity and relevancy.
GET settings/ranking-rules /indexes/{indexUid}/settings/ranking-rules
200 - Response body (Default case)
[
"words",
"typo",
"proximity",
"attribute",
"sort",
"exactness"
]
POST settings/ranking-rules
Request body
[
"sort",
"words",
"typo",
"proximity",
"attribute",
"exactness"
]
202 Accepted - Response body
{
"updateId": 7
}
- âšī¸ The position of the sort ranking rule is significant. The higher it is, the more critical it will be. The lower it is, the less important it will be and other rules will take priority. This allows the developer to adjust the behavior between exhaustivity and relevancy with the sort criterion. If the
sortranking rule is in the last position you will have a very relevant sort; that is, results will emphasize relevancy over your sort criteria. Likewise, ifsortis in the first position, you will have a very exhaustive search that gives precedence to results less relevant to the query terms but more in line with your sort criteria.
Real condition explanation
e.g. Relevant Sort
I want a relevant sort (sort on relevant items). I put sort at the latest level of my ranking rules.
[
"typo",
"words",
"proximity",
"attribute",
"exactness",
"sort"
]
At search time, if I use (knowing that price and reviews_rating have been set as sortable-attributes previously)
{
"q": "Mac Book",
"sort": [
"price:asc",
"reviews_rating:desc"
]
}
Ranking rules can virtually be represented that way for this search request.
[
"typo",
"words",
"proximity",
"attribute",
"exactness",
"price:asc", //First part of the `sort` ranking rule.
"reviews_rating:desc" //Second part of the `sort`ranking rule.
]
Note that if I change the sort search parameter's value order, it changes the inner element of the sort ranking rule.
{
"q": "Mac Book",
"sort": [
"reviews_rating:desc",
"price:asc"
]
}
[
"typo",
"words",
"proximity",
"attribute",
"exactness",
"reviews_rating:desc" //The ordering of inner elements of the sort ranking rule is made from the order of the `sort` query/request parameter.
"price:asc",
]
# IV. Finalized Key Changes
- Add
sortableAttributesparameter in settings schema. - Add
sortable-attributesAPI ressource. - Add the
sortranking rule afterattributeranking rule by default to promote relevant sort. - Add
sortquery/request parameter on/searchresource. Supportstringandnumber.stringcan be sorted in lexicographical order. - Change custom ranking rule format. e.g.
asc(price)becomeprice:asc - Add a new error
invalid_sortsimilar toinvalid_filter. - The numeric type is preferred to the string type by the
sortranking rule. If an attribute has different types among the documents, those containing a numeric value will be placed before the documents containing a string value type. - A document not containing an attribute requested in the
sortparameter will be placed last during the tie-breaking of thesortranking-rule.
# 2. Technical details
# I. Measuring
- Number of sorted attributes used in
sortsearch parameter to calculate an avg per server. ranking-rulessetting definition to evaluate position customization.- Number of
sortableAttributesto calculate an avg per server.
# 3. Future Possibilities
- Support sort on nested fields
- Support computational functions avg, sum, min, max, median. For now, the workaround is to pre-compute values in the document before indexing and sort them at search time with a
sorton the pre-computed field. e.g. sort="precomputed_avg_price:desc". - Ability to set a custom ranking rules order at search time. Out of scope from the
sort.
â Rename max mdb size var Geo search â