Similarity
search query retrieves an extensive list of potentially
relevant results from hundreds of millions of documents. Documents are
matched and ranked to provide the best possible results quickly.
However, some of the results might not be as relevant as expected.
To further improve the quality of results, it is possible to apply a
second phase, a re-ranker based on a Cross-Encoder, to rank the most
promising first-phase candidates again using the same text provided in
the Similarity
query.
The re-ranker defines a new relevance value between 0 and 1 for each
text chunk and drops those with a relevance value below the specified
rerank_threshold
provided.
We recommend specifying a date_range
and retrieving many documents or
chunks so that all the first-phase chunks pass through the re-ranker.
Only the returned chunks after the second phase will count as API query
unit usage.
The following example returns all the chunks from the previous week with
a relevance higher than 0.9