Clarification on Rerankers in Cascading Retrieval

Thanks for the question! The typical implementation would be to do a single “2-stage” where you only rerank over the combined results. However in many advanced use cases their may be many more levels of reranking. Typically you would run cheaper/faster rerankers earlier in the cascade.

3-stage could improve results, though a faster reranker may be required to ensure low enough latency responses

1 Like