Clarification on Rerankers in Cascading Retrieval

crispin.courtenay · December 3, 2024, 4:40pm

Reading through the docs on reranker vs. sparse and dense retrieval, does each method require a reranker, and then the output is combined and then a third reranker outputs the most relevant result(s) to the LLM?

Just trying to mentally map this out so I can apply it.

gdj0nes · December 3, 2024, 4:49pm

Thanks for the question! The typical implementation would be to do a single “2-stage” where you only rerank over the combined results. However in many advanced use cases their may be many more levels of reranking. Typically you would run cheaper/faster rerankers earlier in the cascade.

3-stage could improve results, though a faster reranker may be required to ensure low enough latency responses