Reading through the docs on reranker vs. sparse and dense retrieval, does each method require a reranker, and then the output is combined and then a third reranker outputs the most relevant result(s) to the LLM?
Just trying to mentally map this out so I can apply it.
1 Like
Thanks for the question! The typical implementation would be to do a single “2-stage” where you only rerank over the combined results. However in many advanced use cases their may be many more levels of reranking. Typically you would run cheaper/faster rerankers earlier in the cascade.
3-stage could improve results, though a faster reranker may be required to ensure low enough latency responses
1 Like