However generating the sparse vectors via Splade is taking a very long time and i think it is not utilizing the GPU. Has anyone tried to use a GPU for this?
Unfortunately currently BM25 encoder have no optimizations. The result json is a simple DF count (a dictionary mapping between a token to the number of tokens it appears in) so theoretically you can parallelize multiple fit calls on distinct shards of your training data and then simply merge the outputs.