Does Import Data Feature “vectorize” the S3 files in Parquet format ? In other words, can we build automated pipelines that only drop the files to S3 and they will be added to the desired namespace within an index?
We are scaling at the moment and need efficient ways of duplicating our indices or running the entire doc processing pipeline (> 1 GB of docs) after updates to chunking/embedding, currently we are unable to do so and have to rely on local jobs ( github workflow actions). What are some options that we can use without impacting our production indexes.
1 Like