How do handle different content types?

My scenario involves a website hosting 100 online courses, each with various types of content such as transcripts, mp4s, titles, course descriptions, and modules (including descriptions, audio, and video).

I’m wondering if it’s more advisable to create a separate index for each content type or if it’s better to store all data in a single index to allow for cross-type queries. Any insights on how to approach this decision would be greatly appreciated.

The end goal is to create a recommendation system, search, chatbot, etc.

Thank you.

Hi @richard1, thanks for the post. The decision to partition the data for each online course depends on whether you can or want to allow queries to search across multiple courses. If it would be unacceptable for a query about one course to yield information from a different course, then you should partition your data into namespaces.

To learn more about multitenancy, check out our recently-released blog Multi-Tenancy in Vector Databases.

You should not create indexes for each course, as this will require significantly more resource management, infrastructure design, and costs.

With namespaces, queries and other operations are limited to one namespace, so different requests can search different subsets of your index. Additionally, you can easily create or delete namespaces with simple data plane operations.

How are you planning on embedding the audio and video mp4 content? Assuming that the narrative accompanied a presentation that was on on the video.