I have a dataset that contains application error message for various unique events
Example
-
event: search horror
-
error: [standard multi line application error/exception message]
-
event: search action
-
error: [standard multi line application error/exception message]
I need to look at the error message and group them based on similarity and return the events for each group. For example
error : blah foo bar
event: search action, search horror, search foo (3 total)
error : blah blah blah
event: search foo (1 total)
Question
- Is a vector db the right choice for storing such a dataset so that I can group based on similarity
- If yes, how would I do this in pinecone?