Sparse Vector generation using node package wink-nlp?

Dear Pinecone community,

I’m very new to this and I’m trying to use sparse vectors with nodejs but I’m having trouble generating them as there’s no documentation on how to do it in JS.

There’s no nodejs version of pinecone-text

but I found that wink-nlp can do bm25 vectorization:

BM25 Vectorizer : winkNLP - NLP in Node.js

How can I use it to generate a sparse vector’s indices and the values so I can upsert it to Pinecone?

Using this ecommerce-search example as a guide, I tried to do the following:

// Require wink-nlp, model and its helper.
const model = require('wink-eng-lite-web-model');
const nlp = require('wink-nlp')(model);
const its = nlp.its;

// Require the BM25 Vectorizer.
const BM25Vectorizer = require('wink-nlp/utilities/bm25-vectorizer');

// Instantiate a vectorizer with the default configuration — no input config
// parameter indicates use default.
const bm25 = BM25Vectorizer();

// Sample corpus to train - fit tf-idf values on my corpus
const corpus = [
  'Turtle Check Men Navy Blue Shirt',
  'Peter England Men Party Blue Jeans',
  'Titan Women Silver Watch',
  'Manchester United Men Solid Black Track Pants',
  'Puma Men Grey T-shirt',
];

// Train the vectorizer on each document, using its tokens.
// The tokens are extracted using the .out() api of wink NLP.
corpus.forEach((doc) => {
  const tokens = nlp.readDoc(doc).tokens();
  return bm25
    .learn(tokens.out(its.normal)) // Learns the BM25 token weights from the input document's tokens
});

const v = bm25.vectorOf(nlp.readDoc('Turtle Check Men Navy Blue Shirt').tokens().out(its.normal));
console.log(v);

I believe the last output is the sparse vector values but what about the indices?

Thank you!

media.tenor.com/T22ELGt4yEYAAAAM/bump.gif