Working with the canopy-cli, linux via command line.
I created canopy index using a config file, and the config file was created via command line as instructed by Pinecone. In the config file, I set explicitly to use “gpt-3.5-turbo” for tokenizer and LLM. Furthermore, I restricted (limit) in OpenAI API to allow the project to only use “gpt-3.5-turbo”.
The “canopy new --config …”, “canopy upsert --config …” and “canopy start --config …” worked well or at least we can say that it run (to be more precise).
However, “canopy chat --no-rag” or “canopy chat” seem to point to another LLM in OpenAI, “gpt-3.5-turbo-0125”. Therefore, when I pass a query I received an exception the the following text (snipet),
… project xxxx has no permission to use gpt-3.5-turbo-0125 …
The “canopy chat” has no argument to point to a config file. And the questions are:
- am I doing something wrong?
- is there a way to force “canopy chat” to use a config file?
Thank you for your support.
Hi @alejandro, and welcome to the Pinecone forums!
Thank you for your question and sorry to hear you’re running into this.
If I’m understanding your question correctly, most canopy commands work when passing the --config flag, but the chat command does not appear to read or honor the config file?
A couple of clarifying questions so we can better help:
-
Are you following the Canopy docs in the repository? canopy/docs at main · pinecone-io/canopy · GitHub
-
Are you able to share your configuration file (After removing any secrets like your API keys)?
In the meantime, I’m going to pass your question back to the Canopy team for visibility. You may also want to consider filing your issue against the GitHub repo so the team will see it faster.
Best,
Zack
Hello @ZacharyProser , and thank you for your reply.
- Yes indeed, I’ve followed, only, the docs. I’ve cloned the pinecone-io/canopy and take it form there step by step. “canopy --version” → 0.9.0
A bit more context, I run the following commands:
- Setup API key in Pinecone
- Run
canopy create-config path_to_dir
(I’ve shared the config in point 2)
- Setup key in OpenAI, limited to use “gpt-3.5-turbo” and “text-embedding-3-small”.
- Edit the config file to use “gpt-3.5-turbo” and “text-embedding-3-small”
- Setup required env. variables
- Run
canopy new --config path_to_config_file
, it created the “canopy–test” index.
- Added “canopy-test” to env. variables.
- Run
canopy upsert path_to_jsonl_file
, it indexed all json lines.
- Run
canopy start --config path_to_config_file
, server started and I can access http://127.0.0.1:8000/docs
- Open another terminal window, and run
canopy chat --no-rag --config path_to_config
but I received and exception. No --config flag for canopy chat
- The I run
canopy chat --no-rag
, was prompt to enter query.
- Enter a query, and I received an exception and canopy chat exited. The exception message included"
"project xxxx has no permission to use “project xxxx has no permission to use gpt-3.5-turbo-0125”
Now the interesting part is, the query result “context” part was printed, but the “no context” part was the one that could not be executed and returned the exception.
Follow the exception, I added “gpt-3.5-turbo-0125” in OpenAI API.
Then I went back, to run canopy chat --no-rag
, executed the query and all worked.
- Here is my config file:
# ===========================================================
# Configuration file for Canopy Server
# ===========================================================
# ---------------------------------------------------------------------------------
# LLM prompts
# Defined here for convenience, then referenced in the chat engine configuration
# Note: line breaks in the prompts are important, and may affect the LLM's behavior
# ---------------------------------------------------------------------------------
system_prompt: &system_prompt |
Use the following pieces of context to answer the user question at the next messages. This context retrieved from a knowledge database and you should use only the facts from the context to answer. Always remember to include the source to the documents you used from their 'source' field in the format 'Source: $SOURCE_HERE'.
If you don't know the answer, just say that you don't know, don't try to make up an answer, use the context.
Don't address the context directly, but use it to answer the user question like it's your own knowledge.
query_builder_prompt: &query_builder_prompt |
Your task is to formulate search queries for a search engine, to assist in responding to the user's question.
You should break down complex questions into sub-queries if needed.
tokenizer:
# -------------------------------------------------------------------------------------------
# Tokenizer configuration
# A Tokenizer singleton instance must be initialized before initializing any other components
# -------------------------------------------------------------------------------------------
type: OpenAITokenizer # Options: [OpenAITokenizer]
params:
model_name: gpt-3.5-turbo
chat_engine:
# -------------------------------------------------------------------------------------------------------------
# Chat engine configuration
# The chat engine is the main component of the server, generating responses to the `/chat.completion` API call.
# The configuration is recursive, so that each component contains a subsection for each of its sub components.
# -------------------------------------------------------------------------------------------------------------
params:
max_prompt_tokens: 4096 # The maximum number of tokens to use for input prompt to the LLM
max_generated_tokens: 600 # Leaving `null` will use the default of the underlying LLM
max_context_tokens: null # Leaving `null` will use 70% of `max_prompt_tokens`
system_prompt: *system_prompt # The chat engine's system prompt for calling the LLM
allow_model_params_override: false # Whether to allow overriding the LLM's parameters in an API call
history_pruner: # How to prune messages if chat history is too long. Options: [RecentHistoryPruner, RaisingHistoryPruner]
type: RecentHistoryPruner
params:
min_history_messages: 1 # Minimal number of messages to keep in history
llm: &llm
# -------------------------------------------------------------------------------------------------------------
# LLM configuration
# Configuration of the LLM (Large Language Model)
# -------------------------------------------------------------------------------------------------------------
type: OpenAILLM # Options: [OpenAILLM, AzureOpenAILLM]
params:
model_name: gpt-3.5-turbo # The name of the model to use.
# You can add any additional parameters which are supported by the LLM's `ChatCompletion()` API. The values
# set here will be used in every LLM API call, but may be overridden if `allow_model_params_override` is true.
# temperature: 0.7
# top_p: 0.9
query_builder:
# -------------------------------------------------------------------------------------------------------------
# LLM configuration
# Configuration of the LLM (Large Language Model)
# -------------------------------------------------------------------------------------------------------------
type: FunctionCallingQueryGenerator # Options: [FunctionCallingQueryGenerator, LastMessageQueryGenerator, InstructionQueryGenerator]
params:
prompt: *query_builder_prompt # The query builder's system prompt for calling the LLM
function_description: # A function description passed to the LLM's `function_calling` API
Query search engine for relevant information
llm: # The LLM that the query builder will use to generate queries. Leave `*llm` to use the chat engine's LLM
<<: *llm
context_engine:
# -------------------------------------------------------------------------------------------------------------
# ContextEngine configuration
# The context engine is responsible for generating textual context for the `/query` API calls.
# -------------------------------------------------------------------------------------------------------------
params:
global_metadata_filter: null # An optional metadata filter to apply to all queries
context_builder:
# -------------------------------------------------------------------------
# Configuration for the ContextBuilder subcomponent of the context engine.
# The context builder is responsible for formulating a textual context given query results.
# -------------------------------------------------------------------------
type: StuffingContextBuilder # Options: [StuffingContextBuilder]
knowledge_base:
# -----------------------------------------------------------------------------------------------------------
# KnowledgeBase configuration
# The KnowledgeBase is a responsible for storing and indexing the user's documents
# -----------------------------------------------------------------------------------------------------------
params:
default_top_k: 5 # The default number of document chunks to retrieve for each query
chunker:
# --------------------------------------------------------------------------
# Configuration for the Chunker subcomponent of the knowledge base.
# The chunker is responsible for splitting documents' text into smaller chunks.
# --------------------------------------------------------------------------
type: MarkdownChunker # Options: [MarkdownChunker, RecursiveCharacterChunker]
params:
chunk_size: 256 # The maximum number of tokens in each chunk
chunk_overlap: 0 # The number of tokens to overlap between chunks
keep_separator: true # Whether to keep the separator in the chunks
record_encoder:
# --------------------------------------------------------------------------
# Configuration for the RecordEncoder subcomponent of the knowledge base.
# The record encoder is responsible for encoding document chunks to a vector representation
# --------------------------------------------------------------------------
type: OpenAIRecordEncoder # Options: [OpenAIRecordEncoder, AzureOpenAIRecordEncoder]
params:
model_name: # The name of the model to use for encoding
text-embedding-3-small
batch_size: 400 # The number of document chunks to encode in each call to the encoding model
create_index_params:
# -------------------------------------------------------------------------------------------
# Initialization parameters to be passed to create a canopy index. These parameters will
# be used when running "canopy new".
# -------------------------------------------------------------------------------------------
metric: cosine
spec:
serverless:
cloud: aws
region: us-east-1
# For pod indexes you can pass the spec with the key "pod" instead of "serverless"
# See the example below:
# pod:
# environment: eu-west1-gcp
# pod_type: p1.x1
# # Here you can specify here replicas, shards, pods, metadata_config if needed.
Hope this helps.