optimum_combination

Find Optimum Combination

若要測試所有可用的組合並找到最佳參數,您可以使用 optimum_combination 函數。您可以提供不同的嵌入模型、文件段落大小、語言模型、文件搜索方法以及最相關文檔的數量(topK),該函數將測試所有組合以找到根據給定的問題集和文檔的最佳組合。請注意,最佳得分組合是最高正確率組合,而最佳性價比組合是需要最少token以獲得正確答案的組合。

範例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import akasha.eval as eval
import os
from dotenv import load_dotenv
load_dotenv()

os.environ["OPENAI_API_KEY"] = "your openAI key"
os.environ["HF_TOKEN"] = "your huggingface key"
dir_path = "doc/pvc/"
exp_name = "exp_akasha_optimum_combination"
embeddings_list = ["hf:shibing624/text2vec-base-chinese", "openai:text-embedding-ada-002"]
model_list = ["openai:gpt-3.5-turbo","hf:FlagAlpha/Llama2-Chinese-13b-Chat-4bit","hf:meta-llama/Llama-2-7b-chat-hf",\
"llama-gpu:model/llama-2-7b-chat.Q5_K_S.gguf", "llama-gpu:model/llama-2-13b-chat.Q5_K_S.gguf"]

eva = eval.Model_Eval(question_style="single_choice")
eva.optimum_combination("question_pvc.txt", dir_path, embeddings_list = embeddings_list, model_list = model_list,
chunk_size_list=[200, 400, 600], search_type_list=["merge","tfidf",],record_exp=exp_name)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Best correct rate:  1.000
Best score combination:

embeddings: openai:text-embedding-ada-002, chunk size: 400, model: openai:gpt-3.5-turbo, search type: merge



embeddings: openai:text-embedding-ada-002, chunk size: 400, model: openai:gpt-3.5-turbo, search type: tfidf





Best cost-effective:

embeddings: hf:shibing624/text2vec-base-chinese, chunk size: 400, model: openai:gpt-3.5-turbo, search type: tfidf