流輸出

call_stream_model

在輔助函數中,若LLM模型為若為openai, huggingface, remote, gemini, anthropic類模型,可以使用akasha.call_stream_model()來得到流輸出

1
2
3
4
5
6
7
8
9

import akasha
prompt = "say something."
model_obj = akasha.handle_model("openai:gpt-3.5-turbo", False, 0.0)
streaming = akasha.call_stream_model(model_obj, prompt)

for s in streaming:
print(s)

Doc_QA stream

Doc_QA class的函式皆可使用參數stream=True來得到流輸出

1
2
3
4
5
6
7
8
import akasha

ak = akasha.Doc_QA(stream=True)

streaming = ak.get_response("docs/mic", "say something")

for s in streaming:
print(s)

Stream Output

要在網頁上或API中使用流輸出(及時一個字一個字輸出語言模型回答)時,若為openai, huggingface, remote, gemini, anthropic 類模型,可以使用model_obj.stream(prompt),以下為streamlit write_stream在網頁上即時輸出回答為範例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
import streamlit as st
import akasha
import gc, torch

if "pre" not in st.session_state:
st.session_state.pre = ""
if "model_obj" not in st.session_state:
st.session_state.model_obj = None

def clean():
try:
gc.collect()
torch.cuda.ipc_collect()
torch.cuda.empty_cache()
except:
pass



def stream_response(prompt:str, model_name:str="openai:gpt-3.5-turbo"):
# Mistral-7B-Instruct-v0.3 Llama3-8B-Chinese-Chat
streaming = akasha.call_stream_model(st.session_state.model_obj, prompt)
yield from streaming

model = st.selectbox("select model", ["openai:gpt-3.5-turbo","hf:model/Mistral-7B-Instruct-v0.3"])
prompt = st.chat_input("Say something")
if st.session_state.pre != model:
st.session_state.model_obj = None
clean()
st.session_state.model_obj = akasha.helper.handle_model(model, False, 0.0)
st.session_state.pre = model

if prompt:
st.write("question: " + prompt)
st.write_stream(stream_response(prompt, model))

使用model_obj = akasha.helper.handle_model(model, False, 0.0)建立模型物件,當要使用推論時,使用akasha.call_stream_model(model_obj, prompt)進行推論,可使用yield讓stream_response函式回傳generator, 便可即時輸出回答。