call_stream_model
在輔助函數中,若LLM模型為若為openai, huggingface, remote, gemini, anthropic類模型,可以使用akasha.call_stream_model()來得到流輸出
1 2 3 4 5 6 7 8 9
| import akasha prompt = "say something." model_obj = akasha.handle_model("openai:gpt-3.5-turbo", False, 0.0) streaming = akasha.call_stream_model(model_obj, prompt)
for s in streaming: print(s)
|
Doc_QA stream
Doc_QA class的函式皆可使用參數stream=True來得到流輸出
1 2 3 4 5 6 7 8
| import akasha
ak = akasha.Doc_QA(stream=True)
streaming = ak.get_response("docs/mic", "say something")
for s in streaming: print(s)
|
Stream Output
要在網頁上或API中使用流輸出(及時一個字一個字輸出語言模型回答)時,若為openai, huggingface, remote, gemini, anthropic 類模型,可以使用model_obj.stream(prompt),以下為streamlit write_stream在網頁上即時輸出回答為範例
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
| import streamlit as st import akasha import gc, torch
if "pre" not in st.session_state: st.session_state.pre = "" if "model_obj" not in st.session_state: st.session_state.model_obj = None def clean(): try: gc.collect() torch.cuda.ipc_collect() torch.cuda.empty_cache() except: pass
def stream_response(prompt:str, model_name:str="openai:gpt-3.5-turbo"): streaming = akasha.call_stream_model(st.session_state.model_obj, prompt) yield from streaming
model = st.selectbox("select model", ["openai:gpt-3.5-turbo","hf:model/Mistral-7B-Instruct-v0.3"]) prompt = st.chat_input("Say something") if st.session_state.pre != model: st.session_state.model_obj = None clean() st.session_state.model_obj = akasha.helper.handle_model(model, False, 0.0) st.session_state.pre = model
if prompt: st.write("question: " + prompt) st.write_stream(stream_response(prompt, model))
|
使用model_obj = akasha.helper.handle_model(model, False, 0.0)建立模型物件,當要使用推論時,使用akasha.call_stream_model(model_obj, prompt)進行推論,可使用yield讓stream_response函式回傳generator, 便可即時輸出回答。