mirror of
https://github.com/ggerganov/llama.cpp.git
synced 2024-11-11 21:39:52 +00:00
SimpleChat v3.1: Boolean chat request options in Settings UI, cache_prompt (#7950)
* SimpleChat: Allow for chat req bool options to be user controlled * SimpleChat: Allow user to control cache_prompt flag in request * SimpleChat: Add sample GUI images to readme file Show the chat screen and the settings screen * SimpleChat:Readme: Add quickstart block, title to image, cleanup * SimpleChat: RePosition contents of the Info and Settings UI Make it more logically structured and flow through. * SimpleChat: Rename to apiRequestOptions from chatRequestOptions So that it is not wrongly assumed that these request options are used only for chat/completions endpoint. Rather these are used for both the end points, so rename to match semantic better. * SimpleChat: Update image included with readme wrt settings ui * SimpleChat:ReadMe: Switch to webp screen image to reduce size
This commit is contained in:
parent
f702a90e24
commit
3791ad2193
@ -3,6 +3,13 @@
|
||||
|
||||
by Humans for All.
|
||||
|
||||
## quickstart
|
||||
|
||||
To run from the build dir
|
||||
|
||||
bin/llama-server -m path/model.gguf --path ../examples/server/public_simplechat
|
||||
|
||||
Continue reading for the details.
|
||||
|
||||
## overview
|
||||
|
||||
@ -14,6 +21,8 @@ own system prompts.
|
||||
This allows seeing the generated text / ai-model response in oneshot at the end, after it is fully generated,
|
||||
or potentially as it is being generated, in a streamed manner from the server/ai-model.
|
||||
|
||||
![Chat and Settings screens](./simplechat_screens.webp "Chat and Settings screens")
|
||||
|
||||
Auto saves the chat session locally as and when the chat is progressing and inturn at a later time when you
|
||||
open SimpleChat, option is provided to restore the old chat session, if a matching one exists.
|
||||
|
||||
@ -170,17 +179,23 @@ It is attached to the document object. Some of these can also be updated using t
|
||||
The histogram/freq based trimming logic is currently tuned for english language wrt its
|
||||
is-it-a-alpabetic|numeral-char regex match logic.
|
||||
|
||||
chatRequestOptions - maintains the list of options/fields to send along with chat request,
|
||||
apiRequestOptions - maintains the list of options/fields to send along with api request,
|
||||
irrespective of whether /chat/completions or /completions endpoint.
|
||||
|
||||
If you want to add additional options/fields to send to the server/ai-model, and or
|
||||
modify the existing options value or remove them, for now you can update this global var
|
||||
using browser's development-tools/console.
|
||||
|
||||
For string and numeric fields in chatRequestOptions, including even those added by a user
|
||||
at runtime by directly modifying gMe.chatRequestOptions, setting ui entries will be auto
|
||||
For string, numeric and boolean fields in apiRequestOptions, including even those added by a
|
||||
user at runtime by directly modifying gMe.apiRequestOptions, setting ui entries will be auto
|
||||
created.
|
||||
|
||||
cache_prompt option supported by example/server is allowed to be controlled by user, so that
|
||||
any caching supported wrt system-prompt and chat history, if usable can get used. When chat
|
||||
history sliding window is enabled, cache_prompt logic may or may not kick in at the backend
|
||||
wrt same, based on aspects related to model, positional encoding, attention mechanism etal.
|
||||
However system prompt should ideally get the benefit of caching.
|
||||
|
||||
headers - maintains the list of http headers sent when request is made to the server. By default
|
||||
Content-Type is set to application/json. Additionally Authorization entry is provided, which can
|
||||
be set if needed using the settings ui.
|
||||
@ -197,10 +212,10 @@ It is attached to the document object. Some of these can also be updated using t
|
||||
>0 : Send the latest chat history from the latest system prompt, limited to specified cnt.
|
||||
|
||||
|
||||
By using gMe's iRecentUserMsgCnt and chatRequestOptions.max_tokens one can try to control the
|
||||
implications of loading of the ai-model's context window by chat history, wrt chat response to
|
||||
some extent in a simple crude way. You may also want to control the context size enabled when
|
||||
the server loads ai-model, on the server end.
|
||||
By using gMe's iRecentUserMsgCnt and apiRequestOptions.max_tokens/n_predict one can try to control
|
||||
the implications of loading of the ai-model's context window by chat history, wrt chat response to
|
||||
some extent in a simple crude way. You may also want to control the context size enabled when the
|
||||
server loads ai-model, on the server end.
|
||||
|
||||
|
||||
Sometimes the browser may be stuborn with caching of the file, so your updates to html/css/js
|
||||
@ -237,12 +252,12 @@ also be started with a model context size of 1k or more, to be on safe side.
|
||||
internal n_predict, for now add the same here on the client side, maybe later add max_tokens
|
||||
to /completions endpoint handling code on server side.
|
||||
|
||||
NOTE: One may want to experiment with frequency/presence penalty fields in chatRequestOptions
|
||||
wrt the set of fields sent to server along with the user query. To check how the model behaves
|
||||
NOTE: One may want to experiment with frequency/presence penalty fields in apiRequestOptions
|
||||
wrt the set of fields sent to server along with the user query, to check how the model behaves
|
||||
wrt repeatations in general in the generated text response.
|
||||
|
||||
A end-user can change these behaviour by editing gMe from browser's devel-tool/console or by
|
||||
using the providing settings ui.
|
||||
using the provided settings ui (for settings exposed through the ui).
|
||||
|
||||
|
||||
### OpenAi / Equivalent API WebService
|
||||
@ -253,7 +268,7 @@ for a minimal chatting experimentation by setting the below.
|
||||
* the baseUrl in settings ui
|
||||
* https://api.openai.com/v1 or similar
|
||||
|
||||
* Wrt request body - gMe.chatRequestOptions
|
||||
* Wrt request body - gMe.apiRequestOptions
|
||||
* model (settings ui)
|
||||
* any additional fields if required in future
|
||||
|
||||
|
@ -222,8 +222,8 @@ class SimpleChat {
|
||||
* @param {Object} obj
|
||||
*/
|
||||
request_jsonstr_extend(obj) {
|
||||
for(let k in gMe.chatRequestOptions) {
|
||||
obj[k] = gMe.chatRequestOptions[k];
|
||||
for(let k in gMe.apiRequestOptions) {
|
||||
obj[k] = gMe.apiRequestOptions[k];
|
||||
}
|
||||
if (gMe.bStream) {
|
||||
obj["stream"] = true;
|
||||
@ -740,11 +740,12 @@ class Me {
|
||||
"Authorization": "", // Authorization: Bearer OPENAI_API_KEY
|
||||
}
|
||||
// Add needed fields wrt json object to be sent wrt LLM web services completions endpoint.
|
||||
this.chatRequestOptions = {
|
||||
this.apiRequestOptions = {
|
||||
"model": "gpt-3.5-turbo",
|
||||
"temperature": 0.7,
|
||||
"max_tokens": 1024,
|
||||
"n_predict": 1024,
|
||||
"cache_prompt": false,
|
||||
//"frequency_penalty": 1.2,
|
||||
//"presence_penalty": 1.2,
|
||||
};
|
||||
@ -800,51 +801,55 @@ class Me {
|
||||
|
||||
ui.el_create_append_p(`bStream:${this.bStream}`, elDiv);
|
||||
|
||||
ui.el_create_append_p(`bTrimGarbage:${this.bTrimGarbage}`, elDiv);
|
||||
|
||||
ui.el_create_append_p(`ApiEndPoint:${this.apiEP}`, elDiv);
|
||||
|
||||
ui.el_create_append_p(`iRecentUserMsgCnt:${this.iRecentUserMsgCnt}`, elDiv);
|
||||
|
||||
ui.el_create_append_p(`bCompletionFreshChatAlways:${this.bCompletionFreshChatAlways}`, elDiv);
|
||||
|
||||
ui.el_create_append_p(`bCompletionInsertStandardRolePrefix:${this.bCompletionInsertStandardRolePrefix}`, elDiv);
|
||||
|
||||
ui.el_create_append_p(`bTrimGarbage:${this.bTrimGarbage}`, elDiv);
|
||||
|
||||
ui.el_create_append_p(`iRecentUserMsgCnt:${this.iRecentUserMsgCnt}`, elDiv);
|
||||
|
||||
ui.el_create_append_p(`ApiEndPoint:${this.apiEP}`, elDiv);
|
||||
|
||||
}
|
||||
|
||||
ui.el_create_append_p(`chatRequestOptions:${JSON.stringify(this.chatRequestOptions, null, " - ")}`, elDiv);
|
||||
ui.el_create_append_p(`apiRequestOptions:${JSON.stringify(this.apiRequestOptions, null, " - ")}`, elDiv);
|
||||
ui.el_create_append_p(`headers:${JSON.stringify(this.headers, null, " - ")}`, elDiv);
|
||||
|
||||
}
|
||||
|
||||
/**
|
||||
* Auto create ui input elements for fields in ChatRequestOptions
|
||||
* Auto create ui input elements for fields in apiRequestOptions
|
||||
* Currently supports text and number field types.
|
||||
* @param {HTMLDivElement} elDiv
|
||||
*/
|
||||
show_settings_chatrequestoptions(elDiv) {
|
||||
show_settings_apirequestoptions(elDiv) {
|
||||
let typeDict = {
|
||||
"string": "text",
|
||||
"number": "number",
|
||||
};
|
||||
let fs = document.createElement("fieldset");
|
||||
let legend = document.createElement("legend");
|
||||
legend.innerText = "ChatRequestOptions";
|
||||
legend.innerText = "ApiRequestOptions";
|
||||
fs.appendChild(legend);
|
||||
elDiv.appendChild(fs);
|
||||
for(const k in this.chatRequestOptions) {
|
||||
let val = this.chatRequestOptions[k];
|
||||
for(const k in this.apiRequestOptions) {
|
||||
let val = this.apiRequestOptions[k];
|
||||
let type = typeof(val);
|
||||
if (!((type == "string") || (type == "number"))) {
|
||||
continue;
|
||||
}
|
||||
let inp = ui.el_creatediv_input(`Set${k}`, k, typeDict[type], this.chatRequestOptions[k], (val)=>{
|
||||
if (((type == "string") || (type == "number"))) {
|
||||
let inp = ui.el_creatediv_input(`Set${k}`, k, typeDict[type], this.apiRequestOptions[k], (val)=>{
|
||||
if (type == "number") {
|
||||
val = Number(val);
|
||||
}
|
||||
this.chatRequestOptions[k] = val;
|
||||
this.apiRequestOptions[k] = val;
|
||||
});
|
||||
fs.appendChild(inp.div);
|
||||
} else if (type == "boolean") {
|
||||
let bbtn = ui.el_creatediv_boolbutton(`Set{k}`, k, {true: "true", false: "false"}, val, (userVal)=>{
|
||||
this.apiRequestOptions[k] = userVal;
|
||||
});
|
||||
fs.appendChild(bbtn.div);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@ -870,6 +875,23 @@ class Me {
|
||||
});
|
||||
elDiv.appendChild(bb.div);
|
||||
|
||||
bb = ui.el_creatediv_boolbutton("SetTrimGarbage", "TrimGarbage", {true: "[+] yes trim", false: "[-] dont trim"}, this.bTrimGarbage, (val)=>{
|
||||
this.bTrimGarbage = val;
|
||||
});
|
||||
elDiv.appendChild(bb.div);
|
||||
|
||||
this.show_settings_apirequestoptions(elDiv);
|
||||
|
||||
let sel = ui.el_creatediv_select("SetApiEP", "ApiEndPoint", ApiEP.Type, this.apiEP, (val)=>{
|
||||
this.apiEP = ApiEP.Type[val];
|
||||
});
|
||||
elDiv.appendChild(sel.div);
|
||||
|
||||
sel = ui.el_creatediv_select("SetChatHistoryInCtxt", "ChatHistoryInCtxt", this.sRecentUserMsgCnt, this.iRecentUserMsgCnt, (val)=>{
|
||||
this.iRecentUserMsgCnt = this.sRecentUserMsgCnt[val];
|
||||
});
|
||||
elDiv.appendChild(sel.div);
|
||||
|
||||
bb = ui.el_creatediv_boolbutton("SetCompletionFreshChatAlways", "CompletionFreshChatAlways", {true: "[+] yes fresh", false: "[-] no, with history"}, this.bCompletionFreshChatAlways, (val)=>{
|
||||
this.bCompletionFreshChatAlways = val;
|
||||
});
|
||||
@ -880,23 +902,6 @@ class Me {
|
||||
});
|
||||
elDiv.appendChild(bb.div);
|
||||
|
||||
bb = ui.el_creatediv_boolbutton("SetTrimGarbage", "TrimGarbage", {true: "[+] yes trim", false: "[-] dont trim"}, this.bTrimGarbage, (val)=>{
|
||||
this.bTrimGarbage = val;
|
||||
});
|
||||
elDiv.appendChild(bb.div);
|
||||
|
||||
let sel = ui.el_creatediv_select("SetChatHistoryInCtxt", "ChatHistoryInCtxt", this.sRecentUserMsgCnt, this.iRecentUserMsgCnt, (val)=>{
|
||||
this.iRecentUserMsgCnt = this.sRecentUserMsgCnt[val];
|
||||
});
|
||||
elDiv.appendChild(sel.div);
|
||||
|
||||
sel = ui.el_creatediv_select("SetApiEP", "ApiEndPoint", ApiEP.Type, this.apiEP, (val)=>{
|
||||
this.apiEP = ApiEP.Type[val];
|
||||
});
|
||||
elDiv.appendChild(sel.div);
|
||||
|
||||
this.show_settings_chatrequestoptions(elDiv);
|
||||
|
||||
}
|
||||
|
||||
}
|
||||
|
BIN
examples/server/public_simplechat/simplechat_screens.webp
Normal file
BIN
examples/server/public_simplechat/simplechat_screens.webp
Normal file
Binary file not shown.
After Width: | Height: | Size: 21 KiB |
Loading…
Reference in New Issue
Block a user