SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, Settings UI, Streaming mode, OpenAi Compat (Model, Authorization Bearer), Save/Restore session, Auto Settings UI (#7548)

* SimpleChat:DU:BringIn local helper js modules using importmap

Use it to bring in a simple trim garbage at end logic, which is
used to trim received response.

Also given that importmap assumes esm / standard js modules, so
also global variables arent implicitly available outside the
modules. So add it has a member of document for now

* SimpleChat:DU: Add trim garbage at end in loop helper

* SimpleChat:DU:TrimGarbage if unable try skip char and retry

* SimpleChat:DU: Try trim using histogram based info

TODO: May have to add max number of uniq chars in histogram at
end of learning phase.

* SimpleChat:DU: Switch trim garbage hist based to maxUniq simple

Instead of blindly building histogram for specified substring
length, and then checking if any new char within specified min
garbage length limit, NOW exit learn state when specified maxUniq
chars are found. Inturn there should be no new chars with in
the specified min garbage length required limit.

TODO: Need to track char classes like alphabets, numerals and
special/other chars.

* SimpleChat:DU: Bring in maxType to the mix along with maxUniq

Allow for more uniq chars, but then ensure that a given type of
char ie numerals or alphabets or other types dont cross the
specified maxType limit. This allows intermixed text garbage
to be identified and trimmed.

* SimpleChat:DU: Cleanup debug log messages

* SimpleChat:UI: Move html ui base helpers into its own module

* SimpleChat:DU:Avoid setting frequence/Presence penalty

Some models like llama3 found to try to be over intelligent by
repeating garbage still, but by tweaking the garbage a bit so that
it is not exactly same. So avoid setting these penalties and let
the model's default behaviour work out, as is.

Also the simple minded histogram based garbage trimming from end,
works to an extent, when the garbage is more predictable and
repeatative.

* SimpleChat:UI: Add and use a para-create-append helper

Also update the config params dump to indicate that now one needs
to use document to get hold of gMe global object, this is bcas of
moving to module type js.

Also add ui.mjs to importmap

* SimpleChat:UI: Helper to create bool button and use it wrt settings

* SimpleChat:UI: Add Select helper and use it wrt ChatHistoryInCtxt

* SimpleChat:UI:Select: dict-name-value, value wrt default, change

Take a dict/object of name-value pairs instead of just names.
Inturn specify the actual value wrt default, rather than the
string representing that value.

Trap the needed change event rather than click wrt select.

* SimpleChat:UI: Add Div wrapped label+element helpers

Move settings related elements to use the new div wrapped ones.

* SimpleChat:UI:Add settings button and bring in settings ui

* SimpleChat:UI:Settings make boolean button text show meaning

* SimpleChat: Update a bit wrt readme and notes in du

* SimpleChat: GarbageTrim enable/disable, show trimmed part ifany

* SimpleChat: highlight trim, garbage trimming bitmore aggressive

Make it easy for end user to identified the trimmed text.

Make garbage trimming logic, consider a longer repeat garbage
substring.

* SimpleChat: Cleanup a bit wrt Api end point related flow

Consolidate many of the Api end point related basic meta data into
ApiEP class.

Remove the hardcoded ApiEP/Mode settings from html+js, instead use
the generic select helper logic, inturn in the settings block.

Move helper to generate the appropriate request json string based
on ApiEP into SimpleChat class itself.

* SimpleChat:Move extracting assistant response to SimpleChat class

so also the trimming of garbage.

* SimpleChat:DU: Bring in both trim garbage logics to try trim

* SimpleChat: Cleanup readme a bit, add one more chathistory length

* SimpleChat:Stream:Initial handshake skeleton

Parse the got stream responses and try extract the data from it.

It allows for a part read to get a single data line or multiple
data line. Inturn extract the json body and inturn the delta
content/message in it.

* SimpleChat: Move handling oneshot mode server response

Move handling of the oneshot mode server response into SimpleChat.

Also add plumbing for moving multipart server response into same.

* SimpleChat: Move multi part server response handling in

* SimpleChat: Add MultiPart Response handling, common trimming

Add logic to call into multipart/stream server response handling.

Move trimming of garbage at the end into the common handle_response
helper.

Add new global flag to control between oneshot and multipart/stream
mode of fetching response. Allow same to be controlled by user.

If in multipart/stream mode, send the stream flag to the server.

* SimpleChat: show streamed generative text as it becomes available

Now that the extracting of streamed generated text is implemented,
add logic to show the same on the screen.

* SimpleChat:DU: Add NewLines helper class

To work with an array of new lines. Allow adding, appending,
shifting, ...

* SimpleChat:DU: Make NewLines shift more robust and flexible

* SimpleChat:HandleResponseMultiPart using NewLines helper

Make handle_response_multipart logic better and cleaner. Now it
allows for working with the situation, where the delta data line
got from server in stream mode, could be split up when recving,
but still the logic will handle it appropriately.

ALERT: Rather except (for now) for last data line wrt a request's
response.

* SimpleChat: Disable console debug by default by making it dummy

Parallely save a reference to the original func.

* SimpleChat:MultiPart/Stream flow cleanup

Dont try utf8-decode and newlines-add_append if no data to work on.

If there is no more data to get (ie done is set), then let NewLines
instance return line without newline at end, So that we dont miss
out on any last-data-line without newline kind of scenario.

Pass stream flag wrt utf-8 decode, so that if any multi-byte char
is only partly present in the passed buffer, it can be accounted
for along with subsequent buffer. At sametime, bcas of utf-8's
characteristics there shouldnt be any unaccounted bytes at end,
for valid block of utf8 data split across chunks, so not bothering
calling with stream set to false at end. LATER: Look at TextDecoder's
implementation, for any over intelligence, it may be doing..
If needed, one can use done flag to account wrt both cases.

* SimpleChat: Move baseUrl to Me and inturn gMe

This should allow easy updating of the base url at runtime by the
end user.

* SimpleChat:UI: Add input element helper

* SimpleChat: Add support for changing the base url

This ensures that if the user is running the server with a
different port or wants to try connect to server on a different
machine, then this can be used.

* SimpleChat: Move request headers into Me and gMe

Inturn allow Authorization to be sent, if not empty.

* SimpleChat: Rather need to use append to insert headers

* SimpleChat: Allow Authorization header to be set by end user

* SimpleChat:UI+: Return div and element wrt creatediv helpers

use it to set placeholder wrt Authorization header.

Also fix copy-paste oversight.

* SimpleChat: readme wrt authorization, maybe minimal openai testing

* SimpleChat: model request field for openai/equivalent compat

May help testing with openai/equivalent web services, if they
require this field.

* SimpleChat: readme stream-utf-8 trim-english deps, exception2error

* Readme: Add a entry for simplechat in the http server section

* SimpleChat:WIP:Collate internally, Stream mode Trap exceptions

This can help ensure that data fetched till that point, can be
made use of, rather than losing it.

On some platforms, the time taken wrt generating a long response,
may lead to the network connection being broken when it enters
some user-no-interaction related power saving mode.

* SimpleChat:theResp-origMsg: Undo a prev change to fix non trim

When the response handling was moved into SimpleChat, I had changed
a flow bit unnecessarily and carelessly, which resulted in the non
trim flow, missing out on retaining the ai assistant response.

This has been fixed now.

* SimpleChat: Save message internally in handle_response itself

This ensures that throwing the caught exception again for higher
up logic, doesnt lose the response collated till that time.

Go through theResp.assistant in catch block, just to keep simple
consistency wrt backtracing just in case.

Update the readme file.

* SimpleChat:Cleanup: Add spacing wrt shown req-options

* SimpleChat:UI: CreateDiv Divs map to GridX2 class

This allows the settings ui to be cleaner structured.

* SimpleChat: Show Non SettingsUI config field by default

* SimpleChat: Allow for multiline system prompt

Convert SystemPrompt into a textarea with 2 rows. Reduce
user-input-textarea to 2 rows from 3, so that overall
vertical space usage remains same.

Shorten usage messages a bit, cleanup to sync with settings ui.

* SimpleChat: Add basic skeleton for saving and loading chat

Inturn when ever a chat message (system/user/model) is added,
the chat will be saved into browser's localStorage.

* SimpleChat:ODS: Add a prefix to chatid wrt ondiskstorage key

* SimpleChat:ODS:WIP:TMP: Add UI to load previously saved chat

This is a temporary flow

* SimpleChat:ODS:Move restore/load saved chat btn setup to Me

This also allows being able to set the common system prompt
ui element to loaded chat's system prompt.

* SimpleChat:Readme updated wrt save and restore chat session info

* SimpleChat:Show chat session restore button, only if saved session

* SimpleChat: AutoCreate ChatRequestOptions settings to an extent

* SimpleChat: Update main README wrt usage with server
This commit is contained in:
HanishKVC 2024-06-01 21:50:18 +05:30 committed by GitHub
parent 750f60c03e
commit 2ac95c9d56
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
7 changed files with 1029 additions and 175 deletions

View File

@ -150,6 +150,8 @@ Typically finetunes of the base models below are supported as well.
[llama.cpp web server](./examples/server) is a lightweight [OpenAI API](https://github.com/openai/openai-openapi) compatible HTTP server that can be used to serve local models and easily connect them to existing clients. [llama.cpp web server](./examples/server) is a lightweight [OpenAI API](https://github.com/openai/openai-openapi) compatible HTTP server that can be used to serve local models and easily connect them to existing clients.
[simplechat](./examples/server/public_simplechat) is a simple chat client, which can be used to chat with the model exposed using above web server (use --path to point to simplechat), from a local web browser.
**Bindings:** **Bindings:**
- Python: [abetlen/llama-cpp-python](https://github.com/abetlen/llama-cpp-python) - Python: [abetlen/llama-cpp-python](https://github.com/abetlen/llama-cpp-python)

View File

@ -0,0 +1,266 @@
//@ts-check
// Helpers to work with different data types
// by Humans for All
//
/**
* Given the limited context size of local LLMs and , many a times when context gets filled
* between the prompt and the response, it can lead to repeating text garbage generation.
* And many a times setting penalty wrt repeatation leads to over-intelligent garbage
* repeatation with slight variations. These garbage inturn can lead to overloading of the
* available model context, leading to less valuable response for subsequent prompts/queries,
* if chat history is sent to ai model.
*
* So two simple minded garbage trimming logics are experimented below.
* * one based on progressively-larger-substring-based-repeat-matching-with-partial-skip and
* * another based on char-histogram-driven garbage trimming.
* * in future characteristic of histogram over varying lengths could be used to allow for
* a more aggressive and adaptive trimming logic.
*/
/**
* Simple minded logic to help remove repeating garbage at end of the string.
* The repeatation needs to be perfectly matching.
*
* The logic progressively goes on probing for longer and longer substring based
* repeatation, till there is no longer repeatation. Inturn picks the one with
* the longest chain.
*
* @param {string} sIn
* @param {number} maxSubL
* @param {number} maxMatchLenThreshold
*/
export function trim_repeat_garbage_at_end(sIn, maxSubL=10, maxMatchLenThreshold=40) {
let rCnt = [0];
let maxMatchLen = maxSubL;
let iMML = -1;
for(let subL=1; subL < maxSubL; subL++) {
rCnt.push(0);
let i;
let refS = sIn.substring(sIn.length-subL, sIn.length);
for(i=sIn.length; i > 0; i -= subL) {
let curS = sIn.substring(i-subL, i);
if (refS != curS) {
let curMatchLen = rCnt[subL]*subL;
if (maxMatchLen < curMatchLen) {
maxMatchLen = curMatchLen;
iMML = subL;
}
break;
}
rCnt[subL] += 1;
}
}
console.debug("DBUG:DU:TrimRepeatGarbage:", rCnt);
if ((iMML == -1) || (maxMatchLen < maxMatchLenThreshold)) {
return {trimmed: false, data: sIn};
}
console.debug("DBUG:TrimRepeatGarbage:TrimmedCharLen:", maxMatchLen);
let iEnd = sIn.length - maxMatchLen;
return { trimmed: true, data: sIn.substring(0, iEnd) };
}
/**
* Simple minded logic to help remove repeating garbage at end of the string, till it cant.
* If its not able to trim, then it will try to skip a char at end and then trim, a few times.
* This ensures that even if there are multiple runs of garbage with different patterns, the
* logic still tries to munch through them.
*
* @param {string} sIn
* @param {number} maxSubL
* @param {number | undefined} [maxMatchLenThreshold]
*/
export function trim_repeat_garbage_at_end_loop(sIn, maxSubL, maxMatchLenThreshold, skipMax=16) {
let sCur = sIn;
let sSaved = "";
let iTry = 0;
while(true) {
let got = trim_repeat_garbage_at_end(sCur, maxSubL, maxMatchLenThreshold);
if (got.trimmed != true) {
if (iTry == 0) {
sSaved = got.data;
}
iTry += 1;
if (iTry >= skipMax) {
return sSaved;
}
got.data = got.data.substring(0,got.data.length-1);
} else {
iTry = 0;
}
sCur = got.data;
}
}
/**
* A simple minded try trim garbage at end using histogram driven characteristics.
* There can be variation in the repeatations, as long as no new char props up.
*
* This tracks the chars and their frequency in a specified length of substring at the end
* and inturn checks if moving further into the generated text from the end remains within
* the same char subset or goes beyond it and based on that either trims the string at the
* end or not. This allows to filter garbage at the end, including even if there are certain
* kind of small variations in the repeated text wrt position of seen chars.
*
* Allow the garbage to contain upto maxUniq chars, but at the same time ensure that
* a given type of char ie numerals or alphabets or other types dont cross the specified
* maxType limit. This allows intermixed text garbage to be identified and trimmed.
*
* ALERT: This is not perfect and only provides a rough garbage identification logic.
* Also it currently only differentiates between character classes wrt english.
*
* @param {string} sIn
* @param {number} maxType
* @param {number} maxUniq
* @param {number} maxMatchLenThreshold
*/
export function trim_hist_garbage_at_end(sIn, maxType, maxUniq, maxMatchLenThreshold) {
if (sIn.length < maxMatchLenThreshold) {
return { trimmed: false, data: sIn };
}
let iAlp = 0;
let iNum = 0;
let iOth = 0;
// Learn
let hist = {};
let iUniq = 0;
for(let i=0; i<maxMatchLenThreshold; i++) {
let c = sIn[sIn.length-1-i];
if (c in hist) {
hist[c] += 1;
} else {
if(c.match(/[0-9]/) != null) {
iNum += 1;
} else if(c.match(/[A-Za-z]/) != null) {
iAlp += 1;
} else {
iOth += 1;
}
iUniq += 1;
if (iUniq >= maxUniq) {
break;
}
hist[c] = 1;
}
}
console.debug("DBUG:TrimHistGarbage:", hist);
if ((iAlp > maxType) || (iNum > maxType) || (iOth > maxType)) {
return { trimmed: false, data: sIn };
}
// Catch and Trim
for(let i=0; i < sIn.length; i++) {
let c = sIn[sIn.length-1-i];
if (!(c in hist)) {
if (i < maxMatchLenThreshold) {
return { trimmed: false, data: sIn };
}
console.debug("DBUG:TrimHistGarbage:TrimmedCharLen:", i);
return { trimmed: true, data: sIn.substring(0, sIn.length-i+1) };
}
}
console.debug("DBUG:TrimHistGarbage:Trimmed fully");
return { trimmed: true, data: "" };
}
/**
* Keep trimming repeatedly using hist_garbage logic, till you no longer can.
* This ensures that even if there are multiple runs of garbage with different patterns,
* the logic still tries to munch through them.
*
* @param {any} sIn
* @param {number} maxType
* @param {number} maxUniq
* @param {number} maxMatchLenThreshold
*/
export function trim_hist_garbage_at_end_loop(sIn, maxType, maxUniq, maxMatchLenThreshold) {
let sCur = sIn;
while (true) {
let got = trim_hist_garbage_at_end(sCur, maxType, maxUniq, maxMatchLenThreshold);
if (!got.trimmed) {
return got.data;
}
sCur = got.data;
}
}
/**
* Try trim garbage at the end by using both the hist-driven-garbage-trimming as well as
* skip-a-bit-if-reqd-then-repeat-pattern-based-garbage-trimming, with blind retrying.
* @param {string} sIn
*/
export function trim_garbage_at_end(sIn) {
let sCur = sIn;
for(let i=0; i<2; i++) {
sCur = trim_hist_garbage_at_end_loop(sCur, 8, 24, 72);
sCur = trim_repeat_garbage_at_end_loop(sCur, 32, 72, 12);
}
return sCur;
}
/**
* NewLines array helper.
* Allow for maintaining a list of lines.
* Allow for a line to be builtup/appended part by part.
*/
export class NewLines {
constructor() {
/** @type {string[]} */
this.lines = [];
}
/**
* Extracts lines from the passed string and inturn either
* append to a previous partial line or add a new line.
* @param {string} sLines
*/
add_append(sLines) {
let aLines = sLines.split("\n");
let lCnt = 0;
for(let line of aLines) {
lCnt += 1;
// Add back newline removed if any during split
if (lCnt < aLines.length) {
line += "\n";
} else {
if (sLines.endsWith("\n")) {
line += "\n";
}
}
// Append if required
if (lCnt == 1) {
let lastLine = this.lines[this.lines.length-1];
if (lastLine != undefined) {
if (!lastLine.endsWith("\n")) {
this.lines[this.lines.length-1] += line;
continue;
}
}
}
// Add new line
this.lines.push(line);
}
}
/**
* Shift the oldest/earliest/0th line in the array. [Old-New|Earliest-Latest]
* Optionally control whether only full lines (ie those with newline at end) will be returned
* or will a partial line without a newline at end (can only be the last line) be returned.
* @param {boolean} bFullWithNewLineOnly
*/
shift(bFullWithNewLineOnly=true) {
let line = this.lines[0];
if (line == undefined) {
return undefined;
}
if ((line[line.length-1] != "\n") && bFullWithNewLineOnly){
return undefined;
}
return this.lines.shift();
}
}

View File

@ -8,21 +8,23 @@
<meta name="description" content="SimpleChat: trigger LLM web service endpoints /chat/completions and /completions, single/multi chat sessions" /> <meta name="description" content="SimpleChat: trigger LLM web service endpoints /chat/completions and /completions, single/multi chat sessions" />
<meta name="author" content="by Humans for All" /> <meta name="author" content="by Humans for All" />
<meta http-equiv="Cache-Control" content="no-cache, no-store, must-revalidate" /> <meta http-equiv="Cache-Control" content="no-cache, no-store, must-revalidate" />
<script src="simplechat.js" defer></script> <script type="importmap">
{
"imports": {
"datautils": "./datautils.mjs",
"ui": "./ui.mjs"
}
}
</script>
<script src="simplechat.js" type="module" defer></script>
<link rel="stylesheet" href="simplechat.css" /> <link rel="stylesheet" href="simplechat.css" />
</head> </head>
<body> <body>
<div class="samecolumn" id="fullbody"> <div class="samecolumn" id="fullbody">
<div class="sameline"> <div class="sameline" id="heading">
<p class="heading flex-grow" > <b> SimpleChat </b> </p> <p class="heading flex-grow" > <b> SimpleChat </b> </p>
<div class="sameline"> <button id="settings">Settings</button>
<label for="api-ep">Mode:</label>
<select name="api-ep" id="api-ep">
<option value="chat" selected>Chat</option>
<option value="completion">Completion</option>
</select>
</div>
</div> </div>
<div id="sessions-div" class="sameline"></div> <div id="sessions-div" class="sameline"></div>
@ -30,7 +32,7 @@
<hr> <hr>
<div class="sameline"> <div class="sameline">
<label for="system-in">System</label> <label for="system-in">System</label>
<input type="text" name="system" id="system-in" placeholder="e.g. you are a helpful ai assistant, who provides concise answers" class="flex-grow"/> <textarea name="system" id="system-in" rows="2" placeholder="e.g. you are a helpful ai assistant, who provides concise answers" class="flex-grow"></textarea>
</div> </div>
<hr> <hr>
@ -40,7 +42,7 @@
<hr> <hr>
<div class="sameline"> <div class="sameline">
<textarea id="user-in" class="flex-grow" rows="3" placeholder="enter your query to the ai model here" ></textarea> <textarea id="user-in" class="flex-grow" rows="2" placeholder="enter your query to the ai model here" ></textarea>
<button id="user-btn">submit</button> <button id="user-btn">submit</button>
</div> </div>

View File

@ -11,18 +11,29 @@ in a simple way with minimal code from a common code base. Inturn additionally i
multiple independent back and forth chatting to an extent, with the ai llm model at a basic level, with their multiple independent back and forth chatting to an extent, with the ai llm model at a basic level, with their
own system prompts. own system prompts.
This allows seeing the generated text / ai-model response in oneshot at the end, after it is fully generated,
or potentially as it is being generated, in a streamed manner from the server/ai-model.
Auto saves the chat session locally as and when the chat is progressing and inturn at a later time when you
open SimpleChat, option is provided to restore the old chat session, if a matching one exists.
The UI follows a responsive web design so that the layout can adapt to available display space in a usable The UI follows a responsive web design so that the layout can adapt to available display space in a usable
enough manner, in general. enough manner, in general.
Allows developer/end-user to control some of the behaviour by updating gMe members from browser's devel-tool Allows developer/end-user to control some of the behaviour by updating gMe members from browser's devel-tool
console. console. Parallely some of the directly useful to end-user settings can also be changed using the provided
settings ui.
NOTE: Given that the idea is for basic minimal testing, it doesnt bother with any model context length and NOTE: Current web service api doesnt expose the model context length directly, so client logic doesnt provide
culling of old messages from the chat by default. However by enabling the sliding window chat logic, a crude any adaptive culling of old messages nor of replacing them with summary of their content etal. However there
form of old messages culling can be achieved. is a optional sliding window based chat logic, which provides a simple minded culling of old messages from
the chat history before sending to the ai model.
NOTE: It doesnt set any parameters other than temperature and max_tokens for now. However if someone wants NOTE: Wrt options sent with the request, it mainly sets temperature, max_tokens and optionaly stream for now.
they can update the js file or equivalent member in gMe as needed. However if someone wants they can update the js file or equivalent member in gMe as needed.
NOTE: One may be able to use this to chat with openai api web-service /chat/completions endpoint, in a very
limited / minimal way. One will need to set model, openai url and authorization bearer key in settings ui.
## usage ## usage
@ -52,9 +63,15 @@ Open this simple web front end from your local browser
Once inside Once inside
* Select between chat and completion mode. By default it is set to chat mode. * If you want to, you can change many of the default global settings
* the base url (ie ip addr / domain name, port)
* chat (default) vs completion mode
* try trim garbage in response or not
* amount of chat history in the context sent to server/ai-model
* oneshot or streamed mode.
* In completion mode * In completion mode
* one normally doesnt use a system prompt in completion mode.
* logic by default doesnt insert any role specific "ROLE: " prefix wrt each role's message. * logic by default doesnt insert any role specific "ROLE: " prefix wrt each role's message.
If the model requires any prefix wrt user role messages, then the end user has to If the model requires any prefix wrt user role messages, then the end user has to
explicitly add the needed prefix, when they enter their chat message. explicitly add the needed prefix, when they enter their chat message.
@ -88,12 +105,16 @@ Once inside
* Wait for the logic to communicate with the server and get the response. * Wait for the logic to communicate with the server and get the response.
* the user is not allowed to enter any fresh query during this time. * the user is not allowed to enter any fresh query during this time.
* the user input box will be disabled and a working message will be shown in it. * the user input box will be disabled and a working message will be shown in it.
* if trim garbage is enabled, the logic will try to trim repeating text kind of garbage to some extent.
* just refresh the page, to reset wrt the chat history and or system prompt and start afresh. * just refresh the page, to reset wrt the chat history and or system prompt and start afresh.
* Using NewChat one can start independent chat sessions. * Using NewChat one can start independent chat sessions.
* two independent chat sessions are setup by default. * two independent chat sessions are setup by default.
* When you want to print, switching ChatHistoryInCtxt to Full and clicking on the chat session button of
interest, will display the full chat history till then wrt same, if you want full history for printing.
## Devel note ## Devel note
@ -104,14 +125,31 @@ by developers who may not be from web frontend background (so inturn may not be
end-use-specific-language-extensions driven flows) so that they can use it to explore/experiment things. end-use-specific-language-extensions driven flows) so that they can use it to explore/experiment things.
And given that the idea is also to help explore/experiment for developers, some flexibility is provided And given that the idea is also to help explore/experiment for developers, some flexibility is provided
to change behaviour easily using the devel-tools/console, for now. And skeletal logic has been implemented to change behaviour easily using the devel-tools/console or provided minimal settings ui (wrt few aspects).
to explore some of the end points and ideas/implications around them. Skeletal logic has been implemented to explore some of the end points and ideas/implications around them.
### General ### General
Me/gMe consolidates the settings which control the behaviour into one object. Me/gMe consolidates the settings which control the behaviour into one object.
One can see the current settings, as well as change/update them using browsers devel-tool/console. One can see the current settings, as well as change/update them using browsers devel-tool/console.
It is attached to the document object. Some of these can also be updated using the Settings UI.
baseURL - the domain-name/ip-address and inturn the port to send the request.
bStream - control between oneshot-at-end and live-stream-as-its-generated collating and showing
of the generated response.
the logic assumes that the text sent from the server follows utf-8 encoding.
in streaming mode - if there is any exception, the logic traps the same and tries to ensure
that text generated till then is not lost.
if a very long text is being generated, which leads to no user interaction for sometime and
inturn the machine goes into power saving mode or so, the platform may stop network connection,
leading to exception.
apiEP - select between /completions and /chat/completions endpoint provided by the server/ai-model.
bCompletionFreshChatAlways - whether Completion mode collates complete/sliding-window history when bCompletionFreshChatAlways - whether Completion mode collates complete/sliding-window history when
communicating with the server or only sends the latest user query/message. communicating with the server or only sends the latest user query/message.
@ -119,6 +157,19 @@ One can see the current settings, as well as change/update them using browsers d
bCompletionInsertStandardRolePrefix - whether Completion mode inserts role related prefix wrt the bCompletionInsertStandardRolePrefix - whether Completion mode inserts role related prefix wrt the
messages that get inserted into prompt field wrt /Completion endpoint. messages that get inserted into prompt field wrt /Completion endpoint.
bTrimGarbage - whether garbage repeatation at the end of the generated ai response, should be
trimmed or left as is. If enabled, it will be trimmed so that it wont be sent back as part of
subsequent chat history. At the same time the actual trimmed text is shown to the user, once
when it was generated, so user can check if any useful info/data was there in the response.
One may be able to request the ai-model to continue (wrt the last response) (if chat-history
is enabled as part of the chat-history-in-context setting), and chances are the ai-model will
continue starting from the trimmed part, thus allows long response to be recovered/continued
indirectly, in many cases.
The histogram/freq based trimming logic is currently tuned for english language wrt its
is-it-a-alpabetic|numeral-char regex match logic.
chatRequestOptions - maintains the list of options/fields to send along with chat request, chatRequestOptions - maintains the list of options/fields to send along with chat request,
irrespective of whether /chat/completions or /completions endpoint. irrespective of whether /chat/completions or /completions endpoint.
@ -126,6 +177,14 @@ One can see the current settings, as well as change/update them using browsers d
modify the existing options value or remove them, for now you can update this global var modify the existing options value or remove them, for now you can update this global var
using browser's development-tools/console. using browser's development-tools/console.
For string and numeric fields in chatRequestOptions, including even those added by a user
at runtime by directly modifying gMe.chatRequestOptions, setting ui entries will be auto
created.
headers - maintains the list of http headers sent when request is made to the server. By default
Content-Type is set to application/json. Additionally Authorization entry is provided, which can
be set if needed using the settings ui.
iRecentUserMsgCnt - a simple minded SlidingWindow to limit context window load at Ai Model end. iRecentUserMsgCnt - a simple minded SlidingWindow to limit context window load at Ai Model end.
This is disabled by default. However if enabled, then in addition to latest system message, only This is disabled by default. However if enabled, then in addition to latest system message, only
the last/latest iRecentUserMsgCnt user messages after the latest system prompt and its responses the last/latest iRecentUserMsgCnt user messages after the latest system prompt and its responses
@ -140,7 +199,8 @@ One can see the current settings, as well as change/update them using browsers d
By using gMe's iRecentUserMsgCnt and chatRequestOptions.max_tokens one can try to control the By using gMe's iRecentUserMsgCnt and chatRequestOptions.max_tokens one can try to control the
implications of loading of the ai-model's context window by chat history, wrt chat response to implications of loading of the ai-model's context window by chat history, wrt chat response to
some extent in a simple crude way. some extent in a simple crude way. You may also want to control the context size enabled when
the server loads ai-model, on the server end.
Sometimes the browser may be stuborn with caching of the file, so your updates to html/css/js Sometimes the browser may be stuborn with caching of the file, so your updates to html/css/js
@ -149,28 +209,15 @@ matter clearing site data, dont directly override site caching in all cases. Wor
have to change port. Or in dev tools of browser, you may be able to disable caching fully. have to change port. Or in dev tools of browser, you may be able to disable caching fully.
Concept of multiple chat sessions with different servers, as well as saving and restoring of Currently the server to communicate with is maintained globally and not as part of a specific
those across browser usage sessions, can be woven around the SimpleChat/MultiChatUI class and chat session. So if one changes the server ip/url in setting, then all chat sessions will auto
its instances relatively easily, however given the current goal of keeping this simple, it has switch to this new server, when you try using those sessions.
not been added, for now.
By switching between chat.add_system_begin/anytime, one can control whether one can change By switching between chat.add_system_begin/anytime, one can control whether one can change
the system prompt, anytime during the conversation or only at the beginning. the system prompt, anytime during the conversation or only at the beginning.
read_json_early, is to experiment with reading json response data early on, if available,
so that user can be shown generated data, as and when it is being generated, rather than
at the end when full data is available.
the server flow doesnt seem to be sending back data early, atleast for request (inc options)
that is currently sent.
if able to read json data early on in future, as and when ai model is generating data, then
this helper needs to indirectly update the chat div with the recieved data, without waiting
for the overall data to be available.
### Default setup ### Default setup
By default things are setup to try and make the user experience a bit better, if possible. By default things are setup to try and make the user experience a bit better, if possible.
@ -179,7 +226,8 @@ However a developer when testing the server of ai-model may want to change these
Using iRecentUserMsgCnt reduce chat history context sent to the server/ai-model to be Using iRecentUserMsgCnt reduce chat history context sent to the server/ai-model to be
just the system-prompt, prev-user-request-and-ai-response and cur-user-request, instead of just the system-prompt, prev-user-request-and-ai-response and cur-user-request, instead of
full chat history. This way if there is any response with garbage/repeatation, it doesnt full chat history. This way if there is any response with garbage/repeatation, it doesnt
mess with things beyond the next question/request/query, in some ways. mess with things beyond the next question/request/query, in some ways. The trim garbage
option also tries to help avoid issues with garbage in the context to an extent.
Set max_tokens to 1024, so that a relatively large previous reponse doesnt eat up the space Set max_tokens to 1024, so that a relatively large previous reponse doesnt eat up the space
available wrt next query-response. However dont forget that the server when started should available wrt next query-response. However dont forget that the server when started should
@ -189,11 +237,33 @@ also be started with a model context size of 1k or more, to be on safe side.
internal n_predict, for now add the same here on the client side, maybe later add max_tokens internal n_predict, for now add the same here on the client side, maybe later add max_tokens
to /completions endpoint handling code on server side. to /completions endpoint handling code on server side.
Frequency and presence penalty fields are set to 1.2 in the set of fields sent to server NOTE: One may want to experiment with frequency/presence penalty fields in chatRequestOptions
along with the user query. So that the model is partly set to try avoid repeating text in wrt the set of fields sent to server along with the user query. To check how the model behaves
its response. wrt repeatations in general in the generated text response.
A end-user can change these behaviour by editing gMe from browser's devel-tool/console. A end-user can change these behaviour by editing gMe from browser's devel-tool/console or by
using the providing settings ui.
### OpenAi / Equivalent API WebService
One may be abe to handshake with OpenAI/Equivalent api web service's /chat/completions endpoint
for a minimal chatting experimentation by setting the below.
* the baseUrl in settings ui
* https://api.openai.com/v1 or similar
* Wrt request body - gMe.chatRequestOptions
* model (settings ui)
* any additional fields if required in future
* Wrt request headers - gMe.headers
* Authorization (available through settings ui)
* Bearer THE_OPENAI_API_KEY
* any additional optional header entries like "OpenAI-Organization", "OpenAI-Project" or so
NOTE: Not tested, as there is no free tier api testing available. However logically this might
work.
## At the end ## At the end

View File

@ -21,6 +21,17 @@
.role-user { .role-user {
background-color: lightgray; background-color: lightgray;
} }
.role-trim {
background-color: lightpink;
}
.gridx2 {
display: grid;
grid-template-columns: repeat(2, 1fr);
border-bottom-style: dotted;
border-bottom-width: thin;
border-bottom-color: lightblue;
}
.flex-grow { .flex-grow {
flex-grow: 1; flex-grow: 1;

View File

@ -2,6 +2,9 @@
// A simple completions and chat/completions test related web front end logic // A simple completions and chat/completions test related web front end logic
// by Humans for All // by Humans for All
import * as du from "./datautils.mjs";
import * as ui from "./ui.mjs"
class Roles { class Roles {
static System = "system"; static System = "system";
static User = "user"; static User = "user";
@ -9,40 +12,65 @@ class Roles {
} }
class ApiEP { class ApiEP {
static Chat = "chat"; static Type = {
static Completion = "completion"; Chat: "chat",
Completion: "completion",
}
static UrlSuffix = {
'chat': `/chat/completions`,
'completion': `/completions`,
}
/**
* Build the url from given baseUrl and apiEp id.
* @param {string} baseUrl
* @param {string} apiEP
*/
static Url(baseUrl, apiEP) {
if (baseUrl.endsWith("/")) {
baseUrl = baseUrl.substring(0, baseUrl.length-1);
}
return `${baseUrl}${this.UrlSuffix[apiEP]}`;
}
} }
let gUsageMsg = ` let gUsageMsg = `
<p class="role-system">Usage</p> <p class="role-system">Usage</p>
<ul class="ul1"> <ul class="ul1">
<li> Set system prompt above, to try control ai response charactersitic, if model supports same.</li> <li> System prompt above, to try control ai response characteristics.</li>
<ul class="ul2"> <ul class="ul2">
<li> Completion mode normally wont have a system prompt.</li> <li> Completion mode - no system prompt normally.</li>
</ul> </ul>
<li> Use shift+enter for inserting enter/newline.</li>
<li> Enter your query to ai assistant below.</li> <li> Enter your query to ai assistant below.</li>
<ul class="ul2">
<li> Completion mode doesnt insert user/role: prefix implicitly.</li>
<li> Use shift+enter for inserting enter/newline.</li>
</ul>
<li> Default ContextWindow = [System, Last Query+Resp, Cur Query].</li> <li> Default ContextWindow = [System, Last Query+Resp, Cur Query].</li>
<ul class="ul2"> <ul class="ul2">
<li> experiment iRecentUserMsgCnt, max_tokens, model ctxt window to expand</li> <li> ChatHistInCtxt, MaxTokens, ModelCtxt window to expand</li>
</ul> </ul>
</ul> </ul>
`; `;
/** @typedef {{role: string, content: string}[]} ChatMessages */ /** @typedef {{role: string, content: string}[]} ChatMessages */
/** @typedef {{iLastSys: number, xchat: ChatMessages}} SimpleChatODS */
class SimpleChat { class SimpleChat {
constructor() { /**
* @param {string} chatId
*/
constructor(chatId) {
this.chatId = chatId;
/** /**
* Maintain in a form suitable for common LLM web service chat/completions' messages entry * Maintain in a form suitable for common LLM web service chat/completions' messages entry
* @type {ChatMessages} * @type {ChatMessages}
*/ */
this.xchat = []; this.xchat = [];
this.iLastSys = -1; this.iLastSys = -1;
this.latestResponse = "";
} }
clear() { clear() {
@ -50,6 +78,27 @@ class SimpleChat {
this.iLastSys = -1; this.iLastSys = -1;
} }
ods_key() {
return `SimpleChat-${this.chatId}`
}
save() {
/** @type {SimpleChatODS} */
let ods = {iLastSys: this.iLastSys, xchat: this.xchat};
localStorage.setItem(this.ods_key(), JSON.stringify(ods));
}
load() {
let sods = localStorage.getItem(this.ods_key());
if (sods == null) {
return;
}
/** @type {SimpleChatODS} */
let ods = JSON.parse(sods);
this.iLastSys = ods.iLastSys;
this.xchat = ods.xchat;
}
/** /**
* Recent chat messages. * Recent chat messages.
* If iRecentUserMsgCnt < 0 * If iRecentUserMsgCnt < 0
@ -94,6 +143,15 @@ class SimpleChat {
return rchat; return rchat;
} }
/**
* Collate the latest response from the server/ai-model, as it is becoming available.
* This is mainly useful for the stream mode.
* @param {string} content
*/
append_response(content) {
this.latestResponse += content;
}
/** /**
* Add an entry into xchat * Add an entry into xchat
* @param {string} role * @param {string} role
@ -107,6 +165,7 @@ class SimpleChat {
if (role == Roles.System) { if (role == Roles.System) {
this.iLastSys = this.xchat.length - 1; this.iLastSys = this.xchat.length - 1;
} }
this.save();
return true; return true;
} }
@ -121,10 +180,8 @@ class SimpleChat {
} }
let last = undefined; let last = undefined;
for(const x of this.recent_chat(gMe.iRecentUserMsgCnt)) { for(const x of this.recent_chat(gMe.iRecentUserMsgCnt)) {
let entry = document.createElement("p"); let entry = ui.el_create_append_p(`${x.role}: ${x.content}`, div);
entry.className = `role-${x.role}`; entry.className = `role-${x.role}`;
entry.innerText = `${x.role}: ${x.content}`;
div.appendChild(entry);
last = entry; last = entry;
} }
if (last !== undefined) { if (last !== undefined) {
@ -132,21 +189,45 @@ class SimpleChat {
} else { } else {
if (bClear) { if (bClear) {
div.innerHTML = gUsageMsg; div.innerHTML = gUsageMsg;
gMe.setup_load(div, this);
gMe.show_info(div); gMe.show_info(div);
} }
} }
return last;
}
/**
* Setup the fetch headers.
* It picks the headers from gMe.headers.
* It inserts Authorization only if its non-empty.
* @param {string} apiEP
*/
fetch_headers(apiEP) {
let headers = new Headers();
for(let k in gMe.headers) {
let v = gMe.headers[k];
if ((k == "Authorization") && (v.trim() == "")) {
continue;
}
headers.append(k, v);
}
return headers;
} }
/** /**
* Add needed fields wrt json object to be sent wrt LLM web services completions endpoint. * Add needed fields wrt json object to be sent wrt LLM web services completions endpoint.
* The needed fields/options are picked from a global object. * The needed fields/options are picked from a global object.
* Add optional stream flag, if required.
* Convert the json into string. * Convert the json into string.
* @param {Object} obj * @param {Object} obj
*/ */
request_jsonstr(obj) { request_jsonstr_extend(obj) {
for(let k in gMe.chatRequestOptions) { for(let k in gMe.chatRequestOptions) {
obj[k] = gMe.chatRequestOptions[k]; obj[k] = gMe.chatRequestOptions[k];
} }
if (gMe.bStream) {
obj["stream"] = true;
}
return JSON.stringify(obj); return JSON.stringify(obj);
} }
@ -157,7 +238,7 @@ class SimpleChat {
let req = { let req = {
messages: this.recent_chat(gMe.iRecentUserMsgCnt), messages: this.recent_chat(gMe.iRecentUserMsgCnt),
} }
return this.request_jsonstr(req); return this.request_jsonstr_extend(req);
} }
/** /**
@ -180,7 +261,60 @@ class SimpleChat {
let req = { let req = {
prompt: prompt, prompt: prompt,
} }
return this.request_jsonstr(req); return this.request_jsonstr_extend(req);
}
/**
* Return a string form of json object suitable for specified api endpoint.
* @param {string} apiEP
*/
request_jsonstr(apiEP) {
if (apiEP == ApiEP.Type.Chat) {
return this.request_messages_jsonstr();
} else {
return this.request_prompt_jsonstr(gMe.bCompletionInsertStandardRolePrefix);
}
}
/**
* Extract the ai-model/assistant's response from the http response got.
* Optionally trim the message wrt any garbage at the end.
* @param {any} respBody
* @param {string} apiEP
*/
response_extract(respBody, apiEP) {
let assistant = "";
if (apiEP == ApiEP.Type.Chat) {
assistant = respBody["choices"][0]["message"]["content"];
} else {
try {
assistant = respBody["choices"][0]["text"];
} catch {
assistant = respBody["content"];
}
}
return assistant;
}
/**
* Extract the ai-model/assistant's response from the http response got in streaming mode.
* @param {any} respBody
* @param {string} apiEP
*/
response_extract_stream(respBody, apiEP) {
let assistant = "";
if (apiEP == ApiEP.Type.Chat) {
if (respBody["choices"][0]["finish_reason"] !== "stop") {
assistant = respBody["choices"][0]["delta"]["content"];
}
} else {
try {
assistant = respBody["choices"][0]["text"];
} catch {
assistant = respBody["content"];
}
}
return assistant;
} }
/** /**
@ -239,53 +373,99 @@ class SimpleChat {
return sysPrompt; return sysPrompt;
} }
}
/**
let gBaseURL = "http://127.0.0.1:8080"; * Handle the multipart response from server/ai-model
let gChatURL = { * @param {Response} resp
'chat': `${gBaseURL}/chat/completions`, * @param {string} apiEP
'completion': `${gBaseURL}/completions`, * @param {HTMLDivElement} elDiv
} */
async handle_response_multipart(resp, apiEP, elDiv) {
let elP = ui.el_create_append_p("", elDiv);
/** if (!resp.body) {
* Set the class of the children, based on whether it is the idSelected or not. throw Error("ERRR:SimpleChat:SC:HandleResponseMultiPart:No body...");
* @param {HTMLDivElement} elBase
* @param {string} idSelected
* @param {string} classSelected
* @param {string} classUnSelected
*/
function el_children_config_class(elBase, idSelected, classSelected, classUnSelected="") {
for(let child of elBase.children) {
if (child.id == idSelected) {
child.className = classSelected;
} else {
child.className = classUnSelected;
} }
let tdUtf8 = new TextDecoder("utf-8");
let rr = resp.body.getReader();
this.latestResponse = "";
let xLines = new du.NewLines();
while(true) {
let { value: cur, done: done } = await rr.read();
if (cur) {
let curBody = tdUtf8.decode(cur, {stream: true});
console.debug("DBUG:SC:PART:Str:", curBody);
xLines.add_append(curBody);
}
while(true) {
let curLine = xLines.shift(!done);
if (curLine == undefined) {
break;
}
if (curLine.trim() == "") {
continue;
}
if (curLine.startsWith("data:")) {
curLine = curLine.substring(5);
}
let curJson = JSON.parse(curLine);
console.debug("DBUG:SC:PART:Json:", curJson);
this.append_response(this.response_extract_stream(curJson, apiEP));
}
elP.innerText = this.latestResponse;
elP.scrollIntoView(false);
if (done) {
break;
}
}
console.debug("DBUG:SC:PART:Full:", this.latestResponse);
return this.latestResponse;
} }
}
/** /**
* Create button and set it up. * Handle the oneshot response from server/ai-model
* @param {string} id * @param {Response} resp
* @param {(this: HTMLButtonElement, ev: MouseEvent) => any} callback * @param {string} apiEP
* @param {string | undefined} name */
* @param {string | undefined} innerText async handle_response_oneshot(resp, apiEP) {
*/ let respBody = await resp.json();
function el_create_button(id, callback, name=undefined, innerText=undefined) { console.debug(`DBUG:SimpleChat:SC:${this.chatId}:HandleUserSubmit:RespBody:${JSON.stringify(respBody)}`);
if (!name) { return this.response_extract(respBody, apiEP);
name = id;
} }
if (!innerText) {
innerText = id; /**
* Handle the response from the server be it in oneshot or multipart/stream mode.
* Also take care of the optional garbage trimming.
* @param {Response} resp
* @param {string} apiEP
* @param {HTMLDivElement} elDiv
*/
async handle_response(resp, apiEP, elDiv) {
let theResp = {
assistant: "",
trimmed: "",
}
if (gMe.bStream) {
try {
theResp.assistant = await this.handle_response_multipart(resp, apiEP, elDiv);
this.latestResponse = "";
} catch (error) {
theResp.assistant = this.latestResponse;
this.add(Roles.Assistant, theResp.assistant);
this.latestResponse = "";
throw error;
}
} else {
theResp.assistant = await this.handle_response_oneshot(resp, apiEP);
}
if (gMe.bTrimGarbage) {
let origMsg = theResp.assistant;
theResp.assistant = du.trim_garbage_at_end(origMsg);
theResp.trimmed = origMsg.substring(theResp.assistant.length);
}
this.add(Roles.Assistant, theResp.assistant);
return theResp;
} }
let btn = document.createElement("button");
btn.id = id;
btn.name = name;
btn.innerText = innerText;
btn.addEventListener("click", callback);
return btn;
} }
@ -302,14 +482,16 @@ class MultiChatUI {
this.elDivChat = /** @type{HTMLDivElement} */(document.getElementById("chat-div")); this.elDivChat = /** @type{HTMLDivElement} */(document.getElementById("chat-div"));
this.elBtnUser = /** @type{HTMLButtonElement} */(document.getElementById("user-btn")); this.elBtnUser = /** @type{HTMLButtonElement} */(document.getElementById("user-btn"));
this.elInUser = /** @type{HTMLInputElement} */(document.getElementById("user-in")); this.elInUser = /** @type{HTMLInputElement} */(document.getElementById("user-in"));
this.elSelectApiEP = /** @type{HTMLSelectElement} */(document.getElementById("api-ep")); this.elDivHeading = /** @type{HTMLSelectElement} */(document.getElementById("heading"));
this.elDivSessions = /** @type{HTMLDivElement} */(document.getElementById("sessions-div")); this.elDivSessions = /** @type{HTMLDivElement} */(document.getElementById("sessions-div"));
this.elBtnSettings = /** @type{HTMLButtonElement} */(document.getElementById("settings"));
this.validate_element(this.elInSystem, "system-in"); this.validate_element(this.elInSystem, "system-in");
this.validate_element(this.elDivChat, "chat-div"); this.validate_element(this.elDivChat, "chat-div");
this.validate_element(this.elInUser, "user-in"); this.validate_element(this.elInUser, "user-in");
this.validate_element(this.elSelectApiEP, "api-ep"); this.validate_element(this.elDivHeading, "heading");
this.validate_element(this.elDivChat, "sessions-div"); this.validate_element(this.elDivChat, "sessions-div");
this.validate_element(this.elBtnSettings, "settings");
} }
/** /**
@ -350,13 +532,18 @@ class MultiChatUI {
this.handle_session_switch(this.curChatId); this.handle_session_switch(this.curChatId);
} }
this.elBtnSettings.addEventListener("click", (ev)=>{
this.elDivChat.replaceChildren();
gMe.show_settings(this.elDivChat);
});
this.elBtnUser.addEventListener("click", (ev)=>{ this.elBtnUser.addEventListener("click", (ev)=>{
if (this.elInUser.disabled) { if (this.elInUser.disabled) {
return; return;
} }
this.handle_user_submit(this.curChatId, this.elSelectApiEP.value).catch((/** @type{Error} */reason)=>{ this.handle_user_submit(this.curChatId, gMe.apiEP).catch((/** @type{Error} */reason)=>{
let msg = `ERRR:SimpleChat\nMCUI:HandleUserSubmit:${this.curChatId}\n${reason.name}:${reason.message}`; let msg = `ERRR:SimpleChat\nMCUI:HandleUserSubmit:${this.curChatId}\n${reason.name}:${reason.message}`;
console.debug(msg.replace("\n", ":")); console.error(msg.replace("\n", ":"));
alert(msg); alert(msg);
this.ui_reset_userinput(); this.ui_reset_userinput();
}); });
@ -377,6 +564,8 @@ class MultiChatUI {
// allow user to insert enter into the system prompt using shift+enter. // allow user to insert enter into the system prompt using shift+enter.
// while just pressing enter key will lead to setting the system prompt. // while just pressing enter key will lead to setting the system prompt.
if ((ev.key === "Enter") && (!ev.shiftKey)) { if ((ev.key === "Enter") && (!ev.shiftKey)) {
let value = this.elInSystem.value;
this.elInSystem.value = value.substring(0,value.length-1);
let chat = this.simpleChats[this.curChatId]; let chat = this.simpleChats[this.curChatId];
chat.add_system_anytime(this.elInSystem.value, this.curChatId); chat.add_system_anytime(this.elInSystem.value, this.curChatId);
chat.show(this.elDivChat); chat.show(this.elDivChat);
@ -392,34 +581,12 @@ class MultiChatUI {
* @param {boolean} bSwitchSession * @param {boolean} bSwitchSession
*/ */
new_chat_session(chatId, bSwitchSession=false) { new_chat_session(chatId, bSwitchSession=false) {
this.simpleChats[chatId] = new SimpleChat(); this.simpleChats[chatId] = new SimpleChat(chatId);
if (bSwitchSession) { if (bSwitchSession) {
this.handle_session_switch(chatId); this.handle_session_switch(chatId);
} }
} }
/**
* Try read json response early, if available.
* @param {Response} resp
*/
async read_json_early(resp) {
if (!resp.body) {
throw Error("ERRR:SimpleChat:MCUI:ReadJsonEarly:No body...");
}
let tdUtf8 = new TextDecoder("utf-8");
let rr = resp.body.getReader();
let gotBody = "";
while(true) {
let { value: cur, done: done} = await rr.read();
let curBody = tdUtf8.decode(cur);
console.debug("DBUG:SC:PART:", curBody);
gotBody += curBody;
if (done) {
break;
}
}
return JSON.parse(gotBody);
}
/** /**
* Handle user query submit request, wrt specified chat session. * Handle user query submit request, wrt specified chat session.
@ -434,7 +601,7 @@ class MultiChatUI {
// So if user wants to simulate a multi-chat based completion query, // So if user wants to simulate a multi-chat based completion query,
// they will have to enter the full thing, as a suitable multiline // they will have to enter the full thing, as a suitable multiline
// user input/query. // user input/query.
if ((apiEP == ApiEP.Completion) && (gMe.bCompletionFreshChatAlways)) { if ((apiEP == ApiEP.Type.Completion) && (gMe.bCompletionFreshChatAlways)) {
chat.clear(); chat.clear();
} }
@ -447,41 +614,26 @@ class MultiChatUI {
} }
chat.show(this.elDivChat); chat.show(this.elDivChat);
let theBody; let theUrl = ApiEP.Url(gMe.baseURL, apiEP);
let theUrl = gChatURL[apiEP] let theBody = chat.request_jsonstr(apiEP);
if (apiEP == ApiEP.Chat) {
theBody = chat.request_messages_jsonstr();
} else {
theBody = chat.request_prompt_jsonstr(gMe.bCompletionInsertStandardRolePrefix);
}
this.elInUser.value = "working..."; this.elInUser.value = "working...";
this.elInUser.disabled = true; this.elInUser.disabled = true;
console.debug(`DBUG:SimpleChat:MCUI:${chatId}:HandleUserSubmit:${theUrl}:ReqBody:${theBody}`); console.debug(`DBUG:SimpleChat:MCUI:${chatId}:HandleUserSubmit:${theUrl}:ReqBody:${theBody}`);
let theHeaders = chat.fetch_headers(apiEP);
let resp = await fetch(theUrl, { let resp = await fetch(theUrl, {
method: "POST", method: "POST",
headers: { headers: theHeaders,
"Content-Type": "application/json",
},
body: theBody, body: theBody,
}); });
let respBody = await resp.json(); let theResp = await chat.handle_response(resp, apiEP, this.elDivChat);
//let respBody = await this.read_json_early(resp);
console.debug(`DBUG:SimpleChat:MCUI:${chatId}:HandleUserSubmit:RespBody:${JSON.stringify(respBody)}`);
let assistantMsg;
if (apiEP == ApiEP.Chat) {
assistantMsg = respBody["choices"][0]["message"]["content"];
} else {
try {
assistantMsg = respBody["choices"][0]["text"];
} catch {
assistantMsg = respBody["content"];
}
}
chat.add(Roles.Assistant, assistantMsg);
if (chatId == this.curChatId) { if (chatId == this.curChatId) {
chat.show(this.elDivChat); chat.show(this.elDivChat);
if (theResp.trimmed.length > 0) {
let p = ui.el_create_append_p(`TRIMMED:${theResp.trimmed}`, this.elDivChat);
p.className="role-trim";
}
} else { } else {
console.debug(`DBUG:SimpleChat:MCUI:HandleUserSubmit:ChatId has changed:[${chatId}] [${this.curChatId}]`); console.debug(`DBUG:SimpleChat:MCUI:HandleUserSubmit:ChatId has changed:[${chatId}] [${this.curChatId}]`);
} }
@ -500,7 +652,7 @@ class MultiChatUI {
} }
elDiv.replaceChildren(); elDiv.replaceChildren();
// Btn for creating new chat session // Btn for creating new chat session
let btnNew = el_create_button("New CHAT", (ev)=> { let btnNew = ui.el_create_button("New CHAT", (ev)=> {
if (this.elInUser.disabled) { if (this.elInUser.disabled) {
console.error(`ERRR:SimpleChat:MCUI:NewChat:Current session [${this.curChatId}] awaiting response, ignoring request...`); console.error(`ERRR:SimpleChat:MCUI:NewChat:Current session [${this.curChatId}] awaiting response, ignoring request...`);
alert("ERRR:SimpleChat\nMCUI:NewChat\nWait for response to pending query, before starting new chat session"); alert("ERRR:SimpleChat\nMCUI:NewChat\nWait for response to pending query, before starting new chat session");
@ -514,7 +666,7 @@ class MultiChatUI {
} }
this.new_chat_session(chatIdGot, true); this.new_chat_session(chatIdGot, true);
this.create_session_btn(elDiv, chatIdGot); this.create_session_btn(elDiv, chatIdGot);
el_children_config_class(elDiv, chatIdGot, "session-selected", ""); ui.el_children_config_class(elDiv, chatIdGot, "session-selected", "");
}); });
elDiv.appendChild(btnNew); elDiv.appendChild(btnNew);
// Btns for existing chat sessions // Btns for existing chat sessions
@ -528,7 +680,7 @@ class MultiChatUI {
} }
create_session_btn(elDiv, cid) { create_session_btn(elDiv, cid) {
let btn = el_create_button(cid, (ev)=>{ let btn = ui.el_create_button(cid, (ev)=>{
let target = /** @type{HTMLButtonElement} */(ev.target); let target = /** @type{HTMLButtonElement} */(ev.target);
console.debug(`DBUG:SimpleChat:MCUI:SessionClick:${target.id}`); console.debug(`DBUG:SimpleChat:MCUI:SessionClick:${target.id}`);
if (this.elInUser.disabled) { if (this.elInUser.disabled) {
@ -537,7 +689,7 @@ class MultiChatUI {
return; return;
} }
this.handle_session_switch(target.id); this.handle_session_switch(target.id);
el_children_config_class(elDiv, target.id, "session-selected", ""); ui.el_children_config_class(elDiv, target.id, "session-selected", "");
}); });
elDiv.appendChild(btn); elDiv.appendChild(btn);
return btn; return btn;
@ -567,46 +719,183 @@ class MultiChatUI {
class Me { class Me {
constructor() { constructor() {
this.baseURL = "http://127.0.0.1:8080";
this.defaultChatIds = [ "Default", "Other" ]; this.defaultChatIds = [ "Default", "Other" ];
this.multiChat = new MultiChatUI(); this.multiChat = new MultiChatUI();
this.bStream = true;
this.bCompletionFreshChatAlways = true; this.bCompletionFreshChatAlways = true;
this.bCompletionInsertStandardRolePrefix = false; this.bCompletionInsertStandardRolePrefix = false;
this.bTrimGarbage = true;
this.iRecentUserMsgCnt = 2; this.iRecentUserMsgCnt = 2;
this.sRecentUserMsgCnt = {
"Full": -1,
"Last0": 1,
"Last1": 2,
"Last2": 3,
"Last4": 5,
};
this.apiEP = ApiEP.Type.Chat;
this.headers = {
"Content-Type": "application/json",
"Authorization": "", // Authorization: Bearer OPENAI_API_KEY
}
// Add needed fields wrt json object to be sent wrt LLM web services completions endpoint. // Add needed fields wrt json object to be sent wrt LLM web services completions endpoint.
this.chatRequestOptions = { this.chatRequestOptions = {
"model": "gpt-3.5-turbo",
"temperature": 0.7, "temperature": 0.7,
"max_tokens": 1024, "max_tokens": 1024,
"frequency_penalty": 1.2, "n_predict": 1024,
"presence_penalty": 1.2, //"frequency_penalty": 1.2,
"n_predict": 1024 //"presence_penalty": 1.2,
}; };
} }
/** /**
* Disable console.debug by mapping it to a empty function.
*/
debug_disable() {
this.console_debug = console.debug;
console.debug = () => {
};
}
/**
* Setup the load saved chat ui.
* @param {HTMLDivElement} div
* @param {SimpleChat} chat
*/
setup_load(div, chat) {
if (!(chat.ods_key() in localStorage)) {
return;
}
div.innerHTML += `<p class="role-system">Restore</p>
<p>Load previously saved chat session, if available</p>`;
let btn = ui.el_create_button(chat.ods_key(), (ev)=>{
console.log("DBUG:SimpleChat:SC:Load", chat);
chat.load();
queueMicrotask(()=>{
chat.show(div);
this.multiChat.elInSystem.value = chat.get_system_latest();
});
});
div.appendChild(btn);
}
/**
* Show the configurable parameters info in the passed Div element.
* @param {HTMLDivElement} elDiv
* @param {boolean} bAll
*/
show_info(elDiv, bAll=false) {
let p = ui.el_create_append_p("Settings (devel-tools-console document[gMe])", elDiv);
p.className = "role-system";
if (bAll) {
ui.el_create_append_p(`baseURL:${this.baseURL}`, elDiv);
ui.el_create_append_p(`Authorization:${this.headers["Authorization"]}`, elDiv);
ui.el_create_append_p(`bStream:${this.bStream}`, elDiv);
ui.el_create_append_p(`bCompletionFreshChatAlways:${this.bCompletionFreshChatAlways}`, elDiv);
ui.el_create_append_p(`bCompletionInsertStandardRolePrefix:${this.bCompletionInsertStandardRolePrefix}`, elDiv);
ui.el_create_append_p(`bTrimGarbage:${this.bTrimGarbage}`, elDiv);
ui.el_create_append_p(`iRecentUserMsgCnt:${this.iRecentUserMsgCnt}`, elDiv);
ui.el_create_append_p(`ApiEndPoint:${this.apiEP}`, elDiv);
}
ui.el_create_append_p(`chatRequestOptions:${JSON.stringify(this.chatRequestOptions, null, " - ")}`, elDiv);
ui.el_create_append_p(`headers:${JSON.stringify(this.headers, null, " - ")}`, elDiv);
}
/**
* Auto create ui input elements for fields in ChatRequestOptions
* Currently supports text and number field types.
* @param {HTMLDivElement} elDiv * @param {HTMLDivElement} elDiv
*/ */
show_info(elDiv) { show_settings_chatrequestoptions(elDiv) {
let typeDict = {
"string": "text",
"number": "number",
};
let fs = document.createElement("fieldset");
let legend = document.createElement("legend");
legend.innerText = "ChatRequestOptions";
fs.appendChild(legend);
elDiv.appendChild(fs);
for(const k in this.chatRequestOptions) {
let val = this.chatRequestOptions[k];
let type = typeof(val);
if (!((type == "string") || (type == "number"))) {
continue;
}
let inp = ui.el_creatediv_input(`Set${k}`, k, typeDict[type], this.chatRequestOptions[k], (val)=>{
if (type == "number") {
val = Number(val);
}
this.chatRequestOptions[k] = val;
});
fs.appendChild(inp.div);
}
}
var p = document.createElement("p"); /**
p.innerText = "Settings (devel-tools-console gMe)"; * Show settings ui for configurable parameters, in the passed Div element.
p.className = "role-system"; * @param {HTMLDivElement} elDiv
elDiv.appendChild(p); */
show_settings(elDiv) {
var p = document.createElement("p"); let inp = ui.el_creatediv_input("SetBaseURL", "BaseURL", "text", this.baseURL, (val)=>{
p.innerText = `bCompletionFreshChatAlways:${this.bCompletionFreshChatAlways}`; this.baseURL = val;
elDiv.appendChild(p); });
elDiv.appendChild(inp.div);
p = document.createElement("p"); inp = ui.el_creatediv_input("SetAuthorization", "Authorization", "text", this.headers["Authorization"], (val)=>{
p.innerText = `bCompletionInsertStandardRolePrefix:${this.bCompletionInsertStandardRolePrefix}`; this.headers["Authorization"] = val;
elDiv.appendChild(p); });
inp.el.placeholder = "Bearer OPENAI_API_KEY";
elDiv.appendChild(inp.div);
p = document.createElement("p"); let bb = ui.el_creatediv_boolbutton("SetStream", "Stream", {true: "[+] yes stream", false: "[-] do oneshot"}, this.bStream, (val)=>{
p.innerText = `iRecentUserMsgCnt:${this.iRecentUserMsgCnt}`; this.bStream = val;
elDiv.appendChild(p); });
elDiv.appendChild(bb.div);
p = document.createElement("p"); bb = ui.el_creatediv_boolbutton("SetCompletionFreshChatAlways", "CompletionFreshChatAlways", {true: "[+] yes fresh", false: "[-] no, with history"}, this.bCompletionFreshChatAlways, (val)=>{
p.innerText = `chatRequestOptions:${JSON.stringify(this.chatRequestOptions)}`; this.bCompletionFreshChatAlways = val;
elDiv.appendChild(p); });
elDiv.appendChild(bb.div);
bb = ui.el_creatediv_boolbutton("SetCompletionInsertStandardRolePrefix", "CompletionInsertStandardRolePrefix", {true: "[+] yes insert", false: "[-] dont insert"}, this.bCompletionInsertStandardRolePrefix, (val)=>{
this.bCompletionInsertStandardRolePrefix = val;
});
elDiv.appendChild(bb.div);
bb = ui.el_creatediv_boolbutton("SetTrimGarbage", "TrimGarbage", {true: "[+] yes trim", false: "[-] dont trim"}, this.bTrimGarbage, (val)=>{
this.bTrimGarbage = val;
});
elDiv.appendChild(bb.div);
let sel = ui.el_creatediv_select("SetChatHistoryInCtxt", "ChatHistoryInCtxt", this.sRecentUserMsgCnt, this.iRecentUserMsgCnt, (val)=>{
this.iRecentUserMsgCnt = this.sRecentUserMsgCnt[val];
});
elDiv.appendChild(sel.div);
sel = ui.el_creatediv_select("SetApiEP", "ApiEndPoint", ApiEP.Type, this.apiEP, (val)=>{
this.apiEP = ApiEP.Type[val];
});
elDiv.appendChild(sel.div);
this.show_settings_chatrequestoptions(elDiv);
} }
@ -619,6 +908,9 @@ let gMe;
function startme() { function startme() {
console.log("INFO:SimpleChat:StartMe:Starting..."); console.log("INFO:SimpleChat:StartMe:Starting...");
gMe = new Me(); gMe = new Me();
gMe.debug_disable();
document["gMe"] = gMe;
document["du"] = du;
for (let cid of gMe.defaultChatIds) { for (let cid of gMe.defaultChatIds) {
gMe.multiChat.new_chat_session(cid); gMe.multiChat.new_chat_session(cid);
} }

View File

@ -0,0 +1,211 @@
//@ts-check
// Helpers to work with html elements
// by Humans for All
//
/**
* Set the class of the children, based on whether it is the idSelected or not.
* @param {HTMLDivElement} elBase
* @param {string} idSelected
* @param {string} classSelected
* @param {string} classUnSelected
*/
export function el_children_config_class(elBase, idSelected, classSelected, classUnSelected="") {
for(let child of elBase.children) {
if (child.id == idSelected) {
child.className = classSelected;
} else {
child.className = classUnSelected;
}
}
}
/**
* Create button and set it up.
* @param {string} id
* @param {(this: HTMLButtonElement, ev: MouseEvent) => any} callback
* @param {string | undefined} name
* @param {string | undefined} innerText
*/
export function el_create_button(id, callback, name=undefined, innerText=undefined) {
if (!name) {
name = id;
}
if (!innerText) {
innerText = id;
}
let btn = document.createElement("button");
btn.id = id;
btn.name = name;
btn.innerText = innerText;
btn.addEventListener("click", callback);
return btn;
}
/**
* Create a para and set it up. Optionaly append it to a passed parent.
* @param {string} text
* @param {HTMLElement | undefined} elParent
* @param {string | undefined} id
*/
export function el_create_append_p(text, elParent=undefined, id=undefined) {
let para = document.createElement("p");
para.innerText = text;
if (id) {
para.id = id;
}
if (elParent) {
elParent.appendChild(para);
}
return para;
}
/**
* Create a button which represents bool value using specified text wrt true and false.
* When ever user clicks the button, it will toggle the value and update the shown text.
*
* @param {string} id
* @param {{true: string, false: string}} texts
* @param {boolean} defaultValue
* @param {function(boolean):void} cb
*/
export function el_create_boolbutton(id, texts, defaultValue, cb) {
let el = document.createElement("button");
el["xbool"] = defaultValue;
el["xtexts"] = structuredClone(texts);
el.innerText = el["xtexts"][String(defaultValue)];
if (id) {
el.id = id;
}
el.addEventListener('click', (ev)=>{
el["xbool"] = !el["xbool"];
el.innerText = el["xtexts"][String(el["xbool"])];
cb(el["xbool"]);
})
return el;
}
/**
* Create a div wrapped button which represents bool value using specified text wrt true and false.
* @param {string} id
* @param {string} label
* @param {{ true: string; false: string; }} texts
* @param {boolean} defaultValue
* @param {(arg0: boolean) => void} cb
* @param {string} className
*/
export function el_creatediv_boolbutton(id, label, texts, defaultValue, cb, className="gridx2") {
let div = document.createElement("div");
div.className = className;
let lbl = document.createElement("label");
lbl.setAttribute("for", id);
lbl.innerText = label;
div.appendChild(lbl);
let btn = el_create_boolbutton(id, texts, defaultValue, cb);
div.appendChild(btn);
return { div: div, el: btn };
}
/**
* Create a select ui element, with a set of options to select from.
* * options: an object which contains name-value pairs
* * defaultOption: the value whose name should be choosen, by default.
* * cb : the call back returns the name string of the option selected.
*
* @param {string} id
* @param {Object<string,*>} options
* @param {*} defaultOption
* @param {function(string):void} cb
*/
export function el_create_select(id, options, defaultOption, cb) {
let el = document.createElement("select");
el["xselected"] = defaultOption;
el["xoptions"] = structuredClone(options);
for(let cur of Object.keys(options)) {
let op = document.createElement("option");
op.value = cur;
op.innerText = cur;
if (options[cur] == defaultOption) {
op.selected = true;
}
el.appendChild(op);
}
if (id) {
el.id = id;
el.name = id;
}
el.addEventListener('change', (ev)=>{
let target = /** @type{HTMLSelectElement} */(ev.target);
console.log("DBUG:UI:Select:", id, ":", target.value);
cb(target.value);
})
return el;
}
/**
* Create a div wrapped select ui element, with a set of options to select from.
*
* @param {string} id
* @param {any} label
* @param {{ [x: string]: any; }} options
* @param {any} defaultOption
* @param {(arg0: string) => void} cb
* @param {string} className
*/
export function el_creatediv_select(id, label, options, defaultOption, cb, className="gridx2") {
let div = document.createElement("div");
div.className = className;
let lbl = document.createElement("label");
lbl.setAttribute("for", id);
lbl.innerText = label;
div.appendChild(lbl);
let sel = el_create_select(id, options,defaultOption, cb);
div.appendChild(sel);
return { div: div, el: sel };
}
/**
* Create a input ui element.
*
* @param {string} id
* @param {string} type
* @param {any} defaultValue
* @param {function(any):void} cb
*/
export function el_create_input(id, type, defaultValue, cb) {
let el = document.createElement("input");
el.type = type;
el.value = defaultValue;
if (id) {
el.id = id;
}
el.addEventListener('change', (ev)=>{
cb(el.value);
})
return el;
}
/**
* Create a div wrapped input.
*
* @param {string} id
* @param {string} label
* @param {string} type
* @param {any} defaultValue
* @param {function(any):void} cb
* @param {string} className
*/
export function el_creatediv_input(id, label, type, defaultValue, cb, className="gridx2") {
let div = document.createElement("div");
div.className = className;
let lbl = document.createElement("label");
lbl.setAttribute("for", id);
lbl.innerText = label;
div.appendChild(lbl);
let el = el_create_input(id, type, defaultValue, cb);
div.appendChild(el);
return { div: div, el: el };
}