Changes:
- Move each example into its own function. This makes the code much
easier to read and understand.
- Make the program easy to only run one test by commenting out function
calls in main().
- Make the output easy to parse by indenting the output for each example.
- Add shebang and +x bit to make it clear it's an executable.
- Make the host configurable via --host with a default 127.0.0.1:8080.
- Make the code look in the tools list to call the registered tool,
instead of hardcoding the returned values. This makes the code more
copy-pastable.
- Add error checking, so that the program exits 1 if the LLM didn't
returned expected values. It's super useful to check for correctness.
Testing:
- Tested with Mistral-7B-Instruct-v0.3 in F16 and Q5_K_M and
Meta-Llama-3-8B-Instruct in F16 and Q5_K_M.
- I did not observe a failure even once in Mistral-7B-Instruct-v0.3.
- Llama-3 failed about a third of the time in example_concurrent: it
only returned one call instead of 3. Even for F16.
Potential follow ups:
- Do not fix the prompt encoding yet. Surprisingly it mostly works even
if the prompt encoding is not model optimized.
- Add chained answer and response.
Test only change.
* py : type-check all Python scripts with Pyright
* server-tests : use trailing slash in openai base_url
* server-tests : add more type annotations
* server-tests : strip "chat" from base_url in oai_chat_completions
* server-tests : model metadata is a dict
* ci : disable pip cache in type-check workflow
The cache is not shared between branches, and it's 250MB in size,
so it would become quite a big part of the 10GB cache limit of the repo.
* py : fix new type errors from master branch
* tests : fix test-tokenizer-random.py
Apparently, gcc applies optimisations even when pre-processing,
which confuses pycparser.
* ci : only show warnings and errors in python type-check
The "information" level otherwise has entries
from 'examples/pydantic_models_to_grammar.py',
which could be confusing for someone trying to figure out what failed,
considering that these messages can safely be ignored
even though they look like errors.