# run with: ./tests.sh --no-skipped --tags passkey @passkey @slow Feature: Passkey / Self-extend with context shift Background: Server startup Given a server listening on localhost:8080 # Generates a long text of junk and inserts a secret passkey number inside it. # Then we query the LLM for the secret passkey. # see #3856 and #4810 Scenario Outline: Passkey Given a model file from HF repo And as batch size And as number of junk And server max tokens to predict And 42 as seed And 0.0 temperature And KV cache size And 1 slots And group attention factor to extend context size through self-extend And group attention width to extend context size through self-extend # Can be override with N_GPU_LAYERS And GPU offloaded layers Then the server is starting # Higher timeout because the model may need to be downloaded from the internet Then the server is healthy with timeout 120 seconds Given available models Then model 0 is trained on tokens context Given a prefix prompt: """ here is an important info hidden inside a lot of irrelevant text. Find it and memorize them. I will quiz you about the important information there. """ And a passkey prompt template: """ The pass key is Remember it. is the pass key. """ And a junk suffix prompt: """ The grass is green. The sky is blue. The sun is yellow. Here we go. There and back again. """ And a suffix prompt: """ What is the pass key? The pass key is """ Given a "" passkey challenge prompt with the passkey inserted every junk And a completion request with no api error Then tokens are predicted matching Examples: | hf_repo | hf_file | n_ctx_train | ngl | n_ctx | n_batch | n_ga | n_ga_w | n_junk | i_pos | passkey | n_predicted | re_content | | TheBloke/phi-2-GGUF | phi-2.Q4_K_M.gguf | 2048 | 5 | 8192 | 512 | 4 | 512 | 250 | 50 | 42 | 1 | 42 | | TheBloke/phi-2-GGUF | phi-2.Q4_K_M.gguf | 2048 | 5 | 8192 | 512 | 2 | 512 | 250 | 50 | 42 | 1 | \b((?!42)\w)+\b | #| TheBloke/Llama-2-7B-GGUF | llama-2-7b.Q2_K.gguf | 4096 | 3 | 16384 | 512 | 4 | 512 | 500 | 300 | 1234 | 5 | 1234 | #| TheBloke/Mixtral-8x7B-v0.1-GGUF | mixtral-8x7b-v0.1.Q2_K.gguf | 32768 | 2 | 16384 | 512 | 4 | 512 | 500 | 100 | 0987 | 5 | 0 # 987 |