-`-c N, --ctx-size N`: Set the size of the prompt context. The default is 4096, but if a LLaMA model was built with a longer context, increasing this value will provide better results for longer input/inference.
The `infill` program provides several ways to interact with the LLaMA models using input prompts:
-`--in-prefix PROMPT_BEFORE_CURSOR`: Provide the prefix directly as a command-line option.
-`--in-suffix PROMPT_AFTER_CURSOR`: Provide the suffix directly as a command-line option.
-`--interactive-first`: Run the program in interactive mode and wait for input right away. (More on this below.)
## Interaction
The `infill` program offers a seamless way to interact with LLaMA models, allowing users to receive real-time infill suggestions. The interactive mode can be triggered using `--interactive`, and `--interactive-first`
### Interaction Options
-`-i, --interactive`: Run the program in interactive mode, allowing users to get real time code suggestions from model.
-`--interactive-first`: Run the program in interactive mode and immediately wait for user input before starting the text generation.
-`--color`: Enable colorized output to differentiate visually distinguishing between prompts, user input, and generated text.