llama.cpp/prompts/LLM-questions.txt

In the context of LLMs, what is "Attention"?
In the context of LLMs, what is a completion?
In the context of LLMs, what is a prompt?
In the context of LLMs, what is GELU?
In the context of LLMs, what is RELU?
In the context of LLMs, what is softmax?
In the context of LLMs, what is decoding?
In the context of LLMs, what is encoding?
In the context of LLMs, what is tokenizing?
In the context of LLMs, what is an embedding?
In the context of LLMs, what is quantization?
In the context of LLMs, what is a tensor?
In the context of LLMs, what is a sparse tensor?
In the context of LLMs, what is a vector?
In the context of LLMs, how is attention implemented?
In the context of LLMs, why is attention all you need?
In the context of LLMs, what is "RoPe" and what is it used for?
In the context of LLMs, what is "LoRA" and what is it used for?
In the context of LLMs, what are weights?
In the context of LLMs, what are biases?
In the context of LLMs, what are checkpoints?
In the context of LLMs, what is "perplexity"?
In the context of LLMs, what are models?
In the context of machine-learning, what is "catastrophic forgetting"?
In the context of machine-learning, what is "elastic weight consolidation (EWC)"?
In the context of neural nets, what is a hidden layer?
In the context of neural nets, what is a convolution?
In the context of neural nets, what is dropout?
In the context of neural nets, what is cross-entropy?
In the context of neural nets, what is over-fitting?
In the context of neural nets, what is under-fitting?
What is the difference between an interpreted computer language and a compiled computer language?
In the context of software development, what is a debugger?
When processing using a GPU, what is off-loading?
When processing using a GPU, what is a batch?
When processing using a GPU, what is a block?
When processing using a GPU, what is the difference between a batch and a block?
When processing using a GPU, what is a scratch tensor?
When processing using a GPU, what is a layer?
When processing using a GPU, what is a cache?
When processing using a GPU, what is unified memory?
When processing using a GPU, what is VRAM?
When processing using a GPU, what is a kernel?
When processing using a GPU, what is "metal"?
In the context of LLMs, what are "Zero-Shot", "One-Shot" and "Few-Shot" learning models?
In the context of LLMs, what is the "Transformer-model" architecture?
In the context of LLMs, what is "Multi-Head Attention"?
In the context of LLMs, what is "Self-Attention"?
In the context of transformer-model architectures, how do attention mechanisms use masks?
parallel : add option to load external prompt file (#3416) * Enable external file and add datestamp * Add name of external file at end * Upload ToK2024 * Delete ToK2024.txt * Experiments with jeopardy * Move ParallelQuestions to /proimpts and rename * Interim commit * Interim commit * Final revision * Remove trailing whitespace * remove cmake_all.sh * Remove cmake_all.sh * Changed .gitignore * Improved reporting and new question files. * Corrected typo * More LLM questions * Update LLM-questions.txt * Yet more LLM-questions * Remove jeopardy results file * Reinstate original jeopardy.sh * Update examples/parallel/parallel.cpp --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> 2023-10-06 13:16:38 +00:00			`In the context of LLMs, what is "Attention"?`
			`In the context of LLMs, what is a completion?`
			`In the context of LLMs, what is a prompt?`
			`In the context of LLMs, what is GELU?`
			`In the context of LLMs, what is RELU?`
			`In the context of LLMs, what is softmax?`
			`In the context of LLMs, what is decoding?`
			`In the context of LLMs, what is encoding?`
			`In the context of LLMs, what is tokenizing?`
			`In the context of LLMs, what is an embedding?`
			`In the context of LLMs, what is quantization?`
			`In the context of LLMs, what is a tensor?`
			`In the context of LLMs, what is a sparse tensor?`
			`In the context of LLMs, what is a vector?`
			`In the context of LLMs, how is attention implemented?`
			`In the context of LLMs, why is attention all you need?`
			`In the context of LLMs, what is "RoPe" and what is it used for?`
			`In the context of LLMs, what is "LoRA" and what is it used for?`
			`In the context of LLMs, what are weights?`
			`In the context of LLMs, what are biases?`
			`In the context of LLMs, what are checkpoints?`
			`In the context of LLMs, what is "perplexity"?`
			`In the context of LLMs, what are models?`
			`In the context of machine-learning, what is "catastrophic forgetting"?`
			`In the context of machine-learning, what is "elastic weight consolidation (EWC)"?`
			`In the context of neural nets, what is a hidden layer?`
			`In the context of neural nets, what is a convolution?`
			`In the context of neural nets, what is dropout?`
			`In the context of neural nets, what is cross-entropy?`
			`In the context of neural nets, what is over-fitting?`
			`In the context of neural nets, what is under-fitting?`
			`What is the difference between an interpreted computer language and a compiled computer language?`
			`In the context of software development, what is a debugger?`
			`When processing using a GPU, what is off-loading?`
			`When processing using a GPU, what is a batch?`
			`When processing using a GPU, what is a block?`
			`When processing using a GPU, what is the difference between a batch and a block?`
			`When processing using a GPU, what is a scratch tensor?`
			`When processing using a GPU, what is a layer?`
			`When processing using a GPU, what is a cache?`
			`When processing using a GPU, what is unified memory?`
			`When processing using a GPU, what is VRAM?`
			`When processing using a GPU, what is a kernel?`
			`When processing using a GPU, what is "metal"?`
			`In the context of LLMs, what are "Zero-Shot", "One-Shot" and "Few-Shot" learning models?`
			`In the context of LLMs, what is the "Transformer-model" architecture?`
			`In the context of LLMs, what is "Multi-Head Attention"?`
			`In the context of LLMs, what is "Self-Attention"?`
			`In the context of transformer-model architectures, how do attention mechanisms use masks?`