mirror of
https://github.com/ggerganov/llama.cpp.git
synced 2024-11-11 13:30:35 +00:00
a8777ad84e
* Enable external file and add datestamp * Add name of external file at end * Upload ToK2024 * Delete ToK2024.txt * Experiments with jeopardy * Move ParallelQuestions to /proimpts and rename * Interim commit * Interim commit * Final revision * Remove trailing whitespace * remove cmake_all.sh * Remove cmake_all.sh * Changed .gitignore * Improved reporting and new question files. * Corrected typo * More LLM questions * Update LLM-questions.txt * Yet more LLM-questions * Remove jeopardy results file * Reinstate original jeopardy.sh * Update examples/parallel/parallel.cpp --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
49 lines
2.5 KiB
Plaintext
49 lines
2.5 KiB
Plaintext
In the context of LLMs, what is "Attention"?
|
|
In the context of LLMs, what is a completion?
|
|
In the context of LLMs, what is a prompt?
|
|
In the context of LLMs, what is GELU?
|
|
In the context of LLMs, what is RELU?
|
|
In the context of LLMs, what is softmax?
|
|
In the context of LLMs, what is decoding?
|
|
In the context of LLMs, what is encoding?
|
|
In the context of LLMs, what is tokenizing?
|
|
In the context of LLMs, what is an embedding?
|
|
In the context of LLMs, what is quantization?
|
|
In the context of LLMs, what is a tensor?
|
|
In the context of LLMs, what is a sparse tensor?
|
|
In the context of LLMs, what is a vector?
|
|
In the context of LLMs, how is attention implemented?
|
|
In the context of LLMs, why is attention all you need?
|
|
In the context of LLMs, what is "RoPe" and what is it used for?
|
|
In the context of LLMs, what is "LoRA" and what is it used for?
|
|
In the context of LLMs, what are weights?
|
|
In the context of LLMs, what are biases?
|
|
In the context of LLMs, what are checkpoints?
|
|
In the context of LLMs, what is "perplexity"?
|
|
In the context of LLMs, what are models?
|
|
In the context of machine-learning, what is "catastrophic forgetting"?
|
|
In the context of machine-learning, what is "elastic weight consolidation (EWC)"?
|
|
In the context of neural nets, what is a hidden layer?
|
|
In the context of neural nets, what is a convolution?
|
|
In the context of neural nets, what is dropout?
|
|
In the context of neural nets, what is cross-entropy?
|
|
In the context of neural nets, what is over-fitting?
|
|
In the context of neural nets, what is under-fitting?
|
|
What is the difference between an interpreted computer language and a compiled computer language?
|
|
In the context of software development, what is a debugger?
|
|
When processing using a GPU, what is off-loading?
|
|
When processing using a GPU, what is a batch?
|
|
When processing using a GPU, what is a block?
|
|
When processing using a GPU, what is the difference between a batch and a block?
|
|
When processing using a GPU, what is a scratch tensor?
|
|
When processing using a GPU, what is a layer?
|
|
When processing using a GPU, what is a cache?
|
|
When processing using a GPU, what is unified memory?
|
|
When processing using a GPU, what is VRAM?
|
|
When processing using a GPU, what is a kernel?
|
|
When processing using a GPU, what is "metal"?
|
|
In the context of LLMs, what are "Zero-Shot", "One-Shot" and "Few-Shot" learning models?
|
|
In the context of LLMs, what is the "Transformer-model" architecture?
|
|
In the context of LLMs, what is "Multi-Head Attention"?
|
|
In the context of LLMs, what is "Self-Attention"?
|
|
In the context of transformer-model architectures, how do attention mechanisms use masks? |