mirror of
https://github.com/ggerganov/llama.cpp.git
synced 2024-11-14 06:49:54 +00:00
49 lines
2.5 KiB
Plaintext
49 lines
2.5 KiB
Plaintext
|
In the context of LLMs, what is "Attention"?
|
||
|
In the context of LLMs, what is a completion?
|
||
|
In the context of LLMs, what is a prompt?
|
||
|
In the context of LLMs, what is GELU?
|
||
|
In the context of LLMs, what is RELU?
|
||
|
In the context of LLMs, what is softmax?
|
||
|
In the context of LLMs, what is decoding?
|
||
|
In the context of LLMs, what is encoding?
|
||
|
In the context of LLMs, what is tokenizing?
|
||
|
In the context of LLMs, what is an embedding?
|
||
|
In the context of LLMs, what is quantization?
|
||
|
In the context of LLMs, what is a tensor?
|
||
|
In the context of LLMs, what is a sparse tensor?
|
||
|
In the context of LLMs, what is a vector?
|
||
|
In the context of LLMs, how is attention implemented?
|
||
|
In the context of LLMs, why is attention all you need?
|
||
|
In the context of LLMs, what is "RoPe" and what is it used for?
|
||
|
In the context of LLMs, what is "LoRA" and what is it used for?
|
||
|
In the context of LLMs, what are weights?
|
||
|
In the context of LLMs, what are biases?
|
||
|
In the context of LLMs, what are checkpoints?
|
||
|
In the context of LLMs, what is "perplexity"?
|
||
|
In the context of LLMs, what are models?
|
||
|
In the context of machine-learning, what is "catastrophic forgetting"?
|
||
|
In the context of machine-learning, what is "elastic weight consolidation (EWC)"?
|
||
|
In the context of neural nets, what is a hidden layer?
|
||
|
In the context of neural nets, what is a convolution?
|
||
|
In the context of neural nets, what is dropout?
|
||
|
In the context of neural nets, what is cross-entropy?
|
||
|
In the context of neural nets, what is over-fitting?
|
||
|
In the context of neural nets, what is under-fitting?
|
||
|
What is the difference between an interpreted computer language and a compiled computer language?
|
||
|
In the context of software development, what is a debugger?
|
||
|
When processing using a GPU, what is off-loading?
|
||
|
When processing using a GPU, what is a batch?
|
||
|
When processing using a GPU, what is a block?
|
||
|
When processing using a GPU, what is the difference between a batch and a block?
|
||
|
When processing using a GPU, what is a scratch tensor?
|
||
|
When processing using a GPU, what is a layer?
|
||
|
When processing using a GPU, what is a cache?
|
||
|
When processing using a GPU, what is unified memory?
|
||
|
When processing using a GPU, what is VRAM?
|
||
|
When processing using a GPU, what is a kernel?
|
||
|
When processing using a GPU, what is "metal"?
|
||
|
In the context of LLMs, what are "Zero-Shot", "One-Shot" and "Few-Shot" learning models?
|
||
|
In the context of LLMs, what is the "Transformer-model" architecture?
|
||
|
In the context of LLMs, what is "Multi-Head Attention"?
|
||
|
In the context of LLMs, what is "Self-Attention"?
|
||
|
In the context of transformer-model architectures, how do attention mechanisms use masks?
|