llama.cpp/src
Sukriti Sharma 2fffc52b50
llama : fix Roberta embeddings (#10856)
* fix: Use gpt2 tokenizer for roberta and add eos/bos tokens

Branch: RobertaTokenizer

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

* fixes to position embeddings

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* map roberta-bpe to gpt-2

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* fix linting

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

---------

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>
2024-12-19 15:04:51 +02:00
..
CMakeLists.txt remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797) 2024-12-12 19:02:49 +01:00
llama-grammar.cpp llama : refactor sampling v2 (#9294) 2024-09-07 15:16:19 +03:00
llama-grammar.h llama : refactor sampling v2 (#9294) 2024-09-07 15:16:19 +03:00
llama-impl.h log : add CONT level for continuing previous log entry (#9610) 2024-09-24 10:15:35 +03:00
llama-sampling.cpp sampling : refactor + optimize penalties sampler (#10803) 2024-12-16 12:31:14 +02:00
llama-sampling.h llama : add DRY sampler (#9702) 2024-10-25 19:07:34 +03:00
llama-vocab.cpp tts : add OuteTTS support (#10784) 2024-12-18 19:27:21 +02:00
llama-vocab.h llama : add DRY sampler (#9702) 2024-10-25 19:07:34 +03:00
llama.cpp llama : fix Roberta embeddings (#10856) 2024-12-19 15:04:51 +02:00
unicode-data.cpp server : better security control for public deployments (#9776) 2024-10-08 13:27:04 +02:00
unicode-data.h llama : reduce compile time and binary size (#9712) 2024-10-02 15:49:55 +02:00
unicode.cpp unicode : improve naming style (#10838) 2024-12-16 12:31:45 +02:00
unicode.h unicode : improve naming style (#10838) 2024-12-16 12:31:45 +02:00