Skip to content
GitLab
Explore
Sign in
Tags
Tags give the ability to mark specific points in history as being important
b2346
61d1c88e
·
Vulkan Improvements (#5835)
·
Mar 05, 2024
b2345
21b08674
·
[SYCL] fix mul_mat fault in CI/unit-test (#5862)
·
Mar 05, 2024
b2343
29eee404
·
fix speculative decoding build on windows (#5874)
·
Mar 04, 2024
b2334
4ffcdce2
·
add alias for chat template (#5858)
·
Mar 04, 2024
b2333
a0fc6266
·
sync : ggml
·
Mar 04, 2024
b2331
82f3e668
·
common : use LLAMA_DEFAULT_SEED (#5855)
·
Mar 04, 2024
b2330
5a51cc1b
·
main : support special tokens as reverse/anti prompt (#5847)
·
Mar 04, 2024
b2329
67be2ce1
·
cuda : fix data race in soft max (#5853)
·
Mar 03, 2024
b2327
475df1d6
·
llama : allow for user specified embedding pooling type (#5849)
·
Mar 03, 2024
b2325
de9692a7
·
llama : fix llama_copy_state_data with fragmented KV cache (#5840)
·
Mar 03, 2024
b2324
e6029348
·
ci : schedule slow server tests only on Release or on demand (#5839)
·
Mar 03, 2024
b2323
8ef969af
·
server : init http requests thread pool with --parallel if set (#5836)
·
Mar 03, 2024
b2321
97311342
·
server: tests: passkey challenge / self-extend with context shift demo (#5832)
·
Mar 02, 2024
b2320
4a6e2d61
·
llama : add abort_callback to interrupt computation (#5409)
·
Mar 02, 2024
b2319
494c8703
·
ggml : fix IQ3_S AVX implementation (#5834)
·
Mar 02, 2024
b2318
4d4d2366
·
convert : automatically fall back to HfVocab if tokenizer.model doesn't exist (#5821)
·
Mar 02, 2024
b2316
bbde6eb2
·
ggml : IQ3_S improvements (#5829)
·
Mar 02, 2024
b2314
6c32d8c7
·
llama : refactor internal quantization functions (#5830)
·
Mar 02, 2024
b2313
802da009
·
llama : fix segfault from unknown model arch name (#5820)
·
Mar 02, 2024
b2312
71564139
·
Support multiple GPUs (split mode) on SYCL backend (#5806)
·
Mar 02, 2024
Prev
1
…
19
20
21
22
23
24
25
26
27
…
99
Next