Skip to content
GitLab
Explore
Sign in
Tags
Tags give the ability to mark specific points in history as being important
b1430
2f9ec7e2
·
cuda : improve text-generation and batched decoding performance (#3776)
·
Oct 27, 2023
b1429
34b2a5e1
·
server : do not release slot on image input (#3798)
·
Oct 26, 2023
b1428
6961c4bd
·
batched-bench : print params at start
·
Oct 25, 2023
b1427
cc448774
·
log : disable pid in log filenames
·
Oct 25, 2023
b1426
ad939626
·
server : add parameter -tb N, --threads-batch N (#3584) (#3768)
·
Oct 24, 2023
b1425
1717521c
·
server : do not block system prompt update (#3767)
·
Oct 24, 2023
b1424
b2f7e04b
·
sync : ggml (conv ops + cuda MSVC fixes) (#3765)
·
Oct 24, 2023
b1423
abd21fc9
·
cmake : add missed dependencies (#3763)
·
Oct 24, 2023
b1422
2b4ea35e
·
cuda : add batched cuBLAS GEMM for faster attention (#3749)
·
Oct 24, 2023
b1421
daab3d7f
·
Add more tokenizer tests (#3742)
·
Oct 24, 2023
b1420
469c9add
·
metal : handle ggml_scale for n%4 != 0 (close #3754)
·
Oct 24, 2023
b1419
e3932593
·
Revert "make : add optional CUDA_NATIVE_ARCH (#2482)"
·
Oct 23, 2023
b1416
5be6c803
·
llama : remove token functions with `context` args in favor of `model` (#3720)
·
Oct 23, 2023
b1414
96981f37
·
make : add optional CUDA_NATIVE_ARCH (#2482)
·
Oct 22, 2023
b1413
438c2ca8
·
server : parallel decoding and multimodal (#3677)
·
Oct 22, 2023
b1412
9e70cc03
·
Add test for MPT tokenization (#3728)
·
Oct 22, 2023
b1410
a5e7dbd6
·
llama : validate special token ids are in range when loading GGUF model (#3635)
·
Oct 22, 2023
b1409
d3956aea
·
main : escape prompt for cfg_negative_prompt and consecutive inputs in main...
·
Oct 22, 2023
b1408
22c69a27
·
batched : add len CLI argument
·
Oct 22, 2023
b1407
465219b9
·
CLBlast: Add outer loops over src0 for broadcasting in mulmat
·
Oct 20, 2023
Prev
1
…
49
50
51
52
53
54
55
56
57
…
99
Next