Skip to content
GitLab
Explore
Sign in
Tags
Tags give the ability to mark specific points in history as being important
b2943
e932094d
·
server : return error on too large embedding input (#7389)
·
May 20, 2024
b2941
33c8d50a
·
Add provisions for windows support for BF16 code including CMake provision for...
·
May 20, 2024
b2940
d359f309
·
llama : remove MPI backend (#7395)
·
May 20, 2024
b2939
1ea2a003
·
quantize : fix --keep-split check (#7374)
·
May 19, 2024
b2938
f030ec1f
·
Vulkan Embedding Fix (#7360)
·
May 19, 2024
b2937
e4e6f67b
·
ggml : fix another case of quants nans (#7387)
·
May 19, 2024
b2936
5ca49cbe
·
ggml: implement quantized KV cache for FA (#7372)
·
May 19, 2024
b2934
41858392
·
server: fix seed being reported back (#7382)
·
May 19, 2024
b2933
6aade19e
·
Add StableLM2 pre-tokenizer (#7349)
·
May 19, 2024
b2932
ab33f7a3
·
cuda : clear error after buffer allocation failure (#7376)
·
May 19, 2024
b2930
854d365a
·
cmake : update android comments (#7341)
·
May 19, 2024
b2929
f5bf7617
·
Capture CUDA logging output (#7298)
·
May 19, 2024
b2928
059031b8
·
ci : re-enable sanitizer runs (#7358)
·
May 18, 2024
b2927
511182ea
·
android : use "ci-android" branch for CI (#7341)
·
May 18, 2024
b2926
133d99c5
·
CUDA: deduplicate FlashAttention code (#7352)
·
May 18, 2024
b2923
0f98acfa
·
llama : add support for larger Granite Code Models (20B, 34B) (#7324)
·
May 18, 2024
b2922
ca57e0f3
·
perplexity : ndot progress and show stats with < 100 tasks (#7348)
·
May 18, 2024
b2921
c1b295ee
·
Update and fix Vulkan soft_max and argsort implementations (#7237)
·
May 18, 2024
b2918
05834841
·
ggml : fix quants nans when all the group weights are very close to zero (#7313)
·
May 18, 2024
b2917
ef277de2
·
cmake : fix typo in AMDGPU_TARGETS (#7356)
·
May 18, 2024
Prev
1
2
3
4
5
6
7
8
9
10
…
99
Next