Tags

Tags give the ability to mark specific points in history as being important

b2282

cb49e0f8 · Attempt to fix android build (#5752) · Feb 27, 2024
b2281

0becb22a · IQ4_XS: a 4.25 bpw quantization (#5747) · Feb 27, 2024
b2280

c24a2a6e · cuda : replace remaining shfl_xor with calls to warp_reduce functions (#5744) · Feb 27, 2024
b2279

1f30b7a9 · ggml-quants : fix avx2 iq1_s vec_dot when compiled with gcc (#5742) · Feb 27, 2024
b2278

9d533a77 · llama : fix defrag bugs + add parameter (#5735) · Feb 27, 2024
b2277

cbbd1efa · Makefile: use variables for cublas (#5689) · Feb 27, 2024
b2276

b11a93df · fix server hangs on empty prompt (#5733) · Feb 26, 2024
b2275

a33e6a0d · Adding IQ2_S and IQ2_M to complete coverage of the 2-3 bit quantization range (#5721) · Feb 26, 2024
b2274

47bb7b48 · CUDA: fix DEBUG_CUDA_MALLOC (#5729) · Feb 26, 2024
b2272

e849078c · [SYCL] Add support for soft_max ALiBi (#5639) · Feb 26, 2024
b2271

67fd3313 · unicode : reuse iterator (#5726) · Feb 26, 2024
b2270

4804215c · server: CI fix trailing space (#5728) · Feb 26, 2024
b2269

8a533f0d · server: CI tests reduce build matrix (#5725) · Feb 26, 2024
b2268

269de86b · llama : fix Gemma rope type (#5691) · Feb 26, 2024
b2266

e3965cf3 · server: tests - slow inference causes timeout on the CI (#5715) · Feb 25, 2024
b2264

bf08e006 · llama : refactor k-shift implementation + KV defragmentation (#5691) · Feb 25, 2024
b2263

f7625019 · server : fix crash when system prompt is bigger than batch size (#5714) · Feb 25, 2024
b2262

abbabc5e · ggml-quants : provide ggml_vqtbl1q_u8 for 64bit compatibility (#5711) · Feb 25, 2024
b2261

f1a98c52 · make : fix nvcc version is empty (#5713) · Feb 25, 2024
b2259

930b1780 · server: logs - unified format and --log-format option (#5700) · Feb 25, 2024

Prev
1
…
21
22
23
24
25
26
27
28
29
…
99
Next