Tags

Tags give the ability to mark specific points in history as being important

b2090

ee1628bd · Basic Vulkan Multi-GPU implementation (#5321) · Feb 07, 2024
b2087

316c7faf · llama : add MiniCPM support (#5346) · Feb 07, 2024
b2086

f3e2b4fa · server : update `/props` with "total_slots" value (#5373) · Feb 07, 2024
b2084

213d1439 · server : remove model.json endpoint (#5371) · Feb 06, 2024
b2083

17c97fb0 · CUDA: mul_mat_vec_q max. batch size 8 -> 4 (#5370) · Feb 06, 2024
b2081

f57fadc0 · Slight quantization improvement for Q4_K and Q5_K (#5361) · Feb 06, 2024
b2079

2c516611 · CUDA: mul_mat_vec_q for batch sizes > 1 (#5351) · Feb 06, 2024
b2078

8a79c591 · server : include total "num_slots" in props endpoint (#5349) · Feb 06, 2024
b2077

31e79032 · server : add `dynatemp_range` and `dynatemp_exponent` (#5352) · Feb 06, 2024
b2076

4ffc7a17 · server : various fixes for the prompt field in /completion (#5300) · Feb 06, 2024
b2074

098f6d73 · make: Use ccache for faster compilation (#5318) · Feb 05, 2024
b2072

c6b39553 · ggml : make use of ggml-quants.h possible in C++ code (#5338) · Feb 05, 2024
b2071

abb61944 · ggml : avoid duplicating function calls using MIN/MAX macros (#5325) · Feb 05, 2024
b2070

89503dcb · iq3_xxs: quards for the no-imatrix situation (#5334) · Feb 05, 2024
b2068

6fdfa2ec · iq2_xxs: tune quantization (#5320) · Feb 05, 2024
b2067

a2d60c91 · server : allow to get default generation settings for completion (#5307) · Feb 05, 2024
b2066

e6f81775 · common : add dynamic temperature parameters to main example cli (#5295) · Feb 05, 2024
b2062

4833ac20 · [SYCL] Fix cpy with dims of 3 (#5289) · Feb 05, 2024
b2060

5ed26e1f · Adding some imatrix tools (#5302) · Feb 04, 2024
b2059

277fad30 · cmake : use set() for LLAMA_WIN_VER (#5298) · Feb 03, 2024

Prev
1
…
27
28
29
30
31
32
33
34
35
…
99
Next