Tags

Tags give the ability to mark specific points in history as being important

b2258

d52d7819 · server: concurrency fix + monitoring - add /metrics prometheus compatible endpoint (#5708) · Feb 25, 2024
b2257

12894088 · cmake : fix compilation for Android armeabi-v7a (#5702) · Feb 25, 2024
b2256

ab336a9d · code : normalize enum names (#5697) · Feb 25, 2024
b2254

9e359a4f · server: continue to update other slots on embedding concurrent request (#5699) · Feb 24, 2024
b2253

4c4cb307 · IQ3_S: a much better alternative to Q3_K (#5676) · Feb 24, 2024
b2252

525213d2 · server: init functional tests (#5566) · Feb 24, 2024
b2251

fd43d66f · server : add KV cache quantization options (#5684) · Feb 23, 2024
b2249

15499eb9 · mpt : do not duplicate token_embd.weight on disk (#5670) · Feb 22, 2024
b2248

96633eec · gemma : use more bits for the token_embd.weight tensor (#5650) · Feb 22, 2024
b2247

847eedbd · py : add Gemma conversion from HF models (#5647) · Feb 22, 2024
b2246

7e4f339c · ggml : always define ggml_fp16_t as uint16_t (#5666) · Feb 22, 2024
b2245

334f76fa · sync : ggml · Feb 22, 2024
b2241

373ee3fb · Add Gemma chat template (#5665) · Feb 22, 2024
b2240

4cb4d8b2 · workflows: nix: hardcode cachix ids, build unconditionally (#5663) · Feb 22, 2024
b2239

3a03541c · minor : fix trailing whitespace (#5638) · Feb 22, 2024
b2237

a46f5074 · server : fallback to chatml, add AlphaMonarch chat template (#5628) · Feb 22, 2024
b2235

4ef245a9 · mpt : add optional bias tensors (#5638) · Feb 22, 2024
b2234

973053d8 · llama : fix loading models with shared tok_embd and output (#5651) · Feb 22, 2024
b2233

7c8bcc11 · Add docs for llama_chat_apply_template (#5645) · Feb 22, 2024
b2232

7fe4678b · llama : fix session save/load with quantized KV (#5649) · Feb 21, 2024

Prev
1
…
22
23
24
25
26
27
28
29
30
…
99
Next