Skip to content
GitLab
Explore
Sign in
Tags
Tags give the ability to mark specific points in history as being important
master-5488fb7
5488fb78
·
ggml : allocate graphs in a context (#2392)
·
Jul 26, 2023
master-eb542d3
eb542d39
·
Add LLAMA_DEFAULT_RMS_EPS so we can change the default (#2384)
·
Jul 25, 2023
master-07aaa0f
07aaa0f6
·
ggml : fix ggml_flash_attn to use op_params (#2387)
·
Jul 25, 2023
master-875086b
875086bd
·
ggml : relax contiguous constraints in activation function (#2371)
·
Jul 25, 2023
master-da18898
da188983
·
ggml : improve graph build time via hash table lookup (#2329)
·
Jul 25, 2023
master-0c06204
0c06204f
·
main : add `--in-prefix-bos` to prefix BOS to user inputs; keep EOS (#2304)
·
Jul 25, 2023
master-1fed755
1fed755b
·
ci : add non-AVX scalar build/test (#2356)
·
Jul 25, 2023
master-be2301b
be2301bc
·
k_quants : add AVX support to dot functions with QK_K as 64 (#2339)
·
Jul 25, 2023
master-1aa18ef
1aa18ef9
·
metal : concurrently dispatch commands (#2358)
·
Jul 25, 2023
master-129d844
129d844c
·
Fix Q4_K and Q5_K for QK_K = 64 on CUDA (#2359)
·
Jul 25, 2023
master-d5512b7
d5512b78
·
server: add rms_norm_eps parameter (#2380)
·
Jul 25, 2023
master-c798308
c798308e
·
[Server] Escape HTML in webchat (#2368)
·
Jul 25, 2023
master-41c6741
41c67416
·
make rms_norm_eps a parameter (#2374)
·
Jul 24, 2023
master-b3f138d
b3f138d0
·
Chat UI extras (#2366)
·
Jul 24, 2023
master-5b2b2dc
5b2b2dc6
·
ggml : sync (unary ops refactor, static-correctness) (#2370)
·
Jul 24, 2023
master-42f70cb
42f70cb2
·
Fix scalar version of Q5_K when QK_K = 64 (#2362)
·
Jul 24, 2023
master-84e09a7
84e09a7d
·
llama : add grammar-based sampling (#1773)
·
Jul 23, 2023
master-2f9cf97
2f9cf974
·
Some more Q4_K and Q5_K speedup on CUDA (#2346)
·
Jul 24, 2023
master-4f06592
4f06592c
·
Add gqa parameter support to the server (#2351)
·
Jul 23, 2023
master-57921ca
57921ca6
·
common : n_threads == -1 uses std::thread::hardware_concurrency() (#2347)
·
Jul 23, 2023
Prev
1
…
67
68
69
70
71
72
73
74
75
…
99
Next