Tags

Tags give the ability to mark specific points in history as being important

b2943

e932094d · server : return error on too large embedding input (#7389) · May 20, 2024
b2941

33c8d50a · Add provisions for windows support for BF16 code including CMake provision for... · May 20, 2024
b2940

d359f309 · llama : remove MPI backend (#7395) · May 20, 2024
b2939

1ea2a003 · quantize : fix --keep-split check (#7374) · May 19, 2024
b2938

f030ec1f · Vulkan Embedding Fix (#7360) · May 19, 2024
b2937

e4e6f67b · ggml : fix another case of quants nans (#7387) · May 19, 2024
b2936

5ca49cbe · ggml: implement quantized KV cache for FA (#7372) · May 19, 2024
b2934

41858392 · server: fix seed being reported back (#7382) · May 19, 2024
b2933

6aade19e · Add StableLM2 pre-tokenizer (#7349) · May 19, 2024
b2932

ab33f7a3 · cuda : clear error after buffer allocation failure (#7376) · May 19, 2024
b2930

854d365a · cmake : update android comments (#7341) · May 19, 2024
b2929

f5bf7617 · Capture CUDA logging output (#7298) · May 19, 2024
b2928

059031b8 · ci : re-enable sanitizer runs (#7358) · May 18, 2024
b2927

511182ea · android : use "ci-android" branch for CI (#7341) · May 18, 2024
b2926

133d99c5 · CUDA: deduplicate FlashAttention code (#7352) · May 18, 2024
b2923

0f98acfa · llama : add support for larger Granite Code Models (20B, 34B) (#7324) · May 18, 2024
b2922

ca57e0f3 · perplexity : ndot progress and show stats with < 100 tasks (#7348) · May 18, 2024
b2921

c1b295ee · Update and fix Vulkan soft_max and argsort implementations (#7237) · May 18, 2024
b2918

05834841 · ggml : fix quants nans when all the group weights are very close to zero (#7313) · May 18, 2024
b2917

ef277de2 · cmake : fix typo in AMDGPU_TARGETS (#7356) · May 18, 2024

Previous
1
2
3
4
5
6
7
8
9
10
…
99
Next