Default Branch

b828e18c75 · docker : fix vulkan build (#19352) · Updated 2026-02-05 12:10:39 +02:00

Branches

145401c9e3 · context : fix logits size overflow for huge batches · Updated 2025-08-05 05:26:46 +03:00

1859
2

342e7014db · imatrix : only warn about suffix when output format is unspecified · Updated 2025-08-04 22:12:27 +03:00

1864
2

e549515cb3 · memory : handle kv_unified for hybrid models · Updated 2025-08-03 07:45:47 +03:00

1873
1

91e67b8583 · imatrix : fix 3d tensor counts · Updated 2025-07-31 18:56:38 +03:00

1901
4

b98f80a6b4 · server : test alternative LRU logic · Updated 2025-07-29 21:19:21 +03:00

1922
1

0591b39e48 · ops: add MUSA · Updated 2025-07-29 12:25:32 +03:00

1928
1

381879e0ac · cont : tmp · Updated 2025-07-29 07:42:55 +03:00

1952
3

fb371c18ec · bench,common : add CPU extra buffer types · Updated 2025-07-28 21:53:18 +03:00

1929
1

e9f7e7cce2 · ops : update BLAS · Updated 2025-07-28 09:42:57 +03:00

1939
1

a5801f408f · sync : ggml · Updated 2025-07-25 14:31:39 +03:00

1958
2

6f4c57236b · server : fix vision test regex · Updated 2025-07-25 11:22:36 +03:00

1980
1

e65aa69402 · context : only sort outputs when needed · Updated 2025-07-24 18:06:34 +03:00

1967
1

a124399f19 · sched : fix multiple evaluations of the same graph with pipeline parallelism · Updated 2025-07-24 17:03:14 +03:00

1967
1

978c88ba0a · cont : add TODO · Updated 2025-07-24 16:31:10 +03:00

1969
2

1ef3cc1a87 · imatrix : use GGUF regardless of the output filename · Updated 2025-07-24 06:22:41 +03:00

1974
2

55cf48de1e · cuda : fix multi-seq, quantized FA · Updated 2025-07-22 20:48:53 +03:00

2016
2

0a0af0dbbd · Vulkan: Fix fprintf format-security warning · Updated 2025-07-19 12:45:31 +03:00

2010
1

386892ec61 · sync : ggml · Updated 2025-07-19 11:46:12 +03:00

2011
1

cfe5e98423 · graph : fix graph reuse reset of params · Updated 2025-07-18 17:50:32 +03:00

2014
1

9106d7595d · model : fix build after merge conflict · Updated 2025-07-18 11:50:59 +03:00

2017
1