返回
RCreddit.com
8
·开发者社区 · RSS

EPYC hybrid system benches and optimal CPU

查看原文

Finally I've built my semi-budget setup, tho not everything went as I expected. Firstly, I purchased EPYC 9555 QS, but was scammed and CPU arrived dead. That time I was only able to afford placeholder 9135 with 2 CCD.

That's why I'm interested in inference numbers of people who bought proper cpu. Everyone talks that 16 CCD and less cores is the best choice (9175f), but based on my research difference is not so big. Otherwise I saw comment that someone benched GLM-5.2 on 9684x (cpu only) and scored 12t/s. My setup's cpu only got me around 7t/s. I've also heard that 9555 would be better than 9355 in some github thread.

https://openbenchmarking.org/ contains only small models benches.

My setup: 768 DDR5 4800, EPYC 9135, RTX 5090

Test command (ik_llama and Ubergarm/Kimi-K2.6 Q4_X)

: ./llama-sweep-bench \

--model Ubergarm/Kimi-K2.6-Q4_X-00001-of-00014.gguf \

--no-mmap --merge-qkv \

-mla 3 -amb 512 \

-b

4096 -ub

4096 \

-ctk f16 -ctv f16 -c 32000 \

-ngl 999 -ncmoe 999 \

--threads 16 \

--threads-batch 28 \

--warmup-batch \

-n 128

Numbers: b 4096 | PP | TG | N_KV | T_PP s | S_PP t/s | T_TG s | S_TG t/s |

| 4096 | 128 | 0 | 15.701 | 260.87 | 7.168 | 17.86 |

| 4096 | 128 | 4096 | 16.128 | 253.96 | 7.260 | 17.63 |

| 4096 | 128 | 8192 | 16.296 | 251.35 | 7.457 | 17.16 |

| 4096 | 128 | 16384 | 17.006 | 240.86 | 7.519 | 17.02 |

| 4096 | 128 | 32768 | 18.397 | 222.65 | 7.845 | 16.32 |

原始关键词#benches#optimal#hybrid#system#epyc#cpu
查看原文reddit.com
单一来源,暂无交叉验证
EPYC hybrid system benches and optimal CPU · BuzzRadr