SWE-rebench leaderboard update: GLM-5.2, Qwen3.6-27B, Qwen3.6-35B-A3B, Gemma 4 31B and more + improved UI

推荐理由

这条记录涉及编程工具或代码能力更新，适合开发者评估工作流变化和可复用价值。

We made several updates to the SWE-rebench leaderboard: added new models, refreshed recent results, and reworked the leaderboard UI to make results easier to read, compare, and understand.

For r/LocalLLaMA , the most interesting part is probably the local / self-hosted model results. Qwen3.6-27B is quite strong for its size, while Qwen3.6-35B-A3B and Gemma 4 31B are also now on the board for comparison.

Which local models should we test ? Let us know which ones you use for coding agents or local development, and we’ll consider adding them in future updates.

> Harbor (If you want to run Agent on your own) : https://hub.harborframework.com/datasets/swe-rebench/swe-rebench-leaderboard/latest

原始关键词#leaderboard#improved#rebench#update#gemma#qwen3

查看原文reddit.com

单一来源，暂无交叉验证