Comparing Models for Parametric Furniture Modeling

Not a benchmark, but I think the result is interesting. I used each of these LLMs to build furniture model with identical prompt. And compare the build time, token usage, cost and model quality.

GPT 5.5 is fast, token efficient, but it wasn't following the prompt, the dovetails are not at the corners and joinery it invented is physically impossible to be assembled. (I highlighted the broken joint in each of the images).

Fable 5 is still fast, and not expensive as I expected. It get all the joinery right, it even added the additional panel with sliding dovetails inserts from behind. It also thought about wall mounting solution, and used the french cleat approach(though the hook direction wrong😅).

Opus 4.8 and Sonnet 5 all took a lot more iterations(more tokens), and their final output model didn't get the dovetails right. GLM5.2 simply failed.

Note that I have a validator to do basic physics checks of the output model, mainly checking connectivity and interference aka no board floating around and no 2 boards have overlapping volumes. The agents will have to pass that for the model to be ready. Only GPT5.5 and Fable 5 passed the checks in 1 attempt, and all the rest of models have to iterate multiple times, that is why it takes so much so much token.

All the test are based on AI furniture modeling skill Shopprentice I have been building.

主题标签模型发布

原始关键词#parametric#comparing#furniture#modeling#models

查看原文reddit.com

单一来源，暂无交叉验证