返回
RCreddit.com
14
·开发者社区 · RSS

Has anyone tried this approach with Fast Byte Latent Transformers ? [R]

查看原文

Paper Referred:- https://arxiv.org/pdf/2412.09871v1

Has anyone switched the transformer in the entropy model here to a Mamba model ? What could be the possible changes ?

Just a ML fresher asking a genuine, since Mamba is more popular and saves computer (O(n)).

Thanking you in advance !

原始关键词#transformers#approach#anyone#latent#tried#byte
查看原文reddit.com
单一来源,暂无交叉验证
Has anyone tried this approach with Fast Byte Latent Transformers ? [R] · BuzzRadr