Particle Scattering Sampler for llama.cpp
https://github.com/IceFog72/llama.cpp
I added an experimental sampler to llama.cpp
called scatter
The short version: it slightly smooths the model’s next-token probability distribution inside the already-selected top candidates. It is meant to make generation less rigid without doing the usual “raise temperature and wake up the garbage tail” thing.
It uses a light-scattering metaphor, but the implementation is not real physics. The actual operation is much simpler: a cheap local diffusion / moving-average step over token rank.
Think of the model’s next-token distribution as a beam. The strongest candidate is rank 1, the next strongest is rank 2, and so on. scatter
lets nearby ranks exchange a little probability mass. Rank 1 can lose some mass to rank 2, 3, 4, etc. Rank 5 can exchange with nearby ranks. But it does not pull random deep-tail tokens into play.
That is the main point:
flatten the head of the distribution without leaking probability into the deep tail.
Status
Implemented and tested as an experimental sampler for llama.cpp
What is currently implemented:
- Native sampler API: - llama_sampler_init_scatter(...)
- llama_sampler_init_scatter_ext(...)
- Sampler-chain name: scatter
- Sampler-chain character: r
- Included in the default sampler chain between xtc
and temperature
- Disabled by default: with default parameters it returns a noop sampler
- Fixed scattering strength
- Optional adaptive strength using entropy feedback
- Optional repeated-token absorption
- Optional collision / mean-free-path gating
- Invariant tests in tests/test-sampling.cpp