Mapping Local Nodes - Mildlyinteresting
这条记录涉及生成能力或端侧推理进展,适合跟踪模型效率、部署门槛和应用机会。
I've been working on mapping (with tags) and steering local models based on their activation path in specific context to questioning during a/b testing. There is no insight or "how to" here, no benchmarks or improvement suggestions, no products. I just think that the activation paths that these models take with different prompts is really neat.
Here are batch prompts (5 prompts) and the activation paths (total token) each model took to answer them without steering. The answers provided to the questions are normal and as expected.
I find it incredibly interesting the nodes and path each model uses to complete each task. The questions themselves are designed to envoke a different activation path from the model which explains the variance between each.
Tested here are Gemma 4 31b qat with minimal variance. Gemma 4 26b qat with extreme variance and Qwen 3.6 35b q_4 with moderate variance in activation path illumination.
Rant: These activation paths are akin to the neural networks (neuron clusters) that operate with each activation within our own minds. There is no "sarcastic node" that fires independently. Its a series of nodes firing together to create that sarcasm. It's fascinating to me. We have the ability to essentially perform brain surgery on these models. We get the opportunity to find clusters (L10-N8156X) where specific reactionary traits, inference exist and can then exploit/manipulate that into a form we want. I figured if there was one sub that might get my investment into this, it would be here.