AnalysisAI Models
Jun 14, 12:00 AM
Google's Diffusion Gemma generates 256 tokens at once
The 2-billion parameter model uses image diffusion techniques to produce 256 tokens simultaneously, significantly speeding up local inference. It iteratively denoises masked tokens rather than predicting one at a time, trading some coherence for parallel generation.
