Back to AIBriefs
AnalysisAI Models

Sakana AI proposes DiffusionBlocks for block-wise training

Training memory is reduced by a factor of B (number of blocks) by training transformer networks one block at a time. Performance is maintained across diverse architectures.

·
14 days ago
Sakana AI proposes DiffusionBlocks for block-wise training — AIBriefs