AnalysisAI Models
11 hours ago
FlashMemory-DeepSeek-V4 introduces Lookahead Sparse Attention for ultra-long context
Proposes Lookahead Sparse Attention (LSA) and a Neural Memory Indexer to overcome GPU memory bottlenecks in ultra-long context inference. Built on the DeepSeek-V4 architecture.
·
11 hours ago
