arxiv:2504.04823
SunYuxuan
snowdusky
AI & ML interests
None yet
Recent Activity
upvoted a paper 15 days ago
Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation upvoted a paper 22 days ago
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression upvoted a paper 3 months ago
Scaling Embeddings Outperforms Scaling Experts in Language ModelsOrganizations
None yet