Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer Inference
Junyan Li
senfu
AI & ML interests
None yet
Recent Activity
upvoted a paper 1 day ago
FlowCompile: An Optimizing Compiler for Structured LLM Workflows submitted a paper about 2 months ago
FlowCompile: An Optimizing Compiler for Structured LLM Workflows updated a dataset 11 months ago
senfu/test