Skip to main navigation Skip to search Skip to main content

MEDUSA: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

  • Tianle Cai
  • , Yuhong Li
  • , Zhengyang Geng
  • , Hongwu Peng
  • , Jason D. Lee
  • , Deming Chen
  • , Tri Dao

Research output: Contribution to journalConference articlepeer-review

Fingerprint

Dive into the research topics of 'MEDUSA: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads'. Together they form a unique fingerprint.
Sort by

Keyphrases

Computer Science

Engineering