Abstract
High-level synthesis (HLS) tools generate register-transfer level (RTL) hardware descriptions from behavioral-level specifications through resource allocation, scheduling and binding. Traditionally, HLS tools build datapath pipelines by inserting pipeline registers to break combinational logic into single-cycle segments; accurately analyzing that the number of available cycles for signal propagation is proven to be infeasible at the RT-level. Thus, RT-level timing analyses must pessimistically assume each path has at most one cycle for signal propagation. This leads to false positives in critical-path analyses, prevents RTL synthesis tools from optimizing real critical paths, and forces HLS flows to insert pipeline registers without improving hardware quality. In this paper, we present an efficient behavioral-level multicycle path analysis (BL-MCPA) algorithm that leverages control-data flow information to reduce time complexity of multicycle path analysis from exponential to polynomial. BL-MCPA helps eliminate false positives in timing analysis, and improves the reported fmax by 15% on average. With BL-MCPA, we avoid unnecessary pipeline register insertion, and reduce execution latency by 25% and register usage by 29% under a user fmax constraint of 300 MHz. Using BL-MCPA, we replace large multiplexers (MUXs) by pipelined MUX-trees and reduce execution latency of hardware by up to 67% on designs whose performance is limited by the large MUXs.
Original language | English (US) |
---|---|
Article number | 6917054 |
Pages (from-to) | 1832-1845 |
Number of pages | 14 |
Journal | IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems |
Volume | 33 |
Issue number | 12 |
DOIs | |
State | Published - Dec 2014 |
Keywords
- High-level synthesis
- chaining
- performance optimization
- pipelining
- timing analysis
ASJC Scopus subject areas
- Software
- Computer Graphics and Computer-Aided Design
- Electrical and Electronic Engineering