With the growing popularity of mobile devices, the trend in the field of system-on-chip has shifted from high performance to low power operation. However, traditional design methodology is limited by the design margins reserved for process, voltage and temperature variations. Therefore, a systematic solution that enables real-time timing error detection and correction was proposed to eliminate redundant margins in the design of an ARM microprocessor. A prototype stochastic ARM1136 processor was implemented in TSMC 90nm technology. Two circuit-level techniques, Razor and Surger, are exploited to form a hybrid error detection mechanism by observing both global and local timing information. To enable the deployment of aggressive voltage scaling with hardware-based error tolerance mechanism, we propose an activity-driven optimization flow to reshape the slack distribution based on path-activation probability. The chip achieves a frequency of 250MHz at worst case with 48.82mW power consumption. The overall power overhead of the proposed error tolerance mechanism is about 25% (hold-fixing latches 15.25% plus Razor 10.53%). The energy saving through design margins elimination is 51% (an average of the three corner cases) and a 42.8% saving was measured at the lowest operation voltage.