Large scale simulations with discrete element method (DEM) are often computationally expensive. A variety of approaches are used to address this persistent challenge by developing more efficient computational algorithms or improving employed computing hardware. In this article we propose to take advantage of parallel computation combined with the application of reconfigurable hardware. This is done by (a) employing multi-threads/processors to run the code on massively parallel machines, (b) exploring the placement of time-consuming parts of the DEM code onto field programmable gate arrays (FPGA) for further gains in speed. Copyright ASCE 2006.