In this paper, we introduce a reconfigurable architecture, named 3D CMOS-NEM FPGA, which utilizes Nanoelectromechanical (NEM) relays and 3D integration techniques synergistically. Unique features of our architecture include: hybrid CMOS-NEM FPGA look-up tables (LUTs) and configurable logic blocks (CLBs), NEM-based switch blocks (SBs) and connection blocks (CBs), and face-to-face 3D stacking. This architecture also has a built-in feature named direct link which is dedicated local communication channel using the short vertical wire between the two stacks to further enhance performance. A customized 3D FPGA placement and routing flow has been developed. By replacing CMOS components with NEM relays, a 19.5% delay reduction can be achieved compared to the baseline 2D CMOS architecture. 3D stacking together with NEM devices achieves a 31.5% delay reduction over the baseline. The best performance of this architecture is achieved by adding direct links, which provides a 41.9% performance gain over the baseline.