Key strategies used in the development of a scalable, three-dimensional direct simu-lation Monte Carlo (DSMC) program are described. The code employs an Octree based adaptive mesh refinement (AMR) that gives flexibility in capturing multiscale physics. It is coupled with a robust cut-cell algorithm to incorporate complex triangulated geome-tries. With the use of distributed memory systems and Message-Passing-Interface (MPI) for communication, the code is highly scalable. The paper explains special considerations required for embedded geometries in parallel environment. An efficient algorithm is men-tioned that allows checking of particle-surface interaction only if they are close enough to the geometry. The code employs Borgnakke-Larsen continuous relaxation model to simu-late inelastic collisions of diatomic molecules. Finally, it has been validated by simulating hypersonic flow of argon and nitrogen over a hemisphere and double-wedge configuration and is compared with the results obtained from the SMILE code.