The method of moments (MoM) has been developed and widely used for solving electromagnetic scattering and radiation problems. The major disadvantage of the MoM is that it has O(N2) computational and storage complexities, which result in a large memory requirement and a tremendous amount of computation time (J.-M. Jin, Theory and Computation of Electromagnetic Fields. Hoboken, New Jersey: Wiley, 2010). To alleviate these problems, a GPU-accelerated multilevel fast multipole algorithm (MLFMA) has been developed with a capability of solving one-million-unknown problems on four GPUs (J. Guan, S. Yan, and J.-M. Jin, IEEE Trans. Antennas Propag., vol. 60, pp. 3607-3616, June 2013). However, this parallelized algorithm requires substantially more GPU resources if the problem size increases further, which would result in a reduction of the computational efficiency because more data communications between CPU and GPU are required in the MLFMA. To overcome this problem, a 'compute on-the-fly' strategy is investigated in this work, with the objective to solve larger problems with limited GPU resources.