Exploiting the sparsity of signals in an adaptive dictionary or transform domain benefits various applications in image/video processing. As opposed to synthesis dictionary learning, transform learning allows for cheap computations, and has been demonstrated to perform well in applications such as image denoising. Very recently, we proposed methods for online sparsifying transform learning, which are particularly useful for processing large-scale or streaming data. Online transform learning has good convergence guarantees and enjoys a much lower computational cost than online synthesis dictionary learning. In this work, we present a video denoising framework based on online 3D spatio-temporal sparsifying transform learning. The proposed scheme has low computational and memory costs, and can potentially handle streaming video. Our numerical experiments show promising performance for the proposed video denoising method compared to popular prior or state-of-the-art methods.