The exponential increase in the volume of images and videos captured on construction sites and the growing availability of building information models (BIM) and schedules with production-level details has created a unique opportunity to automate how progress is monitored and reported on construction sites. However, the state-of-the-art methods of automated progress comparison are still in its infancy largely because of these methods either only leverage geometry of the 3D reconstructed scenes to reason about presence or detect and classify construction material from 2D images without considering geometrical characteristics. To the best of our knowledge, this paper is the first to offer a computer vision method that can jointly reason about geometry and appearance of observed BIM elements in site images and videos to monitor and report on their state of progress. The new method fuses structure-from-motion geometrical features together with directional and radial appearance features in a new deep convolutional neural network (CNN) architecture to detect and classify state of work-in-progress. Our experimental results show that using geometrical features reduces errors in appearance-based recognition methods and offers a new opportunity to scale the applicability of automated progress detection methods to real-world settings.