We present a novel and intuitive framework for building modular vision systems for complex tasks such as surveillance applications. Inspired by graphical models, especially factor graphs, the framework allows capturing the dependencies between different variables in form of a graph. This enforces principled coordination and exchange of information between different modules. Breaking away from the traditional probabilistic graphical models the framework allows flexibility of design in individual modules by allowing different learning and inference mechanisms to work in a common setting. It also allows easy integration of more modules into an already functional system. We demonstrate the ease of building a complex vision system within this framework by designing a fully automatic multi-target tracking system for a video surveillance scenario. Favorable results are obtained for the tracking application.