This paper proposes a novel framework for semantic indexing and retrieval in digital video. The components of the framework are probabilistic multimedia objects (multijects) and a network of such objects (multinets). The main contribution of this paper is a novel application of a factor graph framework to model the interactions in a network of multijects (multinet) at a semantic level. Factor graphs are statistical graphical models that provide an efficient framework for exact and approximate inference via the sum-product algorithm. Incorporating the statistical interactions between the concepts using factor graphs enhances the detection probability of individual multijects and provides a unified framework for integrating multiple modalities and supports inference of unobservable concepts based on their relation with observable concepts. Our experiments reveal significant performance improvement using the inference on the factor graph models.