We propose a fusion framework to integrate multiple cues for tracking by finding a set of optimal dynamic weights for different tracking modalities. In the setup of Bayesian sequential estimation, we give an optimal criterion to find the dynamic weight for each modality: Using a linear combination of the proposal distributions from multiple cues to approach the posterior distribution p(xt|yt)-The fusion problem is then formulated as an optimization problem with a non-convex objective function. We further convert the optimization problem to a constrained convex programming problem. The equations for finding the global optimal solution are given and an approximate analytical solution is derived. The derived approximate analytical solution is justified by comparing to the fusion weights/mixture weights in [8, 32]. The fusion framework can find out reliable cues and rely more on them dynamically. We test the proposed fusion framework for human tracking on a very challenging surveillance video taken at crowded subway station. We also test the fusion framework for articulated tracking. The claim that the proposed fusion framework can integrate weak modalities to improve tracking performance is supported by the promising results.