Carrier Sense Multiple Access (CSMA) protocols have been shown to reach the full capacity region for data communication in wireless networks, with polynomial complexity. However, current literature achieves the throughput optimality with an exponential delay scaling with the network size, even in a simplified scenario for transmission jobs with uniform sizes. Although CSMA protocols with order-optimal average delay have been proposed for specific topologies, no existing work can provide worst-case delay guarantee for each job in general network settings, not to mention the case when the jobs have non-uniform lengths while the throughput optimality is still targeted. In this paper, we tackle on this issue by proposing a two-timescale CSMA-based data communication protocol with dynamic decisions on rate control, link scheduling, job transmission and dropping in polynomial complexity. Through rigorous analysis, we demonstrate that the proposed protocol can achieve a throughput utility arbitrarily close to its offline optima for jobs with non-uniform sizes and worst-case delay guarantees, with a tradeoff of longer maximum allowable delay.