Soft Continuum Arms (SCAs) are challenging to control due to their highly nonlinear characteristics and sensitivity to external loading. Recent efforts to address the control problem using machine learning techniques are limited to simple SCA architectures. In this paper, we train a model-free reinforcement learning control policy based on Deep Deterministic Policy Gradient (DDPG) for end effector path tracking on a BR2 SCA. Unlike simple SCA architectures, the BR2 SCA has the functionality to bend and rotate spatially thus leading to enhanced workspace and ability to perform complex tasks. The control policy is first validated in simulations and then implemented on a prototype BR2 with state feedback. An average tracking error less than 3 cm (< diameter of the SCA) is reported using the proposed control policy. The efficacy of the control policy is validated for different loading conditions both in simulations and on the SCA prototype.