Many companies now use crowdsourcing to leverage external (as well as internal) crowds to perform specialized work, and so methods of improving efficiency are critical. Tasks in crowdsourcing systems with specialized work have multiple steps and each step requires multiple skills. Steps may have different flexibilities in terms of obtaining service from one or multiple agents, due to varying levels of dependency among parts of steps. Steps of a task may have precedence constraints among them. Moreover, there are variations in loads of different types of tasks requiring different skill-sets and availabilities of different types of agents with different skill-sets. Considering these constraints together necessitates the design of novel schemes to allocate steps to agents. In addition, large crowdsourcing systems require allocation schemes that are simple, fast, decentralized and offer customers (task requesters) the freedom to choose agents. In this work we study the performance limits of such crowdsourcing systems and propose efficient allocation schemes that provably meet the performance limits under these additional requirements. We demonstrate our algorithms on data from a crowdsourcing platform run by a non-profit company and show significant improvements over current practice.