We consider the problem of circuit partitioning for multiple-chip implementation. One motivation for studying this problem is the current needs of good partitioning tools for implementing a circuit on multiple FPGA chips. We allow duplication of logic gates as it would reduce circuit delay. Circuit partitioning with duplication of logic gates is also called circuit clustering. In this paper, we present a circuit clustering algorithm that minimizes circuit delay subject to both area and pin constraints on each chip, using the general delay model. We develop a repeated network cut technique for finding a cluster that is bounded by both area and pin constraints. Our algorithm achieves optimal delay under either the area constraint only or the pin constraint only. Under both area and pin constraints, our algorithm achieves optimal delay in most cases. We outline the condition under which the nonoptimality occurs and show that the condition occurs rarely in practice. We tested our algorithm on a set of benchmark circuits and consistently obtained optimal or near-optimal delays.