This paper addresses the problem of distributed multi-agent optimization in which each agent i has a local cost function hi(x), and the goal is to optimize a global cost function consisting of an average of the local cost functions. Such optimization problems are of interest in many contexts, including distributed machine learning and distributed robotics. We consider the distributed optimization problem in the presence of faulty agents. We focus primarily on Byzantine failures, but also briey discuss some results for crash failures. For the Byzantine fault-tolerant optimization problem, the ideal goal is to optimize the average of local cost functions of the non-faulty agents. However, this goal also cannot be achieved. Therefore, we consider a relaxed version of the fault-tolerant optimization problem. The goal for the relaxed problem is to generate an output that is an optimum of a global cost function formed as a convex combination of local cost functions of the non-faulty agents. More precisely, if N denotes the set of non-faulty agents in a given execution, then there must exist weights αi for i ϵ N such that αi ≥ 0 and ΣiϵN αi = 1, such that the output is an optimum of the cost function P i2N αihi(x). Ideally, we would like αi = 1 |N| for all i ϵ N, however, the maximum number of nonzero weights (αi's) that can be guaranteed is |N| - f, where f is the maximum number of Byzantine faulty agents. The contribution of this paper is to present an iterative distributed optimization algorithm that achieves optimal faulttolerance. Specifically, it ensures that at least |N|-f agents have weights that are bounded away from 0 (in particular, lower bounded by 1/2(|N|-f) ). The proposed distributed al-gorithm has a simple iterative structure, with each agent maintaining only a small amount of local state. We show that the iterative algorithm ensures two properties as time goes to 1: consensus (i.e., output of non-faulty agents becomes identical in the time limit), and optimality (in the sense that the output is the optimum of a suitably defined global cost function). After a finite number of iterations, the algorithm satisfies these properties approximately.