Publications

Google Scholar lists papers in citation count order.

B. O'Donoghue, E. Chu, N. Parikh, and S. Boyd. SCS (Splitting Conic Solver). GitHub, 2019.

SCS, the splitting conic solver, is a numerical optimization package for solving large-scale convex cone problems, based on our paper Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding. It is written in C and can be used in other C, C++, Python, Matlab, R, Julia, and Ruby programs via linked interfaces. It can also be called as a solver from convex optimization toolboxes CVX (3.0 or later), CVXPY, Convex.jl, and Yalmip.

B. O'Donoghue, E. Chu, N. Parikh, and S. Boyd. Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding. Journal of Optimization Theory and Applications, 2016.

We introduce a first order method for solving very large cone programs to modest accuracy. The method uses an operator splitting method, the alternating direction method of multipliers, to solve the homogeneous self-dual embedding, an equivalent feasibility problem involving finding a nonzero point in the intersection of a subspace and a cone.

N. Parikh and S. Boyd. Proximal Algorithms. Foundations and Trends in Optimization, 2014.

This monograph is about a class of optimization algorithms called proximal algorithms. Much like Newton’s method is a standard tool for solving unconstrained smooth optimization problems of modest size, proximal algorithms can be viewed as an analogous tool for nonsmooth, constrained, large-scale, or distributed versions of these problems. They are very generally applicable, but are especially well-suited to problems of substantial recent interest involving large or high-dimensional datasets. Proximal methods sit at a higher level of abstraction than classical algorithms like Newton’s method: the base operation is evaluating the proximal operator of a function, which itself involves solving a small convex optimization problem. These subproblems, which generalize the problem of projecting a point into a convex set, often admit closed-form solutions or can be solved very quickly with standard or simple specialized methods. Here, we discuss the many different interpretations of proximal operators and algorithms, describe their connections to many other topics in optimization and applied mathematics, survey some popular algorithms, and provide a large number of examples of proximal operators that commonly arise in practice.

N. Parikh and S. Boyd. Block Splitting for Distributed Optimization. Mathematical Programming Computation, 2014.

This paper describes a general purpose method for solving convex optimization problems in a distributed computing environment. In particular, if the problem data includes a large linear operator or matrix \(A\), the method allows for handling each subblock of \(A\) on a separate machine. The approach works as follows. First, we define a canonical problem form called graph form, in which we have two sets of variables \(x\) and \(y\) related by a linear operator \(A\), such that the objective function is separable across these two sets of variables. Many problems are easily expressed in graph form, including cone programs and a wide variety of regularized loss minimization problems from statistics, like logistic regression, the support vector machine, and the lasso. Next, we describe graph projection splitting, a form of Douglas-Rachford splitting or the alternating direction method of multipliers, to solve graph form problems serially. Finally, we derive a distributed block splitting algorithm based on graph projection splitting. In a statistical or machine learning context, this allows for training models exactly with a huge number of both training examples and features, such that each processor handles only a subset of both. To the best of our knowledge, this is the only general purpose method with this property. We present several numerical experiments in both the serial and distributed settings.

N. Parikh. Distributed Convex Optimization with Proximal Methods. Stanford University, 2014.

This thesis included work from the monographs “Proximal Algorithms” and “Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers” along with some additional material from “Block Splitting for Distributed Optimization”.

E. Chu, N. Parikh, A. Domahidi, and S. Boyd. Code Generation for Embedded Second-Order Cone Programming. Proceedings of the European Control Conference, 2013.

This paper describes a framework for generating easily verifiable code to solve convex optimization problems in embedded applications by transforming them into equivalent second-order cone programs. In embedded applications, it is critical to be able to verify code correctness, but it is also desirable to be able to rapidly prototype and deploy high-performance solvers for different problems. To balance these two requirements, we propose a code generation system that takes high-level descriptions of convex optimization problems and generates code that maps the parameters in the original problem to data in an equivalent second-order cone program, which is then solved by a single, external solver that can be verified once and for all. A novel aspect is that we restrict the parameters in the original problem to only appear in affine functions, which lets us map the parameters to problem data without performing any floating point operations. As a result, the generated code is lightweight, fast, and trivial to verify. The approach thus marries the benefits of high-level parser/solvers with custom, high-performance, high-reliability solvers for embedded applications.

E. Chu, B. O'Donoghue, N. Parikh, and S. Boyd. A Primal-Dual Operator Splitting Method for Conic Optimization. Unfinished draft, 2013.

We develop a simple operator splitting method for solving a primal conic optimization problem; we show that the iterates also solve the dual problem. The resulting algorithm is very simple to describe and implement and yields solutions of modest accuracy in competitive times. Several versions of the algorithm are amenable to parallelization, either via distributed linear algebra or GPU-accelerated matrix-vector multiplication. We provide a simple, single-threaded C implementation for reference.

N. Parikh and S. Boyd. Block Splitting for Large-Scale Distributed Learning. Neural Information Processing Systems, Workshop on Big Learning, 2011.

Machine learning and statistics with very large datasets is now a topic of widespread interest, both in academia and industry. Many such tasks can be posed as convex optimization problems, so algorithms for distributed convex optimization serve as a powerful, general-purpose mechanism for training a wide class of models on datasets too large to process on a single machine. In previous work, it has been shown how to solve such problems in such a way that each machine only looks at either a subset of training examples or a subset of features. In this paper, we extend these algorithms by showing how to split problems by both examples and features simultaneously, which is necessary to deal with datasets that are very large in both dimensions. We present some experiments with these algorithms run on Amazon’s Elastic Compute Cloud.

S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers. Foundations and Trends in Machine Learning, 2011.

Many problems of recent interest in statistics and machine learning can be posed in the framework of convex optimization. Due to the explosion in size and complexity of modern datasets, it is increasingly important to be able to solve problems with a very large number of features, training examples, or both. As a result, both the decentralized collection or storage of these datasets as well as accompanying distributed solution methods are either necessary or at least highly desirable. In this paper, we argue that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas. The method was developed in the 1970s, with roots in the 1950s, and is equivalent or closely related to many other algorithms, such as dual decomposition, the method of multipliers, Douglas-Rachford splitting, Spingarn’s method of partial inverses, Dykstra’s alternating projections, Bregman iterative algorithms for \(\ell_1\) problems, proximal methods, and others. After briefly surveying the theory and history of the algorithm, we discuss applications to a wide variety of statistical and machine learning problems of recent interest, including the lasso, sparse logistic regression, basis pursuit, covariance selection, support vector machines, and many others. We also discuss general distributed optimization, extensions to the nonconvex setting, and efficient implementation, including some details on distributed MPI and Hadoop MapReduce implementations.