University of Wisconsin-Madison

Stephen J. Wright holds the George B. Dantzig Professorship, the Sheldon Lubar Chair, and the Amar and Balinder Sohi Professorship of Computer Sciences at the University of Wisconsin-Madison. His research is in computational optimisation and its applications to many areas of science and engineering.

Prior to joining UW-Madison in 2001, Wright held positions at North Carolina State University (1986-90), Argonne National Laboratory (1990-2001), and the University of Chicago (2000-2001). He graduated with a Ph.D. from the University of Queensland in 1984. He has served as Chair of the Mathematical Optimisation Society and as a Trustee of SIAM. He is a Fellow of SIAM. In 2014, he won the W.R.G. Baker award from IEEE.

Wright is the author / coauthor of widely used text / reference books in optimisation including “Primal Dual Interior-Point Methods” and “Numerical Optimisation”. He has published widely on optimisation theory, algorithms, software, and applications.

Wright is current editor-in-chief of the SIAM Journal on Optimisation and previously served as editor-in-chief or associate editor of Mathematical Programming (Series A), Mathematical Programming (Series B), SIAM Review, SIAM Journal on Scientific Computing, and several other journals and book series.

Professor Stephen Wright will be lecturing at AMSI Winter School 2017, delivering a course on “Optimisation Techniques For Data Analysis”.

Professor Stephen Wright


AMSI caught up with Professor Stephen Wright from the University of Wisconsin-Madison. We asked Stephen about his research, his interests, and got some advice…

1. Can you tell us about your work? What drives your interest in this field?

I’m interested in the properties of fundamental algorithms in optimisation, particularly algorithms that are relevant to data science and machine learning applications. There has been a surge of interest in optimisation over the past decade as it emerged as an important – perhaps the most important – enabling discipline for data science and machine learning. I like to understand in mathematical terms why algorithms that are of practical use work the way they do – what can we prove about their convergence behaviour and their complexity properties? I’m also interested in the practical side of optimisation – adapting optimisation methods to solve problems in science and engineering effectively. It is fun to collaborate with people from other disciplines, such as process control engineers, power systems engineers, statisticians, and database systems people – to learn about other domains of knowledge and other research cultures.

2. What are the most interesting “big questions” or challenges facing researchers in your area?

Currently there is a lot of interest surrounding nonconvex optimisation. Some problems in data analysis lead to nonconvex formulations, which in general are supposed to be hard to solve. But the algorithms often work better than we have any right to expect. A lot of people are trying to understand why – trying to identify “hidden convexity” or other structure that make these problems easier than they appear at first. Some very interesting results are emerging in this area. A related area of intense interest – more in machine learning than in optimisation – is understanding why deep learning works as well as it does. Deep learning has revolutionized several important areas such as speech and image recognition, so is a key technology in smart phones and self-driving cars, for example. But there is no satisfactory mathematical understanding as to why it does so well. A lot of smart people are working very hard on trying to gain an understanding, and small advances have been made. But real breakthroughs have been elusive so far.

3. What are some key industry applications of your work?

Data analysis and machine learning are really the driving applications these days. For example, parallel stochastic gradient is used almost universally to train deep neural networks. We did some of the first work in 2011 on a mathematical theory for this algorithm, and followed up with theory for parallel coordinate descent. Other technologies such as convex quadratic programming are popular in both machine learning and finance – our code OOQP is used in both venues. And I heard last year that my code PCx was the first linear programming code used at Google, around 1997-98 when it was just handful of Stanford graduate students. There are also some high-impact applications in process control

4. Why did you become a mathematician/statistician?

I did well at maths in high school, in Qld state competitions, and attended a national summer school at ANU before my senior year in high school. I didn’t have a strong vocational interest on entering university so took honours maths almost by default. It gradually became a career. I have found some interesting problems to work on at different stages of my career, and I enjoy the “expository” side too – writing books and giving review talks and articles that tell the story of recent research advances in optimisation.

5. Do you have any advice for future researchers?

It is a really exciting time in my research community (optimisation, machine learning, data science, computer science) but the competition has become intense as interest in these fields has grown. Be prepared to work really hard!