(X'X)-1X'Y

# Happy birthday to Gauss, father of the first predictive algorithm

Carl Friedrich Gauss, the “Prince of Mathematicians.”
Image: Wikimedia
By

If you work or study in a field that uses math, you almost certainly owe a debt to Carl Friedrich Gauss, the great German mathematician, who was born 241 years ago today. Architects and designers use his findings to measure the curvature of surfaces. Engineers take advantage of his discovery of the discrete Fourier transform, a method used for image and sound processing. Biologists use his formulation of the normal distribution to study patterns in nature.

For all of his incredible findings, Gauss is sometimes called the “Prince of Mathematicians.” The breadth of Gauss’s accomplishments are highlighted in today’s Google doodle.

Gauss was so prolific that he considered his invention of statistical regression, a central tool in modern statistics and data science, too trivial to even report when he discovered it. Specifically, Gauss invented least squares regression, a method used to calculate a straight line that best fits a set of data, and the earliest form of regression analysis. It is primarily used for understanding the relationship between variables, or to predict future outcomes. For example, you could use regression to understand the strength of the relationship between parental height and the height of children, or even parental height and the income of children when they become adults.

Although the French mathematician Adrien-Marie Legendre was the first to publish a paper using regression in 1805, Gauss claimed to have invented the method in 1795. He asserted his discovery in an 1809 paper in which he used regression to predict the location of an asteroid. Legendre disputed that Gauss deserved credit for the invention, and it would lead to lifelong hostility between the two. Still, as the historian Stephen Stigler explains in a paper on the imbroglio (pdf), Gauss would get most of the accolades because he did more to explain regression’s fundamental value. He also provided an algorithm to compute it.

Gauss and Legendre never actually used the term regression for their method. It was first referred to as the “regression line” by statistician Karl Pearson in 1901.

Today, regression is one of the foundations of modern social science, and among the most important tools of computer scientists. It has been further developed since Gauss’s time, of course, but the basics are still quite similar. It is used by economists to try to understand what causes economic growth, by psychologists to analyze experiments on human behavior, and by biostatisticians to figure out which drugs are safe to use.

Machines also use regression to predict things like which advertisement to show you on a webpage. While newer prediction algorithms, like support vector machines and neural networks, generally surpass least squares regression in terms of accuracy, the method remains popular for its simplicity and speed of calculation. More than 200 years later, the work of Gauss is as relevant as ever.