### Example: The Calculus of Fitting

The graph shows data on the monthly energy use by a house (measured in ``therms'') versus the heating-degree-days for each month. What's the relationship between the two variables? A linear model is pretty reasonable here:

therms = m hdd.

What should m be? Introduce the idea of residuals and the criterion of picking the best m by minimizing the sum of squares of the residuals.

How you choose to do this might depend on the course you are teaching. Calculus? Perhaps write down the sum of square residuals:

&Sigmani=1 ( yi - m xi)2.
With xi and yi known --- they are the data, after all --- this is a function of m. If you want to do classical calculus optimization, differentiate that function with respect to m and set the result to zero. That gives the formula for the best-fitting linear function in terms of &Sigma yi2, &Sigma xi2, and &Sigma xiyi.

Statistics? With the data in a spreadsheet, calculate the sum of square residuals explicitly for some given m and plot that point. Then try another m and so on. Gradually, a quadratic curve will emerge, reminding students about what they learned in algebra and calculus.

Computer science? Write a program to calculate the best fit. Then, introduce the idea of random sampling with replacement and have students construct a resampling function --- a useful exercise in indexing. By iterating the best fit to resampled data --- called bootstrapping in statistics --- your students can find the range of m consistent with the data.

Linear algebra? Project the vector of y values down onto the vector of x. That's what least-squares fitting is all about. The coefficient m is an exercise in dot products.