Ir al contenido principal

Regresión lineal

Dadas dos listas de valores

xs = [x(1), x(2), ..., x(n)]
ys = [y(1), y(2), ..., y(n)]

la ecuación de la recta de regresión de ys sobre xs es y = a+bx, donde \[b = \frac{n \sum x_i y_i - \sum x_i \sum y_i}{n \sum x_i^2 - \left(\sum x_i\right)^2}\] \[a = \frac{\sum y_i - b \sum x_i}{n}\]

Definir la función

regresionLineal :: [Double] -> [Double] -> (Double,Double)

tal que (regresionLineal xs ys) es el par (a,b) de los coeficientes de la recta de regresión de ys sobre xs. Por ejemplo, para los valores

ejX, ejY :: [Double]
ejX = [5,  7, 10, 12, 16, 20, 23, 27, 19, 14]
ejY = [9, 11, 15, 16, 20, 24, 27, 29, 22, 20]

se tiene

λ> regresionLineal ejX ejY
(5.195045748716805,0.9218924347243919)

Para comprobar la definición se define el procedimiento

grafica :: [Double] -> [Double] -> IO ()
grafica xs ys =
    plotPathsStyle
      [YRange (0,10+mY)]
      [(defaultStyle {plotType = Points,
                      lineSpec = CustomStyle [LineTitle "Datos",
                                              PointType 2,
                                              PointSize 2.5]},
                     zip xs ys),
       (defaultStyle {plotType = Lines,
                      lineSpec = CustomStyle [LineTitle "Ajuste",
                                              LineWidth 2]},
                     [(x,a+b*x) | x <- [0..mX]])]
    where (a,b) = regresionLineal xs ys
          mX    = maximum xs
          mY    = maximum ys

tal que (grafica xs ys) pintea los puntos correspondientes a las listas de valores xs e ys y su recta de regresión. Por ejemplo, con (grafica ejX ejY) se obtiene el siguiente dibujo


Soluciones

import Data.List (genericLength)
import Graphics.Gnuplot.Simple

ejX, ejY :: [Double]
ejX = [5,  7, 10, 12, 16, 20, 23, 27, 19, 14]
ejY = [9, 11, 15, 16, 20, 24, 27, 29, 22, 20]

regresionLineal :: [Double] -> [Double] -> (Double,Double)
regresionLineal xs ys = (a,b)
    where n     = genericLength xs
          sumX  = sum xs
          sumY  = sum ys
          sumX2 = sum (zipWith (*) xs xs)
          sumY2 = sum (zipWith (*) ys ys)
          sumXY = sum (zipWith (*) xs ys)
          b     = (n*sumXY - sumX*sumY) / (n*sumX2 - sumX^2)
          a     = (sumY - b*sumX) / n

grafica :: [Double] -> [Double] -> IO ()
grafica xs ys =
    plotPathsStyle
      [YRange (0,10+mY)]
      [(defaultStyle {plotType = Points,
                      lineSpec = CustomStyle [LineTitle "Datos",
                                              PointType 2,
                                              PointSize 2.5]},
                     zip xs ys),
       (defaultStyle {plotType = Lines,
                      lineSpec = CustomStyle [LineTitle "Ajuste",
                                              LineWidth 2]},
                     [(x,a+b*x) | x <- [0..mX]])]
    where (a,b) = regresionLineal xs ys
          mX    = maximum xs
          mY    = maximum ys