Page 160 - Data Science Algorithms in a Week
P. 160

Regression


            The presence of the intercept term may be caused by the errors in the measurements or by
            other forces playing in the equation. Since it is relatively small, the final velocity should be
            estimated reasonably well. Putting the distance of 300km into the equation we get:

                                      2
                                     v  = 4.206 * 300000 - 317.708=1261482.292
                                                  v=1123.157


            Therefore for the projectile to reach the 300km from the source, we need to fire it at the
            speed of 1123.157 m/s approximately.



            Summary

            We can think of variables as being dependent on each other in a functional way. For
            example, the variable y is a function of x denoted by y=f(x). The function f(x) has constant
            parameters. For example, if y depends on x linearly, then f(x)=a*x+b, where a and b are
            constant parameters in the function f(x). Regression is a method to estimate these constant
            parameters in such a way that the estimated f(x) follows y as closely as possible. This is
            formally measured by the squared error between f(x) and y for the data samples x.

            The gradient descent method minimizes this error by updating the constant parameters in
            the direction of the steepest descent (that is, the partial derivative of the error), ensuring
            that the parameters converge to the values resulting in the minimal error in the quickest
            possible way.
            The statistical software R supports the estimation of the linear regression with the function
            lm.



            Problems


                   1.  Cloud storage prediction cost: Our software application generates data on a
                      monthly basis and stores this data in cloud storage together with the data from
                      the previous months. We are given the following bills for the cloud storage and
                      we would like to estimate the running costs for the first year of using this cloud
                      storage:








                                                    [ 148 ]
   155   156   157   158   159   160   161   162   163   164   165