Page 157 - Data Science Algorithms in a Week
P. 157
Regression
Bratislava Berlin 517 1h 15m 2.250
Vienna Dublin 1686 2h 50m 2.833
Vienna Amsterdam 932 1h 55m 1.917
Amsterdam Budapest 1160 2h 10m 2.167
Bratislava Amsterdam 978 ? ?
Analysis:
We can reason that the flight duration time consists of two times - the first is the time to
take off and the landing time; the second is the time that the airplane moves at a certain
speed in the air. The first time is some constant. The second time depends linearly on the
speed of the plane, which we assume is similar across all the flights in the table. Therefore,
the flight time can be expressed using a linear formula in terms of the flight distance.
Analysis using R:
Input:
source_code/6/flight_time.r
flights = data.frame(
distance = c(365,1462,1285,1096,517,1686,932,1160),
time = c(1.167,2.333,2.250,2.083,2.250,2.833,1.917,2.167)
)
model = lm(time ~ distance, data = flights) print(model)
Output:
$ Rscript flight_time.r
Call:
lm(formula = time ~ distance, data = flights)
Coefficients: (Intercept) distance
1.2335890 0.0008387
According to the linear regression, the time to take off and the landing time for an average
flight is about 1.2335890 hours. Then to travel 1 km with the plane takes 0.0008387 hours; in
other words, the speed of an airplane is 1192 km per hour. The actual usual speed of an
aeroplane for short-distance flights like the ones in the table is about 850 km per hour. This
leaves room for improvement in our estimation (refer to exercise 6.3).
[ 145 ]