
| Quantitative Techniques to Transport Planning | Courses Index | ![]() | ![]() |
Page 35
of 88
pages. Chapter: 7: Regression ![]() |
IntroductionIn the previous section we found out how to investigate and measure the degree of association between two variables. But we saw that a relationship does not imply a direct causal connection. Sometimes, however, there is a logical connection. For example:
In these cases we might well want to use the relationship to predict the value of one variable from another. We are going to look only at cases where there is a straight-line relationship. In order to present the relationship graphically, we need to fit a line through a set of points. The first, most important, thing to note is that there is no one best line through a set of points – you must decide what you want to use the line for. Consider the set of points below, which represent a set of people plotted according to their height and weight. There is a clear pattern of the taller people being heavier.
What might we want to use this data for? Most commonly we use such data to:
But equally we could want to:
We need two different lines to do two things. In the first case we are trying to predict weight, so we want out line to fit points as closely as possible in the weight direction.
In the second case we are trying to predict height so we need the fit in the height direction to be as close as possible.
These two lines will be different – they will have different slopes and cut y axis at different points So we need to decide, right at the beginning, what is:
Having decided which variable goes on which axis, we can draw our scatter chart and then fit the line through the points. By far the easiest way of doing this is by eye. Move a (preferably clear plastic) ruler over the points until you think they are balanced equally either side. A rather more accurate way is to fit the least square regression line. This is the one line through the set of data that minimises the sum of the squares of the y deviations of the points from the line.
This ‘best fit’ line is the one that minimises |
![]() ![]() ![]() ![]() ![]() ![]() |