Appendix B Discrete choice models

Discrete choice models have been employed widely in travel demand analysis since the 1970s, with the most common application being in the choice of travel mode. Modal split is the relative proportion of travellers or shippers using one particular mode compared with the other available modes. Most models of modal choice use a ‘utility’ function representation of the attributes of the different modes and of the travellers or shippers as a main set of independent (explanatory) variables. The utility function is usually a weighted sum of the modal and personal attributes considered (such as travel time and reliability, travel cost, service frequency and socio-economic characteristics). The simplest choice models consider only two alternatives (for example, Mode A and Mode B) and are known as binary choice models. In general terms, a binary model can be expressed as:

$\frac{p A}{p B} = F (U A, U B)$

where p_A and p_B are the probabilities of choosing modes A and B, U_A and U_B are the utility functions for modes A and B, and F(U_A,U_B) is some suitable function. The models are often expressed for one of the modes only, for example, as:

$p A = f (U A, U B)$

and they can be extended to include more than two alternatives in the choice set. The discrete choice models are often termed ‘behavioural’ because they can represent causality in that they can be derived from a theory that explicitly maps out the decision-making processes of the individual taking the decision. The theoretical basis is usually that of utility maximisation. It assumes the utility an individual ascribes to an alternative is defined by a utility function in which the attributes of the alternative and characteristics of the individual are determining factors. The choice of a particular alternative is made on the basis of comparing the levels of utility derivable from each of the available alternatives.

Of necessity, the models estimate the probability that n_A individual, in a given situation, will choose a particular alternative rather than the definite selection of a preferred alternative. Assume that an individual can choose one alternative r from a set of K available alternatives and that the utility of alternative r is for that individual is given by U_ri. Alternative r will then be chosen if:

$U ri \geq U ki for all r \neq k ϵ K$

Given that there will almost always be some uncertainty concerning the specification of the utility function – because of measurement errors, omission of unobservable attributes and other specification errors – utility functions have a random component. Thus it is only possible to determine the probability that a given alternative will be chosen. We can represent the utility function U_ri as:

$U ri = V ri + ε ri$

where V_ri is the deterministic part of the utility function and É_ri is the random part. Then the probability that an individual will select alternative r can be written as:

$p ri = P r (U ri \geq (U ki)) = P r (V ri + ε ri \geq (V ki + ε ki) =) P r (V ri - V ki \geq (ε ki - ε ri) for all k \in K | k \neq r)$

Specific mathematical forms of the choice model then emerge depending on the assumptions adopted about the form of the joint distribution of the random errors É_ki – É_ri. If this distribution is assumed to be the normal distribution, then the choice probability model is the probit model. Unfortunately, this model is mathematically intractable. As a result, the practice is to assume that the distribution follows Weibull distribution, which approximates the normal distribution to some degree. The advantage of this assumption is that the resultant choice model is the multinomial logit model, which is mathematically tractable. The function form of the multinomial logit model is:

$p ri = \frac{e x p (U ri)}{\sum k ϵ K e x p (U ki)}$

The binomial form of this model – using the earlier notation of alternatives A and B – is:

$p A = \frac{e x p (U A)}{e x p (U A) + e x p (U B)} = \frac{1}{1 + e x p (U B - U A)}$

This function is such that if U_A and U_B are equal, then the probability of choosing each of the two alternatives is 0.5, while if U_A > U_B, then the probability of choosing A is greater than choosing B. The utility functions are generally weighted linear functions of the attributes. For example:

$U ri = α r + \sum_{j = 1}^{J} β rj X rji + \sum_{l = 1}^{L} γ l Y li$

Where a_r is an alternative-specific constant, the {B_rj, j = 1, …., J} are constant coefficients for the attributes (e.g. service variables) {X_j} of the alternative and the {γ_l, l = 1,….,L} are constant coefficients for the attributes (e.g. socio-economic characteristics) of the individual decision‑maker i.

$U = α + β 1 × p r i c e + β 2 \times time + β 3 × r e l i a b i l i t y$

then by dividing both sides of equation (7.3) by B₁, yields the money value of time (B₂/B₁) and the money value of reliability (B₃/B₁).

Coefficients in the utility functions are generally estimated from observed data sets using maximum likelihood techniques implemented in software packages such as LIMDEP.

While the multinomial logit model is a powerful tool for understanding travel choices, it has some significant limitations. The most important of these is its reliance on the ‘Axiom of the Independence of Irrelevant Alternatives (IIA)’ (Luce 1959) which states that "if a set of alternative choices exists, then the relative probability of choice among any two alternative is unaffected by the removal (or addition) of any set of other alternatives". This means that the ratio p_A/p_B is independent of the other alternatives available in the choice set. This property is the basis of the multinomial logit model. Unfortunately, while the model is attractive and easy to use, it really applies only in rather special circumstances and its use in practice can lead to certain anomalies (for example, the ‘red bus-blue bus’ anomaly^[1]). The general solution is to use nested logit models that present a hierarchy of choices in which the decision‑maker usually has to choose between no more than two alternatives at any point in the nested structure. Multinomial logit models are generally used at each of the decision points. An example of a nested logit model for mode choice is displayed in Figure 10.

Figure 10: Structure of a nested logit modal choice model

This model incorporates three broad modes of travel: car, public transport and no-motorised transport. The car mode is split into car driver and car passenger. Public transport is split into three separate elemental modes, depending on the mode of access taken to use the transit services (note that this model does not distinguish between rail or bus services). Non‑motorised transport is split into bicycle and pedestrian modes. Douglas, Franzmann and Frost (2003) describe the development of similar discrete choice modes for modal choice in Brisbane.

The use of multinomial logit models in freight transport studies is illustrated in Wigan, Rockliffe, Thoresen and Tsolakis (1998). This study used the models to estimate the value of time spent in transit and reliability of arrival time for both long-haul and metropolitan freight.

More complicated, but more generally applicable, choice models are also available, such as the mixed logit models (see Louviere, Hensher & Swait 2000; Train, Revelt & Ruud 2004). While the standard logit model assumes the utility function coefficients are the same for the entire population (everyone has the same values of time and of other quality attributes), the mixed logit model allows the analyst to make the more realistic assumption that the coefficients vary across the population according to some distribution (such as uniform, normal, log normal). The mixed logit model relaxes the assumption of IIA and allows correlation in the unobserved components of utility between alternatives. The model is, however, much more data intensive and computationally exhaustive.

It must be noted the reliance on the ‘independence of irrelevant alternatives’ axiom does not invalidate the multinomial logit model. It is a perfectly acceptable model as long as care is taken to ensure its basic assumptions are not violated in a given application.

One advantage of the multinomial logit model is that it is possible to derive point elasticity values from it. Considering the choice model defined by equations EQ 3.1 and EQ 3.2, it can be shown that the elasticity of the probability of choosing mode A for an individual with respect to changes in X_rji (the j^th independent variable of alternative r) is given by:

$η r Aji = [δ Ar - p ri] β rj X rji$

where δ_Ar is a delta function defined as:

$δ Ar = 0 if A ≠ r (cross elasticity)$ $δ Ar = 1 if A = r (direct elasticity)$

This result indicates that the direct elasticity for alternative A depends only on the attributes of that alternative, while for the cross elasticities only, the attributes of the other alternative r enter the equation.

Properly established discrete choice models are powerful decision-support tools for policy analysis and project evaluation. At a basic level, the models can be used to estimate changes in market shares across modes, or for alternative routes if a project is undertaken that will reduce the price or improve one r more of the service quality attributes for one mode or route. The models may also be used to help a transport operator determine the price to charge that will maximise profits. At a more complex level, the models can be used to estimate the welfare gains to customers from an improvement in a service quality attribute, which is important for inclusion in benefit-cost analysis.

For further information on the development, estimation and application of discrete choice models, the determination of elasticity values from discrete choice models and the application of the models in project evaluation, see: Ortuzar and Willumsen (2011); Taplin, Hensher and Smith (1999); Hensher and Button (2000); and Louviere, Hensher and Swait (2000).

[1]Consider the case where travellers can choose between two modes – say car and bus – to make a given trip and further assume that the utility values for both of these modes are equal. Then there are probabilities of 0.5 for the use of both modes. Now assume the public transport operator paints half of the buses red and the other half blue. There are now three modes apparent: car, red bus and blue bus. All modes still have the same utility values, so the ratio between any pair of modes is unity and the overall modal split is (according to the multinomial logit model) one third car, one third blue bus and one third red bus. The proportion of travellers using cars has then decreased from 0.5 to 0.33, but nothing has actually changed except the colours of the buses. This is an illogical result and is due to the fact that the two bus modes are actually variations of the same choice alternative for the travellers – they are not truly independent alternatives.