## Methodology

**Combined Travel Demand and Route Choice Models**

Prediction of traffic in a road network requires the solution of two closely related problems: the matrix estimation (ME) estimation problem estimates the demand for travel and the Trip Assignment (TA) problem predicts how those trips are routed through the network. The TA problem takes the travel demand (O-D matrix) as input and estimates the traffic volumes whereas the ME problem estimates the O-D matrix, often using traffic volumes as inputs.

Trip demands are sometimes available directly from households surveys but in many cases an up-to-date survey is not available. However, if we have some initial “guess" for the O-D matrix, we can often get a good estimation of the actual trip matrix by using the traffic volume information to augment our initial guess. These initial guesses are usually based on an outdated or subjectively guessed O-D matrix. The available methods for solving the ME problem include least-squares methods, entropy-based methods and statistical-based methods. The approach used in the current study is inspired by Castillo’s Bayesian Network approach (Castillo et al., 2008) which is later described in more detail.

If the TA and ME problems are treated sequentially, there can be inconsistencies between estimates of the O-D flows and the link flows. Therefore, these problems are often combined into a single problem and solved using bilevel or decomposition approaches. By iterating between solving these two problems (using any available method for each problem), we eventually converge on a solution which satisfies both problems.

In the current study, no survey data were available to generate the initial O-D matrix. Furthermore, Andorra has a population of 80,000 but attracts 8 millions tourists per year. The highly irregular and seasonal travel patterns in Andorra can therefore not be captured using static methods such as surveys. However, several months of high resolution geolocated telecoms data were available and these could be used to generate initial ‘guesses’ for the O-D matrix of any time period and division of zones.

**RNC Telecoms Data **

Through a partnership with Andorra Telecom, three months of Radio Network Controller (RNC) data were available. The RNC keeps track of the locations of devices as they move around the coverage area. Each time a connected device interacts with the network (call, text or cellular data), moves from one cell to another or goes unobserved for 90 minutes, a record is made of the subscriber ID, the timestamp, the coordinates of the device and the home network of the subscriber.

The initial O-D matrices could be extracted from the RNC data by first identifying stay-points (Li at al., 2008), mapping the stay-points to Traffic Analysis Zones (TAZ) and scanning series of TAZs of each device, searching for transitions from one location to another. For each transition found, a trip was added to the O-D matrix for the appropriate time period.