mvstat.net Tarn Duong Research Software CV

Contents

  1. Background
  2. How to run a simulation
  3. Results from a simulation
  4. How to create an interface between Splus/R and C
  5. References

Background

Realistic stochastic traffic models require intensive and extensive computation. Splus/ R are high level statistical programming languages that make it easy to implement these complex models. However the downside is, due to their user-friendliness (and compiled nature), they are not the most efficient languages and not entirely feasible for realistically sized traffic network simulations.

As a compromise then, the most computationally intensive functions can be written in C (which is a more efficient programming language) to improve performance and leaving the rest of the (non-performance critical) functions in Splus/ R. The C code is dynamically loaded at run-time and called by the appropriate Splus/ R functions. This way usability is maintained AND efficiency improved.

Now this is pretty much standard: what is different about this software is that it you can control amount and type of traffic assignment. By changing the amount and type of reassignment, you can run simulations in much less time. There are two types of reassignment:

  • explicit - this is where, for a given traveller, a set of costs are generated for all links in the network and a shortest path is then found
  • implicit - after a certain number of paths have been assigned explicitly, they then can then be used to compute an emprical estimate of the assignment probabilities. Then the rest of the travellers can now be assigned to a path using these empirical probabilities. This is done by randomly selecting (with equal probability) from the explicitly assigned paths, and assigning this to a given traveller.
The important points to note about reassignment are:
  • Decreasing the amount of total reassignment (explicit + implicit) gives a smoother link flow time series and decreases the time required for a simulation. A smooth-ish time series is probably more realistic than a more jagged one as travellers have their own inertia when they choose their travel paths each day.
  • For a given level of total reassignment, increasing the amount of implicit reassignment or equivalently decreasing the amount of explicit reassignment, also decreases the time required for a simulation. However this also decreases the accuracy of the results so there is a trade-off here between time and accuracy.
To ensure a minimum level of accuracy, a threshold is used. All travellers to be reassigned are randomly numbered 1, 2, 3, ... and all those that exceed the threshold are implicitly reassigned whereas those that don't are explicitly reassigned. This way, at least the threshold number of travellers are explicitly reassigned. This also means that implicit reassignment is not carried out if the total number of reassigned travellers is less than the threshold.

Please note that a label correcting algorithm is used here to find the shortest paths. This particular form of the algorithm uses the 'forward star' form of a network. Please see Urban Transportation Networks by Sheffi for further details.

How to run a simulation

Preparing the data

You must set up the datasets as the code expects it. The main input datasets are:
  • network matrix
  • demand/ origin-destination (OD) matrix
  • node coordinates matrix (if you wish to plot the network flows)
  • vector of memory weights

Network matrix

The network is an n_L x 6 matrix where n_L is the number of links in the network with respective columns:
  • From node
  • To node
  • BPR cost parameter - constant
  • BPR cost parameter - flow coefficient
  • BPR cost parameter - power
  • standard deviation

Memory weights vector

The memory weights vector is of length n_W where n_W is the number of memory weights. These weights must sum to 1. In this example, the 5 memory weights are all equal to 0.2.

Using the Splus/ R functions

External functions

In this section, the functions that you will be using are described (in a format similar to the Splus/ R help).
  • assign.path - assigns shortest paths to all travellers
  • change.cost - change BPR cost parameters or standard deviation for a link
  • plot.flow.hist - plot flow history for one link
  • plot.network - plot current flows for entire network
  • prepare.network - prepares the network, demand, node co-ordinates and memory weights
  • total.mean - computes the totals and mean link flow and total cost?????
The function prepare.network is used first before change.cost and assign.path can be used. The remaining functions (...) are used after assign.path has completed.

prepare.network

Once you have set up the datasets as detailed in the previous section, prepare.network then converts the network to its forward star form, the demand and node-coordinate matrices into lists. The function header is
prepare.network(network, demand, node.coord, memory)
It takes the network, demand matrix, node coordinate matrix and the memory weights and returns a list of the appropriate data structures which is henceforth called the network specification.

assign.path

The network specification is used by assign.path to assign travellers their minimum cost or shortest paths. The function header is (with the default values)
assign.path(net.spec, err = 'a', n.iter = 100, graph = 'n', scale = 100,
    save.name = "save", default.init = 'y', spath.table.init,
    flow.mem.init, append = 'n', flow.hist.init, thresh = 100, ...)

where
    net.spec - network specification
    err - type of error structure
      'a' - additive i.e. cost + error (default)
      'm' - multiplicative i.e. cost * (1 + error)
    n.iter - no. of iterations
    thresh - threshold for implicit reassignment (default = 100)
      if traveller <= threshold then explicit reassign
      if traveller > threshold then implict reassign
    graph
      'y' to plot graph of network of link flows
      'n' don't plot link flows (default)
    scale - line width in plot = 1 + flow/ scale (default = 100)
    save.name - filename where output is saved (default = "save.Rdata")
    default.init
      'y' for default initialisation i.e. first n_W link flows and shortest paths are all zero
      'n' for user-specified initialisation
    spath.table.init - initial (user-specified) shortest paths
    flow.mem.init - initial (user-specified) link flows
    append
      'y' appends current flow history to flow.hist.init
      'n' start new flow history
    flow.hist.init - initial (user-specified) flow history
This function then returns a list with fields (you mostly will only need to look at the flow history - the other fields are used internally):
    flow.hist - link flow history (ie flows for all links for all iterations)
    spath.table - shortest paths taken by each traveller
    flow.mem - previous n_W link flows
    cost.mem - previous n_W measured costs

plot.network

If graph is equal to 'y' is assign.path then a plot of the flow for each iteration is shown. This is recommended as it helps visualise the data and it doesn't decrease the speed of the simulation significantly. The graph has the following features
  • nodes are circled numbers
  • the link flows are drawn in red
  • arrows indicate the direction of the traffic flow
  • the thickness of the arrow is directly proportional to the link flow
  • dotted lines indicate no link, in that direction, connects the nodes
An example of a plot is

If graph = 'n' then a message is printed in the command line stating which iteration the function is up to.

The function header is

plot.network(net.spec, scale = 100, flow, ...)
where
    net.spec - network specification
    scale - line width in plot = 1 + flow/ scale (default = 100)
    flow - link flows (from one iteration) to be ploted

plot.flow.hist

This function plots the flow history (output from assign.path) of a particular link.

The function header is

plot.flow.hist(net.spec, link, flow.hist,...)
where
    net.spec - network specification
    link - link in the form of (from node, to node) eg. c(1, 5) is the link from node 1 to node 5
    flow.hist - link flow history

change.cost

This function was written to make easy changing the BPR cost parameters or the standard deviation corresponding to a link. The function header is
change.cost(net.spec, link, const.new = NULL, coeff.new = NULL,
    power.new = NULL, sd.new = NULL)
    net.spec - network specification
    link - link eg. c(1, 5)
    const.new - new BPR constant parameter
    coeff.new - new BPR floe coefficient parameter
    power.new - new BPR power parameter
    const.new - new standard deviation

total.mean

This function takes the flow history and calculates the total cost on the network for each iteration and the mean total cost over all iterations. The total and mean link flows (over time) are also calculated. The function header is
total.mean(net.spec, flow.hist, ...)
where
    net.spec - network specification
    flow.hist - flow history
This function then returns
    cost.tot - total cost on network
    cost.mean - mean total cost on network
    flow.tot - total link flow on network
    flow.mean - mean link flow on network

Internal functions

In this section, the internal functions ie. these you will probably will not need to use. They have been provided for completeness.
  • bpr.cost - calculates the cost using the BPR parameters and the link flow
  • flow.comp - computes the link flows
  • link.num - finds the link number corresponding to the link: (from node, to node)
  • probit.error - adds (independent normal) errors according to the probit model
  • spath - finds the shortest path (for one traveller) through the network (written in C and Splus/ R)

Results from simulations

Example 1

A 3 x 3 grid is the test network. The cost structure is very simple: each link had the same BPR cost parameters ie. const = 10, coefficient = 0.0025 and power = 2 so the cost function for day i is
    cost(flowi) = 10 + 0.0025 * flowi2
where flow_i is the flow on day i. The memory weights are w1 = w2 = w3 = w4 = w5 = 0.2 which means that each travellers puts equal emphasis on the past 5 days. So the measured cost on day i is
    measured cost = w1 * cost(flowi-1) + w2 * cost(flowi-2) + w3 * cost(flowi-3) + w4 * cost(flowi-4) + w5 * cost(flowi-5)
for i > 5. Each link has an additive error so the
    personal cost = measured cost + error
where error ~ N(0, 9). The demand is 10 travellers for OD-pair (1, 9) and another 10 for OD-pair (5, 1). All travellers are explicitly reassigned on each day. The plot of the link flows on the 500th day is below. The red arrows indicate the direction and amount of flow on each link. The dotted lines indicate that there is no link in that direction.

On the 501st day, road works on link (1, 4) closes it off to traffic (in the south direction only). After another 500 iterations, the flow is shown below. Note that he traffic that used to flow on link (1,4) is now diverted to link (1, 2):

The flow history time series for link (2, 5) shows the break at iteration 500 where link (1, 4) was closed off:

The first 500 iterations took 94 s on a Celeron 550 MHz PC. The second 93 s. As there were 20 travellers in total, this equates to 10000 shortest paths.

Example 2

In this example, we use a real life network of the Leiceister RA area to display the more advanced features of the software, ie. in particular the way travellers are explicitly or implicitly reassigned. Here is a picture of the network:
There are 30 OD pairs in the demand matrix with a total of 3827 travellers. The error on the link costs is a 10% multiplicative:
    personal cost = measured cost * (1 + error)
where error ~ N(0, 0.01). The memory weights are again w1 = w2 = w3 = w4 = w5 = 0.2. Unfortunately the data files are confidential so it can't be provided to you. In the cells in the table below, are the times that each simulation took to run through 100 iterations, as well as the mean link flow for link (13, 8):
      Proportion of reassignment
    Threshold 1 0.5 0.2
    100 1454 s
    426.42
    1007 s
    427.15
    557 s
    429.47
    Infinite 2754 s
    427.05
    1452 s
    423.02
    571 s
    429.86
An infinite threshold just means that no implicit reassignment is performed hence all travellers are explicitly reassigned. Looking at the results in the table, there are two notable trends:
  • simulations run faster as the proportion of total reassignment decreases - this is sensible as fewer shortest paths have to be explicitly enumerated
  • simulations run faster as the threshold decreases - again because fewer shortest paths have to be explicitly enumerated
  • time savings from implicit reassignment are greater when the proportion is total reassignment is close to 1 - this is expected as more travellers will exceed the threshold than when this proportion is close to 0

How to create an interface between Splus/R and C

Under Windows

Under Windows, shared libraries are also known as dynamic link libraries (DLLs) and have a .dll file extension.

The C compiler used was LCC Win32 - it is available free on the Internet. It is pretty nifty in that it can create DLLs in a single integrated project. I have also included instructions that can be used to create DLLs manually - these instructions can be easily(?) adapted to other C compilers.

Automatically creating DLLs

  1. Create a project. Suppose you call this project file. By default, LCC Win32 expects that the source file is called file.c and will create a DLL called file.dll. Next, on the "Definition of a new project" screen, remember to click the radio button for the dynamic link library in the bottom right hand corner. Select the directories where LCC Win32 expects input and output.
  2. On the next screen, don't use the application wizard. When you click "No" the next screen is to add source files. If you haven't written any source files, click cancel and continue with an empty project. You can add (and validate) files later, before compilation. Otherwise add and validate your source files (all the defaults should be OK.)
  3. For each function that you want to export (make visible) to Splus/ R, __declspec(dllexport) should be added to the function declaration to prompt the C compiler to export it. For example if the usual function declaration is
    void hello(int *world)
    then the exporting version is just
    void __declspec(dllexport) hello(int *world)
    Note:
    • There are two underscores in __declspec(dllexport)
    • Each exported function must return type void
    • All parameters must be pointers (eg. int *).
  4. Then just compile. LCC Win32 will create the DLL in the directory specified in step 1. It will also create an object file file.obj and export file file.exp (this second file contains all the names of the exported functions).

Manually creating DLLs

These instructions are heavily borrowed from the website for creating DLLs (for Gauss) written by Thierry Roncalli.
  1. Cut and paste the dll.tpl file (found in, say, C:\lcc\lib\wizard) before the actual function code.

    For each function that you want to export (make visible) to Splus/ R, __declspec(dllexport) should be added to the function declaration to prompt the C compiler to export it. For example if the usual function declaration is

    void hello(int world)
    then the exporting version is just
    void __declspec(dllexport) hello(int *world)
    Note:
    • There are two underscores in __declspec(dllexport)
    • Each exported function must return type void
    • All parameters must be pointers (eg. int *).
    • Make sure to #include <windows.h>
  2. Remember to set the PATH environment variable in your autoexec.bat e.g.
    SET PATH = %PATH%; C:\LCC\BIN
    if the LCC Win32 binary .exe file is in the directory C:\LCC\BIN (This only needs to be done once and you may need to restart your computer for Windows to update the PATH variable).
  3. Move into the directory with the source code file.c is. The following commands will produce a DLL.
    • Compile the file with
      lcc file.c
    • Build the DLL with
      lcclnk -dll file.obj
    This will create an object file file.obj and export file file.exp (this contains all the names of the exported functions).

Under Unix

R

To create a shared library of file.c simply type in the command line
R CMD SHLIB file.c
This will automatically create the object file (file.o) and the shared library (file.so). Then just run R as usual.

Splus

I have used Splus 5 for this (mostly because the machines that have older versions of Splus installed on them don't support shared libraries), the process is much much simpler. Just type in
Splus5 CHAPTER
in the command line, in the directory where file.c is. Then type in
Splus5 make
which creates the shared library (S.so) as well as the makefile (makefile) and the object file (file.o). Please see the page by the Ohio State University Statistics Department for more details.

Dynamically (un)loading shared libraries into Splus/ R

Under Windows

After creating a DLL, you then need to load into Splus/ R. There is static loading (see Venables & Ripley). However dynamic loading is preferred as it is more time and memory efficient. The are two Splus/ R commands involved in using DLLs
  1. dyn.load is used to dynamically load the DLL
  2. .C is used to call the C function in the DLL
For example, to load file.dll then use
dyn.load("C:\myfiles\file.dll")
If the C function name is hello then in the file.exp it's known as _hello. This underscore is important! To check that hello has been loaded use
is.loaded(symbol.C("_hello"))
Suppose the C function is hello(int *world) and that it increments the value of the world parameter. Then to call this function in Splus/ R:
world <- 5
.C(symbol.C("_hello"), as.integer(world))
returns
[[1]]
6
Hence, when using .C, a list of the modified values of all parameters that are passed into a C function is returned to Splus/ R.

To ensure that any difference between Splus/R and C in storing variables are correctly converted, use the correspondence outlined in the table below (adapted from Venables & Ripley and Writing R Extensions).

Splus/ R storage mode C type
"logical"
"integer"
"double"
"character"
"complex"
long *, int *
long *, int *
double *
char **
Rcomplex * (contained in R.h)

To dynamically unload any loaded shared libraries, just use

dyn.unload("C:\myfiles\file.dll")

Under Unix

R

The commands are similar to the Windows ones.
  1. dyn.load is used to dynamically load the DLL
  2. .C is used to call the C function in the DLL
For example, to load file.so then use
dyn.load("file.so")
To check that hello has been loaded use
is.loaded(symbol.C("hello"))
Note there is no underscore like in Windows.

Suppose the C function is hello(int *world) and that it increments the value of the world parameter. Then to call this function in Splus/ R:

world <- 5
.C(symbol.C("hello"), as.integer(world))
returns
[[1]]
6
Hence, when using .C, a list of the modified values of all parameters that are passed into a C function is returned to Splus/ R.

The type casting applies in the same way as for Windows.

To dynamically unload any loaded shared libraries, just use

dyn.unload("file.so")

Splus

This no longer needs to be done explicitly in Splus 5. Just start a normal Splus session in the directory where the shared library is stored and it will be automatically loaded. Easy! And there is no need to dynamically unload shasred libraries. (Though you still need to keep in mind the Splus to C conversion of types in the above table.)

References

Venables, W. N. & Ripley, B. D. (1994) Modern applied statistics with S-Plus (2nd ed.) New York: Springer Verlag.

R Development Core Team (2000) Writing R extensions http://cran.r-project.org

Becker, R. A., Chambers J. M., Wilks, A. R. (1988) The new S language California: Cole Advanced Books & Software

Navia, J. (1998) LCC-Win32 user's manual http://www.cs.virginia.edu/~lcc-win32/

Roncalli, T. (?) Creating DLLs for GAUSS using LCC-Win32 http://www.city.ac.uk/cubs/ferc/thierry/gdll1.html

Sheffi, (1985) Urban Transportation Networks Englewood Cliffs: Prentice-Hall.

Department of Statistics, The Ohio State University, (?) Dynamic Loading in Splus http://www.stat.ohio-state.edu/department/docs/statpack/dynload.html


Document originally written by T. Duong in 2001, cosmetic updates June 2010.