ABMUS 2019

State Estimation and Data Assimilation for an Agent-Based Model using a Probabilistic Framework


Nick Malleson, Luke Archer, Minh Kieu, Jonathan A. Ward, Alison Heppenstall, Christoforos Anagnostopoulo

University of Leeds and Improbable, UK

dust.leeds.ac.uk


These slides and abstract: https://urban-analytics.github.io/dust/presentations.html

How many people are there in Trafalgar Square right now?

We need to better understand urban flows:

Crime – how many possible victims?

Pollution – who is being exposed? Where are the hotspots?

Economy – can we attract more people to our city centre?

Health - can we encourage more active travel?

City Simulation

Understanding and predicting short-term urban flows

Problem: Models will Diverge

Uncertainty abounds

Inputs (measurement noise)

Parameter values

Model structure

Nonlinear models predict near future well, but diverge over time.

Possible Solution: Dynamic Data Assimilation

Used in meteorology and hydrology to constrain models closer to reality.

Try to improve estimates of the true system state by combining:

Noisy, real-world observations

Model estimates of the system state

Should be more accurate than data / observations in isolation.

Diagram of dynamic data assimilation and an ABM

Current Question:

How much data are needed to successfully model a (pedestrian) system?

people in a train station

Example: Crowds in a train station

We want a real-time model to forecast short-term crowding

How much data do we need?

Counts of people entering?

Counts at various points in the concourse (e.g. cameras)

Full traces of all individuals?

Hypothetical Train Station ('StationSim')

A birds-eye view of the hypothetical train station

Crowding emerges from random choices of exit and maximum walking speed

Very simple, not designed to be a competitive crowd model

Markovian model

Produces output from a set of inputs (the ‘state vector’) without any other information.

The state vector is made up of all model variables Model variables are the positions of all agents

Hypothetical Train Station ('StationSim')

The state vector is made up of all model variables Model variables are the positions of all agents

Parameters are fixed

E.g. an agent's max speed and it's chosen destination

This is easier for the DA algorithm

Later work will include the parameters in the state vector and get the DA algorithm to find suitable values

Probabilistic Modelling

Use probability theory to express all forms of uncertainty

Synonymous with Bayesian modelling

Probabilistic Programming: "a general framework for expressing probabilistic models as computer programs" (Ghahramani, 2015)

By expressing the model probabilistically (i.e. with variables represented as probability distributions), we can explore the impacts of uncertainty and (importantly) begin to assimilate data.

(hopefully)

The modelling framework

Modelling Framework

The modelling framework

1. Run StationSim to generate pseudo-truth data

2. Data assimilation framework

Run the model for N iterations up to time t (i.e. 'now')

Construct a Bayesian network to represent the state vector

This gives the a prior estimate of the current state

Observe some 'current' pseudo-truth data and use the probabilistic model (with MCMC) to a produce posterior estimate of the state vector

Forecasts from this point should be more accurate.

"By the time of the work- shop the paper will report initial experiments"

Malleson, 2019

Challenges

Technically difficult, probabilistic programming is still relatively new

Difficult to test and debug

Opportunities

Positions of agents could be latent (unobserved) in the probabilistic model

Can then include additional observed variables, such as crowd density, and use these to generate a posterior over the latent ones

In other words: very elegant way to include different data (hopefully!)

Immediate Next Steps

Results!!

Get it working with all agents

Experiment with different types of observations, e.g.

Cameras that count the passers-by in a single place

Full traces of all agents

Basically: how much do we need to find solutions that fit the observations

Conclusion

Overall aim: data assimilation for agent-based models

We want to be able to simulate cities, assimilating real time 'smart city' data as they emerge to reduce uncertainty (and prevent divergence).

Current goal: use a new probabilistic programming library to:

Experiment with the amount of data needed to simulate a system

Perform Bayesian inference on an ABM

Implement data assimilation

ABMUS 2019

State Estimation and Data Assimilation for an Agent-Based Model using a Probabilistic Framework


Nick Malleson, Luke Archer, Minh Kieu, Jonathan A. Ward, Alison Heppenstall, Christoforos Anagnostopoulo

University of Leeds, UK

Improbable, UK

dust.leeds.ac.uk


These slides and abstract: https://urban-analytics.github.io/dust/presentations.html