Python microsim model

Submodules

microsim.QUANTRampAPI2 module

microsim.activity_location module

class microsim.activity_location.ActivityLocation(name, locations, flows, individuals, duration_col)

Bases: object

Class to represent information about activity locations, e.g. retail destinations, workpaces, etc.

get_dangers()

Get the danger associated with each location as a list. These will be in the same order as the location IDs returned by get_ids()

Return type: List[float]

get_dataframe_copy(): Get a copy of the dataframe that underpins this ActivityLocation :rtype: DataFrame :return:

get_ids()

Retrn the IDs of each destination. Shouldn’t need to know these. Use get_dangers or update_dangers instead

Return type: List[int]

get_indices()

Return the index (row number) of each destination Shouldn’t need to know these. Use get_dangers or update_dangers instead

Return type: List[int]

get_name()

Get the name of this activity. This is used to label columns in the file of individuals

Return type: str

update_dangers(dangers): Update the danger associated with each location :type dangers: List[float] :param dangers: A list of dangers for each location. Must be in the same order as the locations as returned by get_ids.

microsim.column_names module

class microsim.column_names.ColumnNames

Bases: object

Used to record standard dataframe column names used throughout

ACTIVITY_DURATION = '_Duration'

ACTIVITY_DURATION_INITIAL = '_Duration_Initial'

ACTIVITY_FLOWS = '_Flows'

ACTIVITY_RISK = '_Risk'

ACTIVITY_VENUES = '_Venues'

class Activities

Bases: object

ALL = ['Retail', 'PrimarySchool', 'SecondarySchool', 'Home', 'Work']

HOME = 'Home'

PRIMARY = 'PrimarySchool'

RETAIL = 'Retail'

SECONDARY = 'SecondarySchool'

WORK = 'Work'

CURRENT_RISK = 'current_risk'

DISEASE_EXPOSED_DAYS = 'exposed_days'

DISEASE_PRESYMP = 'presymp_days'

DISEASE_STATUS = 'disease_status'

DISEASE_STATUS_CHANGED = 'status_changed'

DISEASE_SYMP_DAYS = 'symp_days'

class DiseaseStatuses

Bases: object

ALL = [0, 1, 2, 3, 4, 5, 6]

ASYMPTOMATIC = 4

DEAD = 6

EXPOSED = 1

PRESYMPTOMATIC = 2

RECOVERED = 5

SUSCEPTIBLE = 0

SYMPTOMATIC = 3

INDIVIDUAL_AGE = 'DC1117EW_C_AGE'

INDIVIDUAL_ETH = 'DC2101EW_C_ETHPUK11'

INDIVIDUAL_SEX = 'DC1117EW_C_SEX'

LOCATION_DANGER = 'Danger'

LOCATION_ID = 'ID'

LOCATION_NAME = 'Location_Name'

TRAVEL_BUS = 'Bus'

TRAVEL_CAR = 'Car'

TRAVEL_TRAIN = 'Train'

TRAVEL_WALK = 'Walk'

microsim.dashboard module

Created on Thu Jun 4 16:22:57 2020

@author: Natalie

microsim.dashboard.calc_nr_days(data_file)

microsim.dashboard.create_counts_dict(conditions_dict, r_range, data_dir, start_day, end_day, start_run, nr_runs, age_cat): Counts per condition (3D, mean and standard deviation) Produces 5 types of counts: msoacounts: nr per msoa and day agecounts: nr per age category and day totalcounts: nr per day (across all areas) cumcounts: nr per MSOA and day uniquecounts: nr with ‘final’ disease status across time period e.g. someone who is presymptomatic, symptomatic and recoverd is only counted once as recovered Output: msoas # list of msoas totalcounts_dict, cumcounts_dict, agecounts_dict, msoacounts_dict, cumcounts_dict_3d, totalcounts_dict_std, cumcounts_dict_std, agecounts_dict_std, msoacounts_dict_std, totalcounts_dict_3d, agecounts_dict_3d, msoacounts_dict_3d, uniquecounts_dict_3d, uniquecounts_dict_std, uniquecounts_dict

microsim.dashboard.create_difference_dict(dict_sc0, dict_sc1, lookup_dict)

microsim.dashboard.create_msoa_dangers_dict(dangers_dict, keys, msoa_codes): Converts dangers_dict to MSOA level data for the appropriate venue types. Produces average danger score (sum dangers in MSOA / total nr venues in MSOA) Output: dangers_msoa_dict

microsim.dashboard.create_venue_dangers_dict(locations_dict, r_range, data_dir, start_day, end_day, start_run, nr_runs): Reads in venue pickle files (venues from locations_dict) and populates dangers_dict_3d (raw data: venue, day, run), dangers_dict (mean across runs) and dangers_dict_std (standard deviation across runs) Possible output includes: dangers_dict # mean (value to be plotted) dangers_dict_std # standard deviation (could plot as error bars) dangers_dict_3d # full 3D data (for debugging)

microsim.dashboard_QUANT module

Created on Thu Jun 4 16:22:57 2020

@author: Natalie

microsim.dashboard_QUANT.calc_nr_days(data_file)

microsim.dashboard_QUANT.create_counts_dict(conditions_dict, r_range, data_dir, start_day, end_day, start_run, nr_runs, age_cat): Counts per condition (3D, mean and standard deviation) Produces 5 types of counts: msoacounts: nr per msoa and day agecounts: nr per age category and day totalcounts: nr per day (across all areas) cumcounts: nr per MSOA and day uniquecounts: nr with ‘final’ disease status across time period e.g. someone who is presymptomatic, symptomatic and recoverd is only counted once as recovered Output: msoas # list of msoas totalcounts_dict, cumcounts_dict, agecounts_dict, msoacounts_dict, cumcounts_dict_3d, totalcounts_dict_std, cumcounts_dict_std, agecounts_dict_std, msoacounts_dict_std, totalcounts_dict_3d, agecounts_dict_3d, msoacounts_dict_3d, uniquecounts_dict_3d, uniquecounts_dict_std, uniquecounts_dict

microsim.dashboard_QUANT.create_difference_dict(dict_sc0, dict_sc1, lookup_dict)

microsim.dashboard_QUANT.create_msoa_dangers_dict(dangers_dict, keys, msoa_codes): Converts dangers_dict to MSOA level data for the appropriate venue types. Produces average danger score (sum dangers in MSOA / total nr venues in MSOA) Output: dangers_msoa_dict

microsim.dashboard_QUANT.create_venue_dangers_dict(locations_dict, r_range, data_dir, start_day, end_day, start_run, nr_runs): Reads in venue pickle files (venues from locations_dict) and populates dangers_dict_3d (raw data: venue, day, run), dangers_dict (mean across runs) and dangers_dict_std (standard deviation across runs) Possible output includes: dangers_dict # mean (value to be plotted) dangers_dict_std # standard deviation (could plot as error bars) dangers_dict_3d # full 3D data (for debugging)

microsim.initialisation_cache module

class microsim.initialisation_cache.InitialisationCache(cache_dir)

Bases: object

Class to handle caching of initialisation data, eg. individuals and activity locations dataframes

cache_files_exist()

is_empty()

read_from_cache()

store_in_cache(individuals, activity_locations)

microsim.load_msoa_locations module

microsim.load_msoa_locations.calculate_msoa_buildings(osm_buildings, msoa_shapes)

microsim.load_msoa_locations.load_devon_msoas(data_dir, msoa_filename='devon_msoas.csv')

microsim.load_msoa_locations.load_msoa_shapes(data_dir, visualize=False)

microsim.load_msoa_locations.load_osm_shapefile(data_dir)

microsim.load_msoa_locations.main()

microsim.main module

Core RAMP-UA model.

Created on Wed Apr 29 19:59:25 2020

@author: nick

microsim.main.create_params(calibration_params, disease_params)

microsim.main.run_opencl_model(individuals_df, activity_locations, time_activity_multiplier, iterations, data_dir, base_dir, use_gui, use_gpu, use_cache, initialise, calibration_params, disease_params)

microsim.main.run_python_model(individuals_df, activity_locations_df, time_activity_multiplier, msim_args, iterations, repetitions, parameters_file)

microsim.microsim_initialisation module

microsim.microsim_model module

class microsim.microsim_model.Microsim(individuals, activity_locations, time_activity_multiplier=None, random_seed=None, disable_disease_status=False, r_script_dir='./R/py_int/', data_dir='./data/', scen_dir='default', output=True, output_every_iteration=False, hazard_individual_multipliers={}, hazard_location_multipliers={}, risk_multiplier=1.0, disease_params={})

Bases: object

Class containing code for running timesteps of the Python/ R microsim model. This operates on two main dataframes: individuals and activity_locations.

calculate_new_disease_status(): Call an R function to calculate the new disease status for all individuals. Update the indivdiuals dataframe in place :rtype: None :return: . Update the dataframe inplace

change_behaviour_with_disease(): When people have the disease the proportions that they spend doing activities changes. This function applies those changes inline to the individuals dataframe :rtype: None :return: None. Update the dataframe inplace

run(iterations, repnr)

Run the model (call the step() function) for the given number of iterations. :type iterations: int :param iterations: The number of iterations to run :type repnr: int :param repnr: The repition number of this model. Like an ID. Used to create new unique directory for this

model instance.

Return type: None

step()

Step (iterate) the model for 1 iteration

Return type: None
Returns

update_behaviour_during_lockdown()

Unilaterally alter the proportions of time spent on different activities before and after ‘lockddown’ Otherwise this doesn’t do anything update_behaviour_during_lockdown.

Note: ignores people who are currently showing symptoms (ColumnNames.DiseaseStatus.SYMPTOMATIC)

update_venue_danger_and_risks(decimals=8)

Update the danger score for each location, based on where the individuals who have the infection visit. Then look through the individuals again, assigning some of that danger back to them as ‘current risk’.

Parameters

risk_multiplier – Risk is calcuated as duration * flow * risk_multiplier.
decimals – Number of decimals to round the indivdiual risks and dangers to (defult 10). If ‘None’ then do no rounding

microsim.population_initialisation module

Core RAMP-UA model.

Created on Wed Apr 29 19:59:25 2020

@author: nick

class microsim.population_initialisation.PopulationInitialisation(data_dir='./data/', read_data=True, testing=False, debug=False, quant_object=None)

Bases: object

A class used to load different datasources and generate the population of people ready to be iterated. This produces dataframes of people and places ready to start either model implementation.

classmethod add_disease_columns(individuals)

Adds columns required to estimate disease prevalence

Return type: DataFrame

classmethod add_individual_flows(flow_type, individuals, flow_matrix)

Take a flow matrix from MSOAs to (e.g. retail) locations and assign flows to individuals.

It a assigns the id of the destination of the flow according to its column in the matrix. So the first column that has flows for a destination is given index 0, the second is index 1, etc. This is probably not the same as the ID of the venue that they point to (e.g. the first store probably has ID 1, but will be given the index 0) so it is important that when the activity_locations are created, they are created in the same order as the columns that appear in the matix. The first column in the matrix must also be the first row in the locations data. :type flow_type: str :param flow_type: What type of flows are these. This will be appended to the column names. E.g. “Retail”. :type individuals: DataFrame :param individuals: The DataFrame contining information about all individuals :type flow_matrix: DataFrame :param flow_matrix: The flow matrix, created by (e.g.) read_retail_flows_data() :rtype: DataFrame :return: The DataFrame of individuals with new locations and probabilities added

classmethod add_work_flows(flow_type, individuals, workplaces, commuting_flows, flow_threshold): Create a dataframe of work locations that individuals travel to. The flows are based on general commuting patterns and assume one work location per industry type MSOA. :type flow_type: str :param flow_type: The name for these flows (probably something like ‘Work’) :type individuals: DataFrame :param individuals: The dataframe of synthetic individuals :type workplaces: DataFrame :param workplaces: The dataframe of workplaces (i.e. occupations) :type commuting_flows: DataFrame :param commuting_flows: The general commuting flows between MSOAs (an O-D matrix) :param flow_threshold: Only include the top x destinations as possible flows. ‘None’ means no limit. :rtype: DataFrame :return: The new ‘individuals’ dataframe (with new columns)

classmethod check_sim_flows(locations, flows): Check that the flow matrix looks OK, raising an error if not :type locations: DataFrame :param locations: A DataFrame with information about each location (destination) :type flows: DataFrame :param flows: The flow matrix itself, showing flows from origin MSOAs to destinations :return:

classmethod extract_msoas_from_individuals(individuals): Analyse a DataFrame of individuals and extract the unique MSOA codes, returning them as a list in ascending order :type individuals: DataFrame :param individuals: :rtype: List[str] :return:

classmethod generate_travel_time_colums(individuals): TODO Read the raw travel time columns and create standard ones to show how long individuals spend travelling on different modes. Ultimately these will be turned into activities :type individuals: DataFrame :param individuals: :rtype: DataFrame :return:

classmethod pad_durations(individuals, activity_locations): Some indvidiuals’ activity durations don’t add up to 1. In these cases pad them out with extra time at home. :param individuals: :param activity_locations: :rtype: DataFrame :return: The new individuals dataframe

classmethod read_commuting_flows_data(study_msoas)

Read the commuting flows between each MSOA

Parameters: study_msoas (List[str]) – A list of MSOAs in the study area (flows outside of this will be ignored)
Return type: (<class ‘pandas.core.frame.DataFrame’>, <class ‘pandas.core.frame.DataFrame’>)
Returns: A dataframe with origin and destination flows in all MSOAs in the study area

classmethod read_individual_time_use_and_health_data(home_name)

Read a population of individuals. Includes time-use & health info.

Parameters: home_name (str) – A string to describe flows to people’s homes (probably ‘Home’)

:return A tuple with new dataframes of individuals and households

Return type: DataFrame

classmethod read_retail_flows_data(study_msoas, quant_object)

Read the flows between each MSOA and the most commonly visited shops

Parameters

study_msoas (List[str]) – A list of MSOAs in the study area (flows outside of this will be ignored)
quant_object (QuantRampAPI) – The QuantRampAPI object used to estimate destination school and retail locations

Return type

(<class ‘pandas.core.frame.DataFrame’>, <class ‘pandas.core.frame.DataFrame’>)

Returns

A tuple of two dataframes. One containing all of the flows and another

containing information about the stores themselves.

classmethod read_school_flows_data(study_msoas, quant_object)

Read the flows between each MSOA and the most likely schools attended by pupils in this area. All schools are initially read together, but flows are separated into primary and secondary

Parameters

study_msoas (List[str]) – A list of MSOAs in the study area (flows outside of this will be ignored)
quant_object (QuantRampAPI) – A pointer to the QuantRampAPI object that implements the spatial interaction model

Return type

(<class ‘pandas.core.frame.DataFrame’>, <class ‘pandas.core.frame.DataFrame’>, <class ‘pandas.core.frame.DataFrame’>, <class ‘pandas.core.frame.DataFrame’>)

Returns

A tuple of three dataframes. All schools, then the flows to primary and secondary

(Schools, PrimaryFlows, SeconaryFlows). Although all the schools are one dataframe, no primary flows will flow to secondary schools and vice versa).

static read_time_activity_multiplier(lockdown_file): Some times people should spend more time at home than normal. E.g. after lockdown. This function reads a file that tells us how much more time should be spent at home on each day. :param lockdown_file: Where to read the mobility data from (assume it’s within the DATA_DIR). :rtype: DataFrame :return: A dataframe with ‘day’ and ‘timeout_multiplier’ columns

microsim.quant_api module

class microsim.quant_api.QuantRampAPI(quant_dir='QUANT_RAMP')

Bases: object

Class that handles integration of QUANT data into the RAMP microsim model QUANT spatial interaction data include probabilities of trips from MSOA or IZ origins to primary schools, secondary schools and retail locations. Based on QUANTRampAPI.py provided by UCL. For further details about QUANT, see https://github.com/maptube/QUANT_RAMP

static getProbableHospitalByMSOAIZ(dfHospitalPopulation, dfHospitalZones, hospital_probHij, msoa_iz, threshold)

Given an MSOA area code (England and Wales) or an Intermediate Zone (IZ) 2001 code (Scotland), return a list of all the surrounding hospitals whose probabilty of being visited by the MSOA_IZ is greater than or equal to the threshold. Hospital ids are taken from the NHS England export of “location” - see hospitalZones for ids and names (and east/north) NOTE: code identical to the primary school version, only with switched lookup tables

Parameters

dfHospitalPopulation (pandas.DataFrame) – the population in an MSOA IZ zone who can travel to hospital
dfHospitalZones (pandas.DataFrame) – locations of hospitals and size (in number of beds)
hospital_probHij (numpy.matrix) – matrix of probability scores of a hospital being visited
msoa_iz – An MSOA code (England/Wales e.g. E02000001) or an IZ2001 code (Scotland e.g. S02000001)
threshold (float) – Probability threshold e.g. 0.5 means return all possible hospital points with probability>=0.5

Returns

a list of [ {id: ‘hospitalid1’, p: 0.5}, {id: ‘hospitalid2’, p:0.6}, … etc] (NOTE: not sorted in any particular order)

Return type

list

static getProbablePrimarySchoolsByMSOAIZ(dfPrimaryPopulation, dfPrimaryZones, primary_probPij, msoa_iz, threshold)

Given an MSOA area code (England and Wales) or an Intermediate Zone (IZ) 2001 code (Scotland), return a list of all the surrounding primary schools whose probabilty of being visited by the MSOA_IZ is greater than or equal to the threshold. School ids are taken from the Edubase list of URN NOTE: code identical to the secondary school version, only with switched lookup tables

Parameters

dfPrimaryPopulation (pandas.DataFrame) – the population in an MSOA IZ zone who go to primary school
dfPrimaryZones – the points representing the schools (location and ID code) from

the schools database dump for the UK :type dfPrimaryZones: pandas.DataFrame :param primary_probPij: matrix of probability scores of a primary school being visited :type primary_probPij: numpy.matrix :param msoa_iz: An MSOA code (England/Wales e.g. E02000001) or an IZ2001 code (Scotland e.g. S02000001) :type msoa_iz: str :param threshold: Probability threshold e.g. 0.5 means return all possible schools with probability>=0.5 :type threshold: float :return: a list of probabilities in the same order as the venues :rtype: list

static getProbableRetailByMSOAIZ(dfRetailPointsPopulation, dfRetailPointsZones, retailpoints_probSij, msoa_iz, threshold)

Given an MSOA area code (England and Wales) or an Intermediate Zone (IZ) 2001 code (Scotland), return a list of all the surrounding retail points whose probabilty of being visited by the MSOA_IZ is greater than or equal to the threshold. Retail ids are from ????

Parameters

dfRetailPointsPopulation (pandas.DataFrame) – the population in an MSOA IZ zone who might use retail (everyone) and the amounts available for retail spending (although that is not used in RAMP)
dfRetailPointsZones (pandas.DataFrame) – destinations (the locations of shops)
retailpoints_probSij (numpy.matrix) – matrix of probability scores of a retail points being visited
msoa_iz (str) – An MSOA code (England/Wales e.g. E02000001) or an IZ2001 code (Scotland e.g. S02000001)
threshold (float) – Probability threshold e.g. 0.5 means return all possible retail points with probability>=0.5

Returns

a list of probabilities in the same order as the venues

Return type

list

static getProbableSecondarySchoolsByMSOAIZ(dfSecondaryPopulation, dfSecondaryZones, secondary_probPij, msoa_iz, threshold)

Given an MSOA area code (England and Wales) or an Intermediate Zone (IZ) 2001 code (Scotland), return a list of all the surrounding secondary schools whose probabilty of being visited by the MSOA_IZ is greater than or equal to the threshold. School ids are taken from the Edubase list of URN NOTE: code identical to the primary school version, only with switched lookup tables

Parameters

dfSecondaryPopulation (pandas.DataFrame) – the population in an MSOA IZ zone who go to secondary school
dfSecondaryZones (pandas.DataFrame) – the points representing the schools (location and ID code) from the schools database dump for the UK
secondary_probPij (numpy.matrix) – matrix of probability scores of a secondary school being visited
msoa_iz (str) – An MSOA code (England/Wales e.g. E02000001) or an IZ2001 code (Scotland e.g. S02000001)
threshold (float) – Probability threshold e.g. 0.5 means return all possible schools with probability>=0.5

Returns

a list of probabilities in the same order as the venues

Return type

list

classmethod get_flows(venue, msoa_list, threshold, thresholdtype)

Wrapper function that generates flow data for a given venue type, MSOA list, and specified threshold.

Parameters

venue (str) – venue type from ColumnNames.Activities class to generate flow data for
msoa_list (list) – list of MSOAs to calculate flow data for
threshold (float) – Probability threshold e.g. 0.5 means return all possible hospital points with probability>=0.5
thresholdtype (str) – the threshold type setting, can either be prob or nr

Returns

dataframe of flow data

Return type

pandas.DataFrame

classmethod read_data(QUANT_DIR)

reads in all data in provided data directory and creates series of class object attributes

Parameters: QUANT_DIR (str) – a string of the full path to QUANT files

microsim.r_interface module

class microsim.r_interface.RInterface(script_dir)

Bases: object

An RInterface object can be used to create an R session, initialise everything that is needed for the disease status estimation function, and then interact with session to calculate the disease status.

calculate_disease_status(individuals, iteration, repnr, disease_params): Call the R ‘run_status’ function to calculate the new disease status. It will return a new dataframe with a few columns, including the new status. :type individuals: DataFrame :param individuals: The individuals dataframe from which new statuses need to be calculated :type iteration: int :param iteration: The iteration number (i.e. number of model steps so far) :type repnr: int :param repnr: The repetition number of the model. Like a unique ID for the model. :type disease_params: dict :param disease_params: A dictionary of disease parameters used in the R model. :return: a new dataframe that includes new disease statuses

microsim.utilities module

class microsim.utilities.Optimise

Bases: object

Functions to optimise the memory use of pandas dataframes. From https://medium.com/bigdatarepublic/advanced-pandas-optimize-speed-and-memory-a654b53be6c2

static optimize(df, datetime_features=[])

microsim.utilities.check_durations_sum_to_1(individuals, activities)

microsim.utilities.data_setup(url='https://ramp0storage.blob.core.windows.net/rampdata/devon_data.tar.gz')

A wrapper function for downloading and unpacking Azure stored devon_data

Args:: archive (str): A string directory path to archive file using url (str, optional): A url to an archive file. Defaults to “https://ramp0storage.blob.core.windows.net/rampdata/devon_data.tar.gz”.

microsim.utilities.download_data(url)

Download data utility function

Args:: url (str, optional): A url to an archive file. Defaults to “https://ramp0storage.blob.core.windows.net/rampdata/devon_data.tar.gz”.

microsim.utilities.unpack_data(archive)

unpack tar data archive

Args:: archive (str): A string directory path to archive file using

Python microsim model

Submodules

microsim.QUANTRampAPI2 module

microsim.activity_location module

microsim.column_names module

microsim.dashboard module

microsim.dashboard_QUANT module

microsim.initialisation_cache module

microsim.load_msoa_locations module

microsim.main module

microsim.microsim_initialisation module

microsim.microsim_model module

microsim.population_initialisation module

microsim.quant_api module

microsim.r_interface module

microsim.utilities module

Module contents