Python microsim model
Submodules
microsim.QUANTRampAPI2 module
microsim.activity_location module
- class microsim.activity_location.ActivityLocation(name, locations, flows, individuals, duration_col)
Bases:
object
Class to represent information about activity locations, e.g. retail destinations, workpaces, etc.
- get_dangers()
Get the danger associated with each location as a list. These will be in the same order as the location IDs returned by get_ids()
- Return type
List
[float
]
- get_dataframe_copy()
Get a copy of the dataframe that underpins this ActivityLocation :rtype:
DataFrame
:return:
- get_ids()
Retrn the IDs of each destination. Shouldn’t need to know these. Use get_dangers or update_dangers instead
- Return type
List
[int
]
- get_indices()
Return the index (row number) of each destination Shouldn’t need to know these. Use get_dangers or update_dangers instead
- Return type
List
[int
]
- get_name()
Get the name of this activity. This is used to label columns in the file of individuals
- Return type
str
- update_dangers(dangers)
Update the danger associated with each location :type dangers:
List
[float
] :param dangers: A list of dangers for each location. Must be in the same order as the locations as returned by get_ids.
microsim.column_names module
- class microsim.column_names.ColumnNames
Bases:
object
Used to record standard dataframe column names used throughout
- ACTIVITY_DURATION = '_Duration'
- ACTIVITY_DURATION_INITIAL = '_Duration_Initial'
- ACTIVITY_FLOWS = '_Flows'
- ACTIVITY_RISK = '_Risk'
- ACTIVITY_VENUES = '_Venues'
- class Activities
Bases:
object
- ALL = ['Retail', 'PrimarySchool', 'SecondarySchool', 'Home', 'Work']
- HOME = 'Home'
- PRIMARY = 'PrimarySchool'
- RETAIL = 'Retail'
- SECONDARY = 'SecondarySchool'
- WORK = 'Work'
- CURRENT_RISK = 'current_risk'
- DISEASE_EXPOSED_DAYS = 'exposed_days'
- DISEASE_PRESYMP = 'presymp_days'
- DISEASE_STATUS = 'disease_status'
- DISEASE_STATUS_CHANGED = 'status_changed'
- DISEASE_SYMP_DAYS = 'symp_days'
- class DiseaseStatuses
Bases:
object
- ALL = [0, 1, 2, 3, 4, 5, 6]
- ASYMPTOMATIC = 4
- DEAD = 6
- EXPOSED = 1
- PRESYMPTOMATIC = 2
- RECOVERED = 5
- SUSCEPTIBLE = 0
- SYMPTOMATIC = 3
- INDIVIDUAL_AGE = 'DC1117EW_C_AGE'
- INDIVIDUAL_ETH = 'DC2101EW_C_ETHPUK11'
- INDIVIDUAL_SEX = 'DC1117EW_C_SEX'
- LOCATION_DANGER = 'Danger'
- LOCATION_ID = 'ID'
- LOCATION_NAME = 'Location_Name'
- TRAVEL_BUS = 'Bus'
- TRAVEL_CAR = 'Car'
- TRAVEL_TRAIN = 'Train'
- TRAVEL_WALK = 'Walk'
microsim.dashboard module
Created on Thu Jun 4 16:22:57 2020
@author: Natalie
- microsim.dashboard.calc_nr_days(data_file)
- microsim.dashboard.create_counts_dict(conditions_dict, r_range, data_dir, start_day, end_day, start_run, nr_runs, age_cat)
Counts per condition (3D, mean and standard deviation) Produces 5 types of counts: msoacounts: nr per msoa and day agecounts: nr per age category and day totalcounts: nr per day (across all areas) cumcounts: nr per MSOA and day uniquecounts: nr with ‘final’ disease status across time period e.g. someone who is presymptomatic, symptomatic and recoverd is only counted once as recovered Output: msoas # list of msoas totalcounts_dict, cumcounts_dict, agecounts_dict, msoacounts_dict, cumcounts_dict_3d, totalcounts_dict_std, cumcounts_dict_std, agecounts_dict_std, msoacounts_dict_std, totalcounts_dict_3d, agecounts_dict_3d, msoacounts_dict_3d, uniquecounts_dict_3d, uniquecounts_dict_std, uniquecounts_dict
- microsim.dashboard.create_difference_dict(dict_sc0, dict_sc1, lookup_dict)
- microsim.dashboard.create_msoa_dangers_dict(dangers_dict, keys, msoa_codes)
Converts dangers_dict to MSOA level data for the appropriate venue types. Produces average danger score (sum dangers in MSOA / total nr venues in MSOA) Output: dangers_msoa_dict
- microsim.dashboard.create_venue_dangers_dict(locations_dict, r_range, data_dir, start_day, end_day, start_run, nr_runs)
Reads in venue pickle files (venues from locations_dict) and populates dangers_dict_3d (raw data: venue, day, run), dangers_dict (mean across runs) and dangers_dict_std (standard deviation across runs) Possible output includes: dangers_dict # mean (value to be plotted) dangers_dict_std # standard deviation (could plot as error bars) dangers_dict_3d # full 3D data (for debugging)
microsim.dashboard_QUANT module
Created on Thu Jun 4 16:22:57 2020
@author: Natalie
- microsim.dashboard_QUANT.calc_nr_days(data_file)
- microsim.dashboard_QUANT.create_counts_dict(conditions_dict, r_range, data_dir, start_day, end_day, start_run, nr_runs, age_cat)
Counts per condition (3D, mean and standard deviation) Produces 5 types of counts: msoacounts: nr per msoa and day agecounts: nr per age category and day totalcounts: nr per day (across all areas) cumcounts: nr per MSOA and day uniquecounts: nr with ‘final’ disease status across time period e.g. someone who is presymptomatic, symptomatic and recoverd is only counted once as recovered Output: msoas # list of msoas totalcounts_dict, cumcounts_dict, agecounts_dict, msoacounts_dict, cumcounts_dict_3d, totalcounts_dict_std, cumcounts_dict_std, agecounts_dict_std, msoacounts_dict_std, totalcounts_dict_3d, agecounts_dict_3d, msoacounts_dict_3d, uniquecounts_dict_3d, uniquecounts_dict_std, uniquecounts_dict
- microsim.dashboard_QUANT.create_difference_dict(dict_sc0, dict_sc1, lookup_dict)
- microsim.dashboard_QUANT.create_msoa_dangers_dict(dangers_dict, keys, msoa_codes)
Converts dangers_dict to MSOA level data for the appropriate venue types. Produces average danger score (sum dangers in MSOA / total nr venues in MSOA) Output: dangers_msoa_dict
- microsim.dashboard_QUANT.create_venue_dangers_dict(locations_dict, r_range, data_dir, start_day, end_day, start_run, nr_runs)
Reads in venue pickle files (venues from locations_dict) and populates dangers_dict_3d (raw data: venue, day, run), dangers_dict (mean across runs) and dangers_dict_std (standard deviation across runs) Possible output includes: dangers_dict # mean (value to be plotted) dangers_dict_std # standard deviation (could plot as error bars) dangers_dict_3d # full 3D data (for debugging)
microsim.initialisation_cache module
microsim.load_msoa_locations module
- microsim.load_msoa_locations.calculate_msoa_buildings(osm_buildings, msoa_shapes)
- microsim.load_msoa_locations.load_devon_msoas(data_dir, msoa_filename='devon_msoas.csv')
- microsim.load_msoa_locations.load_msoa_shapes(data_dir, visualize=False)
- microsim.load_msoa_locations.load_osm_shapefile(data_dir)
- microsim.load_msoa_locations.main()
microsim.main module
Core RAMP-UA model.
Created on Wed Apr 29 19:59:25 2020
@author: nick
- microsim.main.create_params(calibration_params, disease_params)
- microsim.main.run_opencl_model(individuals_df, activity_locations, time_activity_multiplier, iterations, data_dir, base_dir, use_gui, use_gpu, use_cache, initialise, calibration_params, disease_params)
- microsim.main.run_python_model(individuals_df, activity_locations_df, time_activity_multiplier, msim_args, iterations, repetitions, parameters_file)
microsim.microsim_initialisation module
microsim.microsim_model module
- class microsim.microsim_model.Microsim(individuals, activity_locations, time_activity_multiplier=None, random_seed=None, disable_disease_status=False, r_script_dir='./R/py_int/', data_dir='./data/', scen_dir='default', output=True, output_every_iteration=False, hazard_individual_multipliers={}, hazard_location_multipliers={}, risk_multiplier=1.0, disease_params={})
Bases:
object
Class containing code for running timesteps of the Python/ R microsim model. This operates on two main dataframes: individuals and activity_locations.
- calculate_new_disease_status()
Call an R function to calculate the new disease status for all individuals. Update the indivdiuals dataframe in place :rtype:
None
:return: . Update the dataframe inplace
- change_behaviour_with_disease()
When people have the disease the proportions that they spend doing activities changes. This function applies those changes inline to the individuals dataframe :rtype:
None
:return: None. Update the dataframe inplace
- run(iterations, repnr)
Run the model (call the step() function) for the given number of iterations. :type iterations:
int
:param iterations: The number of iterations to run :type repnr:int
:param repnr: The repition number of this model. Like an ID. Used to create new unique directory for thismodel instance.
- Return type
None
- step()
Step (iterate) the model for 1 iteration
- Return type
None
- Returns
- update_behaviour_during_lockdown()
Unilaterally alter the proportions of time spent on different activities before and after ‘lockddown’ Otherwise this doesn’t do anything update_behaviour_during_lockdown.
Note: ignores people who are currently showing symptoms (ColumnNames.DiseaseStatus.SYMPTOMATIC)
- update_venue_danger_and_risks(decimals=8)
Update the danger score for each location, based on where the individuals who have the infection visit. Then look through the individuals again, assigning some of that danger back to them as ‘current risk’.
- Parameters
risk_multiplier – Risk is calcuated as duration * flow * risk_multiplier.
decimals – Number of decimals to round the indivdiual risks and dangers to (defult 10). If ‘None’ then do no rounding
microsim.population_initialisation module
Core RAMP-UA model.
Created on Wed Apr 29 19:59:25 2020
@author: nick
- class microsim.population_initialisation.PopulationInitialisation(data_dir='./data/', read_data=True, testing=False, debug=False, quant_object=None)
Bases:
object
A class used to load different datasources and generate the population of people ready to be iterated. This produces dataframes of people and places ready to start either model implementation.
- classmethod add_disease_columns(individuals)
Adds columns required to estimate disease prevalence
- Return type
DataFrame
- classmethod add_individual_flows(flow_type, individuals, flow_matrix)
Take a flow matrix from MSOAs to (e.g. retail) locations and assign flows to individuals.
It a assigns the id of the destination of the flow according to its column in the matrix. So the first column that has flows for a destination is given index 0, the second is index 1, etc. This is probably not the same as the ID of the venue that they point to (e.g. the first store probably has ID 1, but will be given the index 0) so it is important that when the activity_locations are created, they are created in the same order as the columns that appear in the matix. The first column in the matrix must also be the first row in the locations data. :type flow_type:
str
:param flow_type: What type of flows are these. This will be appended to the column names. E.g. “Retail”. :type individuals:DataFrame
:param individuals: The DataFrame contining information about all individuals :type flow_matrix:DataFrame
:param flow_matrix: The flow matrix, created by (e.g.) read_retail_flows_data() :rtype:DataFrame
:return: The DataFrame of individuals with new locations and probabilities added
- classmethod add_work_flows(flow_type, individuals, workplaces, commuting_flows, flow_threshold)
Create a dataframe of work locations that individuals travel to. The flows are based on general commuting patterns and assume one work location per industry type MSOA. :type flow_type:
str
:param flow_type: The name for these flows (probably something like ‘Work’) :type individuals:DataFrame
:param individuals: The dataframe of synthetic individuals :type workplaces:DataFrame
:param workplaces: The dataframe of workplaces (i.e. occupations) :type commuting_flows:DataFrame
:param commuting_flows: The general commuting flows between MSOAs (an O-D matrix) :param flow_threshold: Only include the top x destinations as possible flows. ‘None’ means no limit. :rtype:DataFrame
:return: The new ‘individuals’ dataframe (with new columns)
- classmethod check_sim_flows(locations, flows)
Check that the flow matrix looks OK, raising an error if not :type locations:
DataFrame
:param locations: A DataFrame with information about each location (destination) :type flows:DataFrame
:param flows: The flow matrix itself, showing flows from origin MSOAs to destinations :return:
- classmethod extract_msoas_from_individuals(individuals)
Analyse a DataFrame of individuals and extract the unique MSOA codes, returning them as a list in ascending order :type individuals:
DataFrame
:param individuals: :rtype:List
[str
] :return:
- classmethod generate_travel_time_colums(individuals)
TODO Read the raw travel time columns and create standard ones to show how long individuals spend travelling on different modes. Ultimately these will be turned into activities :type individuals:
DataFrame
:param individuals: :rtype:DataFrame
:return:
- classmethod pad_durations(individuals, activity_locations)
Some indvidiuals’ activity durations don’t add up to 1. In these cases pad them out with extra time at home. :param individuals: :param activity_locations: :rtype:
DataFrame
:return: The new individuals dataframe
- classmethod read_commuting_flows_data(study_msoas)
Read the commuting flows between each MSOA
- Parameters
study_msoas (
List
[str
]) – A list of MSOAs in the study area (flows outside of this will be ignored)- Return type
(<class ‘pandas.core.frame.DataFrame’>, <class ‘pandas.core.frame.DataFrame’>)
- Returns
A dataframe with origin and destination flows in all MSOAs in the study area
- classmethod read_individual_time_use_and_health_data(home_name)
Read a population of individuals. Includes time-use & health info.
- Parameters
home_name (
str
) – A string to describe flows to people’s homes (probably ‘Home’)
:return A tuple with new dataframes of individuals and households
- Return type
DataFrame
- classmethod read_retail_flows_data(study_msoas, quant_object)
Read the flows between each MSOA and the most commonly visited shops
- Parameters
study_msoas (
List
[str
]) – A list of MSOAs in the study area (flows outside of this will be ignored)quant_object (
QuantRampAPI
) – The QuantRampAPI object used to estimate destination school and retail locations
- Return type
(<class ‘pandas.core.frame.DataFrame’>, <class ‘pandas.core.frame.DataFrame’>)
- Returns
A tuple of two dataframes. One containing all of the flows and another
containing information about the stores themselves.
- classmethod read_school_flows_data(study_msoas, quant_object)
Read the flows between each MSOA and the most likely schools attended by pupils in this area. All schools are initially read together, but flows are separated into primary and secondary
- Parameters
study_msoas (
List
[str
]) – A list of MSOAs in the study area (flows outside of this will be ignored)quant_object (
QuantRampAPI
) – A pointer to the QuantRampAPI object that implements the spatial interaction model
- Return type
(<class ‘pandas.core.frame.DataFrame’>, <class ‘pandas.core.frame.DataFrame’>, <class ‘pandas.core.frame.DataFrame’>, <class ‘pandas.core.frame.DataFrame’>)
- Returns
A tuple of three dataframes. All schools, then the flows to primary and secondary
(Schools, PrimaryFlows, SeconaryFlows). Although all the schools are one dataframe, no primary flows will flow to secondary schools and vice versa).
- static read_time_activity_multiplier(lockdown_file)
Some times people should spend more time at home than normal. E.g. after lockdown. This function reads a file that tells us how much more time should be spent at home on each day. :param lockdown_file: Where to read the mobility data from (assume it’s within the DATA_DIR). :rtype:
DataFrame
:return: A dataframe with ‘day’ and ‘timeout_multiplier’ columns
microsim.quant_api module
- class microsim.quant_api.QuantRampAPI(quant_dir='QUANT_RAMP')
Bases:
object
Class that handles integration of QUANT data into the RAMP microsim model QUANT spatial interaction data include probabilities of trips from MSOA or IZ origins to primary schools, secondary schools and retail locations. Based on QUANTRampAPI.py provided by UCL. For further details about QUANT, see https://github.com/maptube/QUANT_RAMP
- static getProbableHospitalByMSOAIZ(dfHospitalPopulation, dfHospitalZones, hospital_probHij, msoa_iz, threshold)
Given an MSOA area code (England and Wales) or an Intermediate Zone (IZ) 2001 code (Scotland), return a list of all the surrounding hospitals whose probabilty of being visited by the MSOA_IZ is greater than or equal to the threshold. Hospital ids are taken from the NHS England export of “location” - see hospitalZones for ids and names (and east/north) NOTE: code identical to the primary school version, only with switched lookup tables
- Parameters
dfHospitalPopulation (pandas.DataFrame) – the population in an MSOA IZ zone who can travel to hospital
dfHospitalZones (pandas.DataFrame) – locations of hospitals and size (in number of beds)
hospital_probHij (numpy.matrix) – matrix of probability scores of a hospital being visited
msoa_iz – An MSOA code (England/Wales e.g. E02000001) or an IZ2001 code (Scotland e.g. S02000001)
threshold (float) – Probability threshold e.g. 0.5 means return all possible hospital points with probability>=0.5
- Returns
a list of [ {id: ‘hospitalid1’, p: 0.5}, {id: ‘hospitalid2’, p:0.6}, … etc] (NOTE: not sorted in any particular order)
- Return type
list
- static getProbablePrimarySchoolsByMSOAIZ(dfPrimaryPopulation, dfPrimaryZones, primary_probPij, msoa_iz, threshold)
Given an MSOA area code (England and Wales) or an Intermediate Zone (IZ) 2001 code (Scotland), return a list of all the surrounding primary schools whose probabilty of being visited by the MSOA_IZ is greater than or equal to the threshold. School ids are taken from the Edubase list of URN NOTE: code identical to the secondary school version, only with switched lookup tables
- Parameters
dfPrimaryPopulation (pandas.DataFrame) – the population in an MSOA IZ zone who go to primary school
dfPrimaryZones – the points representing the schools (location and ID code) from
the schools database dump for the UK :type dfPrimaryZones: pandas.DataFrame :param primary_probPij: matrix of probability scores of a primary school being visited :type primary_probPij: numpy.matrix :param msoa_iz: An MSOA code (England/Wales e.g. E02000001) or an IZ2001 code (Scotland e.g. S02000001) :type msoa_iz: str :param threshold: Probability threshold e.g. 0.5 means return all possible schools with probability>=0.5 :type threshold: float :return: a list of probabilities in the same order as the venues :rtype: list
- static getProbableRetailByMSOAIZ(dfRetailPointsPopulation, dfRetailPointsZones, retailpoints_probSij, msoa_iz, threshold)
Given an MSOA area code (England and Wales) or an Intermediate Zone (IZ) 2001 code (Scotland), return a list of all the surrounding retail points whose probabilty of being visited by the MSOA_IZ is greater than or equal to the threshold. Retail ids are from ????
- Parameters
dfRetailPointsPopulation (pandas.DataFrame) – the population in an MSOA IZ zone who might use retail (everyone) and the amounts available for retail spending (although that is not used in RAMP)
dfRetailPointsZones (pandas.DataFrame) – destinations (the locations of shops)
retailpoints_probSij (numpy.matrix) – matrix of probability scores of a retail points being visited
msoa_iz (str) – An MSOA code (England/Wales e.g. E02000001) or an IZ2001 code (Scotland e.g. S02000001)
threshold (float) – Probability threshold e.g. 0.5 means return all possible retail points with probability>=0.5
- Returns
a list of probabilities in the same order as the venues
- Return type
list
- static getProbableSecondarySchoolsByMSOAIZ(dfSecondaryPopulation, dfSecondaryZones, secondary_probPij, msoa_iz, threshold)
Given an MSOA area code (England and Wales) or an Intermediate Zone (IZ) 2001 code (Scotland), return a list of all the surrounding secondary schools whose probabilty of being visited by the MSOA_IZ is greater than or equal to the threshold. School ids are taken from the Edubase list of URN NOTE: code identical to the primary school version, only with switched lookup tables
- Parameters
dfSecondaryPopulation (pandas.DataFrame) – the population in an MSOA IZ zone who go to secondary school
dfSecondaryZones (pandas.DataFrame) – the points representing the schools (location and ID code) from the schools database dump for the UK
secondary_probPij (numpy.matrix) – matrix of probability scores of a secondary school being visited
msoa_iz (str) – An MSOA code (England/Wales e.g. E02000001) or an IZ2001 code (Scotland e.g. S02000001)
threshold (float) – Probability threshold e.g. 0.5 means return all possible schools with probability>=0.5
- Returns
a list of probabilities in the same order as the venues
- Return type
list
- classmethod get_flows(venue, msoa_list, threshold, thresholdtype)
Wrapper function that generates flow data for a given venue type, MSOA list, and specified threshold.
- Parameters
venue (str) – venue type from ColumnNames.Activities class to generate flow data for
msoa_list (list) – list of MSOAs to calculate flow data for
threshold (float) – Probability threshold e.g. 0.5 means return all possible hospital points with probability>=0.5
thresholdtype (str) – the threshold type setting, can either be prob or nr
- Returns
dataframe of flow data
- Return type
pandas.DataFrame
- classmethod read_data(QUANT_DIR)
reads in all data in provided data directory and creates series of class object attributes
- Parameters
QUANT_DIR (str) – a string of the full path to QUANT files
microsim.r_interface module
- class microsim.r_interface.RInterface(script_dir)
Bases:
object
An RInterface object can be used to create an R session, initialise everything that is needed for the disease status estimation function, and then interact with session to calculate the disease status.
- calculate_disease_status(individuals, iteration, repnr, disease_params)
Call the R ‘run_status’ function to calculate the new disease status. It will return a new dataframe with a few columns, including the new status. :type individuals:
DataFrame
:param individuals: The individuals dataframe from which new statuses need to be calculated :type iteration:int
:param iteration: The iteration number (i.e. number of model steps so far) :type repnr:int
:param repnr: The repetition number of the model. Like a unique ID for the model. :type disease_params:dict
:param disease_params: A dictionary of disease parameters used in the R model. :return: a new dataframe that includes new disease statuses
microsim.utilities module
- class microsim.utilities.Optimise
Bases:
object
Functions to optimise the memory use of pandas dataframes. From https://medium.com/bigdatarepublic/advanced-pandas-optimize-speed-and-memory-a654b53be6c2
- static optimize(df, datetime_features=[])
- microsim.utilities.check_durations_sum_to_1(individuals, activities)
- microsim.utilities.data_setup(url='https://ramp0storage.blob.core.windows.net/rampdata/devon_data.tar.gz')
A wrapper function for downloading and unpacking Azure stored devon_data
- Args:
archive (str): A string directory path to archive file using url (str, optional): A url to an archive file. Defaults to “https://ramp0storage.blob.core.windows.net/rampdata/devon_data.tar.gz”.
- microsim.utilities.download_data(url)
Download data utility function
- Args:
url (str, optional): A url to an archive file. Defaults to “https://ramp0storage.blob.core.windows.net/rampdata/devon_data.tar.gz”.
- microsim.utilities.unpack_data(archive)
unpack tar data archive
- Args:
archive (str): A string directory path to archive file using