Skip to content

Commit

Permalink
Merge branch 'master' into molaro/harvest-training-data
Browse files Browse the repository at this point in the history
Merge master
  • Loading branch information
marghe-molaro committed Oct 18, 2024
2 parents 7232f97 + 8d088f0 commit 98a8832
Show file tree
Hide file tree
Showing 16 changed files with 645 additions and 62 deletions.
4 changes: 3 additions & 1 deletion .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ on:
- requirements/**
- resources/**
- src/tlo/**
- src/scripts/profiling/scale_run.py
- src/scripts/profiling/shared.py
- tests/**
- pyproject.toml
- tox.ini
Expand Down Expand Up @@ -83,4 +85,4 @@ jobs:
path: ${{ matrix.file }}.results.xml
summary: true
display-options: fEX
title: Results for ${{ matrix.file }}
title: Results for ${{ matrix.file }}
1 change: 1 addition & 0 deletions CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,7 @@ authors:
family-names: Janoušková
orcid: https://orcid.org/0000-0002-4104-0119
affiliation: University College London
website: https://profiles.ucl.ac.uk/90260
- given-names: Rachel
family-names: Murray-Watson
affiliation: Imperial College London
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
<div style="text-align: center" align="center">
<img src="docs/thanzi-la-onse.png" alt="Thanzi La Onze" />
<img src="docs/thanzi-la-onse.png" alt="Thanzi la Onse" />
<br />
<h1>Thanzi la Onse model</h1>
</div>
Expand All @@ -24,7 +24,7 @@ The __Thanzi la Onse model (TLOmodel)__ is a part of the [Thanzi la Onse][thanzi
TLOmodel is developed in a collaboration between:

- [Kamuzu University of Health Sciences][kuhes-link]
- [MRC Centre for Global Infectioous Disease Analysis][mrc-gida-link], [Imperial College London][imperial-link]
- [MRC Centre for Global Infectious Disease Analysis][mrc-gida-link], [Imperial College London][imperial-link]
- [Institute for Global Health][igh-link], [University College London][ucl-link]
- [Centre for Advanced Research Computing][arc-link], [University College London][ucl-link]
- [Centre for Health Economics][che-link], [University of York][york-link]
Expand Down
1 change: 1 addition & 0 deletions contributors.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -195,6 +195,7 @@
family-names: Janoušková
orcid: "https://orcid.org/0000-0002-4104-0119"
affiliation: "University College London"
website: "https://profiles.ucl.ac.uk/90260"
github-username: EvaJanouskova
contributions:
- Epidemiology and modelling
Expand Down
7 changes: 6 additions & 1 deletion docs/publications.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,21 +14,26 @@ Overview of the Model

Analyses Using The Model
========================
* `Health workforce needs in Malawi: analysis of the Thanzi La Onse integrated epidemiological model of care <https://human-resources-health.biomedcentral.com/articles/10.1186/s12960-024-00949-2>`_

* `A new approach to Health Benefits Package design: an application of the Thanzi La Onse model in Malawi <https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1012462>`_

* `The Changes in Health Service Utilisation in Malawi During the COVID-19 Pandemic <https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0290823>`_

* `Modeling Contraception and Pregnancy in Malawi: A Thanzi La Onse Mathematical Modeling Study <https://onlinelibrary.wiley.com/doi/10.1111/sifp.12255>`_

* `Factors Associated with Consumable Stock-Outs in Malawi: Evidence from a Facility Census <https://www.sciencedirect.com/science/article/pii/S2214109X24000950>`_

* `The Effects of Health System Frailties on the Projected Impact of the HIV and TB Programmes in Malawi <https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4508436>`_
* `The Effects of Health System Frailties on the Projected Impact of the HIV and TB Programmes in Malawi <https://www.sciencedirect.com/science/article/pii/S2214109X24002596>`_

* `Estimating the health burden of road traffic injuries in Malawi using an individual-based model <https://injepijournal.biomedcentral.com/articles/10.1186/s40621-022-00386-6>`_

* `The potential impact of intervention strategies on COVID-19 transmission in Malawi: A mathematical modelling study. <https://bmjopen.bmj.com/content/11/7/e045196>`_

* `The potential impact of including pre-school aged children in the praziquantel mass-drug administration programmes on the S.haematobium infections in Malawi: a modelling study <https://www.medrxiv.org/content/10.1101/2020.12.09.20246652v1>`_

* `A Decade of Progress in HIV, Malaria, and Tuberculosis Initiatives in Malawi. <https://www.medrxiv.org/content/10.1101/2024.10.08.24315077v1>`_


Healthcare Seeking Behaviour
============================
Expand Down
82 changes: 82 additions & 0 deletions src/scripts/dependencies/tlo_module_graph.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
"""Construct a graph showing dependencies between modules."""

import argparse
from pathlib import Path
from typing import Dict, Set

from tlo.dependencies import DependencyGetter, get_all_dependencies, get_module_class_map
from tlo.methods import Metadata

try:
import pydot
except ImportError:
pydot = None


def construct_module_dependency_graph(
excluded_modules: Set[str],
disease_module_node_defaults: Dict,
other_module_node_defaults: Dict,
get_dependencies: DependencyGetter = get_all_dependencies,
):
"""Construct a pydot object representing module dependency graph.
:param excluded_modules: Set of ``Module`` subclass names to not included in graph.
:param disease_module_node_defaults: Any dot node attributes to apply to by default
to disease module nodes.
:param other_module_node_defaults: Any dot node attributes to apply to by default
to non-disease module nodes.
:param get_dependencies: Function which given a module gets the set of module
dependencies. Defaults to extracting all dependencies.
:return: Pydot directed graph representing module dependencies.
"""
if pydot is None:
raise RuntimeError("pydot package must be installed")
module_class_map = get_module_class_map(excluded_modules)
module_graph = pydot.Dot("modules", graph_type="digraph")
disease_module_subgraph = pydot.Subgraph("disease_modules")
module_graph.add_subgraph(disease_module_subgraph)
other_module_subgraph = pydot.Subgraph("other_modules")
module_graph.add_subgraph(other_module_subgraph)
disease_module_subgraph.set_node_defaults(**disease_module_node_defaults)
other_module_subgraph.set_node_defaults(**other_module_node_defaults)
for name, module_class in module_class_map.items():
node = pydot.Node(name)
if Metadata.DISEASE_MODULE in module_class.METADATA:
disease_module_subgraph.add_node(node)
else:
other_module_subgraph.add_node(node)
for key, module in module_class_map.items():
for dependency in get_dependencies(module, module_class_map.keys()):
if dependency not in excluded_modules:
module_graph.add_edge(pydot.Edge(key, dependency))
return module_graph


if __name__ == "__main__":
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument(
"output_file", type=Path, help=(
"Path to output graph to. File extension will determine output format - for example: dot, dia, png, svg"
)
)
args = parser.parse_args()
excluded_modules = {
"Mockitis",
"ChronicSyndrome",
"Skeleton",
"AlriPropertiesOfOtherModules",
"DiarrhoeaPropertiesOfOtherModules",
"DummyHivModule",
"SimplifiedBirths",
"Tb",
}
module_graph = construct_module_dependency_graph(
excluded_modules,
disease_module_node_defaults={"fontname": "Arial", "shape": "box"},
other_module_node_defaults={"fontname": "Arial", "shape": "ellipse"},
)
format = (
args.output_file.suffix[1:] if args.output_file.suffix else "raw"
)
module_graph.write(args.output_file, format=format)
14 changes: 12 additions & 2 deletions src/scripts/profiling/run_profiling.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
from pyinstrument.renderers import ConsoleRenderer, HTMLRenderer
from pyinstrument.session import Session
from scale_run import save_arguments_to_json, scale_run
from shared import memory_statistics

try:
from ansi2html import Ansi2HTMLConverter
Expand Down Expand Up @@ -168,6 +169,8 @@ def record_run_statistics(
**profiling_session_statistics(profiling_session),
# Disk input/output statistics
**disk_statistics(disk_usage),
# Process memory statistics
**memory_statistics(),
# Statistics from end end-state of the simulation
**simulation_statistics(completed_sim),
# User-defined additional stats (if any)
Expand Down Expand Up @@ -222,7 +225,7 @@ def run_profiling(
"initial_population": initial_population,
"log_filename": "scale_run_profiling",
"log_level": "WARNING",
"parse_log_file": False,
"parse_log_file": True,
"show_progress_bar": show_progress_bar,
"seed": 0,
"disable_health_system": False,
Expand All @@ -245,7 +248,7 @@ def run_profiling(

# Profile scale_run
disk_at_start = disk_io_counters()
completed_simulation = scale_run(
completed_simulation, logs_dict = scale_run(
**scale_run_args, output_dir=output_dir, profiler=profiler
)
disk_at_end = disk_io_counters()
Expand Down Expand Up @@ -323,6 +326,13 @@ def run_profiling(
additional_stats=additional_stats,
)
print("done")

# Write out logged profiling statistics
logged_statistics_file = output_dir / f"{output_name}.logged-stats.csv"
print(f"Writing {logged_statistics_file}", end="...", flush=True)
logs_dict["tlo.profiling"]["stats"].to_csv(logged_statistics_file, index=False)
print("done")



if __name__ == "__main__":
Expand Down
31 changes: 18 additions & 13 deletions src/scripts/profiling/scale_run.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
from shared import print_checksum, schedule_profile_log

from tlo import Date, Simulation, logging
from tlo.analysis.utils import LogsDict
from tlo.analysis.utils import parse_log_file as parse_log_file_fn
from tlo.methods.fullmodel import fullmodel

Expand Down Expand Up @@ -55,24 +56,25 @@ def scale_run(
ignore_warnings: bool = False,
log_final_population_checksum: bool = True,
profiler: Optional["Profiler"] = None,
) -> Simulation:
) -> Simulation | tuple[Simulation, LogsDict]:
if ignore_warnings:
warnings.filterwarnings("ignore")

# Start profiler if one has been passed
if profiler is not None:
profiler.start()

# Simulation period
start_date = Date(2010, 1, 1)
end_date = start_date + pd.DateOffset(years=years, months=months)

log_config = {
"filename": log_filename,
"directory": output_dir,
"custom_levels": {"*": getattr(logging, log_level)},
# Ensure tlo.profiling log records always recorded
"custom_levels": {"*": getattr(logging, log_level), "tlo.profiling": logging.INFO},
"suppress_stdout": disable_log_output_to_stdout,
}

# Start profiler if one has been passed
if profiler is not None:
profiler.start()

sim = Simulation(
start_date=start_date,
Expand Down Expand Up @@ -102,17 +104,19 @@ def scale_run(

# Run the simulation
sim.make_initial_population(n=initial_population)
schedule_profile_log(sim)
schedule_profile_log(sim, frequency_months=1)
sim.simulate(end_date=end_date)

# Stop profiling session
if profiler is not None:
profiler.stop()

if log_final_population_checksum:
print_checksum(sim)

if save_final_population:
sim.population.props.to_pickle(output_dir / "final_population.pkl")

if parse_log_file:
parse_log_file_fn(sim.log_filepath)

if record_hsi_event_details:
with open(output_dir / "hsi_event_details.json", "w") as json_file:
json.dump(
Expand All @@ -124,10 +128,11 @@ def scale_run(
],
json_file,
)

if parse_log_file:
logs_dict = parse_log_file_fn(sim.log_filepath)
return sim, logs_dict

# Stop profiling session
if profiler is not None:
profiler.stop()
return sim


Expand Down
46 changes: 39 additions & 7 deletions src/scripts/profiling/shared.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,11 @@

import pandas as pd

try:
import psutil
except ImportError:
psutil = None

from tlo import DateOffset, Simulation, logging
from tlo.events import PopulationScopeEventMixin, RegularEvent
from tlo.util import hash_dataframe
Expand All @@ -12,9 +17,34 @@
logger.setLevel(logging.INFO)


def memory_statistics() -> dict[str, float]:
"""
Extract memory usage statistics in current process using `psutil` if available.
Statistics are returned as a dictionary. If `psutil` not installed an empty dict is returned.
Key / value pairs are:
memory_rss_MiB: float
Resident set size in mebibytes. The non-swapped physical memory the process has used.
memory_vms_MiB: float
Virtual memory size in mebibytes. The total amount of virtual memory used by the process.
memory_uss_MiB: float
Unique set size in mebibytes. The memory which is unique to a process and which would be freed if the process
was terminated right now
"""
if psutil is None:
return {}
process = psutil.Process()
memory_info = process.memory_full_info()
return {
"memory_rss_MiB": memory_info.rss / 2**20,
"memory_vms_MiB": memory_info.vms / 2**20,
"memory_uss_MiB": memory_info.uss / 2**20,
}


class LogProgress(RegularEvent, PopulationScopeEventMixin):
def __init__(self, module):
super().__init__(module, frequency=DateOffset(months=3))
def __init__(self, module, frequency_months=3):
super().__init__(module, frequency=DateOffset(months=frequency_months))
self.time = time.time()

def apply(self, population):
Expand All @@ -26,16 +56,18 @@ def apply(self, population):
key="stats",
data={
"time": datetime.datetime.now().isoformat(),
"duration": duration,
"alive": df.is_alive.sum(),
"total": len(df),
"duration_minutes": duration,
"pop_df_number_alive": df.is_alive.sum(),
"pop_df_rows": len(df),
"pop_df_mem_MiB": df.memory_usage(index=True, deep=True).sum() / 2**20,
**memory_statistics(),
},
)


def schedule_profile_log(sim: Simulation) -> None:
def schedule_profile_log(sim: Simulation, frequency_months: int = 3) -> None:
"""Schedules the log progress event, used only for profiling"""
sim.schedule_event(LogProgress(sim.modules["Demography"]), sim.start_date)
sim.schedule_event(LogProgress(sim.modules["Demography"], frequency_months), sim.start_date)


def print_checksum(sim: Simulation) -> None:
Expand Down
Loading

0 comments on commit 98a8832

Please sign in to comment.