Skip to content

Commit

Permalink
Added Jason
Browse files Browse the repository at this point in the history
  • Loading branch information
tbetcke committed Oct 18, 2024
1 parent ce137d9 commit e77cd58
Show file tree
Hide file tree
Showing 2 changed files with 25 additions and 1 deletion.
2 changes: 1 addition & 1 deletion phd_projects/entries/Kovalchuk_healthcare.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ advisor: "Dr. Yevgeniya Kovalchuk"

### Existing background work

Graphs have become a useful tool for representing information in many application domains. Social, computer, sensor and transport networks; molecular structures and business processes – all can be represented as attributed graphs. One of the characteristics of such graphs is dynamism – the graph structure, as well as the attributes of nodes and edges can change over time. The accuracy of predictive and inference models built over dynamic graphs depends on the ability of the models to adapt to these changes. This project will propose novel methods for detecting changes in graphs over time (also known as drifts) and demonstrate their usefulness in downstream machine learning and process mining tasks performed over dynamic graphs. The work will build upon the methods recently proposed by the principal supervisor based on graphs and deep learning for process mining [https://doi.org/10.1109/ACCESS.2020.3025999 (https://doi.org/10.1109/ACCESS.2020.3025999) and drift detection in business processes [https://doi.org/10.3390/e24070910](https://doi.org/10.3390/e24070910). The PhD student will take this previous work as a basis to both advance the theoretical approach to drift detection in graph streams and demonstrate its generalizability by applying to a new domain, namely, discovering disease trajectories. Disease trajectories represented as graphs can help predict disease progression, risk of developing comorbid disorders and patient outcomes more accurately [Google Scholar]. Existing solutions for discovering disease trajectories are based on statistical analysis [https://doi.org/10.1038/ncomms5022](https://doi.org/10.1038/ncomms5022) and knowledge graphs [https://doi.org/10.1186/s13326-020-00228-8](https://doi.org/10.1186/s13326-020-00228-8), thus computationally expensive and not scalable. Furthermore, these solutions are not capable of adapting to changes over time (e.g. changes in disease progression due to events such as the coronavirus pandemic or introducing a new drug/vaccination). Finally, there is currently no solution exists based on the UK population data. The methods built in this project will be applied to Hospital Episode Statistics (HES) data, thus revealing the healthcare picture of the entire UK population.
Graphs have become a useful tool for representing information in many application domains. Social, computer, sensor and transport networks; molecular structures and business processes – all can be represented as attributed graphs. One of the characteristics of such graphs is dynamism – the graph structure, as well as the attributes of nodes and edges can change over time. The accuracy of predictive and inference models built over dynamic graphs depends on the ability of the models to adapt to these changes. This project will propose novel methods for detecting changes in graphs over time (also known as drifts) and demonstrate their usefulness in downstream machine learning and process mining tasks performed over dynamic graphs. The work will build upon the methods recently proposed by the principal supervisor based on graphs and deep learning for process mining [https://doi.org/10.1109/ACCESS.2020.3025999](https://doi.org/10.1109/ACCESS.2020.3025999) and drift detection in business processes [https://doi.org/10.3390/e24070910](https://doi.org/10.3390/e24070910). The PhD student will take this previous work as a basis to both advance the theoretical approach to drift detection in graph streams and demonstrate its generalizability by applying to a new domain, namely, discovering disease trajectories. Disease trajectories represented as graphs can help predict disease progression, risk of developing comorbid disorders and patient outcomes more accurately [Kusuma et al](https://eprints.whiterose.ac.uk/158247/15/HEALTHINF_2020_108.pdf). Existing solutions for discovering disease trajectories are based on statistical analysis [https://doi.org/10.1038/ncomms5022](https://doi.org/10.1038/ncomms5022) and knowledge graphs [https://doi.org/10.1186/s13326-020-00228-8](https://doi.org/10.1186/s13326-020-00228-8), thus computationally expensive and not scalable. Furthermore, these solutions are not capable of adapting to changes over time (e.g. changes in disease progression due to events such as the coronavirus pandemic or introducing a new drug/vaccination). Finally, there is currently no solution exists based on the UK population data. The methods built in this project will be applied to Hospital Episode Statistics (HES) data, thus revealing the healthcare picture of the entire UK population.


### Main objectives of the project
Expand Down
24 changes: 24 additions & 0 deletions phd_projects/entries/mcewen_probabilistic.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
title: "Differentiable probabilistic deep learning with generative denoising diffusion models"
institution: "UCL"
department: "MSSL"
author: "Laura Beer"
date: "10/18/2024"
advisor: "Prof Jason McEwen"
---

## Project Description

### Existing background work

Generative AI models for images, such as denoising diffusion models (e.g. Stable Diffusion), have recently demonstrated remarkable performance (Romback et al. 2022; [https://arxiv.org/abs/2112.10752](https://arxiv.org/abs/2112.10752)). Such generative models can be adapted to solve scientific inverse problems, such as recovering maps of the dark matter of the Universe. However, current approaches typically recover a single prediction, e.g. recover a single image. For robust scientific studies, however, single estimates are not sufficient and a principled statistical assessment is critical in order to quantify uncertainties. Embedding denoising diffusion models in a principled statistical framework for solving inverse problems remains a topical open problem in the field. A number of approximate solutions have been proposed (e.g. Chung et al 2023; [https://arxiv.org/abs/2209.14687](https://arxiv.org/abs/2209.14687)).

McEwen and collaborators have recently developed the proximal nested sampling framework (Cai et al. 2022; [https://arxiv.org/abs/2106.03646](https://arxiv.org/abs/2106.03646)) for principled statistical inference for high-dimensional inverse imaging problems with convex likelihoods (initial code available at [https://github.com/astro-informatics/proxnest](https://github.com/astro-informatics/proxnest)). Not only is the correct underlying posterior distribution targeted but the framework also supports computation of the marginal likelihood for principled Bayesian model comparison. Recently, the framework has been extended to support deep learned data-driven priors based on simple denoisers (McEwen et al. 2023; [https://arxiv.org/abs/2307.00056](https://arxiv.org/abs/2307.00056)), although not denoising diffusion models.

### Main objectives of the project

In this project we will develop a principled statistical framework to sample the posterior distribution of scientific inverse imaging problems that integrates the generative power of denoising diffusion models. This will be achieved by integrating denoising diffusion models into the proximal nested sampling framework. The resulting framework is expected to result in superior reconstruction performance due to the power of generative diffusion models, targets the correct underlying posterior distribution and also allows for Bayesian model comparison to assess different data-driven priors. The framework will be extended beyond convex likelihoods to handle general non-linear models by leveraging automatic differentiation and gradient-based likelihood constraints. Automatic differentiation will also be exploited to accelerate inference. While the focus will be mostly on theoretical methodological and code developments, the methods developed will be demonstrated on a number of inverse imaging problems in a range of fields.

### Details of Software/Data deliverables

The main deliverable with be an open-source code implementing the framework developed. Development will involve differentiable programming, generative denoising diffusion models, and Markov chain Monte Carlo (MCMC) techniques. A number of articles will be prepared as the research progresses, targeting the main deep learning venues (e.g. ICLR, ICML, NeurIPS).

0 comments on commit e77cd58

Please sign in to comment.