-
Notifications
You must be signed in to change notification settings - Fork 0
/
AppendicesDeepImagePrior.tex
188 lines (136 loc) · 25.3 KB
/
AppendicesDeepImagePrior.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
\chapter{Pseudo-Bayesian DIP Denoising as a Preprocessing Step for Kinetic Modelling in Dynamic PET} \label{sec:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix}
\newpage
\section{Abstract} \label{sec:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_abstract}
Noise (among other artefacts) could be considered to be the bane of \gls{PET}. Often, it causes what could otherwise be a more simple problem to explode in complexity. Many methods have been proposed to alleviate the worst annoyances of noise, however, not many take into account the temporal nature of dynamically acquired \gls{PET}. Here, we propose an adaption of a method, which has seen increasing attention in more traditional imaging denoising circles. \gls{DIP} exploits the initialisation of a carefully designed \gls{NN}, so as to treat it as a bank of custom filters, which are to be trained and used afresh on each new image, independently. \gls{DIP} has seen adaptation to \gls{PET} previously (including dynamic \gls{PET}), however, many of these adaptations do not take into account the large memory requirements of the method. Additionally, most previous work does not address the main weakness of the \gls{DIP}, its stopping criteria. Here, we propose a method which is both memory efficient, and includes a smoothing regularisation. In addition, we provide uncertainty estimates by incorporating a Bayesian approximation (using dropout), and prototype a training scheme by which the model is fit on all data simultaneously. The denoised images are then used as input for kinetic modelling. To evaluate the method, dynamic \gls{XCAT} simulations have been produced, with a \gls{FOV} of the lung and liver. The results of the new methods (along with \gls{TV} and the old \gls{DIP}) have been compared by a visual analysis, \gls{SSIM}, and $K_i$ values. Results indicate that the new methods potentially outperform the old methods, without increasing computation time, while reducing system requirements.
\section{Introduction} \label{sec:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_introduction}
Most machine learning or \gls{NN} based methods, rely upon a workflow where a model is designed, trained, validated, and deployed~\parencite{Krose1993AnNetworks}. This is logical as the intention with \gls{NN} based methods is to treat them as a general function approximator~\parencite{Krose1993AnNetworks}. This is where the function is determined by the relationship between the data used and the objective or loss function. However, in the domain of image denoising, the \gls{DIP} method has received attention as training and inference are performed independently on each new image~\parencite{Ulyanov2020DeepPrior}. The \gls{DIP} method could be considered to be a custom learnt bank of filters for each input. In order to prevent overfitting, the number of iterations used is imperative. The authors of the original \gls{DIP} paper argue that the method could be considered to be used to solve many inverse problems~\parencite{Ulyanov2020DeepPrior}.
For \gls{PET}, there have been a number of adaptations of \gls{DIP}.~\parencite{Gong2019PETPrior} used a U-Net with relatively high count/low motion brain scans~\parencite{Weng2021INet:Segmentation}. In~\parencite{Hashimoto20214DNetwork} \gls{DIP} is extended to \gls{4D} dynamic \gls{PET}. To do this, multiple output branches are grafted onto the \gls{NN}, one for each dynamic time point.~\parencite{Hashimoto2019DynamicDatasets} uses the original or static \gls{PET} acquisition as input to the \gls{NN}, rather than noise.~\parencite{Yang2022SimultaneousPrior} represents a more recent extension, where multiple \glspl{NN} are used simultaneously.
This work seeks to extend or simplify previous work, in order to denoise \gls{4D} dynamic \gls{PET} data.
\section{Methods} \label{sec:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_methods}
\subsection{Network Design and Execution} \label{sec:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_methods_network_design_and_execution}
\begin{figure}
\centering
\includegraphics[width=1.0\linewidth]{figures/background_dropout.png}
\captionsetup{singlelinecheck=false}
\caption{
Graphical representation of dropout. On the left of this figure a \gls{NN} which is experiencing a large degree of dropout can be seen. Here, nodes which have been dropped are represented with a cross through it and their outgoing/incoming connections are remove. On the right of this figure the same network can be seen without dropout and will all nodes active and fully connected.
}
\label{fig:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_methods_network_design_and_execution_dropout}
\end{figure}
\begin{figure}
\centering
\includegraphics[width=0.7\linewidth]{figures/background_activation_functions.png}
\captionsetup{singlelinecheck=false}
\caption{
Graphical representation of a number of activation functions. In the top left of this figure can be seen the sigmoid function. In the top right of this figure the tanh activation function can be seen. In the middle left of this figure the \gls{ReLU} activation can be seen. In the bottom right of this figure the \gls{ELU} activation can be seen. In the bottom right of this figure the leaky \gls{ReLU} can be seen.
}
\label{fig:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_methods_network_design_and_execution_activation_functions}
\end{figure}
\begin{figure}
\centering
\includegraphics[width=1.0\linewidth]{figures/background_selu.png}
\captionsetup{singlelinecheck=false}
\caption{
Graphical representation of the \gls{SELU} activation function. This figure shows a scaled version of the \gls{ELU} activation function showed in~\Fref{fig:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_methods_network_design_and_execution_activation_functions}. The same advantages and disadvantages as with \gls{ELU} are apparent here. However, as an additional advantage if the input data to this activation are normalised then the output of the activation will also be normalised.
}
\label{fig:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_methods_network_design_and_execution_selu}
\end{figure}
\begin{figure}
\centering
\includegraphics[width=1.0\linewidth]{figures/background_cnn.png}
\captionsetup{singlelinecheck=false}
\caption{
Graphical representation of the process of convolution. This figure shows a $3\times3$ convolution kernel operating on a padded image at the bottom of the figure to create the new image at the top of the figure. For a \gls{CNN} each value in the convolution kernel is determined using weights and biases, like in a fully connected \gls{NN}. However, rather than there being a weight and bias for every element in the input data there is only one for the kernel. Therefore, the number of parameters is significantly reduced and if the \gls{NN} is fully convolutional then the input size does not have to be fixed.
}
\label{fig:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_methods_network_design_and_execution_cnn}
\end{figure}
Firstly, we reduced \gls{GPU} memory requirements. Because \gls{DIP} requires training at inference, the full amount of memory is needed every time the method is used. To aid in clinical adoption, a hard limit of \SI{8.0}{\giga\byte} \gls{GPU} memory was imposed. Secondly, a more robust stopping criteria is required. The stopping criteria should be flexible so as to allow the method to correct a wider variety of inputs. One method would be, to look at a window of previous loss function values, and exit, when the gradient of this drops below a tolerance. To aid in achieving this, as well as to address weaknesses of the original \gls{DIP}, regularisation must be added, so as to stop the \gls{NN} fitting to noise, similar to~\parencite{Liu2019ImagePrior}. Finally, because the output from this method is to be used in a further kinetic model fitting, it may provide improved results to have a metric of the uncertainty in the denoised images. We used dropout as in~\parencite{Gal2016DropoutLearning} to approximate (more expensive) Bayesian inference. A graphical representation of dropout can be seen in~\Fref{fig:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_methods_network_design_and_execution_dropout}. Furthermore, this work uses \gls{PET} data with a \gls{FOV} of the lung and liver, whereas most previous work uses a \gls{FOV} of the head.
The \gls{NN} used was a modified U-Net~\parencite{Weng2021INet:Segmentation}, with seven down/upsampling stages. Each down/upsampling stage consisted of two convolutional layers (with two, four, eight, $16$, $32$, $64$, or $128$ channels, depending on depth). Followed by, either a split strided convolution and maxpooling layer (with the result concatenated), or a trilinear upsampling layer. Edge padding, group normalisation~\parencite{Wu2020GroupNormalization}, MISH activation~\parencite{Misra2019Mish:Function}, and spatial dropout, were used with every convolutional layer. An example of some activation functions can be seen in~\Fref{fig:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_methods_network_design_and_execution_activation_functions} and a self-regularising activation function, which was considered for this work, can be seen in~\Fref{fig:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_methods_network_design_and_execution_selu}. A graphical representation of the process of convolution can be seen in~\Fref{fig:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_methods_network_design_and_execution_cnn}. Data was edge padded to the nearest power of two, and the input data had Gaussian noise summed to it. Both input and label data were standardised. \gls{MSE} and \gls{TV} were used as the loss function. AdamW~\parencite{Loshchilov2019DecoupledRegularization} was used as optimiser. Training continued for all methods until the gradient of the loss function, over a window of previous results, reduced below a threshold. Parameters were tuned using a grid search.
Two training regimes were explored, one where each time point was treated independently, and another, where the model weights were saved and then independently updated on each time point (the mean of the new model's weights was taken for the next iteration).
As an aside, in order to justify further the selection of activation function, here are the advantages and disadvantages of the activations seen in~\Fref{fig:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_methods_network_design_and_execution_activation_functions}. The sigmoid function compresses the range of the input to be between zero and one. However, values which are much outside of the range of zero to one can have the contrast between them significantly reduced. For instance values with a magnitude of difference could become almost indistinguishable once transformed. Furthermore, because this activation is bounded both above and below the sigmoid function suffers with the vanish gradient problem (where the gradient becomes so small that values are updated very slowly or not at all). However, the sigmoid function does not suffer with the exploading gradient problem (where the gradient becomes so large that optimisation is unstable). The \gls{ReLU} activation is very quick to compute. However, this activation function suffers when values become less than zero, there is no gradient here and as such the values cannot change, this is the dead \gls{ReLU} problem. The \gls{ELU} activation is similar to the \gls{ReLU} activation, in contrast this activation does not suffer from the dead \gls{ReLU} problem. However, because of the exponent this activation function is slow to compute. For leaky \gls{ReLU} rather than setting all negative values to zero the negative region of the activation function is set to a scaled version of the input. This means that leaky \gls{ReLU} does not suffer from the dead \gls{ReLU} problem.
\subsection{PET Acquisition Simulation and Image Reconstruction} \label{sec:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_methods_pet_acquisition_simulation_and_image_reconstruction}
A series of dynamic scans, following the clinical \gls{DWB}-\gls{PET} protocol, were generated using the \gls{XCAT} phantom~\parencite{Segars2010}. Patient-derived kinetic parameters were assigned to $64$ tissues, three tumours of \SI{1.0}{\centi\meter} diameter in the left lung, and three tumours of \SI{2.5}{\centi\meter}, \SI{2.0}{\centi\meter}, and \SI{1.0}{\centi\meter} diameter in the liver. An input function for \gls{18F-FDG}, taken from~\parencite{Langsjo2004EffectsHumans}, was used to simulate \glspl{TAC} to create dynamic images.
% KT some info here on timing
\gls{PET} acquisitions were simulated (and reconstructed) using \gls{STIR}~\parencite{Thielemans2012} through \gls{SIRF}~\parencite{Ovtchinnikov2017}. \gls{Non-TOF} sinogram data were simulated, using resolution modelling (using a \SI{6.0}{\centi\meter} \gls{FWHM} Gaussian filter). Randoms and scatter were not included. Poisson noise was added.% corresponding to a total number of counts of XXX in time frame XXX.
% KT some info above
Finally, all data sets were reconstructed using $10$ iterations with $17$ subsets of \gls{OSEM}~\parencite{Hudson1994}.
\subsection{Kinetic Modelling} \label{sec:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_methods_kinetic_modelling}
Indirect Patlak estimation was used to generate $K_i$ and intercept images~\parencite{Patlak1983GraphicalData}. \glspl{VOI} were defined for the three lesions and two background regions (lung and liver), and the mean values of $K_i$ and $V_d$ were calculated in each \gls{VOI}. The uncertainties of the parameters were estimated as follows. Normally distributed noise was added to the dynamic images, with standard deviation given by the \gls{DIP}-uncertainty, and Patlak analysis was performed. The procedure was repeated for $10$ noise realisations, and the standard deviation of the $K_i$ and $V_d$ parameters were calculated.
\subsection{Evaluation} \label{sec:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_methods_evaluation}
In addition to the denoising performed above in~\Fref{sec:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_methods_network_design_and_execution}, data were also denoised using \gls{TV}, and the \gls{DIP} method presented in~\parencite{Gong2019PETPrior}.
Comparisons used included: A visual analysis, \gls{SSIM} to the ground truth~\parencite{Wang2009MeanMeasures}, $K_i$ values, a \gls{TAC} through a lesion, a profile over a lesion, and \gls{SUV}\textsubscript{max} and \gls{SUV}\textsubscript{peak} (defined following \gls{EANM} guidelines~\parencite{Boellaard2015FDG2.0}).
\section{Results} \label{sec:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_results}
\begin{figure}
\centering
\includegraphics[width=0.7\linewidth]{figures/deep_image_prior_results_visual_analysis.png}
\captionsetup{singlelinecheck=false}
\caption{
First column contains, a visual analysis between the ground truth and denoised results (taken for the last time point, plus \gls{SSIM} to the ground truth), and the second column contains, the $K_i$ results (all voxels in a coronal view) of a Patlak reconstruction of all time points (plus \gls{SSIM} to the ground truth), for the ground truth, and data denoised using, \gls{TV}, the implementation of \gls{DIP} from~\parencite{Gong2019PETPrior}, and our new implementation of \gls{DIP}, trained sequentially and combined (taken for the lung \gls{FOV}). Last row contains, uncertainty volumes, for the data denoised using our new implementation of \gls{DIP}, trained sequentially and combined (taken for the last time point of the lung \gls{FOV}). Colour map ranges are consistent for all images in each section.
}
\label{fig:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_results_visual_analysis}
\end{figure}
\begin{figure}
\centering
\includegraphics[width=1.0\linewidth]{figures/deep_image_prior_results_ki.png}
\captionsetup{singlelinecheck=false}
\caption{
$K_i$ results (single voxel) of a Patlak reconstruction of all time points, plus uncertainty where applicable, for the ground truth, and data denoised using, \gls{TV}, the implementation of \gls{DIP} from~\parencite{Gong2019PETPrior}, and our new implementation of \gls{DIP}, trained sequentially and combined.
}
\label{fig:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_results_ki}
\end{figure}
\begin{figure}
\centering
\includegraphics[width=1.0\linewidth]{figures/deep_image_prior_results_tac.png}
\captionsetup{singlelinecheck=false}
\caption{
A \gls{TAC} through a lesion, fit as a third order polynomial regression, with weighting using uncertainty (where available), for the ground truth, the original noisy data, and this data denoised using, \gls{TV}, the implementation of \gls{DIP} from~\parencite{Gong2019PETPrior}, and our new implementation of \gls{DIP}, trained sequentially and combined, both with and without uncertainty (taken for the lung \gls{FOV}).
}
\label{fig:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_results_tac}
\end{figure}
\begin{figure}
\centering
\includegraphics[width=1.0\linewidth]{figures/deep_image_prior_results_profile.png}
\captionsetup{singlelinecheck=false}
\caption{
A profile through a lesion, in the \gls{SI} direction, for the ground truth, the original noisy data, and this data denoised using, \gls{TV}, the implementation of \gls{DIP} from~\parencite{Gong2019PETPrior}, and our new implementation of \gls{DIP}, trained sequentially and combined (taken for the last time point of the lung \gls{FOV}).
}
\label{fig:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_results_profile}
\end{figure}
\begin{table}
\centering
\captionsetup{singlelinecheck=false}
\caption{
Comparison of \gls{SUV}\textsubscript{max} and \gls{SUV}\textsubscript{peak}, for the ground truth, the original noisy data, and this data denoised using, \gls{TV}, the implementation of \gls{DIP} from~\parencite{Gong2019PETPrior}, and our new implementation of \gls{DIP}, trained sequentially and combined (taken for the last time point of the lung \gls{FOV}).
}
\resizebox*{1.0\linewidth}{!}
{
\begin{tabular}{||c|cc||}
\hline
\textbf{SUV} & \textbf{Max} & \textbf{Peak} \\
\hline
\textbf{Ground Truth} & $12.3$ & $9.53$ \\
\hline
\textbf{Noisy} & $21.5$ & $5.99$ \\
\hline
\textbf{TV} & $3.28$ & $6.24$ \\
\textbf{Original DIP} & $3.90$ & $6.93$ \\
\hline
\textbf{New DIP Sequential} & $9.19$ & $8.02$ \\
\textbf{New DIP Combined} & $9.27$ & $8.21$ \\
\hline
\end{tabular}
}
\label{tab:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_results_suv}
\end{table}
A visual comparison of the reconstructed images (see~\Fref{fig:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_results_visual_analysis}), shows that both of the new \gls{DIP} methods perform comparably, if not for a slight reduction in noise in the combined case. Whereas, the \gls{TV} and original \gls{DIP} implementations appear to have struggled with over smoothing, reducing the contrast of the lesions, and introducing some edge artefacts. The uncertainty of the combined method can be seen reduced compared to the sequential method.
A comparison of $K_i$ values across multiple lesions (see~\Fref{fig:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_results_ki}), shows that the new \gls{DIP} combined method most often estimates the greatest magnitude of $K_i$ value, which is usually closest to the ground truth. The new \gls{DIP} sequential is slightly less accurate (also with greater uncertainty), however, is more accurate than the \gls{TV} and original \gls{DIP} implementations (which consistently significantly underestimate $K_i$).
The overall shape of the \gls{TAC} (see~\Fref{fig:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_results_tac}) for the new \gls{DIP} combined method appears, most similar to the ground truth, however, with a slight reduction in quantification. The sequential method is less accurate, but still more so than both the original \gls{DIP} and \gls{TV} methods. There is significant variation in the noise \gls{TAC}, somewhat masked by the regression, however, its shape is still least like the ground truth. Adding uncertainty appears to have improved the \gls{TAC} of the new \gls{DIP} sequential method, however, the uncertainty of the combined method is less, and as such, the inclusion of uncertainty has not affected results significantly.
The peak of the profile (see~\Fref{fig:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_results_profile}) for both new \gls{DIP} methods is comparable, and greater than both the original \gls{DIP} and \gls{TV} methods. The peak of the noise profile is greater than all other methods, including the ground truth, however, this is not necessarily beneficial, as can be seen by the rest of the profile not closely following the ground truth (it undulates unpredictably). The profile for both new \gls{DIP} methods are significantly smoother, and more closely follow the ground truth.
\gls{SUV} (and \gls{SSIM}) results confirm the above (see~\Fref{tab:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_results_suv}).
% \gls{SSIM} results confirm the above.
\section{Discussion and Conclusions} \label{sec:pseudo_bayesian_dip_denoising_as_a_preprocessing_step_for_kinetic_modelling_in_dynamic_pet_appendix_discussion_and_conclusions}
% XXXSome discussion of results here.
Evaluation indicated that the new \gls{DIP} method, particularly when trained combined, provided images with less noise and more quantitative accuracy than other methods. The combined method had lower uncertainty.
Results presented here were obtained on a single bed position. Initial evaluation on a bed position, centred on the liver, indicated that parameter fine-tuning, depending on the distribution and count level, will be beneficial. Evaluation with patient data will follow.
The uncertainty estimates produced by the \gls{NN} need to be validated by comparison with results obtained from repeated noise realisations.
In the future, research will focus on the application of the method to domains other than dynamic \gls{PET}, where \gls{4D} data exists, such as motion correction.