Home USING WHOLE GENOME SEQUENCING DATA TO IMPROVE MYCOBACTERIUM BOVIS OUTBREAK INVESTIGATION EFFICIENCY

Projects

USING WHOLE GENOME SEQUENCING DATA TO IMPROVE MYCOBACTERIUM BOVIS OUTBREAK INVESTIGATION EFFICIENCY

Summary

<div class="container" style="width:300px;">
<!–
<div class="leftcol">
<B>Forestry Component:</B> #forestry_component%

</div>
–>
<div class="leftcol" style="width:194px">
<b>Animal Health Component</b>
</div>
<div class="rightcol" style="width:56px; text-align:right">25%</div>
<div class="endrow" style="float:none; display:block;"></div>

<!–
<div class="leftcol">
<B>Is this an Integrated Activity?</B> #integrated_activity

</div>
<div class="rightcol"></div>
<div class="endrow"></div>
–>
<div class="leftcol">
<b>Research Effort Categories</b><br>
<div class="container" style="width: 375px;">
<div class="rec_leftcol">Basic</div>
<div class="rec_rightcol">75%</div>
<div class="endrow"></div>
<div class="rec_leftcol">Applied</div>
<div class="rec_rightcol">25%</div>
<div class="endrow"></div>
<div class="rec_leftcol">Developmental</div>
<div class="rec_rightcol">0%</div>
<div class="endrow"></div>
</div>
</div>
<div class="endrow"></div>

</div>

Objectives & Deliverables

<b>Project Methods</b><br> Efforts:Estimating the amount of time since a herd was infected: Drtermining the time since herd infection depends on accuratelyestimatingtheevolutionaryrateof M. bovis, or the relationship between the bacteria's mutation rate and evolutionary time. To estimate time since infection, we will develop Bayesianphylogeneticmodels and a novel convolutional neural network based tool. First, we will BEAST2 to build time-calibrated phylogenies and estimate evolutionary rate of differnetM. bovislineages.Estimatesof evolutionaryratewillthenbe incorporated intolinear modelscapableofpredicting time since herd infectioninanoutbreak. Secondly, we will build aconvolutional neural network that will predict the time since a herd was infected using an alignment of outbreak M. bovissequences.The model will be trained on simulated outbreak sequences with associated time data. The fitted neural network will then be tested using real outbreak data with reliable historical estimates of the amount of time a herd was infected before the outbreak was detected, and the estimates will be compared to those generated using the bayesian phylogenetic approach.Estimating the geographical source of an outbreak:Two population structure algorithms will be applied to estimate geographic structure. The first, RhierBAPS, which is an implementation of hierBAPS n the programming language R, is a model-based population structure algorithm that incorporates sample geographic coordinates and a Markovian sequence clustering model to estimate population structure. hierBAPS cannot incorporate accessory genes, so will be run using only core genes, present in every sequence in the database. The second algorithm, popPUNK F incorporates both core and accessory genes. popPUNK calculates the pairwise Jaccard distance between sequence kmers to estimate sequence divergence. Clusters are then created from core and accessory distances using Gaussian mixture models. An undirected network is then created for each cluster from samples (nodes) connected by distances (edges). This network then serves as a reference network, which can be updated to incorporate information from new sequences. We will compare the accuracy of these two algorithms on labeled data and produce a dynamic database of geographic results.Evaluation:For each aim two separate methodologies will be used to generate estimates. In aim 1,the estimate of time since herd infection generated by the Bayesian phylogenetic approachwill be compared to estimates generated from the deep learning approach. Both estimates will be compared against real outbreak data, where epidemiologic investigation lead to conclusive evidence for a time associated withM. bovisintroduction into a herd. In the second aim, the accuracy of HeirBAPS and popPUNK will be compared to real outbreak data with accurate animal source data.

Principle Investigator(s)

Planned Completion date: 14/06/2022

Effort: $120,000.00

Project Status

COMPLETE

Principal Investigator(s)

National Institute of Food and Agriculture

Researcher Organisations

CORNELL UNIVERSITY

Source Country

United KingdomIconUnited Kingdom