1. Population analysis of Legionella pneumophila epidemiology and the genetic basis for human pathogenicity
- Author
-
Gorzynski, Jamie, Fitzgerald, Jonathan, Wee, Bryan, and Smith, Andrew
- Subjects
Legionella pneumophila ,whole genome sequence (WGS) ,Sequence Types (STs) - Abstract
Legionella are globally ubiquitous aquatic bacteria that cause both Pontiac Fever (a mild flu) and Legionnaires' disease, a severe form of pneumonia with a 5-10% mortality rate. They are natural parasites of freshwater protozoa that may also cause opportunistic human infections when inhaled from the environment via aerosols. Human infections are generally sporadic, although the last decade has seen a global increase in the number of infections, and large-scale outbreaks place an appreciable annual burden on public health worldwide. The species Legionella pneumophila causes around 90% of infections, a large number of which are caused by relatively few clonal lineages, each estimated to have emerged recently and independently. However, the factors leading to their pathogenic success still remain largely unknown. The growing abundance of whole genome sequence (WGS) data has revealed a new horizon for bacterial comparative genomics. Larger, more varied datasets enable more advanced statistical approaches to investigate bacterial evolution, epidemiology and pathogen emergence. In this project, I assembled a comprehensive WGS dataset to conduct population-scale genomic analysis of L. pneumophila. In addition to a historic Scottish reference isolate collection, I downloaded all publically available assemblies and sequence reads for Legionella species. A pipeline was then developed to assemble, filter, clean and curate these data based on a range of parameters, which was improved by visual inspection. I conducted a population-wide meta-analysis of the data to explore the global distribution of Sequence Types (STs) over time. Our results highlight the power of population-scale genomic analysis to monitor disease trends, although several major sources of spatial and temporal sampling bias were identified that should be accounted for in future work. I then used these data to conduct a nation-wide genomic epidemiological analysis of culture- positive clinical L. pneumophila isolates from Scotland over a 36 year timeframe in context with global isolates and epidemiological metadata. The analysis shed new light on the epidemiology of travel-associated infections and revealed widely disseminated endemic clones that were associated with repeated infections in Scotland over many years. In addition, specific clones were identified that were isolated from the water systems of individual hospitals over very long time periods, indicating either repeated re-colonisation or long-term environmental persistence. The results indicate that routine regular environmental sampling is required to support the identification of epidemiological links, attribution of outbreak sources and to inform public health measures targeting endemic clones that present an ongoing risk. Finally, I investigated the genomic features that differentiate clinical and environmental isolates of L. pneumophila and which may be important for human infection potential. I used PIRATE to calculate the L. pneumophila pangenome, which revealed that the number of genes was closely correlated with the population structure, and identified two major lineages in which clinical genomes contained significantly fewer genes. To identify specific genes or variants correlated with an environmental or clinical source, I mapped the hits from a machine learning-based association analysis to corresponding orthologous genes clusters, revealing a number of previously undetected associations with disease. Using a network visualisation approach, I identified strong linkage disequilibrium influencing the significance of hits in commonly syntenic genes throughout the pangenome. Taken together, the results demonstrate the value of high-resolution population-scale WGS data to monitor the distribution and spread of different Legionella pneumophila clones, including those posing a higher human health risk. Furthermore, it empowered the identification of genomic factors significantly associated with the isolation source, which may contribute towards human infection potential.
- Published
- 2023
- Full Text
- View/download PDF