Background: Orofacial granulomatosis (OFG) is a rare, heterogeneous inflammatory disorder of the mouth, in which 20-40% of patients also present with, or subsequently develop, intestinal Crohn's disease (CD). Patients are typically categorised into two phenotypic sub-groups; those with isolated oral findings, 'OFG only' or OFG with concurrent intestinal CD, 'OFG/CD'. There is disagreement about whether OFG is a rare disorder in its own right, or whether it is a part of the complicated phenotypic spectrum of Crohn's disease, a common genetic disorder. There is a high prevalence of allergy/atopy in OFG and identification of novel immunoglobulin E (IgE) expressing subepithelial dendritic B cells in the oral mucosa in OFG patients, giving evidence for the important role of IgE in OFG. Oral granulomas are also seen in other immune co-morbidities of OFG, particularly CD, and sarcoidosis and their etiology remain largely unknown. However, mutations in the NOD2 gene, which are known to cause CD, have been identified in OFG patients and are particularly enriched in those patients with associated CD (OFG/CD). The NOD2 protein has an important role in innate immunity. The COOH-terminal leucine-rich repeat (LRR) domain recognizes and reacts to the bacterial peptidoglycan fragment muramyl dipeptide (MDP) in the innate immune response, triggering a signalling cascade that leads to the activation of the transcription nuclear factor (NF)-κB mediated pro-inflammatory signalling. Aims: The aim of this study was to investigate the underlying genetic etiology of OFG and determine whether there is a shared genetic etiology with CD; ascertain the functional effects of rare novel NOD2 mutations implicated in OFG; and finally, to determine the role of allergy (IgE) in the oral immune response in OFG. Methods: Genotyping via Human Core Infinium (Illumina) SNP microarray and genome wide association study (GWAS) of genotyped and imputed SNPs was performed on 248 OFG patients and 9460 population controls to identify high risk common variants that might be associated with OFG. This data was then analysed alongside two different CD GWAS datasets to calculate the polygenic risk scores (PRS) for CD in the OFG cohort and to allow a comparison of genetic risk profiles between CD and OFG. In addition, whole exome sequencing (WES) was performed on DNA samples from 163 OFG patients to identify potentially pathogenic novel, or rare coding variants in OFG. Next, NF-κB activation assays were optimised to carry out functional analysis of seven rare genetic variants in the NOD2 gene that had been identified in OFG patients (p.A181G, p.A616E, p.L1007insC, p.G908C, p.V793M, p.L1018F, p.A976T). Lastly, serum and salivary IgE levels were measured in OFG patients, CD patients and healthy controls with and without atopy, to determine the role of allergy in OFG. Results and discussion: GWAS revealed a single most significant association on chromosome 5 at the imputed SNP rs75300841 (Chromosome 5: 40551045 bp) that reached genome-wide significance (p = 1.98 x10-08, OR = 2.324). This SNP is also in LD with a known CD associated SNP rs4613763 (Chromosome 5: 40392728 bp, p = 1.81 x 10-07, OR = 2.02), which showed the highest signal for a genotyped SNP, and rs7720838 (Chromosome 5: 40486896 bp, p = 4.23 X 10-06, OR = 0.59) as the second highest genotyped signal. Validation re-genotyping of SNPs rs4613763 and rs7720838 using Taqman assays in an expanded cohort of OFG patients, which included 47 newly recruited patients, followed by sub-phenotype analysis revealed that SNP rs4613763 was significantly associated with an increased disease risk in the OFG/CD group (p = 1.11 x 10-05, OR = 1.98) but less so with the OFG only group (p = 1.03 x 10-02, OR = 1.46). Similarly, SNP rs7720838 was significantly associated with OFG/CD group (p = 1.19 x 10-06, OR = 2.2) but less so with the OFG only group (p = 4.90 x 10-02, OR = 1.27). Calculations of polygenic risk scores for CD in a cohort that included all OFG patients (n = 256) and controls (n = 992) when using Jostins et al (2012) CD GWAS meta-analysis dataset as base data, showed that all PRS models were highly significantly predictive of the OFG phenotype (p = 4.2 x 10-08), with the best model based on 397 SNPs, all of which have been previously significantly associated with CD, (P < 0.01) and explain 3.8% of the phenotypic variance in CD. Similarly, sub-phenotype analysis using just the OFG/CD cohort (n = 107) and controls (n = 992) showed that all PRS models were highly significantly predictive of the OFG/CD phenotype (p = 1.69 x 10-8), with the best model based on 230 CD associated SNPs, explaining the 6.36% of the phenotypic variance in CD. However, sub-phenotype analysis using just the OFG only patients (n = 144) and controls (n = 992) showed a more conservative significance across all models with the best model based on 397 SNPs (p = 0.011), explaining only 1.08% of the phenotypic variance. Nevertheless, all PRS model p values for both sub-phenotypes and the combined dataset remained significant after 100,000 permutations, suggesting a significant overlap between the genetic architecture of all OFG phenotypes and CD. PRS calculations using a second base dataset that encompassed genome wide SNPs from a European GWAS (Liu et al., 2015) showed that all PRS models were highly significantly predictive of the OFG phenotype (p = 9.75 x 10-05), with the best model based on 3681 SNPs explaining 1.9% variance in CD. Sub-phenotype analysis using the OFG/CD cohort (n = 107) and controls (n = 992) showed that all PRS models were highly significantly predictive of the OFG/CD phenotype (p = 1.69 x 10-08), with the best model based on 21 CD associated SNPs, explaining 3.71% of the phenotypic variance in CD. However, PRS models were not predictive of the OFG only phenotype (p = 0.1). The frequency of self-reported atopy was highest in the OFG/CD group (55.56%) compared to OFG only (47.83%) and CD (26.32%). Sub-phenotypic analyses showed that mean total salivary IgE was significantly elevated in CD patients compared to OFG/CD patients (p = 0.02240), OFG only patients (p = 0.0102) and healthy controls (p = 0.02240). However, there was no statistical difference between OFG only, OFG/CD and healthy controls. Although mean total salivary IgE was more pronounced in all atopic OFG patients (13.63 ng/ml) compared to non-atopic OFG patients (7.88 ng/ml), it was not statistically significant. Moreover, mean total serum IgE was significantly elevated in all atopic OFG patients (470.73 ng/ml) compared to non-atopic OFG patients (7.30 ng/ml) (p = 0.0283). Exome sequencing of OFG patients (n = 163) identified 465 variants in 198 genes including NCF4, TGFB1 and LAMB1. Moreover, the sequencing identified five CD associated NOD2 variants as well as 9 potentially pathogenic rare variants. Analysis of the NOD2 pathogenic variant's frequencies in the OFG only cohort (n = 147) and OFG/CD cohort (n = 120) showed that there is a significant enrichment of pathogenic NOD2 variants in OFG/CD patients compared to OFG only (p = 0.00007). Finally, it was demonstrated that all seven NOD2 variants, and the control p.P268S variant, previously identified in OFG, led to significantly reduced NF-κB activation in MDP stimulated HEK 293 cells. Conclusions: GWAS, PRS, WES data and differences in the oral immune responses of OFG patients and CD patients suggests that OFG and CD could possibly be distinct disease entities. There is evidence for overlapping genetic etiology between CD and the OFG/CD group as CD associated variants in NOD2 and newly identified loss-of-function variants in NOD2 were significantly enriched in OFG/CD cases. Likewise, the OFG GWAS data has shown significant association with a previously reported CD associated locus on chromosome 5 in the OFG/CD cohort. CD PRS calculations have shown that CD associated SNPs account for a higher genetic variance in OFG/CD patients compared to the OFG only group. However, these calculations still suggest there may be a much smaller genetic overlap between OFG only and CD which could be due to the inclusion of OFG/CD patients that have not yet developed intestinal CD. Moreover, there was a difference in the incidence of atopy between OFG only and OFG/CD, which indicates that prevalence of atopy is further elevated by presence of CD. Though it was not possible to determine a significant difference in salivary IgE levels between the two sub-phenotypes of OFG, there was however a significant difference between CD and OFG. Altogether, this could suggest that OFG/CD patients may be true CD patients, or part of the spectrum of CD, and that OFG only with no gut involvement is a different entity with a phenotypic overlap with CD that presents with oral involvement, i.e., OFG/CD group. As well as the CD associated NOD2 gene, whole exome sequencing in OFG has identified a possible role for genes that have previously been associated with other phenotypically similar inflammatory diseases such as chronic granulomatous disorder and infantile IBD. However, this could be due to the pleotropic nature of these mutations in these genes, or due to the misdiagnosis of the phenotypically similar inflammatory disease as OFG. Finally, findings in this study have shed more light on the pathogenesis of OFG sub-phenotypes and their relationship to CD which, in turn, could lead to better future understanding of OFG and its diagnosis, or potential misdiagnosis, and consequently, management or treatment. For instance, salivary and serum IgE could be used to monitor IgE-mediated atopy in OFG sub-phenotypes as well as CD. Additionally, evaluation of individual PRS scores may have the potential to identify OFG only patients that are at elevated risk of developing intestinal CD later in life. Thus, facilitating earlier diagnosis of gastrointestinal signs and symptoms and better management of the disease.