34 results on '"Lundegaard, C"'
Search Results
2. Viral bioinformatics
- Author
-
Adams, B., McHardy, A. Carolyn, Lundegaard, C., and Lengauer, T.
- Subjects
Viral Variant ,Human Leucocyte Antigen ,Human Immune System ,Epitope Prediction ,Article ,Viral Evolution - Abstract
Pathogens have presented a major challenge to individuals and populations of living organisms, probably as long as there has been life on earth. They are a prime object of study for at least three reasons: (1) Understanding the way of pathogens affords the basis for preventing and treating the diseases they cause. (2) The interactions of pathogens with their hosts afford valuable insights into the working of the hosts’ cells, in general, and of the host’s immune system, in particular. (3) The co-evolution of pathogens and their hosts allows for transferring knowledge across the two interacting species and affords valuable insights into how evolution works, in general. In the past decade computational biology has started to contribute to the understanding of host-pathogen interaction in at least three ways which are summarized in the subsequent sections of this chapter.
- Published
- 2009
3. Human leukocyte antigen (HLA) class i restricted epitope discovery in yellow fewer and dengue viruses: Importance of HLA binding strength
- Author
-
Lund, O, Nascimento, EJM, Maciel, M, Nielsen, M, Larsen, M, Lundegaard, C, Harndahl, M, Lamberth, K, Buus, S, Salmon, J, August, TJ, Marques, ETA, Lund, O, Nascimento, EJM, Maciel, M, Nielsen, M, Larsen, M, Lundegaard, C, Harndahl, M, Lamberth, K, Buus, S, Salmon, J, August, TJ, and Marques, ETA
- Abstract
Epitopes from all available full-length sequences of yellow fever virus (YFV) and dengue fever virus (DENV) restricted by Human Leukocyte Antigen class I (HLA-I) alleles covering 12 HLA-I supertypes were predicted using the NetCTL algorithm. A subset of 179 predicted YFV and 158 predicted DENV epitopes were selected using the EpiSelect algorithm to allow for optimal coverage of viral strains. The selected predicted epitopes were synthesized and approximately 75% were found to bind the predicted restricting HLA molecule with an affinity, K D, stronger than 500 nM. The immunogenicity of 25 HLA-A*02:01, 28 HLA-A*24:02 and 28 HLA-B*07:02 binding peptides was tested in three HLA-transgenic mice models and led to the identification of 17 HLA-A*02:01, 4 HLA-A*2402 and 4 HLA-B*07:02 immunogenic peptides. The immunogenic peptides bound HLA significantly stronger than the non-immunogenic peptides. All except one of the immunogenic peptides had K D below 100 nM and the peptides with K D below 5 nM were more likely to be immunogenic. In addition, all the immunogenic peptides that were identified as having a high functional avidity had K D below 20 nM. A*02:01 transgenic mice were also inoculated twice with the 17DD YFV vaccine strain. Three of the YFV A*02:01 restricted peptides activated T-cells from the infected mice in vitro. All three peptides that elicited responses had an HLA binding affinity of 2 nM or less. The results indicate the importance of the strength of HLA binding in shaping the immune response. © 2011 Lund et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- Published
- 2011
4. Immune epitope database analysis resource
- Author
-
Kim, Y., primary, Ponomarenko, J., additional, Zhu, Z., additional, Tamang, D., additional, Wang, P., additional, Greenbaum, J., additional, Lundegaard, C., additional, Sette, A., additional, Lund, O., additional, Bourne, P. E., additional, Nielsen, M., additional, and Peters, B., additional
- Published
- 2012
- Full Text
- View/download PDF
5. SARS CTL vaccine candidates; HLA supertype-, genome-wide scanning and biochemical validation
- Author
-
Sylvester-Hvid, C, Nielsen, M, Lamberth, K, Røder, G, Justesen, S, Lundegaard, C., Worning, P., Thomadsen, H., Lund, O., Brunak, S., Buus, Søren, Sylvester-Hvid, C, Nielsen, M, Lamberth, K, Røder, G, Justesen, S, Lundegaard, C., Worning, P., Thomadsen, H., Lund, O., Brunak, S., and Buus, Søren
- Abstract
Udgivelsesdato: 2004-May, An effective Severe Acute Respiratory Syndrome (SARS) vaccine is likely to include components that can induce specific cytotoxic T-lymphocyte (CTL) responses. The specificities of such responses are governed by human leukocyte antigen (HLA)-restricted presentation of SARS-derived peptide epitopes. Exact knowledge of how the immune system handles protein antigens would allow for the identification of such linear sequences directly from genomic/proteomic sequence information (Lauemoller et al., Rev Immunogenet 2001: 2: 477-91). The latter was recently established when a causative coronavirus (SARS-CoV) was isolated and full-length sequenced (Marra et al., Science 2003: 300: 1399-404). Here, we have combined advanced bioinformatics and high-throughput immunology to perform an HLA supertype-, genome-wide scan for SARS-specific CTL epitopes. The scan includes all nine human HLA supertypes in total covering >99% of all individuals of all major human populations (Sette & Sidney, Immunogenetics 1999: 50: 201-12). For each HLA supertype, we have selected the 15 top candidates for test in biochemical binding assays. At this time (approximately 6 months after the genome was established), we have tested the majority of the HLA supertypes and identified almost 100 potential vaccine candidates. These should be further validated in SARS survivors and used for vaccine formulation. We suggest that immunobioinformatics may become a fast and valuable tool in rational vaccine design.
- Published
- 2004
6. Characterization of a new HLA-G allele encoding a nonconservative amino acid substitution in the alpha3 domain (exon 4) and its relevance to certain complications in pregnancy
- Author
-
Hviid, T V, Christiansen, O B, Johansen, J K, Hviid, U R, Lundegaard, C, Møller, C, Morling, N, Hviid, T V, Christiansen, O B, Johansen, J K, Hviid, U R, Lundegaard, C, Møller, C, and Morling, N
- Abstract
Udgivelsesdato: 2001-Feb
- Published
- 2001
7. Immune epitope database analysis resource (IEDB-AR)
- Author
-
Zhang, Q., primary, Wang, P., additional, Kim, Y., additional, Haste-Andersen, P., additional, Beaver, J., additional, Bourne, P. E., additional, Bui, H.-H., additional, Buus, S., additional, Frankild, S., additional, Greenbaum, J., additional, Lund, O., additional, Lundegaard, C., additional, Nielsen, M., additional, Ponomarenko, J., additional, Sette, A., additional, Zhu, Z., additional, and Peters, B., additional
- Published
- 2008
- Full Text
- View/download PDF
8. ‘Query‐by Committee’— An Efficient Method to Select Information‐Rich Data for the Development of Peptide—HLA‐Binding Predictors
- Author
-
Lamberth, K., Nielsen, M., Lundegaard, C., Worning, P., Laurmøller, S. L., Lund, O., Brunak, S., and Buus, S.
- Subjects
Abstracts - Abstract
Rationale: We have previously demonstrated that bioinformatics tools such as artificial neural networks (ANNs) are capable of performing pathogen‐, genome‐ and HLA‐wide predictions of peptide–HLA interactions. These tools may therefore enable a fast and rational approach to epitope identification and thereby assist in the development of vaccines and immunotherapy. A crucial step in the generation of such bioinformatics tools is the selection of data representing the event in question (in casu peptide–HLA interaction). This is particularly important when it is difficult and expensive to obtain data. Herein, we demonstrate the importance in selecting information‐rich data and we develop a computational method, query‐by‐committee, which can perform a global identification of such information‐rich data in an unbiased and automated manner. Furthermore, we demonstrate how this method can be applied to an efficient iterative development strategy for these bioinformatics tools. Methods: A large panel of binding affinities of peptides binding to HLA A*0204 was measured by a radioimmunoassay (RIA). This data was used to develop multiple first generation ANNs, which formed a virtual committee. This committee was used to screen (or ‘queried’) for peptides, where the ANNs agreed (‘low‐QBC’), or disagreed (‘high‐QBC’), on their HLA‐binding potential. Seventeen low‐QBC peptides and 17 high‐QBC peptides were synthesized and tested. The high‐ or low‐QBC data were added to the original data, and new high‐ or low‐QBC second generation ANNs were developed, respectively. This procedure was repeated 40 times. Results: The high‐QBC‐enriched ANN performed significantly better than the low‐QBC‐enriched ANN in 37 of the 40 tests. Conclusion: These results demonstrate that high‐QBC‐enriched networks perform better than low‐QBC‐enriched networks in selecting informative data for developing peptide–MHC‐binding predictors. This improvement in selecting data is not due to differences in network training performance but due to the difference in information content in the high‐QBC experiment and in the low‐QBC experiment. Finally, it should be noted that this strategy could be used in many contexts where generation of data is difficult and costly.
- Published
- 2008
9. SARS CTL Vaccine Candidates — HLA Supertype, Genome‐Wide Scanning and Biochemical Validation
- Author
-
Sylvester‐Hvid, C., Nielsen, M., Lamberth, K., Røder, G., Justesen, S., Lundegaard, C., Worning, P., Thomadsen, H., Lund, O., Brunak, S., and Buus, S.
- Subjects
Abstracts - Abstract
An effective SARS vaccine is likely to include components that can induce specific cytotoxic T‐cell (CTL) responses. The specificities of such responses are governed by HLA‐restricted presentation of SARS‐derived peptide epitopes. Exact knowledge of how the immune system handles protein antigens would allow for the identification of such linear sequences directly from genomic/proteomic sequence information. The latter was recently established when a causative coronavirus (SARS CoV) was isolated and full‐length sequenced. Here, we have combined advanced bioinformatics and high‐throughput immunology to perform an HLA supertype, genome‐wide scan for SARS‐specific cytotoxic T cell epitopes. The scan includes all nine human HLA supertypes in total covering >99% of all major human populations. For each HLA supertype, we have selected the 15 top candidates for test in biochemical‐binding assays. At this time (approximately 6 months after the genome was established), we have tested the majority of the HLA supertypes and identified almost 100 potential vaccine candidates. These should be further validated in SARS survivors and used for vaccine formulation. We suggest that immunobioinformatics may become a fast and valuable tool in rational vaccine design.
- Published
- 2008
10. The DNA damage-inducible dinD gene of Escherichia coli is equivalent to orfY upstream of pyrE
- Author
-
Lundegaard, C, primary and Jensen, K F, additional
- Published
- 1994
- Full Text
- View/download PDF
11. Selection of vaccine-candidate peptides from Mycobacterium avium subsp. paratuberculosis by in silico prediction, in vitro T-cell line proliferation, and in vivo immunogenicity.
- Author
-
Lybeck K, Tollefsen S, Mikkelsen H, Sjurseth SK, Lundegaard C, Aagaard C, Olsen I, and Jungersen G
- Subjects
- Animals, Female, Cattle, Emulsions, Bacterial Vaccines, Interferon-gamma metabolism, Antibodies, Bacterial, Adjuvants, Immunologic, Goats, Cell Line, Paratuberculosis prevention & control, Mycobacterium avium subsp. paratuberculosis, Tuberculosis, Bovine
- Abstract
Mycobacterium avium subspecies paratuberculosis (MAP) is a global concern in modern livestock production worldwide. The available vaccines against paratuberculosis do not offer optimal protection and interfere with the diagnosis of bovine tuberculosis. The aim of this study was to identify immunogenic MAP-specific peptides that do not interfere with the diagnosis of bovine tuberculosis. Initially, 119 peptides were selected by either (1) identifying unique MAP peptides that were predicted to bind to bovine major histocompatibility complex class II (MHC-predicted peptides) or (2) selecting hydrophobic peptides unique to MAP within proteins previously shown to be immunogenic (hydrophobic peptides). Subsequent testing of peptide-specific CD4+ T-cell lines from MAP-infected, adult goats vaccinated with peptides in cationic liposome adjuvant pointed to 23 peptides as being most immunogenic. These peptides were included in a second vaccine trial where three groups of eight healthy goat kids were vaccinated with 14 MHC-predicted peptides, nine hydrophobic peptides, or no peptides in o/w emulsion adjuvant. The majority of the MHC-predicted (93%) and hydrophobic peptides (67%) induced interferon-gamma (IFN-γ) responses in at least one animal. Similarly, 86% of the MHC-predicted and 89% of the hydrophobic peptides induced antibody responses in at least one goat. The immunization of eight healthy heifers with all 119 peptides formulated in emulsion adjuvant identified more peptides as immunogenic, as peptide specific IFN-γ and antibody responses in at least one heifer was found toward 84% and 24% of the peptides, respectively. No peptide-induced reactivity was found with commercial ELISAs for detecting antibodies against Mycobacterium bovis or MAP or when performing tuberculin skin testing for bovine tuberculosis. The vaccinated animals experienced adverse reactions at the injection site; thus, it is recommend that future studies make improvements to the vaccine formulation. In conclusion, immunogenic MAP-specific peptides that appeared promising for use in a vaccine against paratuberculosis without interfering with surveillance and trade tests for bovine tuberculosis were identified by in silico analysis and ex vivo generation of CD4+ T-cell lines and validated by the immunization of goats and cattle. Future studies should test different peptide combinations in challenge trials to determine their protective effect and identify the most MHC-promiscuous vaccine candidates., Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (Copyright © 2024 Lybeck, Tollefsen, Mikkelsen, Sjurseth, Lundegaard, Aagaard, Olsen and Jungersen.)
- Published
- 2024
- Full Text
- View/download PDF
12. Characterization of HIV-specific CD4+ T cell responses against peptides selected with broad population and pathogen coverage.
- Author
-
Buggert M, Norström MM, Czarnecki C, Tupin E, Luo M, Gyllensten K, Sönnerborg A, Lundegaard C, Lund O, Nielsen M, and Karlsson AC
- Subjects
- Adult, Epitope Mapping, Female, HIV Infections virology, Histocompatibility Antigens Class II immunology, Human Immunodeficiency Virus Proteins chemistry, Humans, Male, Middle Aged, Peptides chemistry, Viral Load, Young Adult, gag Gene Products, Human Immunodeficiency Virus chemistry, nef Gene Products, Human Immunodeficiency Virus chemistry, CD4-Positive T-Lymphocytes immunology, Epitopes immunology, HIV Infections immunology, HIV-1 immunology, Peptides immunology
- Abstract
CD4+ T cells orchestrate immunity against viral infections, but their importance in HIV infection remains controversial. Nevertheless, comprehensive studies have associated increase in breadth and functional characteristics of HIV-specific CD4+ T cells with decreased viral load. A major challenge for the identification of HIV-specific CD4+ T cells targeting broadly reactive epitopes in populations with diverse ethnic background stems from the vast genomic variation of HIV and the diversity of the host cellular immune system. Here, we describe a novel epitope selection strategy, PopCover, that aims to resolve this challenge, and identify a set of potential HLA class II-restricted HIV epitopes that in concert will provide optimal viral and host coverage. Using this selection strategy, we identified 64 putative epitopes (peptides) located in the Gag, Nef, Env, Pol and Tat protein regions of HIV. In total, 73% of the predicted peptides were found to induce HIV-specific CD4+ T cell responses. The Gag and Nef peptides induced most responses. The vast majority of the peptides (93%) had predicted restriction to the patient's HLA alleles. Interestingly, the viral load in viremic patients was inversely correlated to the number of targeted Gag peptides. In addition, the predicted Gag peptides were found to induce broader polyfunctional CD4+ T cell responses compared to the commonly used Gag-p55 peptide pool. These results demonstrate the power of the PopCover method for the identification of broadly recognized HLA class II-restricted epitopes. All together, selection strategies, such as PopCover, might with success be used for the evaluation of antigen-specific CD4+ T cell responses and design of future vaccines.
- Published
- 2012
- Full Text
- View/download PDF
13. Reliable B cell epitope predictions: impacts of method development and improved benchmarking.
- Author
-
Kringelum JV, Lundegaard C, Lund O, and Nielsen M
- Subjects
- Epitopes chemistry, Humans, Models, Molecular, Odds Ratio, B-Lymphocytes immunology, Benchmarking, Epitopes immunology
- Abstract
The interaction between antibodies and antigens is one of the most important immune system mechanisms for clearing infectious organisms from the host. Antibodies bind to antigens at sites referred to as B-cell epitopes. Identification of the exact location of B-cell epitopes is essential in several biomedical applications such as; rational vaccine design, development of disease diagnostics and immunotherapeutics. However, experimental mapping of epitopes is resource intensive making in silico methods an appealing complementary approach. To date, the reported performance of methods for in silico mapping of B-cell epitopes has been moderate. Several issues regarding the evaluation data sets may however have led to the performance values being underestimated: Rarely, all potential epitopes have been mapped on an antigen, and antibodies are generally raised against the antigen in a given biological context not against the antigen monomer. Improper dealing with these aspects leads to many artificial false positive predictions and hence to incorrect low performance values. To demonstrate the impact of proper benchmark definitions, we here present an updated version of the DiscoTope method incorporating a novel spatial neighborhood definition and half-sphere exposure as surface measure. Compared to other state-of-the-art prediction methods, Discotope-2.0 displayed improved performance both in cross-validation and in independent evaluations. Using DiscoTope-2.0, we assessed the impact on performance when using proper benchmark definitions. For 13 proteins in the training data set where sufficient biological information was available to make a proper benchmark redefinition, the average AUC performance was improved from 0.791 to 0.824. Similarly, the average AUC performance on an independent evaluation data set improved from 0.712 to 0.727. Our results thus demonstrate that given proper benchmark definitions, B-cell epitope prediction methods achieve highly significant predictive performances suggesting these tools to be a powerful asset in rational epitope discovery. The updated version of DiscoTope is available at www.cbs.dtu.dk/services/DiscoTope-2.0.
- Published
- 2012
- Full Text
- View/download PDF
14. Human leukocyte antigen (HLA) class I restricted epitope discovery in yellow fewer and dengue viruses: importance of HLA binding strength.
- Author
-
Lund O, Nascimento EJ, Maciel M Jr, Nielsen M, Larsen MV, Lundegaard C, Harndahl M, Lamberth K, Buus S, Salmon J, August TJ, and Marques ET Jr
- Subjects
- Amino Acid Sequence, Animals, Enzyme-Linked Immunosorbent Assay, Epitopes chemistry, Humans, Mice, Mice, Transgenic, Molecular Sequence Data, Yellow Fever Vaccine immunology, Dengue Virus immunology, Epitopes immunology, Histocompatibility Antigens Class I immunology, Yellow fever virus immunology
- Abstract
Epitopes from all available full-length sequences of yellow fever virus (YFV) and dengue fever virus (DENV) restricted by Human Leukocyte Antigen class I (HLA-I) alleles covering 12 HLA-I supertypes were predicted using the NetCTL algorithm. A subset of 179 predicted YFV and 158 predicted DENV epitopes were selected using the EpiSelect algorithm to allow for optimal coverage of viral strains. The selected predicted epitopes were synthesized and approximately 75% were found to bind the predicted restricting HLA molecule with an affinity, K(D), stronger than 500 nM. The immunogenicity of 25 HLA-A*02:01, 28 HLA-A*24:02 and 28 HLA-B*07:02 binding peptides was tested in three HLA-transgenic mice models and led to the identification of 17 HLA-A*02:01, 4 HLA-A*2402 and 4 HLA-B*07:02 immunogenic peptides. The immunogenic peptides bound HLA significantly stronger than the non-immunogenic peptides. All except one of the immunogenic peptides had K(D) below 100 nM and the peptides with K(D) below 5 nM were more likely to be immunogenic. In addition, all the immunogenic peptides that were identified as having a high functional avidity had K(D) below 20 nM. A*02:01 transgenic mice were also inoculated twice with the 17DD YFV vaccine strain. Three of the YFV A*02:01 restricted peptides activated T-cells from the infected mice in vitro. All three peptides that elicited responses had an HLA binding affinity of 2 nM or less. The results indicate the importance of the strength of HLA binding in shaping the immune response.
- Published
- 2011
- Full Text
- View/download PDF
15. NetTurnP--neural network prediction of beta-turns by use of evolutionary information and predicted protein sequence features.
- Author
-
Petersen B, Lundegaard C, and Petersen TN
- Subjects
- Amino Acid Sequence, Computational Biology methods, Evolution, Molecular, Internet, Molecular Sequence Data, Proteins genetics, Reproducibility of Results, Algorithms, Neural Networks, Computer, Protein Structure, Secondary, Proteins chemistry
- Abstract
Unlabelled: β-turns are the most common type of non-repetitive structures, and constitute on average 25% of the amino acids in proteins. The formation of β-turns plays an important role in protein folding, protein stability and molecular recognition processes. In this work we present the neural network method NetTurnP, for prediction of two-class β-turns and prediction of the individual β-turn types, by use of evolutionary information and predicted protein sequence features. It has been evaluated against a commonly used dataset BT426, and achieves a Matthews correlation coefficient of 0.50, which is the highest reported performance on a two-class prediction of β-turn and not-β-turn. Furthermore NetTurnP shows improved performance on some of the specific β-turn types. In the present work, neural network methods have been trained to predict β-turn or not and individual β-turn types from the primary amino acid sequence. The individual β-turn types I, I', II, II', VIII, VIa1, VIa2, VIba and IV have been predicted based on classifications by PROMOTIF, and the two-class prediction of β-turn or not is a superset comprised of all β-turn types. The performance is evaluated using a golden set of non-homologous sequences known as BT426. Our two-class prediction method achieves a performance of: MCC=0.50, Qtotal=82.1%, sensitivity=75.6%, PPV=68.8% and AUC=0.864. We have compared our performance to eleven other prediction methods that obtain Matthews correlation coefficients in the range of 0.17-0.47. For the type specific β-turn predictions, only type I and II can be predicted with reasonable Matthews correlation coefficients, where we obtain performance values of 0.36 and 0.31, respectively., Conclusion: The NetTurnP method has been implemented as a webserver, which is freely available at http://www.cbs.dtu.dk/services/NetTurnP/. NetTurnP is the only available webserver that allows submission of multiple sequences.
- Published
- 2010
- Full Text
- View/download PDF
16. NetMHCIIpan-2.0 - Improved pan-specific HLA-DR predictions using a novel concurrent alignment and weight optimization training procedure.
- Author
-
Nielsen M, Justesen S, Lund O, Lundegaard C, and Buus S
- Abstract
Background: Binding of peptides to Major Histocompatibility class II (MHC-II) molecules play a central role in governing responses of the adaptive immune system. MHC-II molecules sample peptides from the extracellular space allowing the immune system to detect the presence of foreign microbes from this compartment. Predicting which peptides bind to an MHC-II molecule is therefore of pivotal importance for understanding the immune response and its effect on host-pathogen interactions. The experimental cost associated with characterizing the binding motif of an MHC-II molecule is significant and large efforts have therefore been placed in developing accurate computer methods capable of predicting this binding event. Prediction of peptide binding to MHC-II is complicated by the open binding cleft of the MHC-II molecule, allowing binding of peptides extending out of the binding groove. Moreover, the genes encoding the MHC molecules are immensely diverse leading to a large set of different MHC molecules each potentially binding a unique set of peptides. Characterizing each MHC-II molecule using peptide-screening binding assays is hence not a viable option., Results: Here, we present an MHC-II binding prediction algorithm aiming at dealing with these challenges. The method is a pan-specific version of the earlier published allele-specific NN-align algorithm and does not require any pre-alignment of the input data. This allows the method to benefit also from information from alleles covered by limited binding data. The method is evaluated on a large and diverse set of benchmark data, and is shown to significantly out-perform state-of-the-art MHC-II prediction methods. In particular, the method is found to boost the performance for alleles characterized by limited binding data where conventional allele-specific methods tend to achieve poor prediction accuracy., Conclusions: The method thus shows great potential for efficient boosting the accuracy of MHC-II binding prediction, as accurate predictions can be obtained for novel alleles at highly reduced experimental costs. Pan-specific binding predictions can be obtained for all alleles with know protein sequence and the method can benefit by including data in the training from alleles even where only few binders are known. The method and benchmark data are available at http://www.cbs.dtu.dk/services/NetMHCIIpan-2.0.
- Published
- 2010
- Full Text
- View/download PDF
17. State of the art and challenges in sequence based T-cell epitope prediction.
- Author
-
Lundegaard C, Hoof I, Lund O, and Nielsen M
- Abstract
Sequence based T-cell epitope predictions have improved immensely in the last decade. From predictions of peptide binding to major histocompatibility complex molecules with moderate accuracy, limited allele coverage, and no good estimates of the other events in the antigen-processing pathway, the field has evolved significantly. Methods have now been developed that produce highly accurate binding predictions for many alleles and integrate both proteasomal cleavage and transport events. Moreover have so-called pan-specific methods been developed, which allow for prediction of peptide binding to MHC alleles characterized by limited or no peptide binding data. Most of the developed methods are publicly available, and have proven to be very useful as a shortcut in epitope discovery. Here, we will go through some of the history of sequence-based predictions of helper as well as cytotoxic T cell epitopes. We will focus on some of the most accurate methods and their basic background.
- Published
- 2010
- Full Text
- View/download PDF
18. Major histocompatibility complex class I binding predictions as a tool in epitope discovery.
- Author
-
Lundegaard C, Lund O, Buus S, and Nielsen M
- Subjects
- Animals, Epitopes, T-Lymphocyte chemistry, Histocompatibility Antigens Class I chemistry, Humans, Protein Binding immunology, Computational Biology methods, Epitopes, T-Lymphocyte immunology, Epitopes, T-Lymphocyte metabolism, Histocompatibility Antigens Class I immunology, Histocompatibility Antigens Class I metabolism
- Abstract
Summary: Over the last decade, in silico models of the major histocompatibility complex (MHC) class I pathway have developed significantly. Before, peptide binding could only be reliably modelled for a few major human or mouse histocompatibility molecules; now, high-accuracy predictions are available for any human leucocyte antigen (HLA) -A or -B molecule with known protein sequence. Furthermore, peptide binding to MHC molecules from several non-human primates, mouse strains and other mammals can now be predicted. In this review, a number of different prediction methods are briefly explained, highlighting the most useful and historically important. Selected case stories, where these 'reverse immunology' systems have been used in actual epitope discovery, are briefly reviewed. We conclude that this new generation of epitope discovery systems has become a highly efficient tool for epitope discovery, and recommend that the less accurate prediction systems of the past be abandoned, as these are obsolete.
- Published
- 2010
- Full Text
- View/download PDF
19. MHC class II epitope predictive algorithms.
- Author
-
Nielsen M, Lund O, Buus S, and Lundegaard C
- Subjects
- Animals, Epitopes chemistry, Epitopes genetics, Histocompatibility Antigens Class II chemistry, Histocompatibility Antigens Class II genetics, Humans, Protein Binding immunology, Computational Biology methods, Epitopes immunology, Epitopes metabolism, Histocompatibility Antigens Class II immunology, Histocompatibility Antigens Class II metabolism
- Abstract
Summary: Major histocompatibility complex class II (MHC-II) molecules sample peptides from the extracellular space, allowing the immune system to detect the presence of foreign microbes from this compartment. To be able to predict the immune response to given pathogens, a number of methods have been developed to predict peptide-MHC binding. However, few methods other than the pioneering TEPITOPE/ProPred method have been developed for MHC-II. Despite recent progress in method development, the predictive performance for MHC-II remains significantly lower than what can be obtained for MHC-I. One reason for this is that the MHC-II molecule is open at both ends allowing binding of peptides extending out of the groove. The binding core of MHC-II-bound peptides is therefore not known a priori and the binding motif is hence not readily discernible. Recent progress has been obtained by including the flanking residues in the predictions. All attempts to make ab initio predictions based on protein structure have failed to reach predictive performances similar to those that can be obtained by data-driven methods. Thousands of different MHC-II alleles exist in humans. Recently developed pan-specific methods have been able to make reasonably accurate predictions for alleles that were not included in the training data. These methods can be used to define supertypes (clusters) of MHC-II alleles where alleles within each supertype have similar binding specificities. Furthermore, the pan-specific methods have been used to make a graphical atlas such as the MHCMotifviewer, which allows for visual comparison of specificities of different alleles.
- Published
- 2010
- Full Text
- View/download PDF
20. CPHmodels-3.0--remote homology modeling using structure-guided sequence profiles.
- Author
-
Nielsen M, Lundegaard C, Lund O, and Petersen TN
- Subjects
- Algorithms, Internet, Protein Folding, Reproducibility of Results, Sequence Analysis, Protein, User-Computer Interface, Software, Structural Homology, Protein
- Abstract
CPHmodels-3.0 is a web server predicting protein 3D structure by use of single template homology modeling. The server employs a hybrid of the scoring functions of CPHmodels-2.0 and a novel remote homology-modeling algorithm. A query sequence is first attempted modeled using the fast CPHmodels-2.0 profile-profile scoring function suitable for close homology modeling. The new computational costly remote homology-modeling algorithm is only engaged provided that no suitable PDB template is identified in the initial search. CPHmodels-3.0 was benchmarked in the CASP8 competition and produced models for 94% of the targets (117 out of 128), 74% were predicted as high reliability models (87 out of 117). These achieved an average RMSD of 4.6 A when superimposed to the 3D structure. The remaining 26% low reliably models (30 out of 117) could superimpose to the true 3D structure with an average RMSD of 9.3 A. These performance values place the CPHmodels-3.0 method in the group of high performing 3D prediction tools. Beside its accuracy, one of the important features of the method is its speed. For most queries, the response time of the server is <20 min. The web server is available at http://www.cbs.dtu.dk/services/CPHmodels/.
- Published
- 2010
- Full Text
- View/download PDF
21. A generic method for assignment of reliability scores applied to solvent accessibility predictions.
- Author
-
Petersen B, Petersen TN, Andersen P, Nielsen M, and Lundegaard C
- Subjects
- Algorithms, Computational Biology, Databases, Protein, Neural Networks, Computer, Proteins chemistry, Solvents chemistry
- Abstract
Background: Estimation of the reliability of specific real value predictions is nontrivial and the efficacy of this is often questionable. It is important to know if you can trust a given prediction and therefore the best methods associate a prediction with a reliability score or index. For discrete qualitative predictions, the reliability is conventionally estimated as the difference between output scores of selected classes. Such an approach is not feasible for methods that predict a biological feature as a single real value rather than a classification. As a solution to this challenge, we have implemented a method that predicts the relative surface accessibility of an amino acid and simultaneously predicts the reliability for each prediction, in the form of a Z-score., Results: An ensemble of artificial neural networks has been trained on a set of experimentally solved protein structures to predict the relative exposure of the amino acids. The method assigns a reliability score to each surface accessibility prediction as an inherent part of the training process. This is in contrast to the most commonly used procedures where reliabilities are obtained by post-processing the output., Conclusion: The performance of the neural networks was evaluated on a commonly used set of sequences known as the CB513 set. An overall Pearson's correlation coefficient of 0.72 was obtained, which is comparable to the performance of the currently best public available method, Real-SPINE. Both methods associate a reliability score with the individual predictions. However, our implementation of reliability scores in the form of a Z-score is shown to be the more informative measure for discriminating good predictions from bad ones in the entire range from completely buried to fully exposed amino acids. This is evident when comparing the Pearson's correlation coefficient for the upper 20% of predictions sorted according to reliability. For this subset, values of 0.79 and 0.74 are obtained using our and the compared method, respectively. This tendency is true for any selected subset.
- Published
- 2009
- Full Text
- View/download PDF
22. Pan-specific MHC class I predictors: a benchmark of HLA class I pan-specific prediction methods.
- Author
-
Zhang H, Lundegaard C, and Nielsen M
- Subjects
- Alleles, Area Under Curve, Databases, Protein, HLA Antigens immunology, Histocompatibility Antigens Class I chemistry, Humans, Ligands, Peptides immunology, Protein Binding, Statistics, Nonparametric, Computational Biology methods, Histocompatibility Antigens Class I immunology
- Abstract
Motivation: MHC:peptide binding plays a central role in activating the immune surveillance. Computational approaches to determine T-cell epitopes restricted to any given major histocompatibility complex (MHC) molecule are of special practical value in the development of for instance vaccines with broad population coverage against emerging pathogens. Methods have recently been published that are able to predict peptide binding to any human MHC class I molecule. In contrast to conventional allele-specific methods, these methods do allow for extrapolation to uncharacterized MHC molecules. These pan-specific human lymphocyte antigen (HLA) predictors have not previously been compared using independent evaluation sets., Result: A diverse set of quantitative peptide binding affinity measurements was collected from Immune Epitope database (IEDB), together with a large set of HLA class I ligands from the SYFPEITHI database. Based on these datasets, three different pan-specific HLA web-accessible predictors NetMHCpan, adaptive double threading (ADT) and kernel-based inter-allele peptide binding prediction system (KISS) were evaluated. The performance of the pan-specific predictors was also compared with a well performing allele-specific MHC class I predictor, NetMHC, as well as a consensus approach integrating the predictions from the NetMHC and NetMHCpan methods., Conclusions: The benchmark demonstrated that pan-specific methods do provide accurate predictions also for previously uncharacterized MHC molecules. The NetMHCpan method trained to predict actual binding affinities was consistently top ranking both on quantitative (affinity) and binary (ligand) data. However, the KISS method trained to predict binary data was one of the best performing methods when benchmarked on binary data. Finally, a consensus method integrating predictions from the two best performing methods was shown to improve the prediction accuracy.
- Published
- 2009
- Full Text
- View/download PDF
23. Quantitative predictions of peptide binding to any HLA-DR molecule of known sequence: NetMHCIIpan.
- Author
-
Nielsen M, Lundegaard C, Blicher T, Peters B, Sette A, Justesen S, Buus S, and Lund O
- Subjects
- Algorithms, Alleles, Amino Acid Sequence physiology, Binding Sites genetics, Binding Sites immunology, Databases, Protein, HLA-DR Antigens genetics, HLA-DR Antigens immunology, Humans, Major Histocompatibility Complex genetics, Molecular Sequence Data, Predictive Value of Tests, Protein Binding immunology, Reproducibility of Results, Sequence Alignment, Sequence Analysis, Protein, HLA-DR Antigens metabolism, Protein Interaction Mapping methods
- Abstract
CD4 positive T helper cells control many aspects of specific immunity. These cells are specific for peptides derived from protein antigens and presented by molecules of the extremely polymorphic major histocompatibility complex (MHC) class II system. The identification of peptides that bind to MHC class II molecules is therefore of pivotal importance for rational discovery of immune epitopes. HLA-DR is a prominent example of a human MHC class II. Here, we present a method, NetMHCIIpan, that allows for pan-specific predictions of peptide binding to any HLA-DR molecule of known sequence. The method is derived from a large compilation of quantitative HLA-DR binding events covering 14 of the more than 500 known HLA-DR alleles. Taking both peptide and HLA sequence information into account, the method can generalize and predict peptide binding also for HLA-DR molecules where experimental data is absent. Validation of the method includes identification of endogenously derived HLA class II ligands, cross-validation, leave-one-molecule-out, and binding motif identification for hitherto uncharacterized HLA-DR molecules. The validation shows that the method can successfully predict binding for HLA-DR molecules-even in the absence of specific data for the particular molecule in question. Moreover, when compared to TEPITOPE, currently the only other publicly available prediction method aiming at providing broad HLA-DR allelic coverage, NetMHCIIpan performs equivalently for alleles included in the training of TEPITOPE while outperforming TEPITOPE on novel alleles. We propose that the method can be used to identify those hitherto uncharacterized alleles, which should be addressed experimentally in future updates of the method to cover the polymorphism of HLA-DR most efficiently. We thus conclude that the presented method meets the challenge of keeping up with the MHC polymorphism discovery rate and that it can be used to sample the MHC "space," enabling a highly efficient iterative process for improving MHC class II binding predictions.
- Published
- 2008
- Full Text
- View/download PDF
24. NetMHC-3.0: accurate web accessible predictions of human, mouse and monkey MHC class I affinities for peptides of length 8-11.
- Author
-
Lundegaard C, Lamberth K, Harndahl M, Buus S, Lund O, and Nielsen M
- Subjects
- Alleles, Animals, Epitopes chemistry, HLA Antigens genetics, Haplorhini genetics, Histocompatibility Antigens Class I genetics, Humans, Internet, Mice, Peptides chemistry, HLA Antigens metabolism, Histocompatibility Antigens Class I metabolism, Peptides immunology, Software
- Abstract
NetMHC-3.0 is trained on a large number of quantitative peptide data using both affinity data from the Immune Epitope Database and Analysis Resource (IEDB) and elution data from SYFPEITHI. The method generates high-accuracy predictions of major histocompatibility complex (MHC): peptide binding. The predictions are based on artificial neural networks trained on data from 55 MHC alleles (43 Human and 12 non-human), and position-specific scoring matrices (PSSMs) for additional 67 HLA alleles. As only the MHC class I prediction server is available, predictions are possible for peptides of length 8-11 for all 122 alleles. artificial neural network predictions are given as actual IC(50) values whereas PSSM predictions are given as a log-odds likelihood scores. The output is optionally available as download for easy post-processing. The training method underlying the server is the best available, and has been used to predict possible MHC-binding peptides in a series of pathogen viral proteomes including SARS, Influenza and HIV, resulting in an average of 75-80% confirmed MHC binders. Here, the performance is further validated and benchmarked using a large set of newly published affinity data, non-redundant to the training set. The server is free of use and available at: http://www.cbs.dtu.dk/services/NetMHC.
- Published
- 2008
- Full Text
- View/download PDF
25. Accurate approximation method for prediction of class I MHC affinities for peptides of length 8, 10 and 11 using prediction tools trained on 9mers.
- Author
-
Lundegaard C, Lund O, and Nielsen M
- Subjects
- Artificial Intelligence, Binding Sites, Protein Binding, Reproducibility of Results, Sensitivity and Specificity, Algorithms, Histocompatibility Antigens Class I chemistry, Peptides chemistry, Protein Interaction Mapping methods, Sequence Analysis, Protein methods, Software
- Abstract
Unlabelled: Several accurate prediction systems have been developed for prediction of class I major histocompatibility complex (MHC):peptide binding. Most of these are trained on binding affinity data of primarily 9mer peptides. Here, we show how prediction methods trained on 9mer data can be used for accurate binding affinity prediction of peptides of length 8, 10 and 11. The method gives the opportunity to predict peptides with a different length than nine for MHC alleles where no such peptides have been measured. As validation, the performance of this approach is compared to predictors trained on peptides of the peptide length in question. In this validation, the approximation method has an accuracy that is comparable to or better than methods trained on a peptide length identical to the predicted peptides., Availability: The algorithm has been implemented in the web-accessible servers NetMHC-3.0: http://www.cbs.dtu.dk/services/NetMHC-3.0, and NetMHCpan-1.1: http://www.cbs.dtu.dk/services/NetMHCpan-1.1
- Published
- 2008
- Full Text
- View/download PDF
26. Modeling the adaptive immune system: predictions and simulations.
- Author
-
Lundegaard C, Lund O, Kesmir C, Brunak S, and Nielsen M
- Subjects
- Animals, Computer Simulation, Humans, Adaptation, Physiological immunology, Epitope Mapping methods, Immunity, Innate immunology, Immunologic Factors immunology, Models, Immunological
- Abstract
Motivation: Immunological bioinformatics methods are applicable to a broad range of scientific areas. The specifics of how and where they might be implemented have recently been reviewed in the literature. However, the background and concerns for selecting between the different available methods have so far not been adequately covered., Summary: Before using predictions systems, it is necessary to not only understand how the methods are constructed but also their strength and limitations. The prediction systems in humoral epitope discovery are still in their infancy, but have reached a reasonable level of predictive strength. In cellular immunology, MHC class I binding predictions are now very strong and cover most of the known HLA specificities. These systems work well for epitope discovery, and predictions of the MHC class I pathway have been further improved by integration with state-of-the-art prediction tools for proteasomal cleavage and TAP binding. By comparison, class II MHC binding predictions have not developed to a comparable accuracy level, but new tools have emerged that deliver significantly improved predictions not only in terms of accuracy, but also in MHC specificity coverage. Simulation systems and mathematical modeling are also now beginning to reach a level where these methods will be able to answer more complex immunological questions.
- Published
- 2007
- Full Text
- View/download PDF
27. Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction.
- Author
-
Larsen MV, Lundegaard C, Lamberth K, Buus S, Lund O, and Nielsen M
- Subjects
- Binding Sites, Protein Binding, Algorithms, Epitope Mapping methods, Epitopes, T-Lymphocyte chemistry, Epitopes, T-Lymphocyte immunology, Sequence Analysis, Protein methods, T-Lymphocytes, Cytotoxic chemistry, T-Lymphocytes, Cytotoxic immunology
- Abstract
Background: Reliable predictions of Cytotoxic T lymphocyte (CTL) epitopes are essential for rational vaccine design. Most importantly, they can minimize the experimental effort needed to identify epitopes. NetCTL is a web-based tool designed for predicting human CTL epitopes in any given protein. It does so by integrating predictions of proteasomal cleavage, TAP transport efficiency, and MHC class I affinity. At least four other methods have been developed recently that likewise attempt to predict CTL epitopes: EpiJen, MAPPP, MHC-pathway, and WAPP. In order to compare the performance of prediction methods, objective benchmarks and standardized performance measures are needed. Here, we develop such large-scale benchmark and corresponding performance measures and report the performance of an updated version 1.2 of NetCTL in comparison with the four other methods., Results: We define a number of performance measures that can handle the different types of output data from the five methods. We use two evaluation datasets consisting of known HIV CTL epitopes and their source proteins. The source proteins are split into all possible 9 mers and except for annotated epitopes; all other 9 mers are considered non-epitopes. In the RANK measure, we compare two methods at a time and count how often each of the methods rank the epitope highest. In another measure, we find the specificity of the methods at three predefined sensitivity values. Lastly, for each method, we calculate the percentage of known epitopes that rank within the 5% peptides with the highest predicted score., Conclusion: NetCTL-1.2 is demonstrated to have a higher predictive performance than EpiJen, MAPPP, MHC-pathway, and WAPP on all performance measures. The higher performance of NetCTL-1.2 as compared to EpiJen and MHC-pathway is, however, not statistically significant on all measures. In the large-scale benchmark calculation consisting of 216 known HIV epitopes covering all 12 recognized HLA supertypes, the NetCTL-1.2 method was shown to have a sensitivity among the 5% top-scoring peptides above 0.72. On this dataset, the best of the other methods achieved a sensitivity of 0.64. The NetCTL-1.2 method is available at http://www.cbs.dtu.dk/services/NetCTL. All used datasets are available at http://www.cbs.dtu.dk/suppl/immunology/CTL-1.2.php.
- Published
- 2007
- Full Text
- View/download PDF
28. NetMHCpan, a method for quantitative predictions of peptide binding to any HLA-A and -B locus protein of known sequence.
- Author
-
Nielsen M, Lundegaard C, Blicher T, Lamberth K, Harndahl M, Justesen S, Røder G, Peters B, Sette A, Lund O, and Buus S
- Subjects
- Binding Sites, HLA-A Antigens metabolism, HLA-B Antigens metabolism, Humans, Peptides chemistry, Computational Biology methods, HLA-A Antigens chemistry, HLA-B Antigens chemistry, Peptides metabolism, Software
- Abstract
Background: Binding of peptides to Major Histocompatibility Complex (MHC) molecules is the single most selective step in the recognition of pathogens by the cellular immune system. The human MHC class I system (HLA-I) is extremely polymorphic. The number of registered HLA-I molecules has now surpassed 1500. Characterizing the specificity of each separately would be a major undertaking., Principal Findings: Here, we have drawn on a large database of known peptide-HLA-I interactions to develop a bioinformatics method, which takes both peptide and HLA sequence information into account, and generates quantitative predictions of the affinity of any peptide-HLA-I interaction. Prospective experimental validation of peptides predicted to bind to previously untested HLA-I molecules, cross-validation, and retrospective prediction of known HIV immune epitopes and endogenous presented peptides, all successfully validate this method. We further demonstrate that the method can be applied to perform a clustering analysis of MHC specificities and suggest using this clustering to select particularly informative novel MHC molecules for future biochemical and functional analysis., Conclusions: Encompassing all HLA molecules, this high-throughput computational method lends itself to epitope searches that are not only genome- and pathogen-wide, but also HLA-wide. Thus, it offers a truly global analysis of immune responses supporting rational development of vaccines and immunotherapy. It also promises to provide new basic insights into HLA structure-function relationships. The method is available at http://www.cbs.dtu.dk/services/NetMHCpan.
- Published
- 2007
- Full Text
- View/download PDF
29. Prediction of MHC class II binding affinity using SMM-align, a novel stabilization matrix alignment method.
- Author
-
Nielsen M, Lundegaard C, and Lund O
- Subjects
- Algorithms, Alleles, Amino Acid Motifs, Amino Acid Sequence, Animals, Databases, Genetic, Epitopes, HLA-DR Antigens chemistry, HLA-DR Antigens immunology, Humans, Inhibitory Concentration 50, Mice, Monte Carlo Method, Peptides chemistry, Peptides immunology, Predictive Value of Tests, Protein Binding, Reproducibility of Results, Sequence Alignment, Histocompatibility Antigens Class II chemistry, Histocompatibility Antigens Class II immunology, Sequence Analysis, Protein methods
- Abstract
Background: Antigen presenting cells (APCs) sample the extra cellular space and present peptides from here to T helper cells, which can be activated if the peptides are of foreign origin. The peptides are presented on the surface of the cells in complex with major histocompatibility class II (MHC II) molecules. Identification of peptides that bind MHC II molecules is thus a key step in rational vaccine design and developing methods for accurate prediction of the peptide:MHC interactions play a central role in epitope discovery. The MHC class II binding groove is open at both ends making the correct alignment of a peptide in the binding groove a crucial part of identifying the core of an MHC class II binding motif. Here, we present a novel stabilization matrix alignment method, SMM-align, that allows for direct prediction of peptide:MHC binding affinities. The predictive performance of the method is validated on a large MHC class II benchmark data set covering 14 HLA-DR (human MHC) and three mouse H2-IA alleles., Results: The predictive performance of the SMM-align method was demonstrated to be superior to that of the Gibbs sampler, TEPITOPE, SVRMHC, and MHCpred methods. Cross validation between peptide data set obtained from different sources demonstrated that direct incorporation of peptide length potentially results in over-fitting of the binding prediction method. Focusing on amino terminal peptide flanking residues (PFR), we demonstrate a consistent gain in predictive performance by favoring binding registers with a minimum PFR length of two amino acids. Visualizing the binding motif as obtained by the SMM-align and TEPITOPE methods highlights a series of fundamental discrepancies between the two predicted motifs. For the DRB1*1302 allele for instance, the TEPITOPE method favors basic amino acids at most anchor positions, whereas the SMM-align method identifies a preference for hydrophobic or neutral amino acids at the anchors., Conclusion: The SMM-align method was shown to outperform other state of the art MHC class II prediction methods. The method predicts quantitative peptide:MHC binding affinity values, making it ideally suited for rational epitope discovery. The method has been trained and evaluated on the, to our knowledge, largest benchmark data set publicly available and covers the nine HLA-DR supertypes suggested as well as three mouse H2-IA allele. Both the peptide benchmark data set, and SMM-align prediction method (NetMHCII) are made publicly available.
- Published
- 2007
- Full Text
- View/download PDF
30. Modelling the human immune system by combining bioinformatics and systems biology approaches.
- Author
-
Rapin N, Kesmir C, Frankild S, Nielsen M, Lundegaard C, Brunak S, and Lund O
- Abstract
Over the past decade a number of bioinformatics tools have been developed that use genomic sequences as input to predict to which parts of a microbe the immune system will react, the so-called epitopes. Many predicted epitopes have later been verified experimentally, demonstrating the usefulness of such predictions. At the same time, simulation models have been developed that describe the dynamics of different immune cell populations and their interactions with microbes. These models have been used to explain experimental findings where timing is of importance, such as the time between administration of a vaccine and infection with the microbe that the vaccine is intended to protect against. In this paper, we outline a framework for integration of these two approaches. As an example, we develop a model in which HIV dynamics are correlated with genomics data. For the first time, the fitness of wild type and mutated virus are assessed by means of a sequence-dependent scoring matrix, derived from a BLOSUM matrix, that links protein sequences to growth rates of the virus in the mathematical model. A combined bioinformatics and systems biology approach can lead to a better understanding of immune system-related diseases where both timing and genomic information are of importance.
- Published
- 2006
- Full Text
- View/download PDF
31. A community resource benchmarking predictions of peptide binding to MHC-I molecules.
- Author
-
Peters B, Bui HH, Frankild S, Nielson M, Lundegaard C, Kostem E, Basch D, Lamberth K, Harndahl M, Fleri W, Wilson SS, Sidney J, Lund O, Buus S, and Sette A
- Subjects
- Animals, Databases, Factual, HLA Antigens chemistry, Humans, Inhibitory Concentration 50, Macaca, Mice, Neural Networks, Computer, Pan troglodytes, ROC Curve, Software, Histocompatibility Antigens Class I chemistry, Peptides chemistry
- Abstract
Recognition of peptides bound to major histocompatibility complex (MHC) class I molecules by T lymphocytes is an essential part of immune surveillance. Each MHC allele has a characteristic peptide binding preference, which can be captured in prediction algorithms, allowing for the rapid scan of entire pathogen proteomes for peptide likely to bind MHC. Here we make public a large set of 48,828 quantitative peptide-binding affinity measurements relating to 48 different mouse, human, macaque, and chimpanzee MHC class I alleles. We use this data to establish a set of benchmark predictions with one neural network method and two matrix-based prediction methods extensively utilized in our groups. In general, the neural network outperforms the matrix-based predictions mainly due to its ability to generalize even on a small amount of data. We also retrieved predictions from tools publicly available on the internet. While differences in the data used to generate these predictions hamper direct comparisons, we do conclude that tools based on combinatorial peptide libraries perform remarkably well. The transparent prediction evaluation on this dataset provides tool developers with a benchmark for comparison of newly developed prediction methods. In addition, to generate and evaluate our own prediction methods, we have established an easily extensible web-based prediction framework that allows automated side-by-side comparisons of prediction methods implemented by experts. This is an advance over the current practice of tool developers having to generate reference predictions themselves, which can lead to underestimating the performance of prediction methods they are not as familiar with as their own. The overall goal of this effort is to provide a transparent prediction evaluation allowing bioinformaticians to identify promising features of prediction methods and providing guidance to immunologists regarding the reliability of prediction tools.
- Published
- 2006
- Full Text
- View/download PDF
32. Improved prediction of MHC class I and class II epitopes using a novel Gibbs sampling approach.
- Author
-
Nielsen M, Lundegaard C, Worning P, Hvid CS, Lamberth K, Buus S, Brunak S, and Lund O
- Subjects
- Binding Sites, Epitopes, T-Lymphocyte immunology, Histocompatibility Antigens Class I immunology, Histocompatibility Antigens Class II immunology, Major Histocompatibility Complex immunology, Protein Binding, Protein Interaction Mapping methods, Reproducibility of Results, Sensitivity and Specificity, Algorithms, Epitopes, T-Lymphocyte chemistry, Histocompatibility Antigens Class I chemistry, Histocompatibility Antigens Class II chemistry, Sequence Alignment methods, Sequence Analysis, Protein methods
- Abstract
Motivation: Prediction of which peptides will bind a specific major histocompatibility complex (MHC) constitutes an important step in identifying potential T-cell epitopes suitable as vaccine candidates. MHC class II binding peptides have a broad length distribution complicating such predictions. Thus, identifying the correct alignment is a crucial part of identifying the core of an MHC class II binding motif. In this context, we wish to describe a novel Gibbs motif sampler method ideally suited for recognizing such weak sequence motifs. The method is based on the Gibbs sampling method, and it incorporates novel features optimized for the task of recognizing the binding motif of MHC classes I and II. The method locates the binding motif in a set of sequences and characterizes the motif in terms of a weight-matrix. Subsequently, the weight-matrix can be applied to identifying effectively potential MHC binding peptides and to guiding the process of rational vaccine design., Results: We apply the motif sampler method to the complex problem of MHC class II binding. The input to the method is amino acid peptide sequences extracted from the public databases of SYFPEITHI and MHCPEP and known to bind to the MHC class II complex HLA-DR4(B1*0401). Prior identification of information-rich (anchor) positions in the binding motif is shown to improve the predictive performance of the Gibbs sampler. Similarly, a consensus solution obtained from an ensemble average over suboptimal solutions is shown to outperform the use of a single optimal solution. In a large-scale benchmark calculation, the performance is quantified using relative operating characteristics curve (ROC) plots and we make a detailed comparison of the performance with that of both the TEPITOPE method and a weight-matrix derived using the conventional alignment algorithm of ClustalW. The calculation demonstrates that the predictive performance of the Gibbs sampler is higher than that of ClustalW and in most cases also higher than that of the TEPITOPE method.
- Published
- 2004
- Full Text
- View/download PDF
33. Reliable prediction of T-cell epitopes using neural networks with novel sequence representations.
- Author
-
Nielsen M, Lundegaard C, Worning P, Lauemøller SL, Lamberth K, Buus S, Brunak S, and Lund O
- Subjects
- Amino Acid Sequence, Epitopes, T-Lymphocyte genetics, Epitopes, T-Lymphocyte metabolism, Genome, Viral, HLA-A2 Antigen chemistry, HLA-A2 Antigen metabolism, Hepacivirus genetics, Hepacivirus immunology, Histocompatibility Antigens Class I chemistry, Humans, Markov Chains, Peptides chemistry, Peptides immunology, Peptides metabolism, Protein Binding, Epitopes, T-Lymphocyte chemistry, Histocompatibility Antigens Class I metabolism, Models, Molecular, Neural Networks, Computer
- Abstract
In this paper we describe an improved neural network method to predict T-cell class I epitopes. A novel input representation has been developed consisting of a combination of sparse encoding, Blosum encoding, and input derived from hidden Markov models. We demonstrate that the combination of several neural networks derived using different sequence-encoding schemes has a performance superior to neural networks derived using a single sequence-encoding scheme. The new method is shown to have a performance that is substantially higher than that of other methods. By use of mutual information calculations we show that peptides that bind to the HLA A*0204 complex display signal of higher order sequence correlations. Neural networks are ideally suited to integrate such higher order correlations when predicting the binding affinity. It is this feature combined with the use of several neural networks derived from different and novel sequence-encoding schemes and the ability of the neural network to be trained on data consisting of continuous binding affinities that gives the new method an improved performance. The difference in predictive performance between the neural network methods and that of the matrix-driven methods is found to be most significant for peptides that bind strongly to the HLA molecule, confirming that the signal of higher order sequence correlation is most strongly present in high-binding peptides. Finally, we use the method to predict T-cell epitopes for the genome of hepatitis C virus and discuss possible applications of the prediction method to guide the process of rational vaccine design.
- Published
- 2003
- Full Text
- View/download PDF
34. Analysis of two large functionally uncharacterized regions in the Methanopyrus kandleri AV19 genome.
- Author
-
Jensen LJ, Skovgaard M, Sicheritz-Pontén T, Jørgensen MK, Lundegaard C, Pedersen CC, Petersen N, and Ussery D
- Subjects
- Amino Acids genetics, Amino Acids physiology, Archaeal Proteins genetics, Archaeal Proteins physiology, Base Composition, DNA, Archaeal analysis, Multigene Family genetics, Multigene Family physiology, Open Reading Frames genetics, Open Reading Frames physiology, Predictive Value of Tests, Transcription Initiation Site, Genes, Archaeal physiology, Genome, Archaeal
- Abstract
Background: For most sequenced prokaryotic genomes, about a third of the protein coding genes annotated are "orphan proteins", that is, they lack homology to known proteins. These hypothetical genes are typically short and randomly scattered throughout the genome. This trend is seen for most of the bacterial and archaeal genomes published to date., Results: In contrast we have found that a large fraction of the genes coding for such orphan proteins in the Methanopyrus kandleri AV19 genome occur within two large regions. These genes have no known homologs except from other M. kandleri genes. However, analysis of their lengths, codon usage, and Ribosomal Binding Site (RBS) sequences shows that they are most likely true protein coding genes and not random open reading frames., Conclusions: Although these regions can be considered as candidates for massive lateral gene transfer, our bioinformatics analysis suggests that this is not the case. We predict many of the organism specific proteins to be transmembrane and belong to protein families that are non-randomly distributed between the regions. Consistent with this, we suggest that the two regions are most likely unrelated, and that they may be integrated plasmids.
- Published
- 2003
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.