85 results on '"Beccuti M"'
Search Results
52. GRAPES-DD: exploiting decision diagrams for index-driven search in biological graph databases.
- Author
-
Licheri N, Bonnici V, Beccuti M, and Giugno R
- Subjects
- Abstracting and Indexing, Algorithms, Databases, Factual, Vitis
- Abstract
Background: Graphs are mathematical structures widely used for expressing relationships among elements when representing biomedical and biological information. On top of these representations, several analyses are performed. A common task is the search of one substructure within one graph, called target. The problem is referred to as one-to-one subgraph search, and it is known to be NP-complete. Heuristics and indexing techniques can be applied to facilitate the search. Indexing techniques are also exploited in the context of searching in a collection of target graphs, referred to as one-to-many subgraph problem. Filter-and-verification methods that use indexing approaches provide a fast pruning of target graphs or parts of them that do not contain the query. The expensive verification phase is then performed only on the subset of promising targets. Indexing strategies extract graph features at a sufficient granularity level for performing a powerful filtering step. Features are memorized in data structures allowing an efficient access. Indexing size, querying time and filtering power are key points for the development of efficient subgraph searching solutions., Results: An existing approach, GRAPES, has been shown to have good performance in terms of speed-up for both one-to-one and one-to-many cases. However, it suffers in the size of the built index. For this reason, we propose GRAPES-DD, a modified version of GRAPES in which the indexing structure has been replaced with a Decision Diagram. Decision Diagrams are a broad class of data structures widely used to encode and manipulate functions efficiently. Experiments on biomedical structures and synthetic graphs have confirmed our expectation showing that GRAPES-DD has substantially reduced the memory utilization compared to GRAPES without worsening the searching time., Conclusion: The use of Decision Diagrams for searching in biochemical and biological graphs is completely new and potentially promising thanks to their ability to encode compactly sets by exploiting their structure and regularity, and to manipulate entire sets of elements at once, instead of exploring each single element explicitly. Search strategies based on Decision Diagram makes the indexing for biochemical graphs, and not only, more affordable allowing us to potentially deal with huge and ever growing collections of biochemical and biological structures.
- Published
- 2021
- Full Text
- View/download PDF
53. MET Exon 14 Skipping: A Case Study for the Detection of Genetic Variants in Cancer Driver Genes by Deep Learning.
- Author
-
Nosi V, Luca A, Milan M, Arigoni M, Benvenuti S, Cacchiarelli D, Cesana M, Riccardo S, Di Filippo L, Cordero F, Beccuti M, Comoglio PM, and Calogero RA
- Subjects
- Genetic Variation genetics, Humans, Neural Networks, Computer, Deep Learning, Exons genetics
- Abstract
Background: Disruption of alternative splicing (AS) is frequently observed in cancer and might represent an important signature for tumor progression and therapy. Exon skipping (ES) represents one of the most frequent AS events, and in non-small cell lung cancer (NSCLC) MET exon 14 skipping was shown to be targetable., Methods: We constructed neural networks (NN/CNN) specifically designed to detect MET exon 14 skipping events using RNAseq data. Furthermore, for discovery purposes we also developed a sparsely connected autoencoder to identify uncharacterized MET isoforms., Results: The neural networks had a Met exon 14 skipping detection rate greater than 94% when tested on a manually curated set of 690 TCGA bronchus and lung samples. When globally applied to 2605 TCGA samples, we observed that the majority of false positives was characterized by a blurry coverage of exon 14, but interestingly they share a common coverage peak in the second intron and we speculate that this event could be the transcription signature of a LINE1 (Long Interspersed Nuclear Element 1)-MET (Mesenchymal Epithelial Transition receptor tyrosine kinase) fusion., Conclusions: Taken together, our results indicate that neural networks can be an effective tool to provide a quick classification of pathological transcription events, and sparsely connected autoencoders could represent the basis for the development of an effective discovery tool.
- Published
- 2021
- Full Text
- View/download PDF
54. Sparsely-connected autoencoder (SCA) for single cell RNAseq data mining.
- Author
-
Alessandri L, Cordero F, Beccuti M, Licheri N, Arigoni M, Olivero M, Di Renzo MF, Sapino A, and Calogero R
- Subjects
- Algorithms, Base Sequence genetics, Cluster Analysis, Humans, Neural Networks, Computer, Software, Systems Biology methods, Exome Sequencing methods, Data Mining methods, Sequence Analysis, RNA methods, Single-Cell Analysis methods
- Abstract
Single-cell RNA sequencing (scRNAseq) is an essential tool to investigate cellular heterogeneity. Thus, it would be of great interest being able to disclose biological information belonging to cell subpopulations, which can be defined by clustering analysis of scRNAseq data. In this manuscript, we report a tool that we developed for the functional mining of single cell clusters based on Sparsely-Connected Autoencoder (SCA). This tool allows uncovering hidden features associated with scRNAseq data. We implemented two new metrics, QCC (Quality Control of Cluster) and QCM (Quality Control of Model), which allow quantifying the ability of SCA to reconstruct valuable cell clusters and to evaluate the quality of the neural network achievements, respectively. Our data indicate that SCA encoded space, derived by different experimentally validated data (TF targets, miRNA targets, Kinase targets, and cancer-related immune signatures), can be used to grasp single cell cluster-specific functional features. In our implementation, SCA efficacy comes from its ability to reconstruct only specific clusters, thus indicating only those clusters where the SCA encoding space is a key element for cells aggregation. SCA analysis is implemented as module in rCASC framework and it is supported by a GUI to simplify it usage for biologists and medical personnel.
- Published
- 2021
- Full Text
- View/download PDF
55. Computational Analysis of Single-Cell RNA-Seq Data.
- Author
-
Alessandrì L, Cordero F, Beccuti M, Arigoni M, and Calogero RA
- Subjects
- Cluster Analysis, Datasets as Topic, Gene Expression Profiling methods, High-Throughput Nucleotide Sequencing methods, Humans, Sequence Analysis, RNA methods, Exome Sequencing methods, Computational Biology methods, RNA-Seq methods, Single-Cell Analysis methods
- Abstract
Single-cell RNAseq data can be generated using various technologies, spanning from isolation of cells by FACS sorting or droplet sequencing, to the use of frozen tissue sections retaining spatial information of cells in their morphological context. The analysis of single cell RNAseq data is mainly focused on the identification of cell subpopulations characterized by specific gene markers that can be used to purify the population of interest for further biological studies. This chapter describes the steps required for dataset clustering and markers detection using a droplet dataset and a spatial transcriptomics dataset.
- Published
- 2021
- Full Text
- View/download PDF
56. Computational Analysis of circRNA Expression Data.
- Author
-
Ferrero G, Licheri N, De Bortoli M, Calogero RA, Beccuti M, and Cordero F
- Subjects
- Algorithms, Datasets as Topic statistics & numerical data, Humans, RNA, Circular chemistry, RNA, Circular genetics, RNA, Untranslated analysis, RNA, Untranslated chemistry, RNA, Untranslated genetics, RNA-Seq statistics & numerical data, Sequence Analysis, RNA, Software, Transcriptome, Computational Biology methods, RNA, Circular analysis, RNA-Seq methods
- Abstract
Analysis of circular RNA (circRNA) expression from RNA-Seq data can be performed with different algorithms and analysis pipelines, tools allowing the extraction of heterogeneous information on the expression of this novel class of RNAs. Computational pipelines were developed to facilitate the analysis of circRNA expression by leveraging different public tools in easy-to-use pipelines. This chapter describes the complete workflow for a computationally reproducible analysis of circRNA expression starting for a public RNA-Seq experiment. The main steps of circRNA prediction, annotation, classification, sequence reconstruction, quantification, and differential expression are illustrated.
- Published
- 2021
- Full Text
- View/download PDF
57. Computational modeling of the immune response in multiple sclerosis using epimod framework.
- Author
-
Pernice S, Follia L, Maglione A, Pennisi M, Pappalardo F, Novelli F, Clerico M, Beccuti M, Cordero F, and Rolla S
- Subjects
- Algorithms, Daclizumab therapeutic use, Humans, Immunosuppressive Agents therapeutic use, Multiple Sclerosis, Relapsing-Remitting drug therapy, Multiple Sclerosis, Relapsing-Remitting pathology, Stochastic Processes, Immune System physiology, Models, Biological, Multiple Sclerosis, Relapsing-Remitting immunology, User-Computer Interface
- Abstract
Background: Multiple Sclerosis (MS) represents nowadays in Europe the leading cause of non-traumatic disabilities in young adults, with more than 700,000 EU cases. Although huge strides have been made over the years, MS etiology remains partially unknown. Furthermore, the presence of various endogenous and exogenous factors can greatly influence the immune response of different individuals, making it difficult to study and understand the disease. This becomes more evident in a personalized-fashion when medical doctors have to choose the best therapy for patient well-being. In this optics, the use of stochastic models, capable of taking into consideration all the fluctuations due to unknown factors and individual variability, is highly advisable., Results: We propose a new model to study the immune response in relapsing remitting MS (RRMS), the most common form of MS that is characterized by alternate episodes of symptom exacerbation (relapses) with periods of disease stability (remission). In this new model, both the peripheral lymph node/blood vessel and the central nervous system are explicitly represented. The model was created and analysed using Epimod, our recently developed general framework for modeling complex biological systems. Then the effectiveness of our model was shown by modeling the complex immunological mechanisms characterizing RRMS during its course and under the DAC administration., Conclusions: Simulation results have proven the ability of the model to reproduce in silico the immune T cell balance characterizing RRMS course and the DAC effects. Furthermore, they confirmed the importance of a timely intervention on the disease course.
- Published
- 2020
- Full Text
- View/download PDF
58. Impacts of reopening strategies for COVID-19 epidemic: a modeling study in Piedmont region.
- Author
-
Pernice S, Castagno P, Marcotulli L, Maule MM, Richiardi L, Moirano G, Sereno M, Cordero F, and Beccuti M
- Subjects
- Betacoronavirus isolation & purification, COVID-19, Carrier State diagnosis, Carrier State epidemiology, Coronavirus Infections diagnosis, Coronavirus Infections transmission, Disease Susceptibility diagnosis, Disease Susceptibility epidemiology, Humans, Italy epidemiology, Models, Theoretical, Pneumonia, Viral diagnosis, Pneumonia, Viral transmission, Quarantine, SARS-CoV-2, Communicable Disease Control methods, Coronavirus Infections epidemiology, Coronavirus Infections prevention & control, Pandemics prevention & control, Pneumonia, Viral epidemiology, Pneumonia, Viral prevention & control
- Abstract
Background: Severe acute respiratory syndrome coronavirus 2 (SARS-COV-2), the causative agent of the coronavirus disease 19 (COVID-19), is a highly transmittable virus. Since the first person-to-person transmission of SARS-CoV-2 was reported in Italy on February 21
st , 2020, the number of people infected with SARS-COV-2 increased rapidly, mainly in northern Italian regions, including Piedmont. A strict lockdown was imposed on March 21st until May 4th when a gradual relaxation of the restrictions started. In this context, computational models and computer simulations are one of the available research tools that epidemiologists can exploit to understand the spread of the diseases and to evaluate social measures to counteract, mitigate or delay the spread of the epidemic., Methods: This study presents an extended version of the Susceptible-Exposed-Infected-Removed-Susceptible (SEIRS) model accounting for population age structure. The infectious population is divided into three sub-groups: (i) undetected infected individuals, (ii) quarantined infected individuals and (iii) hospitalized infected individuals. Moreover, the strength of the government restriction measures and the related population response to these are explicitly represented in the model., Results: The proposed model allows us to investigate different scenarios of the COVID-19 spread in Piedmont and the implementation of different infection-control measures and testing approaches. The results show that the implemented control measures have proven effective in containing the epidemic, mitigating the potential dangerous impact of a large proportion of undetected cases. We also forecast the optimal combination of individual-level measures and community surveillance to contain the new wave of COVID-19 spread after the re-opening work and social activities., Conclusions: Our model is an effective tool useful to investigate different scenarios and to inform policy makers about the potential impact of different control strategies. This will be crucial in the upcoming months, when very critical decisions about easing control measures will need to be taken.- Published
- 2020
- Full Text
- View/download PDF
59. A computational framework for modeling and studying pertussis epidemiology and vaccination.
- Author
-
Castagno P, Pernice S, Ghetti G, Povero M, Pradelli L, Paolotti D, Balbo G, Sereno M, and Beccuti M
- Subjects
- Adolescent, Child, Humans, Reproducibility of Results, Computational Biology methods, Computer Simulation standards, Vaccination methods, Whooping Cough epidemiology
- Abstract
Background: Emerging and re-emerging infectious diseases such as Zika, SARS, ncovid19 and Pertussis, pose a compelling challenge for epidemiologists due to their significant impact on global public health. In this context, computational models and computer simulations are one of the available research tools that epidemiologists can exploit to better understand the spreading characteristics of these diseases and to decide on vaccination policies, human interaction controls, and other social measures to counter, mitigate or simply delay the spread of the infectious diseases. Nevertheless, the construction of mathematical models for these diseases and their solutions remain a challenging tasks due to the fact that little effort has been devoted to the definition of a general framework easily accessible even by researchers without advanced modelling and mathematical skills., Results: In this paper we describe a new general modeling framework to study epidemiological systems, whose novelties and strengths are: (1) the use of a graphical formalism to simplify the model creation phase; (2) the implementation of an R package providing a friendly interface to access the analysis techniques implemented in the framework; (3) a high level of portability and reproducibility granted by the containerization of all analysis techniques implemented in the framework; (4) a well-defined schema and related infrastructure to allow users to easily integrate their own analysis workflow in the framework. Then, the effectiveness of this framework is showed through a case of study in which we investigate the pertussis epidemiology in Italy., Conclusions: We propose a new general modeling framework for the analysis of epidemiological systems, which exploits Petri Net graphical formalism, R environment, and Docker containerization to derive a tool easily accessible by any researcher even without advanced mathematical and computational skills. Moreover, the framework was implemented following the guidelines defined by Reproducible Bioinformatics Project so it guarantees reproducible analysis and makes simple the developed of new user-defined workflows.
- Published
- 2020
- Full Text
- View/download PDF
60. Docker4Circ: A Framework for the Reproducible Characterization of circRNAs from RNA-Seq Data.
- Author
-
Ferrero G, Licheri N, Coscujuela Tarrero L, De Intinis C, Miano V, Calogero RA, Cordero F, De Bortoli M, and Beccuti M
- Subjects
- Animals, Humans, Databases, Nucleic Acid, RNA, Circular genetics, RNA-Seq, Software
- Abstract
Recent improvements in cost-effectiveness of high-throughput technologies has allowed RNA sequencing of total transcriptomes suitable for evaluating the expression and regulation of circRNAs, a relatively novel class of transcript isoforms with suggested roles in transcriptional and post-transcriptional gene expression regulation, as well as their possible use as biomarkers, due to their deregulation in various human diseases. A limited number of integrated workflows exists for prediction, characterization, and differential expression analysis of circRNAs, none of them complying with computational reproducibility requirements. We developed Docker4Circ for the complete analysis of circRNAs from RNA-Seq data. Docker4Circ runs a comprehensive analysis of circRNAs in human and model organisms, including: circRNAs prediction; classification and annotation using six public databases; back-splice sequence reconstruction; internal alternative splicing of circularizing exons; alignment-free circRNAs quantification from RNA-Seq reads; and differential expression analysis. Docker4Circ makes circRNAs analysis easier and more accessible thanks to: (i) its R interface; (ii) encapsulation of computational tasks into docker images; (iii) user-friendly Java GUI Interface availability; and (iv) no need of advanced bash scripting skills for correct use. Furthermore, Docker4Circ ensures a reproducible analysis since all its tasks are embedded into a docker image following the guidelines provided by Reproducible Bioinformatics Project.
- Published
- 2019
- Full Text
- View/download PDF
61. A computational approach based on the colored Petri net formalism for studying multiple sclerosis.
- Author
-
Pernice S, Pennisi M, Romano G, Maglione A, Cutrupi S, Pappalardo F, Balbo G, Beccuti M, Cordero F, and Calogero RA
- Subjects
- Computational Biology, Disease Progression, Female, Humans, Immunosuppressive Agents therapeutic use, Pregnancy, Recurrence, Computer Simulation, Multiple Sclerosis, Relapsing-Remitting immunology, Multiple Sclerosis, Relapsing-Remitting physiopathology
- Abstract
Background: Multiple Sclerosis (MS) is an immune-mediated inflammatory disease of the Central Nervous System (CNS) which damages the myelin sheath enveloping nerve cells thus causing severe physical disability in patients. Relapsing Remitting Multiple Sclerosis (RRMS) is one of the most common form of MS in adults and is characterized by a series of neurologic symptoms, followed by periods of remission. Recently, many treatments were proposed and studied to contrast the RRMS progression. Among these drugs, daclizumab (commercial name Zinbryta), an antibody tailored against the Interleukin-2 receptor of T cells, exhibited promising results, but its efficacy was accompanied by an increased frequency of serious adverse events. Manifested side effects consisted of infections, encephalitis, and liver damages. Therefore daclizumab has been withdrawn from the market worldwide. Another interesting case of RRMS regards its progression in pregnant women where a smaller incidence of relapses until the delivery has been observed., Results: In this paper we propose a new methodology for studying RRMS, which we implemented in GreatSPN, a state-of-the-art open-source suite for modelling and analyzing complex systems through the Petri Net (PN) formalism. This methodology exploits: (a) an extended Colored PN formalism to provide a compact graphical description of the system and to automatically derive a set of ODEs encoding the system dynamics and (b) the Latin Hypercube Sampling with PRCC index to calibrate ODE parameters for reproducing the real behaviours in healthy and MS subjects.To show the effectiveness of such methodology a model of RRMS has been constructed and studied. Two different scenarios of RRMS were thus considered. In the former scenario the effect of the daclizumab administration is investigated, while in the latter one RRMS was studied in pregnant women., Conclusions: We propose a new computational methodology to study RRMS disease. Moreover, we show that model generated and calibrated according to this methodology is able to reproduce the expected behaviours.
- Published
- 2019
- Full Text
- View/download PDF
62. rCASC: reproducible classification analysis of single-cell sequencing data.
- Author
-
Alessandrì L, Cordero F, Beccuti M, Arigoni M, Olivero M, Romano G, Rabellino S, Licheri N, De Libero G, Pace L, and Calogero RA
- Subjects
- Cluster Analysis, Humans, Leukocytes, Mononuclear metabolism, Software, Sequence Analysis, RNA, Single-Cell Analysis, Workflow
- Abstract
Background: Single-cell RNA sequencing is essential for investigating cellular heterogeneity and highlighting cell subpopulation-specific signatures. Single-cell sequencing applications have spread from conventional RNA sequencing to epigenomics, e.g., ATAC-seq. Many related algorithms and tools have been developed, but few computational workflows provide analysis flexibility while also achieving functional (i.e., information about the data and the tools used are saved as metadata) and computational reproducibility (i.e., a real image of the computational environment used to generate the data is stored) through a user-friendly environment., Findings: rCASC is a modular workflow providing an integrated analysis environment (from count generation to cell subpopulation identification) exploiting Docker containerization to achieve both functional and computational reproducibility in data analysis. Hence, rCASC provides preprocessing tools to remove low-quality cells and/or specific bias, e.g., cell cycle. Subpopulation discovery can instead be achieved using different clustering techniques based on different distance metrics. Cluster quality is then estimated through the new metric "cell stability score" (CSS), which describes the stability of a cell in a cluster as a consequence of a perturbation induced by removing a random set of cells from the cell population. CSS provides better cluster robustness information than the silhouette metric. Moreover, rCASC's tools can identify cluster-specific gene signatures., Conclusions: rCASC is a modular workflow with new features that could help researchers define cell subpopulations and detect subpopulation-specific markers. It uses Docker for ease of installation and to achieve a computation-reproducible analysis. A Java GUI is provided to welcome users without computational skills in R., (© The Author(s) 2019. Published by Oxford University Press.)
- Published
- 2019
- Full Text
- View/download PDF
63. Integrative Analysis of Novel Metabolic Subtypes in Pancreatic Cancer Fosters New Prognostic Biomarkers.
- Author
-
Follia L, Ferrero G, Mandili G, Beccuti M, Giordano D, Spadi R, Satolli MA, Evangelista A, Katayama H, Hong W, Momin AA, Capello M, Hanash SM, Novelli F, and Cordero F
- Abstract
Background: Most of the patients with Pancreatic Ductal Adenocarcinoma (PDA) are not eligible for a curative surgical resection. For this reason there is an urgent need for personalized therapies. PDA is the result of complex interactions between tumor molecular profile and metabolites produced by its microenvironment. Despite recent studies identified PDA molecular subtypes, its metabolic classification is still lacking. Methods: We applied an integrative analysis on transcriptomic and genomic data of glycolytic genes in PDA. Data were collected from public datasets and molecular glycolytic subtypes were defined using hierarchical clustering. The grade of purity of the cancer samples was assessed estimating the different amount of stromal and immunological infiltrate among the identified PDA subtypes. Analyses of metabolomic data from a subset of PDA cell lines allowed us to identify the different metabolites produced by the metabolic subtypes. Sera of a cohort of 31 PDA patients were analyzed using Q-TOF mass spectrometer to measure the amount of metabolic circulating proteins present before and after chemotherapy. Results: Our integrative analysis of glycolytic genes identified two glycolytic and two non-glycolytic metabolic PDA subtypes. Glycolytic patients develop disease earlier, have poor prognosis, low immune-infiltrated tumors, and are characterized by a gain in chr12p13 genomic region. This gain results in the over-expression of GAPDH, TPI1 , and FOXM1 . PDA cell lines with the gain of chr12p13 are characterized by an higher lipid uptake and sensitivity to drug targeting the fatty acid metabolism. Our sera proteomic analysis confirms that TPI1 serum levels increase in poor prognosis gemcitabine-treated patients. Conclusions: We identify four metabolic PDA subtypes with different prognosis outcomes which may have pivotal role in setting personalized treatments. Moreover, our data suggest TPI1 as putative prognostic PDA biomarker.
- Published
- 2019
- Full Text
- View/download PDF
64. Reproducible bioinformatics project: a community for reproducible bioinformatics analysis pipelines.
- Author
-
Kulkarni N, Alessandrì L, Panero R, Arigoni M, Olivero M, Ferrero G, Cordero F, Beccuti M, and Calogero RA
- Subjects
- Humans, MicroRNAs genetics, Reproducibility of Results, Software, User-Computer Interface, Workflow, Computational Biology methods
- Abstract
Background: Reproducibility of a research is a key element in the modern science and it is mandatory for any industrial application. It represents the ability of replicating an experiment independently by the location and the operator. Therefore, a study can be considered reproducible only if all used data are available and the exploited computational analysis workflow is clearly described. However, today for reproducing a complex bioinformatics analysis, the raw data and the list of tools used in the workflow could be not enough to guarantee the reproducibility of the results obtained. Indeed, different releases of the same tools and/or of the system libraries (exploited by such tools) might lead to sneaky reproducibility issues., Results: To address this challenge, we established the Reproducible Bioinformatics Project (RBP), which is a non-profit and open-source project, whose aim is to provide a schema and an infrastructure, based on docker images and R package, to provide reproducible results in Bioinformatics. One or more Docker images are then defined for a workflow (typically one for each task), while the workflow implementation is handled via R-functions embedded in a package available at github repository. Thus, a bioinformatician participating to the project has firstly to integrate her/his workflow modules into Docker image(s) exploiting an Ubuntu docker image developed ad hoc by RPB to make easier this task. Secondly, the workflow implementation must be realized in R according to an R-skeleton function made available by RPB to guarantee homogeneity and reusability among different RPB functions. Moreover she/he has to provide the R vignette explaining the package functionality together with an example dataset which can be used to improve the user confidence in the workflow utilization., Conclusions: Reproducible Bioinformatics Project provides a general schema and an infrastructure to distribute robust and reproducible workflows. Thus, it guarantees to final users the ability to repeat consistently any analysis independently by the used UNIX-like architecture.
- Published
- 2018
- Full Text
- View/download PDF
65. SeqBox: RNAseq/ChIPseq reproducible analysis on a consumer game computer.
- Author
-
Beccuti M, Cordero F, Arigoni M, Panero R, Amparore EG, Donatelli S, and Calogero RA
- Subjects
- Computational Biology methods, Reproducibility of Results, Chromatin Immunoprecipitation methods, Sequence Analysis, RNA methods, Software
- Abstract
Summary: Short reads sequencing technology has been used for more than a decade now. However, the analysis of RNAseq and ChIPseq data is still computational demanding and the simple access to raw data does not guarantee results reproducibility between laboratories. To address these two aspects, we developed SeqBox, a cheap, efficient and reproducible RNAseq/ChIPseq hardware/software solution based on NUC6I7KYK mini-PC (an Intel consumer game computer with a fast processor and a high performance SSD disk), and Docker container platform. In SeqBox the analysis of RNAseq and ChIPseq data is supported by a friendly GUI. This allows access to fast and reproducible analysis also to scientists with/without scripting experience., Availability and Implementation: Docker container images, docker4seq package and the GUI are available at http://www.bioinformatica.unito.it/reproducibile.bioinformatics.html., Contact: beccuti@di.unito.it., Supplementary Information: Supplementary data are available at Bioinformatics online., (© The Author(s) 2017. Published by Oxford University Press.)
- Published
- 2018
- Full Text
- View/download PDF
66. Luminal breast cancer-specific circular RNAs uncovered by a novel tool for data analysis.
- Author
-
Coscujuela Tarrero L, Ferrero G, Miano V, De Intinis C, Ricci L, Arigoni M, Riccardo F, Annaratone L, Castellano I, Calogero RA, Beccuti M, Cordero F, and De Bortoli M
- Abstract
Circular RNAs are highly stable molecules present in all eukaryotes generated by distinct transcript processing. We have exploited poly(A-) RNA-Seq data generated in our lab in MCF-7 breast cancer cells to define a compilation of exonic circRNAs more comprehensive than previously existing lists. Development of a novel computational tool, named CircHunter , allowed us to more accurately characterize circRNAs and to quantitatively evaluate their expression in publicly available RNA-Seq data from breast cancer cell lines and tumor tissues. We observed and confirmed, by ChIP analysis, that exons involved in circularization events display significantly higher levels of the histone post-transcriptional modification H3K36me3 than non-circularizing exons. This result has potential impact on circRNA biogenesis since H3K36me3 has been involved in alternative splicing mechanisms. By analyzing an Ago-HITS-CLIP dataset we also found that circularizing exons overlapped with an unexpectedly higher number of Ago binding sites than non-circularizing exons. Finally, we observed that a subset of MCF-7 circRNAs are specific to tumor versus normal tissue, while others can distinguish Luminal from other tumor subtypes, thus suggesting that circRNAs can be exploited as novel biomarkers and drug targets for breast cancer., Competing Interests: CONFLICTS OF INTEREST The authors declare that they have no competing interests.
- Published
- 2018
- Full Text
- View/download PDF
67. HashClone: a new tool to quantify the minimal residual disease in B-cell lymphoma from deep sequencing data.
- Author
-
Beccuti M, Genuardi E, Romano G, Monitillo L, Barbero D, Boccadoro M, Ladetto M, Calogero R, Ferrero S, and Cordero F
- Subjects
- Algorithms, Alleles, B-Lymphocytes pathology, Clone Cells, Humans, Reproducibility of Results, Lymphoma, B-Cell genetics, Neoplasm, Residual genetics
- Abstract
Background: Mantle Cell Lymphoma (MCL) is a B cell aggressive neoplasia accounting for about the 6% of all lymphomas. The most common molecular marker of clonality in MCL, as in other B lymphoproliferative disorders, is the ImmunoGlobulin Heavy chain (IGH) rearrangement, occurring in B-lymphocytes. The patient-specific IGH rearrangement is extensively used to monitor the Minimal Residual Disease (MRD) after treatment through the standardized Allele-Specific Oligonucleotides Quantitative Polymerase Chain Reaction based technique. Recently, several studies have suggested that the IGH monitoring through deep sequencing techniques can produce not only comparable results to Polymerase Chain Reaction-based methods, but also might overcome the classical technique in terms of feasibility and sensitivity. However, no standard bioinformatics tool is available at the moment for data analysis in this context., Results: In this paper we present HashClone, an easy-to-use and reliable bioinformatics tool that provides B-cells clonality assessment and MRD monitoring over time analyzing data from Next-Generation Sequencing (NGS) technique. The HashClone strategy-based is composed of three steps: the first and second steps implement an alignment-free prediction method that identifies a set of putative clones belonging to the repertoire of the patient under study. In the third step the IGH variable region, diversity region, and joining region identification is obtained by the alignment of rearrangements with respect to the international ImMunoGenetics information system database. Moreover, a provided graphical user interface for HashClone execution and clonality visualization over time facilitate the tool use and the results interpretation. The HashClone performance was tested on the NGS data derived from MCL patients to assess the major B-cell clone in the diagnostic samples and to monitor the MRD in the real and artificial follow up samples., Conclusions: Our experiments show that in all the experimental settings, HashClone was able to correctly detect the major B-cell clones and to precisely follow them in several samples showing better accuracy than the state-of-art tool.
- Published
- 2017
- Full Text
- View/download PDF
68. Dissecting the genomic activity of a transcriptional regulator by the integrative analysis of omics data.
- Author
-
Ferrero G, Miano V, Beccuti M, Balbo G, De Bortoli M, and Cordero F
- Subjects
- A549 Cells, Estrogen Receptor alpha genetics, Estrogen Receptor alpha metabolism, Forkhead Box Protein M1 genetics, Forkhead Box Protein M1 metabolism, Genomics statistics & numerical data, Humans, MCF-7 Cells, Neoplasms genetics, Neoplasms metabolism, Neoplasms pathology, Receptors, Glucocorticoid genetics, Receptors, Glucocorticoid metabolism, Chromatin Immunoprecipitation methods, Gene Expression Regulation, Neoplastic, Genomics methods, High-Throughput Nucleotide Sequencing methods
- Abstract
In the study of genomic regulation, strategies to integrate the data produced by Next Generation Sequencing (NGS)-based technologies in a meaningful ensemble are eagerly awaited and must continuously evolve. Here, we describe an integrative strategy for the analysis of data generated by chromatin immunoprecipitation followed by NGS which combines algorithms for data overlap, normalization and epigenetic state analysis. The performance of our strategy is illustrated by presenting the analysis of data relative to the transcriptional regulator Estrogen Receptor alpha (ERα) in MCF-7 breast cancer cells and of Glucocorticoid Receptor (GR) in A549 lung cancer cells. We went through the definition of reference cistromes for different experimental contexts, the integration of data relative to co-regulators and the overlay of chromatin states as defined by epigenetic marks in MCF-7 cells. With our strategy, we identified novel features of estrogen-independent ERα activity, including FoxM1 interaction, eRNAs transcription and a peculiar ontology of connected genes.
- Published
- 2017
- Full Text
- View/download PDF
69. Peculiar Genes Selection: A new features selection method to improve classification performances in imbalanced data sets.
- Author
-
Martina F, Beccuti M, Balbo G, and Cordero F
- Subjects
- Aged, Female, Gene Expression Profiling, Gene Ontology, Humans, Middle Aged, Neoplasms genetics, Reproducibility of Results, Transcription Factors metabolism, Vaccination, Algorithms, Computational Biology methods, Databases as Topic, Genes
- Abstract
High-Throughput technologies provide genomic and trascriptomic data that are suitable for biomarker detection for classification purposes. However, the high dimension of the output of such technologies and the characteristics of the data sets analysed represent an issue for the classification task. Here we present a new feature selection method based on three steps to detect class-specific biomarkers in case of high-dimensional data sets. The first step detects the differentially expressed genes according to the experimental conditions tested in the experimental design, the second step filters out the features with low discriminative power and the third step detects the class-specific features and defines the final biomarker as the union of the class-specific features. The proposed procedure is tested on two microarray datasets, one characterized by a strong imbalance between the size of classes and the other one where the size of classes is perfectly balanced. We show that, using the proposed feature selection procedure, the classification performances of a Support Vector Machine on the imbalanced data set reach a 82% whereas other methods do not exceed 73%. Furthermore, in case of perfectly balanced dataset, the classification performances are comparable with other methods. Finally, the Gene Ontology enrichments performed on the signatures selected with the proposed pipeline, confirm the biological relevance of our methodology. The download of the package with the implementation of Peculiar Genes Selection, 'PGS', is available for R users at: http://github.com/mbeccuti/PGS.
- Published
- 2017
- Full Text
- View/download PDF
70. A computational analysis of S-(2-succino)cysteine sites in proteins.
- Author
-
Miglio G, Sabatino AD, Veglia E, Giraudo MT, Beccuti M, and Cordero F
- Subjects
- Amino Acids chemistry, Amino Acids genetics, Computational Biology, Cysteine chemistry, Cysteine genetics, Fumarates chemistry, Humans, Models, Theoretical, Molecular Conformation, Proteins genetics, Sequence Analysis, Protein, Succinates chemistry, Cysteine analogs & derivatives, Protein Processing, Post-Translational genetics, Proteins chemistry, Proteome
- Abstract
The adduction of fumaric acid to the sulfhydryl group of certain cysteine (Cys) residues in proteins via a Michael-like reaction leads to the formation of S-(2-succino)cysteine (2SC) sites. Although its role remains to be fully understood, this post-translational Cys modification (protein succination) has been implicated in the pathogenesis of diabetes/obesity and fumarate hydratase-related diseases. In this study, theoretical approaches to address sequence- and 3D-structure-based features possibly underlying the specificity of protein succination have been applied to perform the first analysis of the available data on the succinate proteome. A total of 182 succinated proteins, 205 modifiable, and 1750 non-modifiable sites have been examined. The rate of 2SC sites per protein ranged from 1 to 3, and the overall relative abundance of modifiable sites was 10.8%. Modifiable and non-modifiable sites were not distinguishable when the hydrophobicity of the Cys-flaking peptides, the acid dissociation constant value of the sulfhydryl groups, and the secondary structure of the Cys-containing segments were compared. By contrast, significant differences were determined when the accessibility of the sulphur atoms and the amino acid composition of the Cys-flaking peptides were analysed. Based on these findings, a sequence-based score function has been evaluated as a descriptor for Cys residues. In conclusion, our results indicate that modifiable and non-modifiable sites form heterogeneous subsets when features often discussed to describe Cys reactivity are examined. However, they also suggest that some differences exist, which may constitute the baseline for further investigations aimed at the development of predictive methods for 2SC sites in proteins., (Copyright © 2015 Elsevier B.V. All rights reserved.)
- Published
- 2016
- Full Text
- View/download PDF
71. Sequencing of 15 622 gene-bearing BACs clarifies the gene-dense regions of the barley genome.
- Author
-
Muñoz-Amatriaín M, Lonardi S, Luo M, Madishetty K, Svensson JT, Moscou MJ, Wanamaker S, Jiang T, Kleinhofs A, Muehlbauer GJ, Wise RP, Stein N, Ma Y, Rodriguez E, Kudrna D, Bhat PR, Chao S, Condamine P, Heinen S, Resnik J, Wing R, Witt HN, Alpert M, Beccuti M, Bozdag S, Cordero F, Mirebrahim H, Ounit R, Wu Y, You F, Zheng J, Simková H, Dolezel J, Grimwood J, Schmutz J, Duma D, Altschmied L, Blake T, Bregitzer P, Cooper L, Dilbirligi M, Falk A, Feiz L, Graner A, Gustafson P, Hayes PM, Lemaux P, Mammadov J, and Close TJ
- Subjects
- Molecular Sequence Data, Chromosomes, Artificial, Bacterial genetics, Genome, Plant genetics, Hordeum genetics
- Abstract
Barley (Hordeum vulgare L.) possesses a large and highly repetitive genome of 5.1 Gb that has hindered the development of a complete sequence. In 2012, the International Barley Sequencing Consortium released a resource integrating whole-genome shotgun sequences with a physical and genetic framework. However, because only 6278 bacterial artificial chromosome (BACs) in the physical map were sequenced, fine structure was limited. To gain access to the gene-containing portion of the barley genome at high resolution, we identified and sequenced 15 622 BACs representing the minimal tiling path of 72 052 physical-mapped gene-bearing BACs. This generated ~1.7 Gb of genomic sequence containing an estimated 2/3 of all Morex barley genes. Exploration of these sequenced BACs revealed that although distal ends of chromosomes contain most of the gene-enriched BACs and are characterized by high recombination rates, there are also gene-dense regions with suppressed recombination. We made use of published map-anchored sequence data from Aegilops tauschii to develop a synteny viewer between barley and the ancestor of the wheat D-genome. Except for some notable inversions, there is a high level of collinearity between the two species. The software HarvEST:Barley provides facile access to BAC sequences and their annotations, along with the barley-Ae. tauschii synteny viewer. These BAC sequences constitute a resource to improve the efficiency of marker development, map-based cloning, and comparative genomics in barley and related crops. Additional knowledge about regions of the barley genome that are gene-dense but low recombination is particularly relevant., (© 2015 The Authors The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd.)
- Published
- 2015
- Full Text
- View/download PDF
72. The molecular landscape of colorectal cancer cell lines unveils clinically actionable kinase targets.
- Author
-
Medico E, Russo M, Picco G, Cancelliere C, Valtorta E, Corti G, Buscarino M, Isella C, Lamba S, Martinoglio B, Veronese S, Siena S, Sartore-Bianchi A, Beccuti M, Mottolese M, Linnebacher M, Cordero F, Di Nicolantonio F, and Bardelli A
- Subjects
- Anaplastic Lymphoma Kinase, Cell Line, Tumor, Cetuximab, Colorectal Neoplasms genetics, Genes, erbB-1, Genetic Heterogeneity, Humans, Molecular Targeted Therapy, Proto-Oncogene Proteins c-ret metabolism, Receptor Protein-Tyrosine Kinases genetics, Receptor, Fibroblast Growth Factor, Type 2 metabolism, Colorectal Neoplasms enzymology, ErbB Receptors antagonists & inhibitors, Receptor Protein-Tyrosine Kinases metabolism
- Abstract
The development of molecularly targeted anticancer agents relies on large panels of tumour-specific preclinical models closely recapitulating the molecular heterogeneity observed in patients. Here we describe the mutational and gene expression analyses of 151 colorectal cancer (CRC) cell lines. We find that the whole spectrum of CRC molecular and transcriptional subtypes, previously defined in patients, is represented in this cell line compendium. Transcriptional outlier analysis identifies RAS/BRAF wild-type cells, resistant to EGFR blockade, functionally and pharmacologically addicted to kinase genes including ALK, FGFR2, NTRK1/2 and RET. The same genes are present as expression outliers in CRC patient samples. Genomic rearrangements (translocations) involving the ALK and NTRK1 genes are associated with the overexpression of the corresponding proteins in CRC specimens. The approach described here can be used to pinpoint CRCs with exquisite dependencies to individual kinases for which clinically approved drugs are already available.
- Published
- 2015
- Full Text
- View/download PDF
73. Alternative splicing detection workflow needs a careful combination of sample prep and bioinformatics analysis.
- Author
-
Carrara M, Lum J, Cordero F, Beccuti M, Poidinger M, Donatelli S, Calogero RA, and Zolezzi F
- Subjects
- Exons genetics, Humans, RNA genetics, RNA, Ribosomal genetics, RNA, Ribosomal metabolism, Workflow, Alternative Splicing genetics, Computational Biology methods, Gene Library, Sequence Analysis, RNA methods
- Abstract
Background: RNA-Seq provides remarkable power in the area of biomarkers discovery and disease characterization. Two crucial steps that affect RNA-Seq experiment results are Library Sample Preparation (LSP) and Bioinformatics Analysis (BA). This work describes an evaluation of the combined effect of LSP methods and BA tools in the detection of splice variants., Results: Different LSPs (TruSeq unstranded/stranded, ScriptSeq, NuGEN) allowed the detection of a large common set of splice variants. However, each LSP also detected a small set of unique transcripts that are characterized by a low coverage and/or FPKM. This effect was particularly evident using the low input RNA NuGEN v2 protocol. A benchmark dataset, in which synthetic reads as well as reads generated from standard (Illumina TruSeq 100) and low input (NuGEN) LSPs were spiked-in was used to evaluate the effect of LSP on the statistical detection of alternative splicing events (AltDE). Statistical detection of AltDE was done using as prototypes for splice variant-quantification Cuffdiff2 and RSEM-EBSeq. As prototype for exon-level analysis DEXSeq was used. Exon-level analysis performed slightly better than splice variant-quantification approaches, although at most only 50% of the spiked-in transcripts was detected. The performances of both splice variant-quantification and exon-level analysis improved when raising the number of input reads., Conclusion: Data, derived from NuGEN v2, were not the ideal input for AltDE, especially when the exon-level approach was used. We observed that both splice variant-quantification and exon-level analysis performances were strongly dependent on the number of input reads. Moreover, the ribosomal RNA depletion protocol was less sensitive in detecting splicing variants, possibly due to the significant percentage of the reads mapping to non-coding transcripts.
- Published
- 2015
- Full Text
- View/download PDF
74. A versatile mathematical work-flow to explore how Cancer Stem Cell fate influences tumor progression.
- Author
-
Fornari C, Balbo G, Halawani SM, Ba-Rukab O, Ahmad AR, Calogero RA, Cordero F, and Beccuti M
- Subjects
- Animals, Apoptosis physiology, Cell Proliferation, Humans, Neoplastic Stem Cells cytology, Carcinogenesis pathology, Models, Biological, Neoplastic Stem Cells pathology
- Abstract
Background: Nowadays multidisciplinary approaches combining mathematical models with experimental assays are becoming relevant for the study of biological systems. Indeed, in cancer research multidisciplinary approaches are successfully used to understand the crucial aspects implicated in tumor growth. In particular, the Cancer Stem Cell (CSC) biology represents an area particularly suited to be studied through multidisciplinary approaches, and modeling has significantly contributed to pinpoint the crucial aspects implicated in this theory. More generally, to acquire new insights on a biological system it is necessary to have an accurate description of the phenomenon, such that making accurate predictions on its future behaviors becomes more likely. In this context, the identification of the parameters influencing model dynamics can be advantageous to increase model accuracy and to provide hints in designing wet experiments. Different techniques, ranging from statistical methods to analytical studies, have been developed. Their applications depend on case-specific aspects, such as the availability and quality of experimental data, and the dimension of the parameter space., Results: The study of a new model on the CSC-based tumor progression has been the motivation to design a new work-flow that helps to characterize possible system dynamics and to identify those parameters influencing such behaviors. In detail, we extended our recent model on CSC-dynamics creating a new system capable of describing tumor growth during the different stages of cancer progression. Indeed, tumor cells appear to progress through lineage stages like those of normal tissues, being their division auto-regulated by internal feedback mechanisms. These new features have introduced some non-linearities in the model, making it more difficult to be studied by solely analytical techniques. Our new work-flow, based on statistical methods, was used to identify the parameters which influence the tumor growth. The effectiveness of the presented work-flow was firstly verified on two well known models and then applied to investigate our extended CSC model., Conclusions: We propose a new work-flow to study in a practical and informative way complex systems, allowing an easy identification, interpretation, and visualization of the key model parameters. Our methodology is useful to investigate possible model behaviors and to establish factors driving model dynamics. Analyzing our new CSC model guided by the proposed work-flow, we found that the deregulation of CSC asymmetric proliferation contributes to cancer initiation, in accordance with several experimental evidences. Specifically, model results indicated that the probability of CSC symmetric proliferation is responsible of a switching-like behavior which discriminates between tumorigenesis and unsustainable tumor growth.
- Published
- 2015
- Full Text
- View/download PDF
75. Chimera: a Bioconductor package for secondary analysis of fusion products.
- Author
-
Beccuti M, Carrara M, Cordero F, Lazzarato F, Donatelli S, Nadalin F, Policriti A, and Calogero RA
- Subjects
- Animals, Molecular Sequence Annotation, Gene Fusion, Software
- Abstract
Summary: Chimera is a Bioconductor package that organizes, annotates, analyses and validates fusions reported by different fusion detection tools; current implementation can deal with output from bellerophontes, chimeraScan, deFuse, fusionCatcher, FusionFinder, FusionHunter, FusionMap, mapSplice, Rsubread, tophat-fusion and STAR. The core of Chimera is a fusion data structure that can store fusion events detected with any of the aforementioned tools. Fusions are then easily manipulated with standard R functions or through the set of functionalities specifically developed in Chimera with the aim of supporting the user in managing fusions and discriminating false-positive results., (© The Author 2014. Published by Oxford University Press.)
- Published
- 2014
- Full Text
- View/download PDF
76. A mathematical-biological joint effort to investigate the tumor-initiating ability of Cancer Stem Cells.
- Author
-
Fornari C, Beccuti M, Lanzardo S, Conti L, Balbo G, Cavallo F, Calogero RA, and Cordero F
- Subjects
- Animals, Biomarkers metabolism, Breast Neoplasms genetics, Breast Neoplasms metabolism, CD24 Antigen genetics, Carcinogenesis genetics, Carcinogenesis metabolism, Carcinoma genetics, Carcinoma metabolism, Cell Line, Tumor, Disease Progression, Female, Gene Expression, Humans, Hyaluronan Receptors genetics, Mice, Mice, Inbred BALB C, Neoplasm Transplantation, Neoplastic Stem Cells metabolism, Spheroids, Cellular metabolism, Spheroids, Cellular pathology, Transplantation, Heterotopic, Breast Neoplasms pathology, Carcinogenesis pathology, Carcinoma pathology, Models, Statistical, Neoplastic Stem Cells pathology, Receptor, ErbB-2 genetics
- Abstract
The involvement of Cancer Stem Cells (CSCs) in tumor progression and tumor recurrence is one of the most studied subjects in current cancer research. The CSC hypothesis states that cancer cell populations are characterized by a hierarchical structure that affects cancer progression. Due to the complex dynamics involving CSCs and the other cancer cell subpopulations, a robust theory explaining their action has not been established yet. Some indications can be obtained by combining mathematical modeling and experimental data to understand tumor dynamics and to generate new experimental hypotheses. Here, we present a model describing the initial phase of ErbB2(+) mammary cancer progression, which arises from a joint effort combing mathematical modeling and cancer biology. The proposed model represents a new approach to investigate the CSC-driven tumorigenesis and to analyze the relations among crucial events involving cancer cell subpopulations. Using in vivo and in vitro data we tuned the model to reproduce the initial dynamics of cancer growth, and we used its solution to characterize observed cancer progression with respect to mutual CSC and progenitor cell variation. The model was also used to investigate which association occurs among cell phenotypes when specific cell markers are considered. Finally, we found various correlations among model parameters which cannot be directly inferred from the available biological data and these dependencies were used to characterize the dynamics of cancer subpopulations during the initial phase of ErbB2+ mammary cancer progression.
- Published
- 2014
- Full Text
- View/download PDF
77. Combinatorial pooling enables selective sequencing of the barley gene space.
- Author
-
Lonardi S, Duma D, Alpert M, Cordero F, Beccuti M, Bhat PR, Wu Y, Ciardo G, Alsaihati B, Ma Y, Wanamaker S, Resnik J, Bozdag S, Luo MC, and Close TJ
- Subjects
- Chromosomes, Artificial, Bacterial, Cloning, Molecular, Computational Biology methods, Computer Simulation, Genes, Plant, Genetic Markers genetics, Genomic Library, Genomics, Models, Genetic, Oryza genetics, Physical Chromosome Mapping, Species Specificity, Contig Mapping methods, Hordeum genetics, Sequence Analysis, DNA
- Abstract
For the vast majority of species - including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding.
- Published
- 2013
- Full Text
- View/download PDF
78. State of art fusion-finder algorithms are suitable to detect transcription-induced chimeras in normal tissues?
- Author
-
Carrara M, Beccuti M, Cavallo F, Donatelli S, Lazzarato F, Cordero F, and Calogero RA
- Subjects
- Animals, Humans, Sequence Analysis, RNA methods, Algorithms, Gene Fusion, Software, Transcription, Genetic
- Abstract
Background: RNA-seq has the potential to discover genes created by chromosomal rearrangements. Fusion genes, also known as "chimeras", are formed by the breakage and re-joining of two different chromosomes. It is known that chimeras have been implicated in the development of cancer. Few publications in the past showed the presence of fusion events also in normal tissue, but with very limited overlaps between their results. More recently, two fusion genes in normal tissues were detected using both RNA-seq and protein data.Due to heterogeneous results in identifying chimeras in normal tissue, we decided to evaluate the efficacy of state of the art fusion finders in detecting chimeras in RNA-seq data from normal tissues., Results: We compared the performance of six fusion-finder tools: FusionHunter, FusionMap, FusionFinder, MapSplice, deFuse and TopHat-fusion. To evaluate the sensitivity we used a synthetic dataset of fusion-products, called positive dataset; in these experiments FusionMap, FusionFinder, MapSplice, and TopHat-fusion are able to detect more than 78% of fusion genes. All tools were error prone with high variability among the tools, identifying some fusion genes not present in the synthetic dataset. To better investigate the false discovery chimera detection rate, synthetic datasets free of fusion-products, called negative datasets, were used. The negative datasets have different read lengths and quality scores, which allow detecting dependency of the tools on both these features. FusionMap, FusionFinder, mapSplice, deFuse and TopHat-fusion were error-prone. Only FusionHunter results were free of false positive. FusionMap gave the best compromise in terms of specificity in the negative dataset and of sensitivity in the positive dataset., Conclusions: We have observed a dependency of the tools on read length, quality score and on the number of reads supporting each chimera. Thus, it is important to carefully select the software on the basis of the structure of the RNA-seq data under analysis. Furthermore, the sensitivity of chimera detection tools does not seem to be sufficient to provide results consistent with those obtained in normal tissues on the basis of fusion events extracted from published data.
- Published
- 2013
- Full Text
- View/download PDF
79. State-of-the-art fusion-finder algorithms sensitivity and specificity.
- Author
-
Carrara M, Beccuti M, Lazzarato F, Cavallo F, Cordero F, Donatelli S, and Calogero RA
- Subjects
- Chimera genetics, Humans, Neoplasms pathology, Sequence Analysis, RNA, Software, Gene Fusion, Neoplasms genetics, Oncogene Proteins, Fusion genetics, Translocation, Genetic genetics
- Abstract
Background: Gene fusions arising from chromosomal translocations have been implicated in cancer. RNA-seq has the potential to discover such rearrangements generating functional proteins (chimera/fusion). Recently, many methods for chimeras detection have been published. However, specificity and sensitivity of those tools were not extensively investigated in a comparative way., Results: We tested eight fusion-detection tools (FusionHunter, FusionMap, FusionFinder, MapSplice, deFuse, Bellerophontes, ChimeraScan, and TopHat-fusion) to detect fusion events using synthetic and real datasets encompassing chimeras. The comparison analysis run only on synthetic data could generate misleading results since we found no counterpart on real dataset. Furthermore, most tools report a very high number of false positive chimeras. In particular, the most sensitive tool, ChimeraScan, reports a large number of false positives that we were able to significantly reduce by devising and applying two filters to remove fusions not supported by fusion junction-spanning reads or encompassing large intronic regions., Conclusions: The discordant results obtained using synthetic and real datasets suggest that synthetic datasets encompassing fusion events may not fully catch the complexity of RNA-seq experiment. Moreover, fusion detection tools are still limited in sensitivity or specificity; thus, there is space for further improvement in the fusion-finder algorithms.
- Published
- 2013
- Full Text
- View/download PDF
80. Multi-level model for the investigation of oncoantigen-driven vaccination effect.
- Author
-
Cordero F, Beccuti M, Fornari C, Lanzardo S, Conti L, Cavallo F, Balbo G, and Calogero R
- Subjects
- Animals, Breast Neoplasms pathology, Cancer Vaccines immunology, Humans, Mice, Neoplasms immunology, Neoplasms metabolism, Neoplastic Stem Cells pathology, Receptor, ErbB-2, Cancer Vaccines therapeutic use, Models, Biological, Neoplasms pathology, Neoplasms therapy
- Abstract
Background: Cancer stem cell theory suggests that cancers are derived by a population of cells named Cancer Stem Cells (CSCs) that are involved in the growth and in the progression of tumors, and lead to a hierarchical structure characterized by differentiated cell population. This cell heterogeneity affects the choice of cancer therapies, since many current cancer treatments have limited or no impact at all on CSC population, while they reveal a positive effect on the differentiated cell populations., Results: In this paper we investigated the effect of vaccination on a cancer hierarchical structure through a multi-level model representing both population and molecular aspects. The population level is modeled by a system of Ordinary Differential Equations (ODEs) describing the cancer population's dynamics. The molecular level is modeled using the Petri Net (PN) formalism to detail part of the proliferation pathway. Moreover, we propose a new methodology which exploits the temporal behavior derived from the molecular level to parameterize the ODE system modeling populations. Using this multi-level model we studied the ErbB2-driven vaccination effect in breast cancer., Conclusions: We propose a multi-level model that describes the inter-dependencies between population and genetic levels, and that can be efficiently used to estimate the efficacy of drug and vaccine therapies in cancer models, given the availability of molecular data on the cancer driving force.
- Published
- 2013
- Full Text
- View/download PDF
81. Optimizing a massive parallel sequencing workflow for quantitative miRNA expression analysis.
- Author
-
Cordero F, Beccuti M, Arigoni M, Donatelli S, and Calogero RA
- Subjects
- Algorithms, Databases, Genetic, Gene Expression Regulation, Genome, Human genetics, Humans, MicroRNAs metabolism, ROC Curve, Reference Standards, Sample Size, Sequence Alignment, Software, Gene Expression Profiling, High-Throughput Nucleotide Sequencing methods, MicroRNAs genetics, Workflow
- Abstract
Background: Massive Parallel Sequencing methods (MPS) can extend and improve the knowledge obtained by conventional microarray technology, both for mRNAs and short non-coding RNAs, e.g. miRNAs. The processing methods used to extract and interpret the information are an important aspect of dealing with the vast amounts of data generated from short read sequencing. Although the number of computational tools for MPS data analysis is constantly growing, their strengths and weaknesses as part of a complex analytical pipe-line have not yet been well investigated., Primary Findings: A benchmark MPS miRNA dataset, resembling a situation in which miRNAs are spiked in biological replication experiments was assembled by merging a publicly available MPS spike-in miRNAs data set with MPS data derived from healthy donor peripheral blood mononuclear cells. Using this data set we observed that short reads counts estimation is strongly under estimated in case of duplicates miRNAs, if whole genome is used as reference. Furthermore, the sensitivity of miRNAs detection is strongly dependent by the primary tool used in the analysis. Within the six aligners tested, specifically devoted to miRNA detection, SHRiMP and MicroRazerS show the highest sensitivity. Differential expression estimation is quite efficient. Within the five tools investigated, two of them (DESseq, baySeq) show a very good specificity and sensitivity in the detection of differential expression., Conclusions: The results provided by our analysis allow the definition of a clear and simple analytical optimized workflow for miRNAs digital quantitative analysis.
- Published
- 2012
- Full Text
- View/download PDF
82. Large disclosing the nature of computational tools for the analysis of next generation sequencing data.
- Author
-
Cordero F, Beccuti M, Donatelli S, and Calogero RA
- Subjects
- Algorithms, Genome genetics, Humans, Computational Biology methods, High-Throughput Nucleotide Sequencing methods
- Abstract
Next-generation sequencing (NGS) technologies are rapidly changing the approach to complex genomic studies, opening the way to personalized drugs development and personalized medicine. NGS technologies are characterized by a massive throughput for relatively short-sequences (30-100), and they are currently the most reliable and accurate method for grouping individuals on the basis of their genetic profiles. The first and crucial step in sequence analysis is the conversion of millions of short sequences (reads) into valuable genetic information by their mapping to a known (reference) genome. New computational methods, specifically designed for the type and the amount of data generated by NGS technologies, are replacing earlier widespread genome alignment algorithms which are unable to cope with such massive amount of data. This review provides an overview of the bioinformatics techniques that have been developed for the mapping of NGS data onto a reference genome, with a special focus on polymorphism rate and sequence error detection. The different techniques have been experimented on an appropriately defined dataset, to investigate their relative computational costs and usability, as seen from an user perspective. Since NGS platforms interrogate the genome using either the conventional nucleotide space or the more recent color space, this review does consider techniques both in nucleotide and color space, emphasizing similarities and diversities.
- Published
- 2012
- Full Text
- View/download PDF
83. Bilateral posterior maxillary segmental osteotomy to rehabilitate edentulous mandibular area: case report.
- Author
-
Giannini D, Spinelli G, Ghilardi R, Beccuti ML, and Raffaini M
- Subjects
- Female, Humans, Middle Aged, Jaw, Edentulous rehabilitation, Mandible, Orthognathic Surgical Procedures methods
- Abstract
The purpose of this work was to describe a clinical case with reduced vertical height in both the posterior sectors, due to maxillary dento-alveolar extrusion in mandibular edentulous space, as a result of some extractions which have not been promptly replaced by a prosthetic rehabilitation, eventually resolved with a bilateral posterior segmental maxillary osteotomy (PMSO). Our surgical technique was practised under general anesthesia according to Kufner's version of Schuchardt's original description. In the light of the present outcomes, in severe clinical cases of dento-alveolar extrusion, the PMSO can be considered the optimal solution, because of the quality and the stability of the final result, the short therapeutic times, the limited morbidity and the modest compliance asked to the patient.
- Published
- 2010
84. [STRUCTURE OF THE DATURA ARBOREA L. LEAF].
- Author
-
BECCUTI M
- Subjects
- Alkaloids, Chemistry, Pharmaceutical, Datura stramonium, Plants, Medicinal, Solanaceae
- Published
- 1963
85. [STRUCTURE OF THE LEAVES OF CINNAMOMUM CAMPHORA NEES AND EBERM].
- Author
-
BECCUTI M
- Subjects
- Anatomy, Cinnamomum camphora, Plant Leaves, Plants
- Published
- 1963
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.