47 results on '"Beccuti M"'
Search Results
2. Efficient and Settings-Free Calibration of Detailed Kinetic Metabolic Models with Enzyme Isoforms Characterization
- Author
-
Raposo, M, Ribeiro, P, Sério, S, Staiano, A, Ciaramella, A, Totis, N, Tangherloni, A, Beccuti, M, Cazzaniga, P, Nobile, M, Besozzi, D, Pennisi, M, Pappalardo, F, Totis N., Tangherloni A., Beccuti M., Cazzaniga P., Nobile M. S., Besozzi D., Pennisi M., Pappalardo F., Raposo, M, Ribeiro, P, Sério, S, Staiano, A, Ciaramella, A, Totis, N, Tangherloni, A, Beccuti, M, Cazzaniga, P, Nobile, M, Besozzi, D, Pennisi, M, Pappalardo, F, Totis N., Tangherloni A., Beccuti M., Cazzaniga P., Nobile M. S., Besozzi D., Pennisi M., and Pappalardo F.
- Abstract
Mathematical modeling and computational analyses are essential tools to understand and gain novel insights on the functioning of complex biochemical systems. In the specific case of metabolic reaction networks, which are regulated by many other intracellular processes, various challenging problems hinder the definition of compact and fully calibrated mathematical models, as well as the execution of computationally efficient analyses of their emergent dynamics. These problems especially occur when the model explicitly takes into account the presence and the effect of different isoforms of metabolic enzymes. Since the kinetic characterization of the different isoforms is most of the times unavailable, Parameter Estimation (PE) procedures are typically required to properly calibrate the model. To address these issues, in this work we combine the descriptive power of Stochastic Symmetric Nets, a parametric and compact extension of the Petri Net formalism, with FST-PSO, an efficient and settings-free meta-heuristics for global optimization that is suitable for the PE problem. To prove the effectiveness of our modeling and calibration approach, we investigate here a large-scale kinetic model of human intracellular metabolism. To efficiently execute the large number of simulations required by PE, we exploit LASSIE, a deterministic simulator that offloads the calculations onto the cores of Graphics Processing Units, thus allowing a drastic reduction of the running time. Our results attest that estimating isoform-specific kinetic parameters allows to predict how the knock-down of specific enzyme isoforms affects the dynamic behavior of the metabolic network. Moreover, we show that, thanks to LASSIE, we achieved a speed-up of ~30× with respect to the same analysis carried out on Central Processing Units.
- Published
- 2020
3. P–241 Construction of a Machine Learning algorithm based on early morphokinetics for human blastocyst development prediction: a retrospective analysis of 575 cleavage-stage embryos
- Author
-
Canosa, S, primary, Cordero, F, additional, Beccuti, M, additional, Licheri, N, additional, Bergandi, L, additional, Gennarelli, G, additional, Benedetto, C, additional, and Revelli, A, additional
- Published
- 2021
- Full Text
- View/download PDF
4. The Drivers Behind Blockchain Adoption: The Rationality of Irrational Choices
- Author
-
Koens, T., Poll, E., Mencagli, G., Heras, D.B., Cardellini, V., Casalicchio, E., Jeannot, E., Wolf, F., Salis, A., Schifanella, C., Manumachu, R.R., Ricci, L., Beccuti, M., Antonelli, L., Sanchez, J.D.G., Scott, S.L., Mencagli, G., Heras, D.B., Cardellini, V., Casalicchio, E., Jeannot, E., Wolf, F., Salis, A., Schifanella, C., Manumachu, R.R., Ricci, L., Beccuti, M., Antonelli, L., Sanchez, J.D.G., and Scott, S.L.
- Subjects
Digital Security - Abstract
Contains fulltext : 200787.pdf (Publisher’s version ) (Open Access) Euro-Par 2018: Parallel Processing Workshops - Euro-Par 2018 International Workshops, Turin, Italy, August 27-28, 2018
- Published
- 2018
5. High performance computing for haplotyping: Models and platforms
- Author
-
Mencagli, G, Heras, DB, Cardellini, V, Casalicchio, E, Jeannot, E, Wolf, F, Salis, A, Schifanella, C, Manumachu, RR, Ricci, L, Beccuti, M, Antonelli, L, Garcia Sanchez, JD, Scott, SL, Tangherloni, A, Rundo, L, Spolaor, S, Nobile, M, Merelli, I, Besozzi, D, Mauri, G, Cazzaniga, P, Liò, P, Tangherloni, Andrea, Rundo, Leonardo, Spolaor, Simone, Nobile, Marco S., Merelli, Ivan, Besozzi, Daniela, Mauri, Giancarlo, Cazzaniga, Paolo, Liò, Pietro, Mencagli, G, Heras, DB, Cardellini, V, Casalicchio, E, Jeannot, E, Wolf, F, Salis, A, Schifanella, C, Manumachu, RR, Ricci, L, Beccuti, M, Antonelli, L, Garcia Sanchez, JD, Scott, SL, Tangherloni, A, Rundo, L, Spolaor, S, Nobile, M, Merelli, I, Besozzi, D, Mauri, G, Cazzaniga, P, Liò, P, Tangherloni, Andrea, Rundo, Leonardo, Spolaor, Simone, Nobile, Marco S., Merelli, Ivan, Besozzi, Daniela, Mauri, Giancarlo, Cazzaniga, Paolo, and Liò, Pietro
- Abstract
The reconstruction of the haplotype pair for each chromosome is a hot topic in Bioinformatics and Genome Analysis. In Haplotype Assembly (HA), all heterozygous Single Nucleotide Polymorphisms (SNPs) have to be assigned to exactly one of the two chromosomes. In this work, we outline the state-of-the-art on HA approaches and present an in-depth analysis of the computational performance of GenHap, a recent method based on Genetic Algorithms. GenHap was designed to tackle the computational complexity of the HA problem by means of a divide-et-impera strategy that effectively leverages multi-core architectures. In order to evaluate GenHap’s performance, we generated different instances of synthetic (yet realistic) data exploiting empirical error models of four different sequencing platforms (namely, Illumina NovaSeq, Roche/454, PacBio RS II and Oxford Nanopore Technologies MinION). Our results show that the processing time generally decreases along with the read length, involving a lower number of sub-problems to be distributed on multiple cores.
- Published
- 2019
6. GPU Accelerated Analysis of Treg-Teff Cross Regulation in Relapsing-Remitting Multiple Sclerosis
- Author
-
Mencagli, G, Heras, DB, Cardellini, V, Casalicchio, E, Jeannot, E, Wolf, F, Salis, A, Schifanella, C, Manumachu, RR, Ricci, L, Beccuti, M, Antonelli, L, Garcia Sanchez, JD, Scott, SL, Cazzaniga, P, Pennisi, M, Besozzi, D, Nobile, M, Pernice, S, Russo, G, Tangherloni, A, Pappalardo, F, Nobile, MS, Mencagli, G, Heras, DB, Cardellini, V, Casalicchio, E, Jeannot, E, Wolf, F, Salis, A, Schifanella, C, Manumachu, RR, Ricci, L, Beccuti, M, Antonelli, L, Garcia Sanchez, JD, Scott, SL, Cazzaniga, P, Pennisi, M, Besozzi, D, Nobile, M, Pernice, S, Russo, G, Tangherloni, A, Pappalardo, F, and Nobile, MS
- Abstract
The computational analysis of complex biological systems can be hindered by two main factors. First, modeling the system so that it can be easily understood and analyzed by non-expert users is not always possible, especially when dealing with systems of Ordinary Differential Equations. Second, when the system is composed of hundreds or thousands of reactions and chemical species, the classic CPU-based simulators could not be appropriate to efficiently derive the behavior of the system. To overcome these limitations, in this paper we propose a novel approach that combines the descriptive power of Stochastic Symmetric Nets–a Petri Net formalism that allows modeler to describe the system in a parametric and compact manner–with LASSIE, a GPU-powered deterministic simulator that offloads onto the GPU the calculations required to execute many simulations by following both fine-grained and coarse-grained parallelization strategies. This pipeline has been applied to carry out a parameter sweep analysis of a relapsing-remitting multiple sclerosis model, aimed at understanding the role of possible malfunctions in the cross-balancing mechanisms that regulate peripheral tolerance of self-reactive T lymphocytes. From our experiments, LASSIE achieves around 97× speed-up with respect to the sequential execution of the same number of simulations.
- Published
- 2019
7. The Drivers Behind Blockchain Adoption: The Rationality of Irrational Choices
- Author
-
Mencagli, G., Heras, D.B., Cardellini, V., Casalicchio, E., Jeannot, E., Wolf, F., Salis, A., Schifanella, C., Manumachu, R.R., Ricci, L., Beccuti, M., Antonelli, L., Sanchez, J.D.G., Scott, S.L., Koens, T., Poll, E., Mencagli, G., Heras, D.B., Cardellini, V., Casalicchio, E., Jeannot, E., Wolf, F., Salis, A., Schifanella, C., Manumachu, R.R., Ricci, L., Beccuti, M., Antonelli, L., Sanchez, J.D.G., Scott, S.L., Koens, T., and Poll, E.
- Abstract
Euro-Par 2018: Parallel Processing Workshops - Euro-Par 2018 International Workshops, Turin, Italy, August 27-28, 2018, Contains fulltext : 200787.pdf (Publisher’s version ) (Open Access)
- Published
- 2018
8. GPU Powered Parameter Estimation of Large-scale kinetic metabolic model
- Author
-
Totis, N, Tangherloni, A, Beccuti, M, Cazzaniga, P, Nobile, M, Besozzi, D, Pennisi, M, Pappalardo, F, Totis, N, Tangherloni, A, Beccuti, M, Cazzaniga, P, Nobile, M, Besozzi, D, Pennisi, M, and Pappalardo, F
- Published
- 2018
9. Decision diagrams for Petri nets: which variable ordering?
- Author
-
Amparore, E. G., Donatelli, S., Beccuti, M., Garbi, G., and Miner, Andrew
- Published
- 2017
10. D8: preliminary modeling framework
- Author
-
Kaaniche M., Beccuti M., Brasca C., Chiaradonna S., Donatelli S., and Di Giandomenico F.
- Subjects
Critical infrastructures ,Power systems ,Interdependencies modelling - Abstract
This deliverable presents the preliminary version of the CRUTIAL modelling framework aimed at describing interdependencies related failures and evaluating their impact on the dependability and security of information and controlled electricity infrastructures, accounting for both accidental and malicious faults. Two main complementary objectives are followed: 1) the development of qualitative models describing cascading, escalating and common cause failures, where the infrastructures are modelled globally at a high abstraction level; and 2) the development of detailed hierarchical quantitative evaluation models of the infrastructures taking into account their internal architectures and the behaviour of their components resulting from the occurrence of electrical and ICT failures and recoveries.
- Published
- 2008
11. MET Exon 14 Skipping: A Case Study for the Detection of Genetic Variants in Cancer Driver Genes by Deep Learning
- Author
-
Maddalena Arigoni, Paolo M. Comoglio, Francesca Cordero, Vladimir Nosi, Alessandrì Luca, Silvia Benvenuti, Sara Riccardo, Raffaele A. Calogero, Marcella Cesana, Marco Beccuti, Davide Cacchiarelli, Lucio Di Filippo, Melissa Milan, Nosi, V., Luca, A., Milan, M., Arigoni, M., Benvenuti, S., Cacchiarelli, D., Cesana, M., Riccardo, S., Filippo, L. D., Cordero, F., Beccuti, M., Comoglio, P. M., and Calogero, R. A.
- Subjects
Neural Networks ,QH301-705.5 ,Exon ,Computational biology ,Article ,Catalysis ,Receptor tyrosine kinase ,Inorganic Chemistry ,Computer ,Deep Learning ,medicine ,biochemistry ,Humans ,Biology (General) ,Physical and Theoretical Chemistry ,QD1-999 ,Molecular Biology ,Genetic variant ,Spectroscopy ,biology ,genetic variants ,Organic Chemistry ,Alternative splicing ,Intron ,Cancer ,Genetic Variation ,General Medicine ,Exons ,medicine.disease ,Exon skipping ,Neural network ,Computer Science Applications ,Long interspersed nuclear element ,Chemistry ,Tumor progression ,Deep learning ,Genetic variants ,MET ,Neural Networks, Computer ,biology.protein ,Human - Abstract
Background: Disruption of alternative splicing (AS) is frequently observed in cancer and might represent an important signature for tumor progression and therapy. Exon skipping (ES) represents one of the most frequent AS events, and in non-small cell lung cancer (NSCLC) MET exon 14 skipping was shown to be targetable. Methods: We constructed neural networks (NN/CNN) specifically designed to detect MET exon 14 skipping events using RNAseq data. Furthermore, for discovery purposes we also developed a sparsely connected autoencoder to identify uncharacterized MET isoforms. Results: The neural networks had a Met exon 14 skipping detection rate greater than 94% when tested on a manually curated set of 690 TCGA bronchus and lung samples. When globally applied to 2605 TCGA samples, we observed that the majority of false positives was characterized by a blurry coverage of exon 14, but interestingly they share a common coverage peak in the second intron and we speculate that this event could be the transcription signature of a LINE1 (Long Interspersed Nuclear Element 1)-MET (Mesenchymal Epithelial Transition receptor tyrosine kinase) fusion. Conclusions: Taken together, our results indicate that neural networks can be an effective tool to provide a quick classification of pathological transcription events, and sparsely connected autoencoders could represent the basis for the development of an effective discovery tool.
- Published
- 2021
12. Efficient and Settings-Free Calibration of Detailed Kinetic Metabolic Models with Enzyme Isoforms Characterization
- Author
-
Andrea Tangherloni, Daniela Besozzi, Paolo Cazzaniga, Francesco Pappalardo, Marco Beccuti, Marzio Pennisi, Niccoló Totis, Marco S. Nobile, Raposo, M, Ribeiro, P, Sério, S, Staiano, A, Ciaramella, A, Totis, N, Tangherloni, A, Beccuti, M, Cazzaniga, P, Nobile, M, Besozzi, D, Pennisi, M, and Pappalardo, F
- Subjects
0301 basic medicine ,Exploit ,Mathematical model ,Metabolic reaction network ,Settore INF/01 - Informatica ,Estimation theory ,Computer science ,Parameter Estimation ,Metabolic reaction networks ,Metabolic network ,INF/01 - INFORMATICA ,Petri net ,03 medical and health sciences ,030104 developmental biology ,GPU-powered simulations ,Graphics ,Biological system ,Global optimization ,GPU-powered simulation ,Parametric statistics - Abstract
Mathematical modeling and computational analyses are essential tools to understand and gain novel insights on the functioning of complex biochemical systems. In the specific case of metabolic reaction networks, which are regulated by many other intracellular processes, various challenging problems hinder the definition of compact and fully calibrated mathematical models, as well as the execution of computationally efficient analyses of their emergent dynamics. These problems especially occur when the model explicitly takes into account the presence and the effect of different isoforms of metabolic enzymes. Since the kinetic characterization of the different isoforms is most of the times unavailable, Parameter Estimation (PE) procedures are typically required to properly calibrate the model. To address these issues, in this work we combine the descriptive power of Stochastic Symmetric Nets, a parametric and compact extension of the Petri Net formalism, with FST-PSO, an efficient and settings-free meta-heuristics for global optimization that is suitable for the PE problem. To prove the effectiveness of our modeling and calibration approach, we investigate here a large-scale kinetic model of human intracellular metabolism. To efficiently execute the large number of simulations required by PE, we exploit LASSIE, a deterministic simulator that offloads the calculations onto the cores of Graphics Processing Units, thus allowing a drastic reduction of the running time. Our results attest that estimating isoform-specific kinetic parameters allows to predict how the knock-down of specific enzyme isoforms affects the dynamic behavior of the metabolic network. Moreover, we show that, thanks to LASSIE, we achieved a speed-up of \({\sim }\!30{\times }\) with respect to the same analysis carried out on Central Processing Units.
- Published
- 2020
13. GPU Accelerated Analysis of Treg-Teff Cross Regulation in Relapsing-Remitting Multiple Sclerosis
- Author
-
Marco Beccuti, Paolo Cazzaniga, Daniela Besozzi, Simone Pernice, Francesco Pappalardo, Andrea Tangherloni, Giulia Russo, Marco S. Nobile, Marzio Pennisi, Mencagli, G, Heras, DB, Cardellini, V, Casalicchio, E, Jeannot, E, Wolf, F, Salis, A, Schifanella, C, Manumachu, RR, Ricci, L, Beccuti, M, Antonelli, L, Garcia Sanchez, JD, Scott, SL, Cazzaniga, P, Pennisi, M, Besozzi, D, Nobile, M, Pernice, S, Russo, G, Tangherloni, A, and Pappalardo, F
- Subjects
0301 basic medicine ,Multiple sclerosis, GPGPU computing, Petri nets, Parameter sweep analysis ,Settore INF/01 - Informatica ,Computer science ,Computer Science (all) ,INF/01 - INFORMATICA ,Petri nets ,Parallel computing ,Petri net ,GPGPU computing ,Multiple sclerosis ,Parameter sweep analysis ,Cross regulation ,Theoretical Computer Science ,03 medical and health sciences ,Formalism (philosophy of mathematics) ,030104 developmental biology ,0302 clinical medicine ,Relapsing remitting ,Ordinary differential equation ,030217 neurology & neurosurgery ,Parametric statistics - Abstract
The computational analysis of complex biological systems can be hindered by two main factors. First, modeling the system so that it can be easily understood and analyzed by non-expert users is not always possible, especially when dealing with systems of Ordinary Differential Equations. Second, when the system is composed of hundreds or thousands of reactions and chemical species, the classic CPU-based simulators could not be appropriate to efficiently derive the behavior of the system. To overcome these limitations, in this paper we propose a novel approach that combines the descriptive power of Stochastic Symmetric Nets–a Petri Net formalism that allows modeler to describe the system in a parametric and compact manner–with LASSIE, a GPU-powered deterministic simulator that offloads onto the GPU the calculations required to execute many simulations by following both fine-grained and coarse-grained parallelization strategies. This pipeline has been applied to carry out a parameter sweep analysis of a relapsing-remitting multiple sclerosis model, aimed at understanding the role of possible malfunctions in the cross-balancing mechanisms that regulate peripheral tolerance of self-reactive T lymphocytes. From our experiments, LASSIE achieves around \(97\times \) speed-up with respect to the sequential execution of the same number of simulations.
- Published
- 2019
14. High performance computing for haplotyping: Models and platforms
- Author
-
Ivan Merelli, Leonardo Rundo, Daniela Besozzi, Paolo Cazzaniga, Pietro Liò, Giancarlo Mauri, Simone Spolaor, Andrea Tangherloni, Marco S. Nobile, Mencagli, G, Heras, DB, Cardellini, V, Casalicchio, E, Jeannot, E, Wolf, F, Salis, A, Schifanella, C, Manumachu, RR, Ricci, L, Beccuti, M, Antonelli, L, Garcia Sanchez, JD, Scott, SL, Tangherloni, A, Rundo, L, Spolaor, S, Nobile, M, Merelli, I, Besozzi, D, Mauri, G, Cazzaniga, P, and Liò, P
- Subjects
0301 basic medicine ,Computer science ,High Performance Computing ,Single-nucleotide polymorphism ,Parallel computing ,Haplotype Assembly ,Future-generation sequencing ,Genome Analysis ,Master-Slave paradigm ,Theoretical Computer Science ,Genome ,ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI ,03 medical and health sciences ,Genome Analysis Haplotype Assembly ,Settore INF/01 - Informatica ,Haplotype ,Chromosome ,INF/01 - INFORMATICA ,Supercomputer ,Haplotype pair ,030104 developmental biology ,Genome Analysi ,Minion ,Nanopore sequencing - Abstract
The reconstruction of the haplotype pair for each chromosome is a hot topic in Bioinformatics and Genome Analysis. In Haplotype Assembly (HA), all heterozygous Single Nucleotide Polymorphisms (SNPs) have to be assigned to exactly one of the two chromosomes. In this work, we outline the state-of-the-art on HA approaches and present an in-depth analysis of the computational performance of GenHap, a recent method based on Genetic Algorithms. GenHap was designed to tackle the computational complexity of the HA problem by means of a divide-et-impera strategy that effectively leverages multi-core architectures. In order to evaluate GenHap’s performance, we generated different instances of synthetic (yet realistic) data exploiting empirical error models of four different sequencing platforms (namely, Illumina NovaSeq, Roche/454, PacBio RS II and Oxford Nanopore Technologies MinION). Our results show that the processing time generally decreases along with the read length, involving a lower number of sub-problems to be distributed on multiple cores.
- Published
- 2019
15. CREDO: a friendly Customizable, REproducible, DOcker file generator for bioinformatics applications.
- Author
-
Alessandri S, Ratto ML, Rabellino S, Piacenti G, Contaldo SG, Pernice S, Beccuti M, Calogero RA, and Alessandri L
- Subjects
- Reproducibility of Results, Software, Computational Biology methods
- Abstract
Background: The analysis of large and complex biological datasets in bioinformatics poses a significant challenge to achieving reproducible research outcomes due to inconsistencies and the lack of standardization in the analysis process. These issues can lead to discrepancies in results, undermining the credibility and impact of bioinformatics research and creating mistrust in the scientific process. To address these challenges, open science practices such as sharing data, code, and methods have been encouraged., Results: CREDO, a Customizable, REproducible, DOcker file generator for bioinformatics applications, has been developed as a tool to moderate reproducibility issues by building and distributing docker containers with embedded bioinformatics tools. CREDO simplifies the process of generating Docker images, facilitating reproducibility and efficient research in bioinformatics. The crucial step in generating a Docker image is creating the Dockerfile, which requires incorporating heterogeneous packages and environments such as Bioconductor and Conda. CREDO stores all required package information and dependencies in a Github-compatible format to enhance Docker image reproducibility, allowing easy image creation from scratch. The user-friendly GUI and CREDO's ability to generate modular Docker images make it an ideal tool for life scientists to efficiently create Docker images. Overall, CREDO is a valuable tool for addressing reproducibility issues in bioinformatics research and promoting open science practices., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
16. A single cell RNAseq benchmark experiment embedding "controlled" cancer heterogeneity.
- Author
-
Arigoni M, Ratto ML, Riccardo F, Balmas E, Calogero L, Cordero F, Beccuti M, Calogero RA, and Alessandri L
- Subjects
- Humans, Algorithms, Gene Expression Profiling methods, Proto-Oncogene Proteins genetics, Sequence Analysis, RNA methods, Single-Cell Gene Expression Analysis, Cell Line, Tumor, Benchmarking, Lung Neoplasms genetics
- Abstract
Single-cell RNA sequencing (scRNA-seq) has emerged as a vital tool in tumour research, enabling the exploration of molecular complexities at the individual cell level. It offers new technical possibilities for advancing tumour research with the potential to yield significant breakthroughs. However, deciphering meaningful insights from scRNA-seq data poses challenges, particularly in cell annotation and tumour subpopulation identification. Efficient algorithms are therefore needed to unravel the intricate biological processes of cancer. To address these challenges, benchmarking datasets are essential to validate bioinformatics methodologies for analysing single-cell omics in oncology. Here, we present a 10XGenomics scRNA-seq experiment, providing a controlled heterogeneous environment using lung cancer cell lines characterised by the expression of seven different driver genes (EGFR, ALK, MET, ERBB2, KRAS, BRAF, ROS1), leading to partially overlapping functional pathways. Our dataset provides a comprehensive framework for the development and validation of methodologies for analysing cancer heterogeneity by means of scRNA-seq., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
17. Tamoxifen Activates Transcription Factor EB and Triggers Protective Autophagy in Breast Cancer Cells by Inducing Lysosomal Calcium Release: A Gateway to the Onset of Endocrine Resistance.
- Author
-
Boretto C, Actis C, Faris P, Cordero F, Beccuti M, Ferrero G, Muzio G, Moccia F, and Autelli R
- Subjects
- Tamoxifen pharmacology, Calcium, Dietary, Autophagy, Lysosomes, Calcium, Neoplasms
- Abstract
Among the several mechanisms accounting for endocrine resistance in breast cancer, autophagy has emerged as an important player. Previous reports have evidenced that tamoxifen (Tam) induces autophagy and activates transcription factor EB (TFEB), which regulates the expression of genes controlling autophagy and lysosomal biogenesis. However, the mechanisms by which this occurs have not been elucidated as yet. This investigation aims at dissecting how TFEB is activated and contributes to Tam resistance in luminal A breast cancer cells. TFEB was overexpressed and prominently nuclear in Tam-resistant MCF7 cells (MCF7-TamR) compared with their parental counterpart, and this was not dependent on alterations of its nucleo-cytoplasmic shuttling. Tam promoted the release of lysosomal Ca
2+ through the major transient receptor potential cation channel mucolipin subfamily member 1 (TRPML1) and two-pore channels (TPCs), which caused the nuclear translocation and activation of TFEB. Consistently, inhibiting lysosomal calcium release restored the susceptibility of MCF7-TamR cells to Tam. Our findings demonstrate that Tam drives the nuclear relocation and transcriptional activation of TFEB by triggering the release of Ca2+ from the acidic compartment, and they suggest that lysosomal Ca2+ channels may represent new druggable targets to counteract the onset of autophagy-mediated endocrine resistance in luminal A breast cancer cells.- Published
- 2023
- Full Text
- View/download PDF
18. CONNECTOR, fitting and clustering of longitudinal data to reveal a new risk stratification system.
- Author
-
Pernice S, Sirovich R, Grassi E, Viviani M, Ferri M, Sassi F, Alessandrì L, Tortarolo D, Calogero RA, Trusolino L, Bertotti A, Beccuti M, Olivero M, and Cordero F
- Subjects
- Humans, Animals, Cluster Analysis, Time Factors, Disease Models, Animal, Risk Assessment, Software
- Abstract
Motivation: The transition from evaluating a single time point to examining the entire dynamic evolution of a system is possible only in the presence of the proper framework. The strong variability of dynamic evolution makes the definition of an explanatory procedure for data fitting and clustering challenging., Results: We developed CONNECTOR, a data-driven framework able to analyze and inspect longitudinal data in a straightforward and revealing way. When used to analyze tumor growth kinetics over time in 1599 patient-derived xenograft growth curves from ovarian and colorectal cancers, CONNECTOR allowed the aggregation of time-series data through an unsupervised approach in informative clusters. We give a new perspective of mechanism interpretation, specifically, we define novel model aggregations and we identify unanticipated molecular associations with response to clinically approved therapies., Availability and Implementation: CONNECTOR is freely available under GNU GPL license at https://qbioturin.github.io/connector and https://doi.org/10.17504/protocols.io.8epv56e74g1b/v1., (© The Author(s) 2023. Published by Oxford University Press.)
- Published
- 2023
- Full Text
- View/download PDF
19. Stardust: improving spatial transcriptomics data analysis through space-aware modularity optimization-based clustering.
- Author
-
Avesani S, Viesi E, Alessandrì L, Motterle G, Bonnici V, Beccuti M, Calogero R, and Giugno R
- Subjects
- Algorithms, Cluster Analysis, Data Analysis, Transcriptome
- Abstract
Background: Spatial transcriptomics (ST) combines stained tissue images with spatially resolved high-throughput RNA sequencing. The spatial transcriptomic analysis includes challenging tasks like clustering, where a partition among data points (spots) is defined by means of a similarity measure. Improving clustering results is a key factor as clustering affects subsequent downstream analysis. State-of-the-art approaches group data by taking into account transcriptional similarity and some by exploiting spatial information as well. However, it is not yet clear how much the spatial information combined with transcriptomics improves the clustering result., Results: We propose a new clustering method, Stardust, that easily exploits the combination of space and transcriptomic information in the clustering procedure through a manual or fully automatic tuning of algorithm parameters. Moreover, a parameter-free version of the method is also provided where the spatial contribution depends dynamically on the expression distances distribution in the space. We evaluated the proposed methods results by analyzing ST data sets available on the 10x Genomics website and comparing clustering performances with state-of-the-art approaches by measuring the spots' stability in the clusters and their biological coherence. Stability is defined by the tendency of each point to remain clustered with the same neighbors when perturbations are applied., Conclusions: Stardust is an easy-to-use methodology allowing to define how much spatial information should influence clustering on different tissues and achieving more stable results than state-of-the-art approaches., (© The Author(s) 2022. Published by Oxford University Press GigaScience.)
- Published
- 2022
- Full Text
- View/download PDF
20. Sparsely Connected Autoencoders: A Multi-Purpose Tool for Single Cell omics Analysis.
- Author
-
Alessandri L, Ratto ML, Contaldo SG, Beccuti M, Cordero F, Arigoni M, and Calogero RA
- Subjects
- Adenocarcinoma of Lung genetics, Humans, Lung Neoplasms genetics, Exome Sequencing, Adenocarcinoma of Lung pathology, Cluster Analysis, Computational Biology methods, Lung Neoplasms pathology, Machine Learning, Neural Networks, Computer, Single-Cell Analysis methods
- Abstract
Background: Biological processes are based on complex networks of cells and molecules. Single cell multi-omics is a new tool aiming to provide new incites in the complex network of events controlling the functionality of the cell., Methods: Since single cell technologies provide many sample measurements, they are the ideal environment for the application of Deep Learning and Machine Learning approaches. An autoencoder is composed of an encoder and a decoder sub-model. An autoencoder is a very powerful tool in data compression and noise removal. However, the decoder model remains a black box from which is impossible to depict the contribution of the single input elements. We have recently developed a new class of autoencoders, called Sparsely Connected Autoencoders (SCA), which have the advantage of providing a controlled association among the input layer and the decoder module. This new architecture has the benefit that the decoder model is not a black box anymore and can be used to depict new biologically interesting features from single cell data., Results: Here, we show that SCA hidden layer can grab new information usually hidden in single cell data, like providing clustering on meta-features difficult, i.e. transcription factors expression, or not technically not possible, i.e. miRNA expression, to depict in single cell RNAseq data. Furthermore, SCA representation of cell clusters has the advantage of simulating a conventional bulk RNAseq, which is a data transformation allowing the identification of similarity among independent experiments., Conclusions: In our opinion, SCA represents the bioinformatics version of a universal "Swiss-knife" for the extraction of hidden knowledgeable features from single cell omics data.
- Published
- 2021
- Full Text
- View/download PDF
21. GRAPES-DD: exploiting decision diagrams for index-driven search in biological graph databases.
- Author
-
Licheri N, Bonnici V, Beccuti M, and Giugno R
- Subjects
- Abstracting and Indexing, Algorithms, Databases, Factual, Vitis
- Abstract
Background: Graphs are mathematical structures widely used for expressing relationships among elements when representing biomedical and biological information. On top of these representations, several analyses are performed. A common task is the search of one substructure within one graph, called target. The problem is referred to as one-to-one subgraph search, and it is known to be NP-complete. Heuristics and indexing techniques can be applied to facilitate the search. Indexing techniques are also exploited in the context of searching in a collection of target graphs, referred to as one-to-many subgraph problem. Filter-and-verification methods that use indexing approaches provide a fast pruning of target graphs or parts of them that do not contain the query. The expensive verification phase is then performed only on the subset of promising targets. Indexing strategies extract graph features at a sufficient granularity level for performing a powerful filtering step. Features are memorized in data structures allowing an efficient access. Indexing size, querying time and filtering power are key points for the development of efficient subgraph searching solutions., Results: An existing approach, GRAPES, has been shown to have good performance in terms of speed-up for both one-to-one and one-to-many cases. However, it suffers in the size of the built index. For this reason, we propose GRAPES-DD, a modified version of GRAPES in which the indexing structure has been replaced with a Decision Diagram. Decision Diagrams are a broad class of data structures widely used to encode and manipulate functions efficiently. Experiments on biomedical structures and synthetic graphs have confirmed our expectation showing that GRAPES-DD has substantially reduced the memory utilization compared to GRAPES without worsening the searching time., Conclusion: The use of Decision Diagrams for searching in biochemical and biological graphs is completely new and potentially promising thanks to their ability to encode compactly sets by exploiting their structure and regularity, and to manipulate entire sets of elements at once, instead of exploring each single element explicitly. Search strategies based on Decision Diagram makes the indexing for biochemical graphs, and not only, more affordable allowing us to potentially deal with huge and ever growing collections of biochemical and biological structures.
- Published
- 2021
- Full Text
- View/download PDF
22. MET Exon 14 Skipping: A Case Study for the Detection of Genetic Variants in Cancer Driver Genes by Deep Learning.
- Author
-
Nosi V, Luca A, Milan M, Arigoni M, Benvenuti S, Cacchiarelli D, Cesana M, Riccardo S, Di Filippo L, Cordero F, Beccuti M, Comoglio PM, and Calogero RA
- Subjects
- Genetic Variation genetics, Humans, Neural Networks, Computer, Deep Learning, Exons genetics
- Abstract
Background: Disruption of alternative splicing (AS) is frequently observed in cancer and might represent an important signature for tumor progression and therapy. Exon skipping (ES) represents one of the most frequent AS events, and in non-small cell lung cancer (NSCLC) MET exon 14 skipping was shown to be targetable., Methods: We constructed neural networks (NN/CNN) specifically designed to detect MET exon 14 skipping events using RNAseq data. Furthermore, for discovery purposes we also developed a sparsely connected autoencoder to identify uncharacterized MET isoforms., Results: The neural networks had a Met exon 14 skipping detection rate greater than 94% when tested on a manually curated set of 690 TCGA bronchus and lung samples. When globally applied to 2605 TCGA samples, we observed that the majority of false positives was characterized by a blurry coverage of exon 14, but interestingly they share a common coverage peak in the second intron and we speculate that this event could be the transcription signature of a LINE1 (Long Interspersed Nuclear Element 1)-MET (Mesenchymal Epithelial Transition receptor tyrosine kinase) fusion., Conclusions: Taken together, our results indicate that neural networks can be an effective tool to provide a quick classification of pathological transcription events, and sparsely connected autoencoders could represent the basis for the development of an effective discovery tool.
- Published
- 2021
- Full Text
- View/download PDF
23. Sparsely-connected autoencoder (SCA) for single cell RNAseq data mining.
- Author
-
Alessandri L, Cordero F, Beccuti M, Licheri N, Arigoni M, Olivero M, Di Renzo MF, Sapino A, and Calogero R
- Subjects
- Algorithms, Base Sequence genetics, Cluster Analysis, Humans, Neural Networks, Computer, Software, Systems Biology methods, Exome Sequencing methods, Data Mining methods, Sequence Analysis, RNA methods, Single-Cell Analysis methods
- Abstract
Single-cell RNA sequencing (scRNAseq) is an essential tool to investigate cellular heterogeneity. Thus, it would be of great interest being able to disclose biological information belonging to cell subpopulations, which can be defined by clustering analysis of scRNAseq data. In this manuscript, we report a tool that we developed for the functional mining of single cell clusters based on Sparsely-Connected Autoencoder (SCA). This tool allows uncovering hidden features associated with scRNAseq data. We implemented two new metrics, QCC (Quality Control of Cluster) and QCM (Quality Control of Model), which allow quantifying the ability of SCA to reconstruct valuable cell clusters and to evaluate the quality of the neural network achievements, respectively. Our data indicate that SCA encoded space, derived by different experimentally validated data (TF targets, miRNA targets, Kinase targets, and cancer-related immune signatures), can be used to grasp single cell cluster-specific functional features. In our implementation, SCA efficacy comes from its ability to reconstruct only specific clusters, thus indicating only those clusters where the SCA encoding space is a key element for cells aggregation. SCA analysis is implemented as module in rCASC framework and it is supported by a GUI to simplify it usage for biologists and medical personnel.
- Published
- 2021
- Full Text
- View/download PDF
24. Computational modeling of the immune response in multiple sclerosis using epimod framework.
- Author
-
Pernice S, Follia L, Maglione A, Pennisi M, Pappalardo F, Novelli F, Clerico M, Beccuti M, Cordero F, and Rolla S
- Subjects
- Algorithms, Daclizumab therapeutic use, Humans, Immunosuppressive Agents therapeutic use, Multiple Sclerosis, Relapsing-Remitting drug therapy, Multiple Sclerosis, Relapsing-Remitting pathology, Stochastic Processes, Immune System physiology, Models, Biological, Multiple Sclerosis, Relapsing-Remitting immunology, User-Computer Interface
- Abstract
Background: Multiple Sclerosis (MS) represents nowadays in Europe the leading cause of non-traumatic disabilities in young adults, with more than 700,000 EU cases. Although huge strides have been made over the years, MS etiology remains partially unknown. Furthermore, the presence of various endogenous and exogenous factors can greatly influence the immune response of different individuals, making it difficult to study and understand the disease. This becomes more evident in a personalized-fashion when medical doctors have to choose the best therapy for patient well-being. In this optics, the use of stochastic models, capable of taking into consideration all the fluctuations due to unknown factors and individual variability, is highly advisable., Results: We propose a new model to study the immune response in relapsing remitting MS (RRMS), the most common form of MS that is characterized by alternate episodes of symptom exacerbation (relapses) with periods of disease stability (remission). In this new model, both the peripheral lymph node/blood vessel and the central nervous system are explicitly represented. The model was created and analysed using Epimod, our recently developed general framework for modeling complex biological systems. Then the effectiveness of our model was shown by modeling the complex immunological mechanisms characterizing RRMS during its course and under the DAC administration., Conclusions: Simulation results have proven the ability of the model to reproduce in silico the immune T cell balance characterizing RRMS course and the DAC effects. Furthermore, they confirmed the importance of a timely intervention on the disease course.
- Published
- 2020
- Full Text
- View/download PDF
25. Impacts of reopening strategies for COVID-19 epidemic: a modeling study in Piedmont region.
- Author
-
Pernice S, Castagno P, Marcotulli L, Maule MM, Richiardi L, Moirano G, Sereno M, Cordero F, and Beccuti M
- Subjects
- Betacoronavirus isolation & purification, COVID-19, Carrier State diagnosis, Carrier State epidemiology, Coronavirus Infections diagnosis, Coronavirus Infections transmission, Disease Susceptibility diagnosis, Disease Susceptibility epidemiology, Humans, Italy epidemiology, Models, Theoretical, Pneumonia, Viral diagnosis, Pneumonia, Viral transmission, Quarantine, SARS-CoV-2, Communicable Disease Control methods, Coronavirus Infections epidemiology, Coronavirus Infections prevention & control, Pandemics prevention & control, Pneumonia, Viral epidemiology, Pneumonia, Viral prevention & control
- Abstract
Background: Severe acute respiratory syndrome coronavirus 2 (SARS-COV-2), the causative agent of the coronavirus disease 19 (COVID-19), is a highly transmittable virus. Since the first person-to-person transmission of SARS-CoV-2 was reported in Italy on February 21
st , 2020, the number of people infected with SARS-COV-2 increased rapidly, mainly in northern Italian regions, including Piedmont. A strict lockdown was imposed on March 21st until May 4th when a gradual relaxation of the restrictions started. In this context, computational models and computer simulations are one of the available research tools that epidemiologists can exploit to understand the spread of the diseases and to evaluate social measures to counteract, mitigate or delay the spread of the epidemic., Methods: This study presents an extended version of the Susceptible-Exposed-Infected-Removed-Susceptible (SEIRS) model accounting for population age structure. The infectious population is divided into three sub-groups: (i) undetected infected individuals, (ii) quarantined infected individuals and (iii) hospitalized infected individuals. Moreover, the strength of the government restriction measures and the related population response to these are explicitly represented in the model., Results: The proposed model allows us to investigate different scenarios of the COVID-19 spread in Piedmont and the implementation of different infection-control measures and testing approaches. The results show that the implemented control measures have proven effective in containing the epidemic, mitigating the potential dangerous impact of a large proportion of undetected cases. We also forecast the optimal combination of individual-level measures and community surveillance to contain the new wave of COVID-19 spread after the re-opening work and social activities., Conclusions: Our model is an effective tool useful to investigate different scenarios and to inform policy makers about the potential impact of different control strategies. This will be crucial in the upcoming months, when very critical decisions about easing control measures will need to be taken.- Published
- 2020
- Full Text
- View/download PDF
26. A computational framework for modeling and studying pertussis epidemiology and vaccination.
- Author
-
Castagno P, Pernice S, Ghetti G, Povero M, Pradelli L, Paolotti D, Balbo G, Sereno M, and Beccuti M
- Subjects
- Adolescent, Child, Humans, Reproducibility of Results, Computational Biology methods, Computer Simulation standards, Vaccination methods, Whooping Cough epidemiology
- Abstract
Background: Emerging and re-emerging infectious diseases such as Zika, SARS, ncovid19 and Pertussis, pose a compelling challenge for epidemiologists due to their significant impact on global public health. In this context, computational models and computer simulations are one of the available research tools that epidemiologists can exploit to better understand the spreading characteristics of these diseases and to decide on vaccination policies, human interaction controls, and other social measures to counter, mitigate or simply delay the spread of the infectious diseases. Nevertheless, the construction of mathematical models for these diseases and their solutions remain a challenging tasks due to the fact that little effort has been devoted to the definition of a general framework easily accessible even by researchers without advanced modelling and mathematical skills., Results: In this paper we describe a new general modeling framework to study epidemiological systems, whose novelties and strengths are: (1) the use of a graphical formalism to simplify the model creation phase; (2) the implementation of an R package providing a friendly interface to access the analysis techniques implemented in the framework; (3) a high level of portability and reproducibility granted by the containerization of all analysis techniques implemented in the framework; (4) a well-defined schema and related infrastructure to allow users to easily integrate their own analysis workflow in the framework. Then, the effectiveness of this framework is showed through a case of study in which we investigate the pertussis epidemiology in Italy., Conclusions: We propose a new general modeling framework for the analysis of epidemiological systems, which exploits Petri Net graphical formalism, R environment, and Docker containerization to derive a tool easily accessible by any researcher even without advanced mathematical and computational skills. Moreover, the framework was implemented following the guidelines defined by Reproducible Bioinformatics Project so it guarantees reproducible analysis and makes simple the developed of new user-defined workflows.
- Published
- 2020
- Full Text
- View/download PDF
27. Docker4Circ: A Framework for the Reproducible Characterization of circRNAs from RNA-Seq Data.
- Author
-
Ferrero G, Licheri N, Coscujuela Tarrero L, De Intinis C, Miano V, Calogero RA, Cordero F, De Bortoli M, and Beccuti M
- Subjects
- Animals, Humans, Databases, Nucleic Acid, RNA, Circular genetics, RNA-Seq, Software
- Abstract
Recent improvements in cost-effectiveness of high-throughput technologies has allowed RNA sequencing of total transcriptomes suitable for evaluating the expression and regulation of circRNAs, a relatively novel class of transcript isoforms with suggested roles in transcriptional and post-transcriptional gene expression regulation, as well as their possible use as biomarkers, due to their deregulation in various human diseases. A limited number of integrated workflows exists for prediction, characterization, and differential expression analysis of circRNAs, none of them complying with computational reproducibility requirements. We developed Docker4Circ for the complete analysis of circRNAs from RNA-Seq data. Docker4Circ runs a comprehensive analysis of circRNAs in human and model organisms, including: circRNAs prediction; classification and annotation using six public databases; back-splice sequence reconstruction; internal alternative splicing of circularizing exons; alignment-free circRNAs quantification from RNA-Seq reads; and differential expression analysis. Docker4Circ makes circRNAs analysis easier and more accessible thanks to: (i) its R interface; (ii) encapsulation of computational tasks into docker images; (iii) user-friendly Java GUI Interface availability; and (iv) no need of advanced bash scripting skills for correct use. Furthermore, Docker4Circ ensures a reproducible analysis since all its tasks are embedded into a docker image following the guidelines provided by Reproducible Bioinformatics Project.
- Published
- 2019
- Full Text
- View/download PDF
28. A computational approach based on the colored Petri net formalism for studying multiple sclerosis.
- Author
-
Pernice S, Pennisi M, Romano G, Maglione A, Cutrupi S, Pappalardo F, Balbo G, Beccuti M, Cordero F, and Calogero RA
- Subjects
- Computational Biology, Disease Progression, Female, Humans, Immunosuppressive Agents therapeutic use, Pregnancy, Recurrence, Computer Simulation, Multiple Sclerosis, Relapsing-Remitting immunology, Multiple Sclerosis, Relapsing-Remitting physiopathology
- Abstract
Background: Multiple Sclerosis (MS) is an immune-mediated inflammatory disease of the Central Nervous System (CNS) which damages the myelin sheath enveloping nerve cells thus causing severe physical disability in patients. Relapsing Remitting Multiple Sclerosis (RRMS) is one of the most common form of MS in adults and is characterized by a series of neurologic symptoms, followed by periods of remission. Recently, many treatments were proposed and studied to contrast the RRMS progression. Among these drugs, daclizumab (commercial name Zinbryta), an antibody tailored against the Interleukin-2 receptor of T cells, exhibited promising results, but its efficacy was accompanied by an increased frequency of serious adverse events. Manifested side effects consisted of infections, encephalitis, and liver damages. Therefore daclizumab has been withdrawn from the market worldwide. Another interesting case of RRMS regards its progression in pregnant women where a smaller incidence of relapses until the delivery has been observed., Results: In this paper we propose a new methodology for studying RRMS, which we implemented in GreatSPN, a state-of-the-art open-source suite for modelling and analyzing complex systems through the Petri Net (PN) formalism. This methodology exploits: (a) an extended Colored PN formalism to provide a compact graphical description of the system and to automatically derive a set of ODEs encoding the system dynamics and (b) the Latin Hypercube Sampling with PRCC index to calibrate ODE parameters for reproducing the real behaviours in healthy and MS subjects.To show the effectiveness of such methodology a model of RRMS has been constructed and studied. Two different scenarios of RRMS were thus considered. In the former scenario the effect of the daclizumab administration is investigated, while in the latter one RRMS was studied in pregnant women., Conclusions: We propose a new computational methodology to study RRMS disease. Moreover, we show that model generated and calibrated according to this methodology is able to reproduce the expected behaviours.
- Published
- 2019
- Full Text
- View/download PDF
29. rCASC: reproducible classification analysis of single-cell sequencing data.
- Author
-
Alessandrì L, Cordero F, Beccuti M, Arigoni M, Olivero M, Romano G, Rabellino S, Licheri N, De Libero G, Pace L, and Calogero RA
- Subjects
- Cluster Analysis, Humans, Leukocytes, Mononuclear metabolism, Software, Sequence Analysis, RNA, Single-Cell Analysis, Workflow
- Abstract
Background: Single-cell RNA sequencing is essential for investigating cellular heterogeneity and highlighting cell subpopulation-specific signatures. Single-cell sequencing applications have spread from conventional RNA sequencing to epigenomics, e.g., ATAC-seq. Many related algorithms and tools have been developed, but few computational workflows provide analysis flexibility while also achieving functional (i.e., information about the data and the tools used are saved as metadata) and computational reproducibility (i.e., a real image of the computational environment used to generate the data is stored) through a user-friendly environment., Findings: rCASC is a modular workflow providing an integrated analysis environment (from count generation to cell subpopulation identification) exploiting Docker containerization to achieve both functional and computational reproducibility in data analysis. Hence, rCASC provides preprocessing tools to remove low-quality cells and/or specific bias, e.g., cell cycle. Subpopulation discovery can instead be achieved using different clustering techniques based on different distance metrics. Cluster quality is then estimated through the new metric "cell stability score" (CSS), which describes the stability of a cell in a cluster as a consequence of a perturbation induced by removing a random set of cells from the cell population. CSS provides better cluster robustness information than the silhouette metric. Moreover, rCASC's tools can identify cluster-specific gene signatures., Conclusions: rCASC is a modular workflow with new features that could help researchers define cell subpopulations and detect subpopulation-specific markers. It uses Docker for ease of installation and to achieve a computation-reproducible analysis. A Java GUI is provided to welcome users without computational skills in R., (© The Author(s) 2019. Published by Oxford University Press.)
- Published
- 2019
- Full Text
- View/download PDF
30. Integrative Analysis of Novel Metabolic Subtypes in Pancreatic Cancer Fosters New Prognostic Biomarkers.
- Author
-
Follia L, Ferrero G, Mandili G, Beccuti M, Giordano D, Spadi R, Satolli MA, Evangelista A, Katayama H, Hong W, Momin AA, Capello M, Hanash SM, Novelli F, and Cordero F
- Abstract
Background: Most of the patients with Pancreatic Ductal Adenocarcinoma (PDA) are not eligible for a curative surgical resection. For this reason there is an urgent need for personalized therapies. PDA is the result of complex interactions between tumor molecular profile and metabolites produced by its microenvironment. Despite recent studies identified PDA molecular subtypes, its metabolic classification is still lacking. Methods: We applied an integrative analysis on transcriptomic and genomic data of glycolytic genes in PDA. Data were collected from public datasets and molecular glycolytic subtypes were defined using hierarchical clustering. The grade of purity of the cancer samples was assessed estimating the different amount of stromal and immunological infiltrate among the identified PDA subtypes. Analyses of metabolomic data from a subset of PDA cell lines allowed us to identify the different metabolites produced by the metabolic subtypes. Sera of a cohort of 31 PDA patients were analyzed using Q-TOF mass spectrometer to measure the amount of metabolic circulating proteins present before and after chemotherapy. Results: Our integrative analysis of glycolytic genes identified two glycolytic and two non-glycolytic metabolic PDA subtypes. Glycolytic patients develop disease earlier, have poor prognosis, low immune-infiltrated tumors, and are characterized by a gain in chr12p13 genomic region. This gain results in the over-expression of GAPDH, TPI1 , and FOXM1 . PDA cell lines with the gain of chr12p13 are characterized by an higher lipid uptake and sensitivity to drug targeting the fatty acid metabolism. Our sera proteomic analysis confirms that TPI1 serum levels increase in poor prognosis gemcitabine-treated patients. Conclusions: We identify four metabolic PDA subtypes with different prognosis outcomes which may have pivotal role in setting personalized treatments. Moreover, our data suggest TPI1 as putative prognostic PDA biomarker.
- Published
- 2019
- Full Text
- View/download PDF
31. Reproducible bioinformatics project: a community for reproducible bioinformatics analysis pipelines.
- Author
-
Kulkarni N, Alessandrì L, Panero R, Arigoni M, Olivero M, Ferrero G, Cordero F, Beccuti M, and Calogero RA
- Subjects
- Humans, MicroRNAs genetics, Reproducibility of Results, Software, User-Computer Interface, Workflow, Computational Biology methods
- Abstract
Background: Reproducibility of a research is a key element in the modern science and it is mandatory for any industrial application. It represents the ability of replicating an experiment independently by the location and the operator. Therefore, a study can be considered reproducible only if all used data are available and the exploited computational analysis workflow is clearly described. However, today for reproducing a complex bioinformatics analysis, the raw data and the list of tools used in the workflow could be not enough to guarantee the reproducibility of the results obtained. Indeed, different releases of the same tools and/or of the system libraries (exploited by such tools) might lead to sneaky reproducibility issues., Results: To address this challenge, we established the Reproducible Bioinformatics Project (RBP), which is a non-profit and open-source project, whose aim is to provide a schema and an infrastructure, based on docker images and R package, to provide reproducible results in Bioinformatics. One or more Docker images are then defined for a workflow (typically one for each task), while the workflow implementation is handled via R-functions embedded in a package available at github repository. Thus, a bioinformatician participating to the project has firstly to integrate her/his workflow modules into Docker image(s) exploiting an Ubuntu docker image developed ad hoc by RPB to make easier this task. Secondly, the workflow implementation must be realized in R according to an R-skeleton function made available by RPB to guarantee homogeneity and reusability among different RPB functions. Moreover she/he has to provide the R vignette explaining the package functionality together with an example dataset which can be used to improve the user confidence in the workflow utilization., Conclusions: Reproducible Bioinformatics Project provides a general schema and an infrastructure to distribute robust and reproducible workflows. Thus, it guarantees to final users the ability to repeat consistently any analysis independently by the used UNIX-like architecture.
- Published
- 2018
- Full Text
- View/download PDF
32. SeqBox: RNAseq/ChIPseq reproducible analysis on a consumer game computer.
- Author
-
Beccuti M, Cordero F, Arigoni M, Panero R, Amparore EG, Donatelli S, and Calogero RA
- Subjects
- Computational Biology methods, Reproducibility of Results, Chromatin Immunoprecipitation methods, Sequence Analysis, RNA methods, Software
- Abstract
Summary: Short reads sequencing technology has been used for more than a decade now. However, the analysis of RNAseq and ChIPseq data is still computational demanding and the simple access to raw data does not guarantee results reproducibility between laboratories. To address these two aspects, we developed SeqBox, a cheap, efficient and reproducible RNAseq/ChIPseq hardware/software solution based on NUC6I7KYK mini-PC (an Intel consumer game computer with a fast processor and a high performance SSD disk), and Docker container platform. In SeqBox the analysis of RNAseq and ChIPseq data is supported by a friendly GUI. This allows access to fast and reproducible analysis also to scientists with/without scripting experience., Availability and Implementation: Docker container images, docker4seq package and the GUI are available at http://www.bioinformatica.unito.it/reproducibile.bioinformatics.html., Contact: beccuti@di.unito.it., Supplementary Information: Supplementary data are available at Bioinformatics online., (© The Author(s) 2017. Published by Oxford University Press.)
- Published
- 2018
- Full Text
- View/download PDF
33. Luminal breast cancer-specific circular RNAs uncovered by a novel tool for data analysis.
- Author
-
Coscujuela Tarrero L, Ferrero G, Miano V, De Intinis C, Ricci L, Arigoni M, Riccardo F, Annaratone L, Castellano I, Calogero RA, Beccuti M, Cordero F, and De Bortoli M
- Abstract
Circular RNAs are highly stable molecules present in all eukaryotes generated by distinct transcript processing. We have exploited poly(A-) RNA-Seq data generated in our lab in MCF-7 breast cancer cells to define a compilation of exonic circRNAs more comprehensive than previously existing lists. Development of a novel computational tool, named CircHunter , allowed us to more accurately characterize circRNAs and to quantitatively evaluate their expression in publicly available RNA-Seq data from breast cancer cell lines and tumor tissues. We observed and confirmed, by ChIP analysis, that exons involved in circularization events display significantly higher levels of the histone post-transcriptional modification H3K36me3 than non-circularizing exons. This result has potential impact on circRNA biogenesis since H3K36me3 has been involved in alternative splicing mechanisms. By analyzing an Ago-HITS-CLIP dataset we also found that circularizing exons overlapped with an unexpectedly higher number of Ago binding sites than non-circularizing exons. Finally, we observed that a subset of MCF-7 circRNAs are specific to tumor versus normal tissue, while others can distinguish Luminal from other tumor subtypes, thus suggesting that circRNAs can be exploited as novel biomarkers and drug targets for breast cancer., Competing Interests: CONFLICTS OF INTEREST The authors declare that they have no competing interests.
- Published
- 2018
- Full Text
- View/download PDF
34. HashClone: a new tool to quantify the minimal residual disease in B-cell lymphoma from deep sequencing data.
- Author
-
Beccuti M, Genuardi E, Romano G, Monitillo L, Barbero D, Boccadoro M, Ladetto M, Calogero R, Ferrero S, and Cordero F
- Subjects
- Algorithms, Alleles, B-Lymphocytes pathology, Clone Cells, Humans, Reproducibility of Results, Lymphoma, B-Cell genetics, Neoplasm, Residual genetics
- Abstract
Background: Mantle Cell Lymphoma (MCL) is a B cell aggressive neoplasia accounting for about the 6% of all lymphomas. The most common molecular marker of clonality in MCL, as in other B lymphoproliferative disorders, is the ImmunoGlobulin Heavy chain (IGH) rearrangement, occurring in B-lymphocytes. The patient-specific IGH rearrangement is extensively used to monitor the Minimal Residual Disease (MRD) after treatment through the standardized Allele-Specific Oligonucleotides Quantitative Polymerase Chain Reaction based technique. Recently, several studies have suggested that the IGH monitoring through deep sequencing techniques can produce not only comparable results to Polymerase Chain Reaction-based methods, but also might overcome the classical technique in terms of feasibility and sensitivity. However, no standard bioinformatics tool is available at the moment for data analysis in this context., Results: In this paper we present HashClone, an easy-to-use and reliable bioinformatics tool that provides B-cells clonality assessment and MRD monitoring over time analyzing data from Next-Generation Sequencing (NGS) technique. The HashClone strategy-based is composed of three steps: the first and second steps implement an alignment-free prediction method that identifies a set of putative clones belonging to the repertoire of the patient under study. In the third step the IGH variable region, diversity region, and joining region identification is obtained by the alignment of rearrangements with respect to the international ImMunoGenetics information system database. Moreover, a provided graphical user interface for HashClone execution and clonality visualization over time facilitate the tool use and the results interpretation. The HashClone performance was tested on the NGS data derived from MCL patients to assess the major B-cell clone in the diagnostic samples and to monitor the MRD in the real and artificial follow up samples., Conclusions: Our experiments show that in all the experimental settings, HashClone was able to correctly detect the major B-cell clones and to precisely follow them in several samples showing better accuracy than the state-of-art tool.
- Published
- 2017
- Full Text
- View/download PDF
35. Dissecting the genomic activity of a transcriptional regulator by the integrative analysis of omics data.
- Author
-
Ferrero G, Miano V, Beccuti M, Balbo G, De Bortoli M, and Cordero F
- Subjects
- A549 Cells, Estrogen Receptor alpha genetics, Estrogen Receptor alpha metabolism, Forkhead Box Protein M1 genetics, Forkhead Box Protein M1 metabolism, Genomics statistics & numerical data, Humans, MCF-7 Cells, Neoplasms genetics, Neoplasms metabolism, Neoplasms pathology, Receptors, Glucocorticoid genetics, Receptors, Glucocorticoid metabolism, Chromatin Immunoprecipitation methods, Gene Expression Regulation, Neoplastic, Genomics methods, High-Throughput Nucleotide Sequencing methods
- Abstract
In the study of genomic regulation, strategies to integrate the data produced by Next Generation Sequencing (NGS)-based technologies in a meaningful ensemble are eagerly awaited and must continuously evolve. Here, we describe an integrative strategy for the analysis of data generated by chromatin immunoprecipitation followed by NGS which combines algorithms for data overlap, normalization and epigenetic state analysis. The performance of our strategy is illustrated by presenting the analysis of data relative to the transcriptional regulator Estrogen Receptor alpha (ERα) in MCF-7 breast cancer cells and of Glucocorticoid Receptor (GR) in A549 lung cancer cells. We went through the definition of reference cistromes for different experimental contexts, the integration of data relative to co-regulators and the overlay of chromatin states as defined by epigenetic marks in MCF-7 cells. With our strategy, we identified novel features of estrogen-independent ERα activity, including FoxM1 interaction, eRNAs transcription and a peculiar ontology of connected genes.
- Published
- 2017
- Full Text
- View/download PDF
36. Peculiar Genes Selection: A new features selection method to improve classification performances in imbalanced data sets.
- Author
-
Martina F, Beccuti M, Balbo G, and Cordero F
- Subjects
- Aged, Female, Gene Expression Profiling, Gene Ontology, Humans, Middle Aged, Neoplasms genetics, Reproducibility of Results, Transcription Factors metabolism, Vaccination, Algorithms, Computational Biology methods, Databases as Topic, Genes
- Abstract
High-Throughput technologies provide genomic and trascriptomic data that are suitable for biomarker detection for classification purposes. However, the high dimension of the output of such technologies and the characteristics of the data sets analysed represent an issue for the classification task. Here we present a new feature selection method based on three steps to detect class-specific biomarkers in case of high-dimensional data sets. The first step detects the differentially expressed genes according to the experimental conditions tested in the experimental design, the second step filters out the features with low discriminative power and the third step detects the class-specific features and defines the final biomarker as the union of the class-specific features. The proposed procedure is tested on two microarray datasets, one characterized by a strong imbalance between the size of classes and the other one where the size of classes is perfectly balanced. We show that, using the proposed feature selection procedure, the classification performances of a Support Vector Machine on the imbalanced data set reach a 82% whereas other methods do not exceed 73%. Furthermore, in case of perfectly balanced dataset, the classification performances are comparable with other methods. Finally, the Gene Ontology enrichments performed on the signatures selected with the proposed pipeline, confirm the biological relevance of our methodology. The download of the package with the implementation of Peculiar Genes Selection, 'PGS', is available for R users at: http://github.com/mbeccuti/PGS.
- Published
- 2017
- Full Text
- View/download PDF
37. Sequencing of 15 622 gene-bearing BACs clarifies the gene-dense regions of the barley genome.
- Author
-
Muñoz-Amatriaín M, Lonardi S, Luo M, Madishetty K, Svensson JT, Moscou MJ, Wanamaker S, Jiang T, Kleinhofs A, Muehlbauer GJ, Wise RP, Stein N, Ma Y, Rodriguez E, Kudrna D, Bhat PR, Chao S, Condamine P, Heinen S, Resnik J, Wing R, Witt HN, Alpert M, Beccuti M, Bozdag S, Cordero F, Mirebrahim H, Ounit R, Wu Y, You F, Zheng J, Simková H, Dolezel J, Grimwood J, Schmutz J, Duma D, Altschmied L, Blake T, Bregitzer P, Cooper L, Dilbirligi M, Falk A, Feiz L, Graner A, Gustafson P, Hayes PM, Lemaux P, Mammadov J, and Close TJ
- Subjects
- Molecular Sequence Data, Chromosomes, Artificial, Bacterial genetics, Genome, Plant genetics, Hordeum genetics
- Abstract
Barley (Hordeum vulgare L.) possesses a large and highly repetitive genome of 5.1 Gb that has hindered the development of a complete sequence. In 2012, the International Barley Sequencing Consortium released a resource integrating whole-genome shotgun sequences with a physical and genetic framework. However, because only 6278 bacterial artificial chromosome (BACs) in the physical map were sequenced, fine structure was limited. To gain access to the gene-containing portion of the barley genome at high resolution, we identified and sequenced 15 622 BACs representing the minimal tiling path of 72 052 physical-mapped gene-bearing BACs. This generated ~1.7 Gb of genomic sequence containing an estimated 2/3 of all Morex barley genes. Exploration of these sequenced BACs revealed that although distal ends of chromosomes contain most of the gene-enriched BACs and are characterized by high recombination rates, there are also gene-dense regions with suppressed recombination. We made use of published map-anchored sequence data from Aegilops tauschii to develop a synteny viewer between barley and the ancestor of the wheat D-genome. Except for some notable inversions, there is a high level of collinearity between the two species. The software HarvEST:Barley provides facile access to BAC sequences and their annotations, along with the barley-Ae. tauschii synteny viewer. These BAC sequences constitute a resource to improve the efficiency of marker development, map-based cloning, and comparative genomics in barley and related crops. Additional knowledge about regions of the barley genome that are gene-dense but low recombination is particularly relevant., (© 2015 The Authors The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd.)
- Published
- 2015
- Full Text
- View/download PDF
38. The molecular landscape of colorectal cancer cell lines unveils clinically actionable kinase targets.
- Author
-
Medico E, Russo M, Picco G, Cancelliere C, Valtorta E, Corti G, Buscarino M, Isella C, Lamba S, Martinoglio B, Veronese S, Siena S, Sartore-Bianchi A, Beccuti M, Mottolese M, Linnebacher M, Cordero F, Di Nicolantonio F, and Bardelli A
- Subjects
- Anaplastic Lymphoma Kinase, Cell Line, Tumor, Cetuximab, Colorectal Neoplasms genetics, Genes, erbB-1, Genetic Heterogeneity, Humans, Molecular Targeted Therapy, Proto-Oncogene Proteins c-ret metabolism, Receptor Protein-Tyrosine Kinases genetics, Receptor, Fibroblast Growth Factor, Type 2 metabolism, Colorectal Neoplasms enzymology, ErbB Receptors antagonists & inhibitors, Receptor Protein-Tyrosine Kinases metabolism
- Abstract
The development of molecularly targeted anticancer agents relies on large panels of tumour-specific preclinical models closely recapitulating the molecular heterogeneity observed in patients. Here we describe the mutational and gene expression analyses of 151 colorectal cancer (CRC) cell lines. We find that the whole spectrum of CRC molecular and transcriptional subtypes, previously defined in patients, is represented in this cell line compendium. Transcriptional outlier analysis identifies RAS/BRAF wild-type cells, resistant to EGFR blockade, functionally and pharmacologically addicted to kinase genes including ALK, FGFR2, NTRK1/2 and RET. The same genes are present as expression outliers in CRC patient samples. Genomic rearrangements (translocations) involving the ALK and NTRK1 genes are associated with the overexpression of the corresponding proteins in CRC specimens. The approach described here can be used to pinpoint CRCs with exquisite dependencies to individual kinases for which clinically approved drugs are already available.
- Published
- 2015
- Full Text
- View/download PDF
39. Alternative splicing detection workflow needs a careful combination of sample prep and bioinformatics analysis.
- Author
-
Carrara M, Lum J, Cordero F, Beccuti M, Poidinger M, Donatelli S, Calogero RA, and Zolezzi F
- Subjects
- Exons genetics, Humans, RNA genetics, RNA, Ribosomal genetics, RNA, Ribosomal metabolism, Workflow, Alternative Splicing genetics, Computational Biology methods, Gene Library, Sequence Analysis, RNA methods
- Abstract
Background: RNA-Seq provides remarkable power in the area of biomarkers discovery and disease characterization. Two crucial steps that affect RNA-Seq experiment results are Library Sample Preparation (LSP) and Bioinformatics Analysis (BA). This work describes an evaluation of the combined effect of LSP methods and BA tools in the detection of splice variants., Results: Different LSPs (TruSeq unstranded/stranded, ScriptSeq, NuGEN) allowed the detection of a large common set of splice variants. However, each LSP also detected a small set of unique transcripts that are characterized by a low coverage and/or FPKM. This effect was particularly evident using the low input RNA NuGEN v2 protocol. A benchmark dataset, in which synthetic reads as well as reads generated from standard (Illumina TruSeq 100) and low input (NuGEN) LSPs were spiked-in was used to evaluate the effect of LSP on the statistical detection of alternative splicing events (AltDE). Statistical detection of AltDE was done using as prototypes for splice variant-quantification Cuffdiff2 and RSEM-EBSeq. As prototype for exon-level analysis DEXSeq was used. Exon-level analysis performed slightly better than splice variant-quantification approaches, although at most only 50% of the spiked-in transcripts was detected. The performances of both splice variant-quantification and exon-level analysis improved when raising the number of input reads., Conclusion: Data, derived from NuGEN v2, were not the ideal input for AltDE, especially when the exon-level approach was used. We observed that both splice variant-quantification and exon-level analysis performances were strongly dependent on the number of input reads. Moreover, the ribosomal RNA depletion protocol was less sensitive in detecting splicing variants, possibly due to the significant percentage of the reads mapping to non-coding transcripts.
- Published
- 2015
- Full Text
- View/download PDF
40. A versatile mathematical work-flow to explore how Cancer Stem Cell fate influences tumor progression.
- Author
-
Fornari C, Balbo G, Halawani SM, Ba-Rukab O, Ahmad AR, Calogero RA, Cordero F, and Beccuti M
- Subjects
- Animals, Apoptosis physiology, Cell Proliferation, Humans, Neoplastic Stem Cells cytology, Carcinogenesis pathology, Models, Biological, Neoplastic Stem Cells pathology
- Abstract
Background: Nowadays multidisciplinary approaches combining mathematical models with experimental assays are becoming relevant for the study of biological systems. Indeed, in cancer research multidisciplinary approaches are successfully used to understand the crucial aspects implicated in tumor growth. In particular, the Cancer Stem Cell (CSC) biology represents an area particularly suited to be studied through multidisciplinary approaches, and modeling has significantly contributed to pinpoint the crucial aspects implicated in this theory. More generally, to acquire new insights on a biological system it is necessary to have an accurate description of the phenomenon, such that making accurate predictions on its future behaviors becomes more likely. In this context, the identification of the parameters influencing model dynamics can be advantageous to increase model accuracy and to provide hints in designing wet experiments. Different techniques, ranging from statistical methods to analytical studies, have been developed. Their applications depend on case-specific aspects, such as the availability and quality of experimental data, and the dimension of the parameter space., Results: The study of a new model on the CSC-based tumor progression has been the motivation to design a new work-flow that helps to characterize possible system dynamics and to identify those parameters influencing such behaviors. In detail, we extended our recent model on CSC-dynamics creating a new system capable of describing tumor growth during the different stages of cancer progression. Indeed, tumor cells appear to progress through lineage stages like those of normal tissues, being their division auto-regulated by internal feedback mechanisms. These new features have introduced some non-linearities in the model, making it more difficult to be studied by solely analytical techniques. Our new work-flow, based on statistical methods, was used to identify the parameters which influence the tumor growth. The effectiveness of the presented work-flow was firstly verified on two well known models and then applied to investigate our extended CSC model., Conclusions: We propose a new work-flow to study in a practical and informative way complex systems, allowing an easy identification, interpretation, and visualization of the key model parameters. Our methodology is useful to investigate possible model behaviors and to establish factors driving model dynamics. Analyzing our new CSC model guided by the proposed work-flow, we found that the deregulation of CSC asymmetric proliferation contributes to cancer initiation, in accordance with several experimental evidences. Specifically, model results indicated that the probability of CSC symmetric proliferation is responsible of a switching-like behavior which discriminates between tumorigenesis and unsustainable tumor growth.
- Published
- 2015
- Full Text
- View/download PDF
41. Chimera: a Bioconductor package for secondary analysis of fusion products.
- Author
-
Beccuti M, Carrara M, Cordero F, Lazzarato F, Donatelli S, Nadalin F, Policriti A, and Calogero RA
- Subjects
- Animals, Molecular Sequence Annotation, Gene Fusion, Software
- Abstract
Summary: Chimera is a Bioconductor package that organizes, annotates, analyses and validates fusions reported by different fusion detection tools; current implementation can deal with output from bellerophontes, chimeraScan, deFuse, fusionCatcher, FusionFinder, FusionHunter, FusionMap, mapSplice, Rsubread, tophat-fusion and STAR. The core of Chimera is a fusion data structure that can store fusion events detected with any of the aforementioned tools. Fusions are then easily manipulated with standard R functions or through the set of functionalities specifically developed in Chimera with the aim of supporting the user in managing fusions and discriminating false-positive results., (© The Author 2014. Published by Oxford University Press.)
- Published
- 2014
- Full Text
- View/download PDF
42. A mathematical-biological joint effort to investigate the tumor-initiating ability of Cancer Stem Cells.
- Author
-
Fornari C, Beccuti M, Lanzardo S, Conti L, Balbo G, Cavallo F, Calogero RA, and Cordero F
- Subjects
- Animals, Biomarkers metabolism, Breast Neoplasms genetics, Breast Neoplasms metabolism, CD24 Antigen genetics, Carcinogenesis genetics, Carcinogenesis metabolism, Carcinoma genetics, Carcinoma metabolism, Cell Line, Tumor, Disease Progression, Female, Gene Expression, Humans, Hyaluronan Receptors genetics, Mice, Mice, Inbred BALB C, Neoplasm Transplantation, Neoplastic Stem Cells metabolism, Spheroids, Cellular metabolism, Spheroids, Cellular pathology, Transplantation, Heterotopic, Breast Neoplasms pathology, Carcinogenesis pathology, Carcinoma pathology, Models, Statistical, Neoplastic Stem Cells pathology, Receptor, ErbB-2 genetics
- Abstract
The involvement of Cancer Stem Cells (CSCs) in tumor progression and tumor recurrence is one of the most studied subjects in current cancer research. The CSC hypothesis states that cancer cell populations are characterized by a hierarchical structure that affects cancer progression. Due to the complex dynamics involving CSCs and the other cancer cell subpopulations, a robust theory explaining their action has not been established yet. Some indications can be obtained by combining mathematical modeling and experimental data to understand tumor dynamics and to generate new experimental hypotheses. Here, we present a model describing the initial phase of ErbB2(+) mammary cancer progression, which arises from a joint effort combing mathematical modeling and cancer biology. The proposed model represents a new approach to investigate the CSC-driven tumorigenesis and to analyze the relations among crucial events involving cancer cell subpopulations. Using in vivo and in vitro data we tuned the model to reproduce the initial dynamics of cancer growth, and we used its solution to characterize observed cancer progression with respect to mutual CSC and progenitor cell variation. The model was also used to investigate which association occurs among cell phenotypes when specific cell markers are considered. Finally, we found various correlations among model parameters which cannot be directly inferred from the available biological data and these dependencies were used to characterize the dynamics of cancer subpopulations during the initial phase of ErbB2+ mammary cancer progression.
- Published
- 2014
- Full Text
- View/download PDF
43. Combinatorial pooling enables selective sequencing of the barley gene space.
- Author
-
Lonardi S, Duma D, Alpert M, Cordero F, Beccuti M, Bhat PR, Wu Y, Ciardo G, Alsaihati B, Ma Y, Wanamaker S, Resnik J, Bozdag S, Luo MC, and Close TJ
- Subjects
- Chromosomes, Artificial, Bacterial, Cloning, Molecular, Computational Biology methods, Computer Simulation, Genes, Plant, Genetic Markers genetics, Genomic Library, Genomics, Models, Genetic, Oryza genetics, Physical Chromosome Mapping, Species Specificity, Contig Mapping methods, Hordeum genetics, Sequence Analysis, DNA
- Abstract
For the vast majority of species - including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding.
- Published
- 2013
- Full Text
- View/download PDF
44. State of art fusion-finder algorithms are suitable to detect transcription-induced chimeras in normal tissues?
- Author
-
Carrara M, Beccuti M, Cavallo F, Donatelli S, Lazzarato F, Cordero F, and Calogero RA
- Subjects
- Animals, Humans, Sequence Analysis, RNA methods, Algorithms, Gene Fusion, Software, Transcription, Genetic
- Abstract
Background: RNA-seq has the potential to discover genes created by chromosomal rearrangements. Fusion genes, also known as "chimeras", are formed by the breakage and re-joining of two different chromosomes. It is known that chimeras have been implicated in the development of cancer. Few publications in the past showed the presence of fusion events also in normal tissue, but with very limited overlaps between their results. More recently, two fusion genes in normal tissues were detected using both RNA-seq and protein data.Due to heterogeneous results in identifying chimeras in normal tissue, we decided to evaluate the efficacy of state of the art fusion finders in detecting chimeras in RNA-seq data from normal tissues., Results: We compared the performance of six fusion-finder tools: FusionHunter, FusionMap, FusionFinder, MapSplice, deFuse and TopHat-fusion. To evaluate the sensitivity we used a synthetic dataset of fusion-products, called positive dataset; in these experiments FusionMap, FusionFinder, MapSplice, and TopHat-fusion are able to detect more than 78% of fusion genes. All tools were error prone with high variability among the tools, identifying some fusion genes not present in the synthetic dataset. To better investigate the false discovery chimera detection rate, synthetic datasets free of fusion-products, called negative datasets, were used. The negative datasets have different read lengths and quality scores, which allow detecting dependency of the tools on both these features. FusionMap, FusionFinder, mapSplice, deFuse and TopHat-fusion were error-prone. Only FusionHunter results were free of false positive. FusionMap gave the best compromise in terms of specificity in the negative dataset and of sensitivity in the positive dataset., Conclusions: We have observed a dependency of the tools on read length, quality score and on the number of reads supporting each chimera. Thus, it is important to carefully select the software on the basis of the structure of the RNA-seq data under analysis. Furthermore, the sensitivity of chimera detection tools does not seem to be sufficient to provide results consistent with those obtained in normal tissues on the basis of fusion events extracted from published data.
- Published
- 2013
- Full Text
- View/download PDF
45. State-of-the-art fusion-finder algorithms sensitivity and specificity.
- Author
-
Carrara M, Beccuti M, Lazzarato F, Cavallo F, Cordero F, Donatelli S, and Calogero RA
- Subjects
- Chimera genetics, Humans, Neoplasms pathology, Sequence Analysis, RNA, Software, Gene Fusion, Neoplasms genetics, Oncogene Proteins, Fusion genetics, Translocation, Genetic genetics
- Abstract
Background: Gene fusions arising from chromosomal translocations have been implicated in cancer. RNA-seq has the potential to discover such rearrangements generating functional proteins (chimera/fusion). Recently, many methods for chimeras detection have been published. However, specificity and sensitivity of those tools were not extensively investigated in a comparative way., Results: We tested eight fusion-detection tools (FusionHunter, FusionMap, FusionFinder, MapSplice, deFuse, Bellerophontes, ChimeraScan, and TopHat-fusion) to detect fusion events using synthetic and real datasets encompassing chimeras. The comparison analysis run only on synthetic data could generate misleading results since we found no counterpart on real dataset. Furthermore, most tools report a very high number of false positive chimeras. In particular, the most sensitive tool, ChimeraScan, reports a large number of false positives that we were able to significantly reduce by devising and applying two filters to remove fusions not supported by fusion junction-spanning reads or encompassing large intronic regions., Conclusions: The discordant results obtained using synthetic and real datasets suggest that synthetic datasets encompassing fusion events may not fully catch the complexity of RNA-seq experiment. Moreover, fusion detection tools are still limited in sensitivity or specificity; thus, there is space for further improvement in the fusion-finder algorithms.
- Published
- 2013
- Full Text
- View/download PDF
46. Multi-level model for the investigation of oncoantigen-driven vaccination effect.
- Author
-
Cordero F, Beccuti M, Fornari C, Lanzardo S, Conti L, Cavallo F, Balbo G, and Calogero R
- Subjects
- Animals, Breast Neoplasms pathology, Cancer Vaccines immunology, Humans, Mice, Neoplasms immunology, Neoplasms metabolism, Neoplastic Stem Cells pathology, Receptor, ErbB-2, Cancer Vaccines therapeutic use, Models, Biological, Neoplasms pathology, Neoplasms therapy
- Abstract
Background: Cancer stem cell theory suggests that cancers are derived by a population of cells named Cancer Stem Cells (CSCs) that are involved in the growth and in the progression of tumors, and lead to a hierarchical structure characterized by differentiated cell population. This cell heterogeneity affects the choice of cancer therapies, since many current cancer treatments have limited or no impact at all on CSC population, while they reveal a positive effect on the differentiated cell populations., Results: In this paper we investigated the effect of vaccination on a cancer hierarchical structure through a multi-level model representing both population and molecular aspects. The population level is modeled by a system of Ordinary Differential Equations (ODEs) describing the cancer population's dynamics. The molecular level is modeled using the Petri Net (PN) formalism to detail part of the proliferation pathway. Moreover, we propose a new methodology which exploits the temporal behavior derived from the molecular level to parameterize the ODE system modeling populations. Using this multi-level model we studied the ErbB2-driven vaccination effect in breast cancer., Conclusions: We propose a multi-level model that describes the inter-dependencies between population and genetic levels, and that can be efficiently used to estimate the efficacy of drug and vaccine therapies in cancer models, given the availability of molecular data on the cancer driving force.
- Published
- 2013
- Full Text
- View/download PDF
47. Optimizing a massive parallel sequencing workflow for quantitative miRNA expression analysis.
- Author
-
Cordero F, Beccuti M, Arigoni M, Donatelli S, and Calogero RA
- Subjects
- Algorithms, Databases, Genetic, Gene Expression Regulation, Genome, Human genetics, Humans, MicroRNAs metabolism, ROC Curve, Reference Standards, Sample Size, Sequence Alignment, Software, Gene Expression Profiling, High-Throughput Nucleotide Sequencing methods, MicroRNAs genetics, Workflow
- Abstract
Background: Massive Parallel Sequencing methods (MPS) can extend and improve the knowledge obtained by conventional microarray technology, both for mRNAs and short non-coding RNAs, e.g. miRNAs. The processing methods used to extract and interpret the information are an important aspect of dealing with the vast amounts of data generated from short read sequencing. Although the number of computational tools for MPS data analysis is constantly growing, their strengths and weaknesses as part of a complex analytical pipe-line have not yet been well investigated., Primary Findings: A benchmark MPS miRNA dataset, resembling a situation in which miRNAs are spiked in biological replication experiments was assembled by merging a publicly available MPS spike-in miRNAs data set with MPS data derived from healthy donor peripheral blood mononuclear cells. Using this data set we observed that short reads counts estimation is strongly under estimated in case of duplicates miRNAs, if whole genome is used as reference. Furthermore, the sensitivity of miRNAs detection is strongly dependent by the primary tool used in the analysis. Within the six aligners tested, specifically devoted to miRNA detection, SHRiMP and MicroRazerS show the highest sensitivity. Differential expression estimation is quite efficient. Within the five tools investigated, two of them (DESseq, baySeq) show a very good specificity and sensitivity in the detection of differential expression., Conclusions: The results provided by our analysis allow the definition of a clear and simple analytical optimized workflow for miRNAs digital quantitative analysis.
- Published
- 2012
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.