21 results on '"Italia De Feis"'
Search Results
2. Wavelet-based robust estimation and variable selection in nonparametric additive models
- Author
-
Umberto Amato, Anestis Antoniadis, Italia De Feis, and Irène Gijbels
- Subjects
Wavelet thresholding ,Statistics and Probability ,Technology ,Science & Technology ,Variable selection ,M-estimation ,Statistics & Probability ,05 social sciences ,01 natural sciences ,Theoretical Computer Science ,010104 statistics & probability ,Nonconvex penalties ,Contamination ,Computational Theory and Mathematics ,Computer Science, Theory & Methods ,Physical Sciences ,Computer Science ,Additive regression ,REGRESSION ,0502 economics and business ,0101 mathematics ,Statistics, Probability and Uncertainty ,Mathematics ,050205 econometrics - Abstract
This article studies M-type estimators for fitting robust additive models in the presence of anomalous data. The components in the additive model are allowed to have different degrees of smoothness. We introduce a new class of wavelet-based robust M-type estimators for performing simultaneous additive component estimation and variable selection in such inhomogeneous additive models. Each additive component is approximated by a truncated series expansion of wavelet bases, making it feasible to apply the method to nonequispaced data and sample sizes that are not necessarily a power of 2. Sparsity of the additive components together with sparsity of the wavelet coefficients within each component (group), results into a bi-level group variable selection problem. In this framework, we discuss robust estimation and variable selection. A two-stage computational algorithm, consisting of a fast accelerated proximal gradient algorithm of coordinate descend type, and thresholding, is proposed. When using nonconvex redescending loss functions, and appropriate nonconvex penalty functions at the group level, we establish optimal convergence rates of the estimates. We prove variable selection consistency under a weak compatibility condition for sparse additive models. The theoretical results are complemented with some simulations and real data analysis, as well as a comparison to other existing methods.
- Published
- 2021
3. Simultaneous nonparametric regression in RADWT dictionaries
- Author
-
Italia De Feis and Daniela De Canditiis
- Subjects
FOS: Computer and information sciences ,Statistics and Probability ,Computer science ,Grouped LASSO ,Applied Mathematics ,Wavelet transform ,Estimator ,Regression analysis ,Resonance (particle physics) ,Nonparametric regression ,Methodology (stat.ME) ,Computational Mathematics ,62G08 62G20 62H12 ,Computational Theory and Mathematics ,Lasso (statistics) ,Component (UML) ,Multichannel ,RADWT ,Joint (audio engineering) ,Algorithm ,Statistics - Methodology - Abstract
A new technique for nonparametric regression of multichannel signals is presented. The technique is based on the use of the Rational-Dilation Wavelet Transform (RADWT), equipped with a tunable Q-factor able to provide sparse representations of functions with different oscillations persistence. In particular, two different frames are obtained by two RADWT with different Q-factors that give sparse representations of functions with low and high resonance. It is assumed that the signals are measured simultaneously on several independent channels and that they share the low resonance component and the spectral characteristics of the high resonance component. Then, a regression analysis is performed by means of the grouped lasso penalty. Furthermore, a result of asymptotic optimality of the estimator is presented using reasonable assumptions and exploiting recent results on group-lasso like procedures. Numerical experiments show the performance of the proposed method in different synthetic scenarios as well as in a real case example for the analysis and joint detection of sleep spindles and K-complex events for multiple electroencephalogram (EEG) signals., 38 pages, 8 figures
- Published
- 2019
4. Interactome mapping defines BRG1, a component of the SWI/SNF chromatin remodeling complex, as a new partner of the transcriptional regulator CTCF
- Author
-
Sabrina Esposito, Camilla Rega, Claudia Angelini, Maria Teresa Gentile, Tioajiang Xiao, Maria Michela Marino, Italia De Feis, Rosita Russo, Gary Felsenfeld, Ilaria Baglivo, Angela Chambery, Mariangela Valletta, Paolo V. Pedone, Marino, Maria Michela, Rega, Camilla, Russo, Rosita, Valletta, Mariangela, Gentile, Maria Teresa, Esposito, Sabrina, Baglivo, Ilaria, De Feis, Italia, Angelini, Claudia, Xiao, Tioajiang, Felsenfeld, Gary, Chambery, Angela, and Pedone, Paolo Vincenzo
- Subjects
0301 basic medicine ,CCCTC-Binding Factor ,Genomics and Proteomics ,Computational biology ,Insulator (genetics) ,Biology ,Biochemistry ,Interactome ,Chromatin remodeling ,protein-protein interaction ,ChIP-Seq ,03 medical and health sciences ,BRG1 ,Cell Line, Tumor ,Interactomic ,Humans ,mass spectrometry (MS) ,transcriptional regulation ,Enhancer ,Molecular Biology ,proteomic ,transcription factor ,Zinc finger ,030102 biochemistry & molecular biology ,Protein interaction ,DNA Helicases ,Nuclear Proteins ,Cell Biology ,CTCF ,Chromatin Assembly and Disassembly ,SWI/SNF ,Chromatin ,030104 developmental biology ,Multiprotein Complexes ,chromatin ,Transcription Factors - Abstract
The highly conserved zinc finger CCCTC-binding factor (CTCF) regulates genomic imprinting and gene expression by acting as a transcriptional activator or repressor of promoters and insulator of enhancers. The multiple functions of CTCF are accomplished by co-association with other protein partners and are dependent on genomic context and tissue specificity. Despite the critical role of CTCF in the organization of genome structure, to date, only a subset of CTCF interaction partners have been identified. Here we present a large-scale identification of CTCF binding partners using affinity purification and high-resolution LC-MS/MS analysis. In addition to functional enrichment of specific protein families such as the ribosomal proteins and the DEAD box helicases, we identified novel high-confidence CTCF interactors that provide a still unexplored biochemical context for CTCF's multiple functions. One of the newly validated CTCF interactors is BRG1, the major ATPase subunit of the chromatin remodeling complex SWI/SNF, establishing a relationship between two master regulators of genome organization. This work significantly expands the current knowledge of the human CTCF interactome and represents an important resource to direct future studies aimed at uncovering molecular mechanisms modulating CTCF pleiotropic functions throughout the genome. The highly conserved zinc finger CCCTC-binding factor (CTCF) regulates genomic imprinting and gene expression by acting as a transcriptional activator or repressor of promoters and insulator of enhancers. The multiple functions of CTCF are accomplished by co-association with other protein partners and are dependent on genomic context and tissue specificity. Despite the critical role of CTCF in the organization of genome structure, to date, only a subset of CTCF interaction partners have been identified. Here we present a large-scale identification of CTCF-binding partners using affinity purification and high-resolution LC-MS/MS analysis. In addition to functional enrichment of specific protein families such as the ribosomal proteins and the DEAD box helicases, we identified novel high-confidence CTCF interactors that provide a still unexplored biochemical context for CTCF’s multiple functions. One of the newly validated CTCF interactors is BRG1, the major ATPase subunit of the chromatin remodeling complex SWI/SNF, establishing a relationship between two master regulators of genome organization. This work significantly expands the current knowledge of the human CTCF interactome and represents an important resource to direct future studies aimed at uncovering molecular mechanisms modulating CTCF pleiotropic functions throughout the genome.
- Published
- 2019
5. Low and High Resonance Components Restoration in Multichannel Data
- Author
-
Daniela De Canditiis and Italia De Feis
- Subjects
Computer science ,Component (UML) ,Frame (networking) ,Wavelet transform ,Sparse approximation ,Backfitting algorithm ,Signal ,Algorithm ,Resonance (particle physics) ,Communication channel - Abstract
A technique for the restoration of low resonance component and high resonance component of K independently measured signals is presented. The definition of low and high resonance component is given by the Rational Dilatation Wavelet Transform (RADWT), a particular kind of finite frame that provides sparse representation of functions with different oscillations persistence. It is assumed that the signals are measured simultaneously on several independent channels and in each channel the underlying signal is the sum of two components: the low resonance component and the high resonance component, both sharing some common characteristic between the channels. Components restoration is performed by means of the lasso-type penalty and backfitting algorithm. Numerical experiments show the performance of the proposed method in different synthetic scenarios highlighting the advantage of estimating the two components separately rather than together.
- Published
- 2020
6. Le diverse declinazioni della matematica per lo sviluppo dell’intelligenza artificiale nei seminari 'AIM (Artificial Intelligence and Mathematics) - Fundamentals and beyond'. Il ciclo di seminari dell’Istituto per le Applicazioni del Calcolo (IAC)
- Author
-
Italia De Feis, Stefania Giuffrida, and Flavio Lombardi
- Published
- 2022
7. An optimal interpolation scheme for surface and atmospheric parameters: applications to SEVIRI and IASI
- Author
-
Carmine Serio, Italia De Feis, and Guido Masiello
- Subjects
Spectroradiometer ,Data assimilation ,geostationary and polar satellites ,Temporal resolution ,Emissivity ,Environmental science ,2D optimal interpolation ,infrared radiances ,Kalman Filter ,Satellite ,Kalman filter ,Infrared atmospheric sounding interferometer ,Astrophysics::Galaxy Astrophysics ,Interpolation ,Remote sensing - Abstract
In this paper, we present a 2-Dimensional (2D) Optimal Interpolation (OI) technique for spatially scattered infrared satellite observations, from which level 2 products have been obtained, in order to yield level 3, regularly gridded, data. The scheme derives from a Bayesian predictor-corrector scheme used in data assimilation and is based on the Kalman filter estimation. It has been applied to 15-minutes temporal resolution Spinning Enhanced Visible and Infrared Imager (SEVIRI) emissivity and temperature products and to Infrared Atmospheric Sounding Interferometer (IASI) atmospheric ammonia (NH3) retrievals, a gas affecting the air quality. Results have been exemplified for target areas over Italy. In particular temperature retrievals have been compared with gridded data from MODIS (Moderate-resolution Imaging Spectroradiometer) observations. Our findings show that the proposed strategy is quite effective to fill gaps because of data voids due, e.g., to clouds, gains more efficiency in capturing the daily cycle for surface parameters and provides valuable information on NH3 concentration and variability in regions not yet covered by ground-based instruments.
- Published
- 2019
8. Additive model selection
- Author
-
Umberto Amato, Italia De Feis, and Anestis Antoniadis
- Subjects
Statistics and Probability ,Mathematical optimization ,Additive models · Dimension reduction · Penalization · Hypothesis test · Backfitting ,Dimensionality reduction ,Estimator ,020206 networking & telecommunications ,Feature selection ,02 engineering and technology ,High dimensional ,01 natural sciences ,Oracle ,010104 statistics & probability ,0202 electrical engineering, electronic engineering, information engineering ,0101 mathematics ,Statistics, Probability and Uncertainty ,Additive model ,Selection (genetic algorithm) ,Statistical hypothesis testing ,Mathematics - Abstract
We study sparse high dimensional additive model fitting via penalization with sparsity-smoothness penalties. We review several existing algorithms that have been developed for this problem in the recent literature, highlighting the connections between them, and present some computationally efficient algorithms for fitting such models. Furthermore, using reasonable assumptions and exploiting recent results on group LASSO-like procedures, we take advantage of several oracle results which yield asymptotic optimality of estimators for high-dimensional but sparse additive models. Finally, variable selection procedures are compared with some high-dimensional testing procedures available in the literature for testing the presence of additive components.
- Published
- 2016
9. Cancer Markers Selection Using Network-Based Cox Regression: A Methodological and Computational Practice
- Author
-
Antonella Iuliano, Italia De Feis, Pietro Liò, Annalisa Occhipinti, Claudia Angelini, Lio, Pietro [0000-0002-0540-5053], and Apollo - University of Cambridge Repository
- Subjects
0301 basic medicine ,Computer science ,Physiology ,computer.software_genre ,high-dimensionality ,survival ,Correlation ,03 medical and health sciences ,Permutation ,0302 clinical medicine ,Physiology (medical) ,medicine ,Methods ,cancer ,Selection (genetic algorithm) ,Proportional hazards model ,Cancer ,medicine.disease ,3. Good health ,regularization ,Identification (information) ,030104 developmental biology ,030220 oncology & carcinogenesis ,Cox model ,network ,gene expression ,Cancer gene ,Cancer biomarkers ,Data mining ,computer - Abstract
International initiatives such as the Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) are collecting multiple datasets at different genome-scales with the aim of identifying novel cancer biomarkers and predicting survival of patients. To analyze such data, several statistical methods have been applied, among them Cox regression models. Although these models provide a good statistical framework to analyze omic data, there is still a lack of studies that illustrate advantages and drawbacks in integrating biological information and selecting groups of biomarkers. In fact, classical Cox regression algorithms focus on the selection of a single biomarker, without taking into account the strong correlation between genes. Even though network-based Cox regression algorithms overcome such drawbacks, such network-based approaches are less widely used within the life science community. In this article, we aim to provide a clear methodological framework on the use of such approaches in order to turn cancer research results into clinical applications. Therefore, we first discuss the rationale and the practical usage of three recently proposed network-based Cox regression algorithms (i.e., Net-Cox, AdaLnet, and fastcox). Then, we show how to combine existing biological knowledge and available data with such algorithms to identify networks of cancer biomarkers and to estimate survival of patients. Finally, we describe in detail a new permutation based approach to better validate the significance of the selection in terms of cancer gene signatures and pathway/networks identification. We illustrate the proposed methodology by means of both simulations and real case studies. Overall, the aim of our work is two-fold. Firstly, to show how network-based Cox regression models can be used to integrate biological knowledge (e.g., multi-omics data) for the analysis of survival data. Secondly, to provide a clear methodological and computational approach for investigating cancers regulatory networks. Keywords: cancer, Cox model, high-dimensionality, gene expression, network, regularization, survival
- Published
- 2016
10. Uncovering the Complexity of Transcriptomes with RNA-Seq
- Author
-
Alfredo Ciccodicola, Claudia Angelini, Italia De Feis, and Valerio Costa
- Subjects
Sequence analysis ,lcsh:Biotechnology ,Health, Toxicology and Mutagenesis ,lcsh:Medicine ,RNA-Seq ,Review Article ,Computational biology ,Biology ,CELL TRANSCRIPTOME ,Genome ,DNA sequencing ,Transcriptome ,lcsh:TP248.13-248.65 ,Genetics ,Animals ,Humans ,GENOME-WIDE ANALYSIS ,Molecular Biology ,Gene ,GENE-EXPRESSION ,Massive parallel sequencing ,Sequence Analysis, RNA ,Gene Expression Profiling ,lcsh:R ,Computational Biology ,General Medicine ,NUCLEOTIDE RESOLUTION ,Gene expression profiling ,Gene Expression Regulation ,EUKARYOTIC TRANSCRIPTOME ,RNA ,Molecular Medicine ,Biotechnology - Abstract
In recent years, the introduction of massively parallel sequencing platforms for Next Generation Sequencing (NGS) protocols, able to simultaneously sequence hundred thousand DNA fragments, dramatically changed the landscape of the genetics studies. RNA-Seq for transcriptome studies, Chip-Seq for DNA-proteins interaction, CNV-Seq for large genome nucleotide variations are only some of the intriguing new applications supported by these innovative platforms. Among them RNA-Seq is perhaps the most complex NGS application. Expression levels of specific genes, differential splicing, allele-specific expression of transcripts can be accurately determined by RNA-Seq experiments to address many biological-related issues. All these attributes are not readily achievable from previously widespread hybridization-based or tag sequence-based approaches. However, the unprecedented level of sensitivity and the large amount of available data produced by NGS platforms provide clear advantages as well as new challenges and issues. This technology brings the great power to make several new biological observations and discoveries, it also requires a considerable effort in the development of new bioinformatics tools to deal with these massive data files. The paper aims to give a survey of the RNA-Seq methodology, particularly focusing on the challenges that this application presents both from a biological and a bioinformatics point of view.
- Published
- 2010
11. Applications of Network-based Survival Analysis Methods for Pathways Detection in Cancer
- Author
-
Claudia Angelini, Pietro Liò, Italia De Feis, Antonella Iuliano, and Annalisa Occhipinti
- Subjects
Clustering high-dimensional data ,Microarray ,Proportional hazards model ,Gene expression ,medicine ,Cancer ,Computational biology ,Biology ,medicine.disease ,Gene ,Survival analysis ,Network analysis - Abstract
Gene expression data from high-throughput assays, such as microarray, are often used to predict cancer survival. Available datasets consist of a small number of samples (n patients) and a large number of genes (p predictors). Therefore, the main challenge is to cope with the high-dimensionality. Moreover, genes are co-regulated and their expression levels are expected to be highly correlated. In order to face these two issues, network based approaches can be applied. In our analysis, we compared the most recent network penalized Cox models for high-dimensional survival data aimed to determine pathway structures and biomarkers involved into cancer progression.
- Published
- 2015
12. Pointwise convergence of Fourier regularization for smoothing data
- Author
-
Daniela De Canditiis and Italia De Feis
- Subjects
smoothing data ,Pointwise convergence ,Mathematical optimization ,Applied Mathematics ,Generalized Cross Validation ,Regularization (mathematics) ,Linear subspace ,Local convergence ,Sobolev space ,Computational Mathematics ,symbols.namesake ,Fourier analysis ,symbols ,Applied mathematics ,Fourier regularization ,Mean Squared Error ,Fourier series ,Smoothing ,Mean Integrated Squared Error ,Mathematics - Abstract
The classical smoothing data problem is analyzed in a Sobolev space under the assumption of white noise. A Fourier series method based on regularization endowed with generalized cross validation is considered to approximate the unknown function. This approximation is globally optimal, i.e., the mean integrated squared error reaches the optimal rate in the minimax sense. In this paper the pointwise convergence property is studied. Specifically, it is proved that the smoothed solution is locally convergent but not locally optimal. Examples of functions for which the approximation is subefficient are given. It is shown that optimality and superefficiency are possible when restricting to more regular subspaces of the Sobolev space.
- Published
- 2006
13. Computational approaches for isoform detection and estimation: good and bad news
- Author
-
Claudia Angelini, Daniela De Canditiis, and Italia De Feis
- Subjects
Sequence Analysis, RNA ,Gene Expression Profiling ,Applied Mathematics ,Computational biology ,Biology ,Bioinformatics ,Biochemistry ,Computer Science Applications ,Data-driven ,Annotation ,Identification (information) ,Structural Biology ,RNA Isoforms ,False positive paradox ,Humans ,DNA microarray ,Precision and recall ,Molecular Biology ,Algorithms ,Software ,Research Article ,Reference genome - Abstract
Background The main goal of the whole transcriptome analysis is to correctly identify all expressed transcripts within a specific cell/tissue - at a particular stage and condition - to determine their structures and to measure their abundances. RNA-seq data promise to allow identification and quantification of transcriptome at unprecedented level of resolution, accuracy and low cost. Several computational methods have been proposed to achieve such purposes. However, it is still not clear which promises are already met and which challenges are still open and require further methodological developments. Results We carried out a simulation study to assess the performance of 5 widely used tools, such as: CEM, Cufflinks, iReckon, RSEM, and SLIDE. All of them have been used with default parameters. In particular, we considered the effect of the following three different scenarios: the availability of complete annotation, incomplete annotation, and no annotation at all. Moreover, comparisons were carried out using the methods in three different modes of action. In the first mode, the methods were forced to only deal with those isoforms that are present in the annotation; in the second mode, they were allowed to detect novel isoforms using the annotation as guide; in the third mode, they were operating in fully data driven way (although with the support of the alignment on the reference genome). In the latter modality, precision and recall are quite poor. On the contrary, results are better with the support of the annotation, even though it is not complete. Finally, abundance estimation error often shows a very skewed distribution. The performance strongly depends on the true real abundance of the isoforms. Lowly (and sometimes also moderately) expressed isoforms are poorly detected and estimated. In particular, lowly expressed isoforms are identified mainly if they are provided in the original annotation as potential isoforms. Conclusions Both detection and quantification of all isoforms from RNA-seq data are still hard problems and they are affected by many factors. Overall, the performance significantly changes since it depends on the modes of action and on the type of available annotation. Results obtained using complete or partial annotation are able to detect most of the expressed isoforms, even though the number of false positives is often high. Fully data driven approaches require more attention, at least for complex eucaryotic genomes. Improvements are desirable especially for isoform quantification and for isoform detection with low abundance.
- Published
- 2014
14. Smoothing data with correlated noise via Fourier transform
- Author
-
Umberto Amato and Italia De Feis
- Subjects
Numerical Analysis ,Mathematical optimization ,General Computer Science ,Applied Mathematics ,Regularization (mathematics) ,Uncorrelated ,Theoretical Computer Science ,Gradient noise ,symbols.namesake ,Fourier transform ,Gaussian noise ,Modeling and Simulation ,symbols ,Applied mathematics ,Value noise ,Fourier domain ,Smoothing ,Mathematics - Abstract
The problem of smoothing data trough a transform in the Fourier domain is analyzed in the case of correlated noise affecting data. A regularization method and two GCV-type criteria are resorted in order to solve the problem, in analogy with the case of uncorrelated noise. All convergence theorems stated for uncorrelated noise are here generalized to the case of correlated noise. Numerical experiments on significant test functions are shown.
- Published
- 2000
15. Multiple Clustering Solutions Analysis through Least-Squares Consensus Algorithms
- Author
-
Italia De Feis, Roberto Tagliaferri, Giancarlo Raiconi, Claudia Angelini, Ida Bifulco, and Loredana Murino
- Subjects
Fuzzy clustering ,Data stream clustering ,CURE data clustering algorithm ,Correlation clustering ,Consensus clustering ,Constrained clustering ,Canopy clustering algorithm ,Data mining ,Cluster analysis ,computer.software_genre ,computer ,Mathematics - Abstract
Clustering is one of the most important unsupervised learning problems and it deals with finding a structure in a collection of unlabeled data; however, different clustering algorithms applied to the same data-set produce different solutions. In many applications the problem of multiple solutions becomes crucial and providing a limited group of good clusterings is often more desirable than a single solution. In this work we propose the Least Square Consensus clustering that allows a user to extrapolate a small number of different clustering solutions from an initial (large) set of solutions obtained by applying any clustering algorithm to a given data-set. Two different implementations are presented. In both cases, each consensus is accomplished with a measure of quality defined in terms of Least Square error and a graphical visualization is provided in order to make immediately interpretable the result. Numerical experiments are carried out on both synthetic and real data-sets.
- Published
- 2010
16. Combining Replicates and Nearby Species Data: A Bayesian Approach
- Author
-
Richard C. van der Wath, Italia De Feis, Pietro Liò, Viet-Anh Nguyen, and Claudia Angelini
- Subjects
Phylogenetic tree ,Phylogenetic inference ,Bayesian variable selection ,Bayesian probability ,Markov chain Monte Carlo ,Biology ,Biological noise ,computer.software_genre ,symbols.namesake ,symbols ,Data mining ,Marginal distribution ,computer ,Data integration - Abstract
Here we discuss the biological high-throughput data dilemma: how to integrate replicated experiments and nearby species data? Should we consider each species as a monadic source of data when replicated experiments are available or, viceversa, should we try to collect information from the large number of nearby species analyzed in the different laboratories? In this paper we make and justify the observation that experimental replicates and phylogenetic data may be combined to strength the evidences on identifying transcriptional motifs and identify networks, which seems to be quite difficult using other currently used methods. In particular we discuss the use of phylogenetic inference and the potentiality of the Bayesian variable selection procedure in data integration. In order to illustrate the proposed approach we present a case study considering sequences and microarray data from fungi species. We also focus on the interpretation of the results with respect to the problem of experimental and biological noise.
- Published
- 2010
17. Evaluation of a dimension‐reduction‐based statistical technique for Temperature, Water Vapour and Ozone retrievals from IASI radiances
- Author
-
Carmine Serio, Guido Masiello, Italia De Feis, Marco Matricardi, Umberto Amato, and Anestis Antoniadis
- Subjects
Ozone ,Meteorology ,Infrared ,Dimensionality reduction ,Infrared atmospheric sounding interferometer ,Atmosphere ,remote sensing ,chemistry.chemical_compound ,Geography ,chemistry ,light interferometry ,Astrophysics::Earth and Planetary Astrophysics ,meteorology ,Spectral resolution ,Physics::Atmospheric and Oceanic Physics ,Water vapor ,Remote sensing - Abstract
Remote sensing of atmosphere is changing rapidly thanks to the development of high spectral resolution infrared space-borne sensors. The aim is to provide more and more accurate information on the lower atmosphere, as requested by the World Meteorological Organization (WMO), to improve reliability and time span of weather forecasts plus Earth's monitoring. In this paper we show the results we have obtained on a set of Infrared Atmospheric Sounding Interferometer (IASI) observations using a new statistical strategy based on dimension reduction. Retrievals have been compared to time-space colocated ECMWF analysis for temperature, water vapor and ozone.
- Published
- 2009
18. Simultaneous temperature and water vapor profile from IASI radiances
- Author
-
Carmine Serio, Italia De Feis, Alberta M. Lubrano, and Guido Masiello
- Subjects
Reduction (complexity) ,Meteorology ,Chemistry ,Cluster (physics) ,Information loss ,Inverse problem ,Physics::Atmospheric and Oceanic Physics ,Water vapor ,Remote sensing - Abstract
The IASI has 8461 potential channels to be exploited for inversions of geophysical parameters. In this paper we analyze two different strategies for their reduction. The first one looks for suitable spectral ranges where the inverse problem is as linear as possible; the second one is based on the cluster analysis theory. Our aim is to minimize the potential information loss evaluated by directly comparing the retrieved temperature and water vapor profiles on a complete set of test atmospheres.
- Published
- 2001
19. Statistical approaches for the analysis of RNA-Seq and ChIP-seq data and their integration
- Author
-
Claudia Angelini and Italia De Feis
- Subjects
Workstation ,business.industry ,Computer science ,Probabilistic logic ,Wet laboratory ,Context (language use) ,Bioinformatics ,Data science ,Pipeline (software) ,Data type ,Bottleneck ,law.invention ,Software ,law ,business - Abstract
The recent introduction of Next-Generation Sequencing (NGS) platforms, able to simultaneously sequence hundreds of millions of DNA fragments, has dramatically changed the landscape of genetics and genomic studies. However, to benefit of this novel sequencing technology, advanced laboratory and molecular biology expertise must be combined with a strong multidisciplinary background in data analysis. In addition, since the output of an experiment consists of a huge amount of data, terabytes of storage and clusters of computers are required to manage the computational bottleneck. Recently, the Institute of Genetics and Biophysics (IGB) and the Istituto per le Applicazioni del Calcolo (IAC) have started a close collaboration aimed to set up a novel NGS facility in Naples that integrates both the wet laboratory and the bioinformatics core. Therefore, the IGB acquired a SOLiD system (now version 4) and, nowadays it provides all the wet laboratory capabilities and its experience in molecular biology for a wide range of experiments. Our team at IAC provides the experience in the usage and the development of computational methods for their analysis and it is also equipped with a powerful cluster of workstations (http://lilligridbio.na.iac.cnr.it/wordpress/) capable of handling massive computational tasks. The research activities are directed toward two directions: from one side the effort of our group is devoted to the use of efficient software, the maintenance and development of bioinformatics pipeline for specific applications required by the sequencing facility, on the other hand the scientific interest is also devoted to the development of innovative statistical techniques for the NGS data analysis and to the implementation of novel algorithms using both CPU and GPU systems. Till now our group has been involved the analysis of a series of independent studies on both RNA-seq and ChIP-seq. The experiments were conducted on the local sequencing facility by dr. Ciccodicola (for the RNA-seq data) and dr. Matarazzo (for the ChIP-seq data) groups at IGB-CNR, which are also members of the SEQAHEAD Cost Action. In this context our ongoing activities are devoted to the implementation of specific pipeline on our local cluster and to the definition of a probabilistic approach to model in terms of “signal plus noise” both transcriptional profiles and chromatin profiles. However, since we believe that integrating ChIP-seq and RNA-seq data is expected to provide much more biological insights for a better understanding of the mechanisms involved in gene expression regulation, rather than using one dataset only, we will focus our attention on the integration of these types of data in a unified statistical framework. In the light of these considerations our group is aimed to contribute to the goals of the SEQAHEAD project by actively participating to the discussion concerning the development of novel statistical and computational methods for the analysis of RNA-Seq and ChIP-seq data and their integration, and to the development of educational programs on the statistical analysis of NGS data.
- Published
- 2012
20. Massive-scale RNA-Seq experiments in human genetic diseases
- Author
-
Carmela Ziviello, Claudia Angelini, Margherita Scarpato, Roberta Esposito, Maria Rosaria Ambrosio, Alfredo Ciccodicola, Marianna Aprile, Valerio Costa, and Italia De Feis
- Subjects
Regulation of gene expression ,Genetics ,Transcriptome ,Massive parallel sequencing ,Gene regulatory network ,RNA-Seq ,Biology ,Gene ,Chromatin immunoprecipitation ,DNA sequencing - Abstract
Since 2008, our research group is actively working in the field of NGS, with particular attention to RNA-Seq as innovative approach to understand cells’ transcriptome in disease states (Costa et al., 2010a). In particular, combining molecular biology and computational expertise, we have recently analysed (Costa et al., 2011) by RNA-Seq for the first time in Down syndrome (DS) the global transcriptome of endothelial progenitor cells (EPCs), morphologically and functionally impaired in DS (Costa et al., 2010b). After rRNA depletion followed by strand specific sequencing we measured expression from (even) low expressed genes, we identified new regions of active transcription outside annotated loci, novel splice isoforms and extended untranslated regions for known genes, potentially new microRNA targets or regulatory sites. However, although RNA-Seq provided a huge amount of useful data for DS, showing a genome-wide alteration of gene expression (not limited to HSA21 genes), the experiment revealed only a fraction of the underlying complexity, giving no information about the reasons of such global deregulation. Therefore, in this ongoing project we aim to study: 1) by ChIP-Seq, the binding maps of some (preliminarily selected) transcription factors (TFs), key players in gene expression modulation, and 2) by RNA-Seq, the related gene expression changes in the same cells. ChIP-Seq, combining standard chromatin immunoprecipitation and massively parallel sequencing, allows to identify DNA sequences bound by TFs in vivo, helping to decipher gene regulatory networks (Park 2009). We believe that integrating RNAand ChIP-Seq data would provide much more biological insights into gene expression regulation in DS cells, helping us to better understand some blood-related pathological aspects of the syndrome. Our group is also participating to a large-scale collaborative industrial project aimed to develop a diagnostic kit for personalized therapeutic strategies in type 2 diabetic (T2D) patients resistant to conventional drug therapies. In particular, to elucidate some mechanisms of drug resistance, our group will perform massive-scale transcriptome analysis by RNA-Seq in a well-selected subset of individuals (~50), also collaborating with bioinformaticians to further data analysis. In the light of these considerations, and given the objectives of the COST Action BM1006, our group will contribute to the goals of the SEQAHEAD project by actively integrating in the newborn European network of NGS, providing its expertise in sequencing technologies with a particular contribution (protocols, experimental data and pipelines for data analysis) to the RNA-Seq.
- Published
- 2012
21. Regularized inverse algorithms for temperature and absorbing constituent profiles from radiance spectra
- Author
-
Italia De Feis, Carmine Serio, and Umberto Amato
- Subjects
Troposphere ,Geography ,Ordinary least squares ,Radiance ,Inverse ,Inverse problem ,Generalized singular value decomposition ,Algorithm ,Regularization (mathematics) ,Smoothing - Abstract
Retrieving of temperature profiles from radiance data obtained by interferograms is an important problem in remote sensing of atmosphere. The great amount of data to process and the ill-conditioning of the problem demand objective procedures able to reduce the error of the retrieval. In this paper we use Generalized Singular Value Decomposition (GSVD), which is able to deal with deficient-rank smoothing functionals in order to regularize the problem and the L-Curve criterion for choosing the optimal regularization parameter and then the proper amount of smoothing. Some test problems of temperature inversion are carried out to examine the effectiveness of the methods considered; to this purpose we use some indicators based on the bias and variance of the output temperature. We show that the objective L-Curve criterion does not perform fully satisfactory in estimating the optimal regularization parameter and then in reducing output error at best. In any case GSVD plus L-Curve criterion prove effective in reducing output error (with respect to the ordinary least squares method). In particular, reduction of variance over troposphere and stratosphere is high for all tested cases; reduction of bias depends on the first-guess profile. An important role in the latter is played by the choice of deficient-rank smoothing functional.© (1995) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.