153 results on '"unsupervised methods"'
Search Results
2. Identifying novel data-driven subgroups in congenital heart disease using multi-modal measures of brain structure
- Author
-
Vandewouw, Marlee M., Norris-Brilliant, Ami, Rahman, Anum, Assimopoulos, Stephania, Morton, Sarah U., Kushki, Azadeh, Cunningham, Sean, King, Eileen, Goldmuntz, Elizabeth, Miller, Thomas A., Thomas, Nina H., Adams, Heather R., Cleveland, John, Cnota, James F., Ellen Grant, P, Goldberg, Caren S., Huang, Hao, Li, Jennifer S., McQuillen, Patrick, Porter, George A., Roberts, Amy E., Russell, Mark W., Seidman, Christine E., Tivarus, Madalina E., Chung, Wendy K., Hagler, Donald J., Newburger, Jane W., Panigrahy, Ashok, Lerch, Jason P, Gelb, Bruce D., and Anagnostou, Evdokia
- Published
- 2024
- Full Text
- View/download PDF
3. Holistic Consistency for Subject-Level Segmentation Quality Assessment in Medical Image Segmentation
- Author
-
Zhang, Yizhe, Zhou, Tao, Chen, Qiang, Dou, Qi, Wang, Shuo, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Sudre, Carole H., editor, Mehta, Raghav, editor, Ouyang, Cheng, editor, Qin, Chen, editor, Rakic, Marianne, editor, and Wells, William M., editor
- Published
- 2025
- Full Text
- View/download PDF
4. Rapid Unsupervised Keyphrase Extraction from Single Document
- Author
-
Svetlana Popova, Vera Danilova, and John Cardiff
- Subjects
keyphrase extraction ,unsupervised methods ,stop words ,natural language processing ,Telecommunication ,TK5101-6720 - Abstract
Keyphrases offer a concise representation of a document’s content. They are valuable for improving web search results and enhancing tasks such as document tagging, text classification, or summarization. This makes keyphrase extraction is an essential component of text mining. Among the widely used constraints and features in existing keyphrase extraction methods, we identified several effective techniques that have not yet been used together: Part-of-Speech (PoS) restrictions, extended stop-word lists, and position-based features. To address this gap, we propose an approach that leverages automatically extracted extended stop word lists combined with PoS restrictions in keyphrases, and incorporates positional criteria. The main goal of the work was to develop a fast keyphrase extraction algorithm, which was built upon the three mentioned features. Experimental results on the INSPEC and SemEval 2010 datasets demonstrate the effectiveness of the proposed method.
- Published
- 2024
- Full Text
- View/download PDF
5. Exploring the unifying concept of spondyloarthritis: a latent class analysis of the REGISPONSER registry.
- Author
-
Michelena, Xabier, Sepriano, Alexandre, Zhao, Sizheng Steven, López-Medina, Clementina, Collantes-Estévez, Eduardo, Font-Ugalde, Pilar, Juanola, Xavier, and Marzo-Ortega, Helena
- Subjects
- *
PERIPHERAL neuropathy , *CROSS-sectional method , *STATISTICAL models , *RESEARCH funding , *MEDICAL specialties & specialists , *BLOOD testing , *PSORIASIS , *ANKYLOSIS , *STRUCTURAL equation modeling , *DISEASE prevalence , *RESEARCH , *SPONDYLOARTHROPATHIES , *INFLAMMATION , *PHENOTYPES , *SACROILIAC joint , *BACKACHE , *NAIL diseases - Abstract
Objectives The aim of our study was to identify the potential distinct phenotypes within a broad SpA population. Methods We conducted a cross-sectional study using the REGISPONSER registry, which has data from 31 specialist centres in Spain, including patients with SpA who have fulfilled the ESSG criteria. A latent class analysis (LCA) was performed to identify the latent classes underlying SpA according to a set of predefined clinical and radiographic features, independently of expert opinion. Results In a population of 2319 SpA patients, a five-classes LCA model yielded the best fit. Classes named 'Axial with spine involvement' and 'Axial with isolated SI joint involvement' showed a primarily axial SpA phenotype defined by inflammatory back pain and high HLA-B27 prevalence. Patients in class 'Axial + peripheral' showed a similar distribution of manifest variables to previous classes but also had a higher likelihood of peripheral involvement (peripheral arthritis/dactylitis) and enthesitis, therefore representing a mixed (axial and peripheral) subtype. Classes 'Peripheral + psoriasis' and 'Axial + peripheral + psoriasis' were indicative of peripheral SpA (and/or PsA) with high likelihood of psoriasis, peripheral involvement, dactylitis, nail disease, and low HLA-B27 prevalence, while class 'Axial + peripheral + psoriasis' also exhibited increased probability of axial involvement both clinically and radiologically. Conclusion The identification of five latent classes in the REGISPONSER registry with significant overlap between axial and peripheral phenotypes is concordant with a unifying concept of SpA. Psoriasis and related features (nail disease and dactylitis) influenced the phenotype of both axial and peripheral manifestations. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. EpidermaQuant: Unsupervised Detection and Quantification of Epidermal Differentiation Markers on H-DAB-Stained Images of Reconstructed Human Epidermis.
- Author
-
Zamojski, Dawid, Gogler, Agnieszka, Scieglinska, Dorota, and Marczyk, Michal
- Subjects
- *
IMMUNOSTAINING , *K-means clustering , *FILAGGRIN , *PROTEIN analysis ,KERATINOCYTE differentiation - Abstract
The integrity of the reconstructed human epidermis generated in vitro can be assessed using histological analyses combined with immunohistochemical staining of keratinocyte differentiation markers. Technical differences during the preparation and capture of stained images may influence the outcome of computational methods. Due to the specific nature of the analyzed material, no annotated datasets or dedicated methods are publicly available. Using a dataset with 598 unannotated images showing cross-sections of in vitro reconstructed human epidermis stained with DAB-based immunohistochemistry reaction to visualize four different keratinocyte differentiation marker proteins (filaggrin, keratin 10, Ki67, HSPA2) and counterstained with hematoxylin, we developed an unsupervised method for the detection and quantification of immunohistochemical staining. The pipeline consists of the following steps: (i) color normalization; (ii) color deconvolution; (iii) morphological operations; (iv) automatic image rotation; and (v) clustering. The most effective combination of methods includes (i) Reinhard's normalization; (ii) Ruifrok and Johnston color-deconvolution method; (iii) proposed image-rotation method based on boundary distribution of image intensity; and (iv) k-means clustering. The results of the work should enhance the performance of quantitative analyses of protein markers in reconstructed human epidermis samples and enable the comparison of their spatial distribution between different experimental conditions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Detecting Offensive Language on Malay Social Media: A Zero-Shot, Cross-Language Transfer Approach Using Dual-Branch mBERT.
- Author
-
Guo, Xingyi, Adnan, Hamedi Mohd, and Abidin, Muhammad Zaiamri Zainal
- Subjects
MALAY language ,ONLINE comments ,LANGUAGE models ,SOCIAL media ,INTERGROUP relations ,CITIZENS ,SPEECH ,VIRTUAL communities ,DEAF children - Abstract
Social media serves as a platform for netizens to stay informed and express their opinions through the Internet. Currently, the social media discourse environment faces a significant security threat—offensive comments. A group of users posts comments that are provocative, discriminatory, and objectionable, intending to disrupt online discussions, provoke others, and incite intergroup conflict. These comments undermine citizens' legitimate rights, disrupt social order, and may even lead to real-world violent incidents. However, current automatic detection of offensive language primarily focuses on a few high-resource languages, leaving low-resource languages, such as Malay, with insufficient annotated corpora for effective detection. To address this, we propose a zero-shot, cross-language unsupervised offensive language detection (OLD) method using a dual-branch mBERT transfer approach. Firstly, using the multi-language BERT (mBERT) model as the foundational language model, the first network branch automatically extracts features from both source and target domain data. Subsequently, Sinkhorn distance is employed to measure the discrepancy between the source and target language feature representations. By estimating the Sinkhorn distance between the labeled source language (e.g., English) and the unlabeled target language (e.g., Malay) feature representations, the method minimizes the Sinkhorn distance adversarially to provide more stable gradients, thereby extracting effective domain-shared features. Finally, offensive pivot words from the source and target language training sets are identified. These pivot words are then removed from the training data in a second network branch, which employs the same architecture. This process constructs an auxiliary OLD task. By concealing offensive pivot words in the training data, the model reduces overfitting and enhances robustness to the target language. In the end-to-end framework training, the combination of cross-lingual shared features and independent features culminates in unsupervised detection of offensive speech in the target language. The experimental results demonstrate that employing cross-language model transfer learning can achieve unsupervised detection of offensive content in low-resource languages. The number of labeled samples in the source language is positively correlated with transfer performance, and a greater similarity between the source and target languages leads to better transfer effects. The proposed method achieves the best performance in OLD on the Malay dataset, achieving an F1 score of 80.7%. It accurately identifies features of offensive speech, such as sarcasm, mockery, and implicit expressions, and showcases strong generalization and excellent stability across different target languages. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Deep Learning for Video Localization
- Author
-
Wu, Zuxuan, Jiang, Yu-Gang, Shen, Xuemin Sherman, Series Editor, Wu, Zuxuan, and Jiang, Yu-Gang
- Published
- 2024
- Full Text
- View/download PDF
9. Unsupervised Approaches in Anomaly Detection
- Author
-
Higuera, Juan Ramón Bermejo, Higuera, Javier Bermejo, Montalvo, Juan Antonio Sicilia, Crespo, Rubén González, Kacprzyk, Janusz, Series Editor, Jain, Lakhmi C., Series Editor, Nayak, Janmenjoy, editor, Naik, Bighnaraj, editor, S, Vimal, editor, and Favorskaya, Margarita, editor
- Published
- 2024
- Full Text
- View/download PDF
10. Benchmarking Unsupervised Keyword Extraction Algorithms from Online Senegalese News Articles
- Author
-
Landu, Tony Tona, Bousso, Mamadou, Loum, Mor Absa, Sawadogo, Ibrahim, Dia, Yoro, Sall, Ousmane, Faty, Lamine, Mache, Ramiyou Karim, Sylla, Mohamed, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Nagar, Atulya K., editor, Jat, Dharm Singh, editor, Mishra, Durgesh, editor, and Joshi, Amit, editor
- Published
- 2024
- Full Text
- View/download PDF
11. Unsupervised literature mining approaches for extracting relationships pertaining to habitats and reproductive conditions of plant species
- Author
-
Roselyn Gabud, Portia Lapitan, Vladimir Mariano, Eduardo Mendoza, Nelson Pampolina, Maria Art Antonette Clariño, and Riza Batista-Navarro
- Subjects
relation extraction ,information extraction ,unsupervised methods ,rule-based methods ,transformer models ,biodiversity ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
IntroductionFine-grained, descriptive information on habitats and reproductive conditions of plant species are crucial in forest restoration and rehabilitation efforts. Precise timing of fruit collection and knowledge of species' habitat preferences and reproductive status are necessary especially for tropical plant species that have short-lived recalcitrant seeds, and those that exhibit complex reproductive patterns, e.g., species with supra-annual mass flowering events that may occur in irregular intervals. Understanding plant regeneration in the way of planning for effective reforestation can be aided by providing access to structured information, e.g., in knowledge bases, that spans years if not decades as well as covering a wide range of geographic locations. The content of such a resource can be enriched with literature-derived information on species' time-sensitive reproductive conditions and location-specific habitats.MethodsWe sought to develop unsupervised approaches to extract relationships pertaining to habitats and their locations, and reproductive conditions of plant species and corresponding temporal information. Firstly, we handcrafted rules for a traditional rule-based pattern matching approach. We then developed a relation extraction approach building upon transformer models, i.e., the Text-to-Text Transfer Transformer (T5), casting the relation extraction problem as a question answering and natural language inference task. We then propose a novel unsupervised hybrid approach that combines our rule-based and transformer-based approaches.ResultsEvaluation of our hybrid approach on an annotated corpus of biodiversity-focused documents demonstrated an improvement of up to 15 percentage points in recall and best performance over solely rule-based and transformer-based methods with F1-scores ranging from 89.61 to 96.75% for reproductive condition - temporal expression relations, and ranging from 85.39% to 89.90% for habitat - geographic location relations. Our work shows that even without training models on any domain-specific labeled dataset, we are able to extract relationships between biodiversity concepts from literature with satisfactory performance.
- Published
- 2024
- Full Text
- View/download PDF
12. Variable Selection in Binary Logistic Regression for Modelling Bankruptcy Risk
- Author
-
Pierri, Francesca, Kitsos, Christos P., editor, Oliveira, Teresa A., editor, Pierri, Francesca, editor, and Restaino, Marialuisa, editor
- Published
- 2023
- Full Text
- View/download PDF
13. Fake Product Review Detection Using Machine Learning
- Author
-
Santhosh Krishna, B. V., Rajalakshmi, B., Vijay, M., Reddy, Donapati Jaswanth, Abhishek, Bavanasi, Ashwini Reddy, C., Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Shaw, Rabindra Nath, editor, Paprzycki, Marcin, editor, and Ghosh, Ankush, editor
- Published
- 2023
- Full Text
- View/download PDF
14. Explainability for Clustering Models
- Author
-
Arora, Mahima, Chopra, Ankush, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Yusoff, Marina, editor, Hai, Tao, editor, Kassim, Murizah, editor, Mohamed, Azlinah, editor, and Kita, Eisuke, editor
- Published
- 2023
- Full Text
- View/download PDF
15. A Method for Workflow Segmentation and Action Prediction from Video Data - AR Content
- Author
-
Kumar, Abhishek, Agnihotram, Gopichand, Kumar, Surbhit, Sudidhala, Raja Sekhar Reddy, Naik, Pandurang, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Patel, Kanubhai K., editor, Santosh, K. C., editor, and Patel, Atul, editor
- Published
- 2023
- Full Text
- View/download PDF
16. Building Secured Software Defined Networks by Analyzing Anomaly Detection Algorithms on Various Attacks
- Author
-
Presilla, R., Kallimani, Jagadish S., Eswarawaka, Rajesh, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Fong, Simon, editor, Dey, Nilanjan, editor, and Joshi, Amit, editor
- Published
- 2023
- Full Text
- View/download PDF
17. Statistical characterization of vaccinated cases and deaths due to COVID-19: methodology and case study in South America
- Author
-
Carlos Martin-Barreiro, Xavier Cabezas, Víctor Leiva, Pedro Ramos-De Santis, John A. Ramirez-Figueroa, and Erwin J. Delgado
- Subjects
clustering analysis ,data science ,disjoint pca ,k-means analysis ,multivariate statistical analysis ,$\texttt{r}$ software ,sars-cov2 ,unsupervised methods ,Mathematics ,QA1-939 - Abstract
Many studies have been performed in different regions of the world as a result of the COVID-19 pandemic. In this work, we perform a statistical study related to the number of vaccinated cases and the number of deaths due to COVID-19 in ten South American countries. Our objective is to group countries according to the aforementioned variables. Once the groups of countries are built, they are characterized based on common properties of countries in the same group and differences between countries that are in different groups. Countries are grouped using principal component analysis and K-means analysis. These methods are combined in a single procedure that we propose for the classification of the countries. Regarding both variables, the countries were classified into three groups. Political decisions, availability of resources, bargaining power with suppliers and health infrastructure among others are some of the factors that can affect both the vaccination process and the timely care of infected people to avoid death. In general, the countries acted in a timely manner in relation to the vaccination of their citizens with the exception of two countries. Regarding the number of deaths, all countries reached peaks at some point in the study period.
- Published
- 2023
- Full Text
- View/download PDF
18. Detecting Offensive Language on Malay Social Media: A Zero-Shot, Cross-Language Transfer Approach Using Dual-Branch mBERT
- Author
-
Xingyi Guo, Hamedi Mohd Adnan, and Muhammad Zaiamri Zainal Abidin
- Subjects
cross-language model ,offensive language detection ,mBERT ,unsupervised methods ,transfer learning ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
Social media serves as a platform for netizens to stay informed and express their opinions through the Internet. Currently, the social media discourse environment faces a significant security threat—offensive comments. A group of users posts comments that are provocative, discriminatory, and objectionable, intending to disrupt online discussions, provoke others, and incite intergroup conflict. These comments undermine citizens’ legitimate rights, disrupt social order, and may even lead to real-world violent incidents. However, current automatic detection of offensive language primarily focuses on a few high-resource languages, leaving low-resource languages, such as Malay, with insufficient annotated corpora for effective detection. To address this, we propose a zero-shot, cross-language unsupervised offensive language detection (OLD) method using a dual-branch mBERT transfer approach. Firstly, using the multi-language BERT (mBERT) model as the foundational language model, the first network branch automatically extracts features from both source and target domain data. Subsequently, Sinkhorn distance is employed to measure the discrepancy between the source and target language feature representations. By estimating the Sinkhorn distance between the labeled source language (e.g., English) and the unlabeled target language (e.g., Malay) feature representations, the method minimizes the Sinkhorn distance adversarially to provide more stable gradients, thereby extracting effective domain-shared features. Finally, offensive pivot words from the source and target language training sets are identified. These pivot words are then removed from the training data in a second network branch, which employs the same architecture. This process constructs an auxiliary OLD task. By concealing offensive pivot words in the training data, the model reduces overfitting and enhances robustness to the target language. In the end-to-end framework training, the combination of cross-lingual shared features and independent features culminates in unsupervised detection of offensive speech in the target language. The experimental results demonstrate that employing cross-language model transfer learning can achieve unsupervised detection of offensive content in low-resource languages. The number of labeled samples in the source language is positively correlated with transfer performance, and a greater similarity between the source and target languages leads to better transfer effects. The proposed method achieves the best performance in OLD on the Malay dataset, achieving an F1 score of 80.7%. It accurately identifies features of offensive speech, such as sarcasm, mockery, and implicit expressions, and showcases strong generalization and excellent stability across different target languages.
- Published
- 2024
- Full Text
- View/download PDF
19. Statistical characterization of vaccinated cases and deaths due to COVID-19: methodology and case study in South America.
- Author
-
Martin-Barreiro, Carlos, Cabezas, Xavier, Leiva, Víctor, Santis, Pedro Ramos-De, Ramirez-Figueroa, John A., and Delgado, Erwin J.
- Subjects
COVID-19 pandemic ,PRINCIPAL components analysis ,VACCINATION ,BARGAINING power ,MULTIVARIATE analysis - Abstract
Many studies have been performed in different regions of the world as a result of the COVID-19 pandemic. In this work, we perform a statistical study related to the number of vaccinated cases and the number of deaths due to COVID-19 in ten South American countries. Our objective is to group countries according to the aforementioned variables. Once the groups of countries are built, they are characterized based on common properties of countries in the same group and differences between countries that are in different groups. Countries are grouped using principal component analysis and K-means analysis. These methods are combined in a single procedure that we propose for the classification of the countries. Regarding both variables, the countries were classified into three groups. Political decisions, availability of resources, bargaining power with suppliers and health infrastructure among others are some of the factors that can affect both the vaccination process and the timely care of infected people to avoid death. In general, the countries acted in a timely manner in relation to the vaccination of their citizens with the exception of two countries. Regarding the number of deaths, all countries reached peaks at some point in the study period. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
20. Characterizing Cardiovascular Risk Through Unsupervised and Interpretable Techniques
- Author
-
Calero-Díaz, Hugo, Chushig-Muzo, David, Soguero-Ruiz, Cristina, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Yin, Hujun, editor, Camacho, David, editor, and Tino, Peter, editor
- Published
- 2022
- Full Text
- View/download PDF
21. MultiGAN: Multi-domain Image Translation from OCT to OCTA
- Author
-
Pan, Bing, Ji, Zexuan, Chen, Qiang, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Yu, Shiqi, editor, Zhang, Zhaoxiang, editor, Yuen, Pong C., editor, Han, Junwei, editor, Tan, Tieniu, editor, Guo, Yike, editor, Lai, Jianhuang, editor, and Zhang, Jianguo, editor
- Published
- 2022
- Full Text
- View/download PDF
22. Data Synthesis and Iterative Refinement for Neural Semantic Parsing without Annotated Logical Forms
- Author
-
Wu, Shan, Chen, Bo, Han, Xianpei, Sun, Le, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Sun, Maosong, editor, Liu, Yang, editor, Che, Wanxiang, editor, Feng, Yang, editor, Qiu, Xipeng, editor, Rao, Gaoqi, editor, and Chen, Yubo, editor
- Published
- 2022
- Full Text
- View/download PDF
23. A Multilevel Clustering Method for Risky Areas in the Context of Avalanche Danger Management
- Author
-
Pagnier, Fanny, Pourraz, Frédéric, Coquin, Didier, Verjus, Hervé, Mauris, Gilles, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Ciucci, Davide, editor, Couso, Inés, editor, Medina, Jesús, editor, Ślęzak, Dominik, editor, Petturiti, Davide, editor, Bouchon-Meunier, Bernadette, editor, and Yager, Ronald R., editor
- Published
- 2022
- Full Text
- View/download PDF
24. Non-intrusive Load Monitoring and Its Application in Energy Flexibility Potential Extraction of Active Buildings
- Author
-
Azizi, Elnaz, Beheshti, Mohammad T. H., Bolouki, Sadegh, Vahidinasab, Vahid, editor, and Mohammadi-Ivatloo, Behnam, editor
- Published
- 2022
- Full Text
- View/download PDF
25. Topic Aware Contextualized Embeddings for High Quality Phrase Extraction
- Author
-
Venktesh, V., Mohania, Mukesh, Goyal, Vikram, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Hagen, Matthias, editor, Verberne, Suzan, editor, Macdonald, Craig, editor, Seifert, Christin, editor, Balog, Krisztian, editor, Nørvåg, Kjetil, editor, and Setty, Vinay, editor
- Published
- 2022
- Full Text
- View/download PDF
26. 一种基于动态更新神经网络的无监督雷达 退化故障预测方法.
- Author
-
翟玉婷, 程占昕, and 房少军
- Abstract
In order to overcome the shortcomings of the traditional radar fault detection method, which is highly dependents on expert experience, consumes a lot of manpower and material resources, causes over-repair, cannot give advance warnings for degradation faults, and so on. An unsupervised radar degradation fault prediction method based on dynamic updated-neural network was proposed. Firstly, the historical data of peak power and operating frequency were collected by microwave measurement equipment. Secondly, the dynamic updated-neural network was used to dynamically update the historical data and predict the subsequent data. Finally, the isolated forest method was adopted for unsupervised fault detection on the predicted data. In this way, radar degradation fault prediction and early warning can be realized. The results show that the method proposed can predict degradation faults at least 10-time steps (100 minutes) in advance and give the real-time alarms. It also can realize radar degradation fault prediction when there are small samples, no fault samples, no feature extractions and no artificial thresholds. [ABSTRACT FROM AUTHOR]
- Published
- 2023
27. Generated Image Editing Method Based on Global-Local Jacobi Disentanglement for Machine Learning.
- Author
-
Zhang, Jianlong, Yu, Xincheng, Wang, Bin, and Chen, Chen
- Subjects
- *
GENOME editing , *MACHINE learning , *MATRIX decomposition , *SEARCH algorithms , *LATENT semantic analysis , *DECOMPOSITION method - Abstract
Accurate semantic editing of the generated images is extremely important for machine learning and sample enhancement of big data. Aiming at the problem of semantic entanglement in generated image latent space of the StyleGAN2 network, we proposed a generated image editing method based on global-local Jacobi disentanglement. In terms of global disentanglement, we extract the weight matrix of the style layer in the pre-trained StyleGAN2 network; obtain the semantic attribute direction vector by using the weight matrix eigen decomposition method; finally, utilize this direction vector as the initialization vector for the Jacobi orthogonal regularization search algorithm. Our method improves the speed of the Jacobi orthogonal regularization search algorithm with the proportion of effective semantic attribute editing directions. In terms of local disentanglement, we design a local contrast regularized loss function to relax the semantic association local area and non-local area and utilize the Jacobi orthogonal regularization search algorithm to obtain a more accurate semantic attribute editing direction based on the local area prior MASK. The experimental results show that the proposed method achieves SOTA in semantic attribute disentangled metrics and can discover more accurate editing directions compared with the mainstream unsupervised generated image editing methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
28. Multi-modal Fake News Detection
- Author
-
Chakraborty, Tanmoy, Zhai, ChengXiang, Series Editor, de Rijke, Maarten, Series Editor, Belkin, Nicholas J., Editorial Board Member, Clarke, Charles, Editorial Board Member, Kelly, Diane, Editorial Board Member, Sebastiani, Fabrizio, Editorial Board Member, P, Deepak, Chakraborty, Tanmoy, Long, Cheng, and G, Santhosh Kumar
- Published
- 2021
- Full Text
- View/download PDF
29. Unsupervised Cross-Domain Person Re-Identification Method Based on Attention Block and Refined Clustering
- Author
-
Yan Hui, Xi Wu, Xiuhua Hu, Huan Liu, and Shijie You
- Subjects
Attention mechanism ,cross-domain person re-identification ,hybrid memory bank ,refined clustering strategy ,unsupervised methods ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Most unsupervised cross-domain person re-identification methods based on clustering suffer from a lack of feature discrimination and clustering generates pseudo-labels noise, leading to a decrease in accuracy. To solve these problems, this paper proposes an unsupervised cross-domain person re-identification method based on attention block and refined clustering. Firstly, ResNet50 is selected as the backbone network, coordinate attention and triple attention are concatenated and embedded in ResNet50 to extract fine-grained features, perform feature aggregation, and mine fine-grained information. Secondly, a refined clustering strategy is proposed to achieve a coarse-to-fine clustering process by designing the measurement standards for clustering, determining its reliability, and eliminating noisy samples. Finally, the hybrid memory bank dynamically stores cluster centers and continues to update them with iterations, adapting to changes in clusters and performing invariant learning. The experimental results show that the new method designed in the paper improves the accuracy of rank-1 and mAP by 0.4% and 2.4%, respectively, on the target domain Market-1501 dataset, and improves the accuracy of rank-1 and mAP by 0.4% and 1.1%, respectively, on the target domain DukeMTMC-ReID dataset, compared with other typical methods.
- Published
- 2022
- Full Text
- View/download PDF
30. Application of Parallel Factor Analysis (PARAFAC) to the Regional Characterisation of Vineyard Blocks Using Remote Sensing Time Series.
- Author
-
Lopez-Fornieles, Eva, Brunel, Guilhem, Devaux, Nicolas, Roger, Jean-Michel, Taylor, James, and Tisseyre, Bruno
- Subjects
- *
REMOTE sensing , *TIME series analysis , *FACTOR analysis , *GEOLOGICAL statistics , *SOIL profiles , *VINEYARDS , *GRAPES - Abstract
Monitoring wine-growing regions and maximising the value of production based on their region/local specificities requires accurate spatial and temporal monitoring. The increasing amount and variability of information from remote sensing data is a potential tool to assess this challenge for the grape and wine industry. This article provides a first insight into the capacity of a multiway analysis method applied to Sentinel-2 time series to assess the value of simultaneously considering spectral and temporal information to highlight site-specific canopy evolution in relation to environmental factors and management practices, which present a large diversity at this regional scale. Parallel Factor Analysis (PARAFAC) was used as an unsupervised technique to recover pure spectra and temporal signatures from multi-way spectral imagery of vineyards in the Languedoc-Roussillon region in the south of France. The model was developed using a time series of Sentinel-2 satellite imagery collected over 4978 vineyard blocks between May 2019 and August 2020. From the Sentinel-2 (spectral and temporal) signal, the PARAFAC analysis allowed the identification of spectral and temporal profiles in the form of pure components, which corresponded to vegetation and soil. The PARAFAC analysis also identified that two of the pure spectra were strongly related to characteristics and dynamics of vineyard cultivation at a regional scale. A conceptual framework was proposed in order to simultaneously consider both vegetation and soil profiles and to summarise the mass of data accordingly. This methodology allowed the computation of a concentration index that characterised how close a field was to a vegetation or a soil profile over the season. The concentration indices were validated for the vegetation and the soil over two growing seasons (2019 and 2020) with geostatistical analysis. A non-random distribution of the concentration index at the regional scale was assumed to highlight a strongly spatially organised phenomenon related to spatially organised environmental factors (soil, climate, training system, etc.). In a second step, spatial patterns of indices were subjected to the expertise of a panel of advisors of the wine industry in order to validate them in relation to vine-growing conditions. Results showed that the introduction of the PARAFAC method opened up the possibility to identify relevant spectro-temporal profiles for vine monitoring purposes. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
31. Improvement of Sleep Spindle Detection by Aggregation Techniques
- Author
-
Saifutdinova, Elizaveta, Dudysova, Daniela, Gerla, Vaclav, Lhotska, Lenka, Magjarevic, Ratko, Series Editor, Ładyżyński, Piotr, Associate Editor, Ibrahim, Fatimah, Associate Editor, Lackovic, Igor, Associate Editor, Rock, Emilio Sacristan, Associate Editor, Henriques, Jorge, editor, Neves, Nuno, editor, and de Carvalho, Paulo, editor
- Published
- 2020
- Full Text
- View/download PDF
32. Random Vibration Damage Detection for a Composite Beam Under Varying Non-measurable Conditions: Assessment of Statistical Time Series Robust Methods
- Author
-
Aravanis, Tryfon-Chrysovalantis, Sakellariou, John, Fassois, Spilios, and Wahab, Magd Abdel, editor
- Published
- 2020
- Full Text
- View/download PDF
33. Software Defect Prediction on Unlabelled Datasets: A Comparative Study
- Author
-
Ronchieri, Elisabetta, Canaparo, Marco, Belgiovine, Mauro, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Gervasi, Osvaldo, editor, Murgante, Beniamino, editor, Misra, Sanjay, editor, Garau, Chiara, editor, Blečić, Ivan, editor, Taniar, David, editor, Apduhan, Bernady O., editor, Rocha, Ana Maria A.C., editor, Tarantino, Eufemia, editor, Torre, Carmelo Maria, editor, and Karaca, Yeliz, editor
- Published
- 2020
- Full Text
- View/download PDF
34. Probabilistic Pocket Druggability Prediction via One-Class Learning.
- Author
-
Aguti, Riccardo, Gardini, Erika, Bertazzo, Martina, Decherchi, Sergio, and Cavalli, Andrea
- Subjects
DRUG discovery ,FORECASTING - Abstract
The choice of target pocket is a key step in a drug discovery campaign. This step can be supported by in silico druggability prediction. In the literature, druggability prediction is often approached as a two-class classification task that distinguishes between druggable and non-druggable (or less druggable) pockets (or voxels). Apart from obvious cases, however, the non-druggable class is conceptually ambiguous. This is because any pocket (or target) is only non-druggable until a drug is found for it. It is therefore more appropriate to adopt a one-class approach, which uses only unambiguous information, namely, druggable pockets. Here, we propose using the import vector domain description (IVDD) algorithm to support this task. IVDD is a one-class probabilistic kernel machine that we previously introduced. To feed the algorithm, we use customized DrugPred descriptors computed via NanoShaper. Our results demonstrate the feasibility and effectiveness of the approach. In particular, we can remove or mitigate biases chiefly due to the labeling. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
35. Probabilistic Pocket Druggability Prediction via One-Class Learning
- Author
-
Riccardo Aguti, Erika Gardini, Martina Bertazzo, Sergio Decherchi, and Andrea Cavalli
- Subjects
druggability prediction ,drug design ,machine learning ,unsupervised methods ,one-class classification ,import vector domain description ,Therapeutics. Pharmacology ,RM1-950 - Abstract
The choice of target pocket is a key step in a drug discovery campaign. This step can be supported by in silico druggability prediction. In the literature, druggability prediction is often approached as a two-class classification task that distinguishes between druggable and non-druggable (or less druggable) pockets (or voxels). Apart from obvious cases, however, the non-druggable class is conceptually ambiguous. This is because any pocket (or target) is only non-druggable until a drug is found for it. It is therefore more appropriate to adopt a one-class approach, which uses only unambiguous information, namely, druggable pockets. Here, we propose using the import vector domain description (IVDD) algorithm to support this task. IVDD is a one-class probabilistic kernel machine that we previously introduced. To feed the algorithm, we use customized DrugPred descriptors computed via NanoShaper. Our results demonstrate the feasibility and effectiveness of the approach. In particular, we can remove or mitigate biases chiefly due to the labeling.
- Published
- 2022
- Full Text
- View/download PDF
36. Self-Supervised Learning Methods for Label-Efficient Dental Caries Classification.
- Author
-
Taleb, Aiham, Rohrer, Csaba, Bergner, Benjamin, De Leon, Guilherme, Rodrigues, Jonas Almeida, Schwendicke, Falk, Lippert, Christoph, and Krois, Joachim
- Subjects
- *
DENTAL caries , *ELECTRONIC health records , *DEEP learning , *IMAGE analysis , *MACHINE learning , *SIGNAL convolution , *DENTAL education - Abstract
High annotation costs are a substantial bottleneck in applying deep learning architectures to clinically relevant use cases, substantiating the need for algorithms to learn from unlabeled data. In this work, we propose employing self-supervised methods. To that end, we trained with three self-supervised algorithms on a large corpus of unlabeled dental images, which contained 38K bitewing radiographs (BWRs). We then applied the learned neural network representations on tooth-level dental caries classification, for which we utilized labels extracted from electronic health records (EHRs). Finally, a holdout test-set was established, which consisted of 343 BWRs and was annotated by three dental professionals and approved by a senior dentist. This test-set was used to evaluate the fine-tuned caries classification models. Our experimental results demonstrate the obtained gains by pretraining models using self-supervised algorithms. These include improved caries classification performance (6 p.p. increase in sensitivity) and, most importantly, improved label-efficiency. In other words, the resulting models can be fine-tuned using few labels (annotations). Our results show that using as few as 18 annotations can produce ≥45% sensitivity, which is comparable to human-level diagnostic performance. This study shows that self-supervision can provide gains in medical image analysis, particularly when obtaining labels is costly and expensive. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
37. Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods.
- Author
-
Qi, Guo-Jun and Luo, Jiebo
- Subjects
- *
SUPERVISED learning , *PROBABILISTIC generative models , *BIG data , *GENERATIVE adversarial networks - Abstract
Representation learning with small labeled data have emerged in many problems, since the success of deep neural networks often relies on the availability of a huge amount of labeled data that is expensive to collect. To address it, many efforts have been made on training sophisticated models with few labeled data in an unsupervised and semi-supervised fashion. In this paper, we will review the recent progresses on these two major categories of methods. A wide spectrum of models will be categorized in a big picture, where we will show how they interplay with each other to motivate explorations of new ideas. We will review the principles of learning the transformation equivariant, disentangled, self-supervised and semi-supervised representations, all of which underpin the foundation of recent progresses. Many implementations of unsupervised and semi-supervised generative models have been developed on the basis of these criteria, greatly expanding the territory of existing autoencoders, generative adversarial nets (GANs) and other deep networks by exploring the distribution of unlabeled data for more powerful representations. We will discuss emerging topics by revealing the intrinsic connections between unsupervised and semi-supervised learning, and propose in future directions to bridge the algorithmic and theoretical gap between transformation equivariance for unsupervised learning and supervised invariance for supervised learning, and unify unsupervised pretraining and supervised finetuning. We will also provide a broader outlook of future directions to unify transformation and instance equivariances for representation learning, connect unsupervised and semi-supervised augmentations, and explore the role of the self-supervised regularization for many learning problems. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
38. An automated hypersphere-based healthy subspace method for robust and unsupervised damage detection via random vibration response signals.
- Author
-
Kyriakos, Vamvoudakis-Stefanou, Spilios, Fassois, and John, Sakellariou
- Subjects
RANDOM vibration ,STRUCTURAL health monitoring ,STRUCTURAL dynamics ,RANDOM numbers - Abstract
A novel, unsupervised, hypersphere-based healthy subspace method for robust damage detection under non-quantifiable uncertainty via a limited number of random vibration response sensors is postulated. The method is based on the approximate construction, within a proper feature space, of a healthy subspace representing the healthy structural dynamics under uncertainty as the union of properly selected hyperspheres. This is achieved via a fully automated algorithm eliminating user intervention, and thus subjective selections, or complex optimization procedures. The main asset of the proposed method lies in combining simplicity and full automation with high performance. Its performance is systematically assessed via two experimental case studies featuring various uncertainty sources and distinct healthy subspace geometries, while interesting comparisons with three well-known robust damage detection methods are also performed. The results indicate excellent detection performance, which also compares favorably to that of alternative methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
39. Pruning and repopulating a lexical taxonomy: experiments in Spanish, English and French
- Author
-
Nazar Rogelio, Balvet Antonio, Ferraro Gabriela, Marín Rafael, and Renau Irene
- Subjects
hypernymy detection ,language independent methods ,taxonomy induction ,unsupervised methods ,68w06 ,Science ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
In this paper we present the problem of a noisy lexical taxonomy and suggest two tasks as potential remedies. The first task is to identify and eliminate incorrect hypernymy links, and the second is to repopulate the taxonomy with new relations. The first task consists of revising the entire taxonomy and returning a Boolean for each assertion of hypernymy between two nouns (e.g. brie is a kind of cheese). The second task consists of recursively producing a chain of hypernyms for a given noun, until the most general node in the taxonomy is reached (e.g. brie → cheese → food → etc.). In order to achieve these goals, we implemented a hybrid hypernym-detection algorithm that incorporates various intuitions, such as syntagmatic, paradigmatic and morphological association measures as well as lexical patterns. We evaluate these algorithms individually and collectively and report findings in Spanish, English and French.
- Published
- 2020
- Full Text
- View/download PDF
40. Incorporation of Neighborhood Concept in Enhancing SOM Based Multi-label Classification
- Author
-
Saini, Naveen, Saha, Sriparna, Bhattacharyya, Pushpak, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Deka, Bhabesh, editor, Maji, Pradipta, editor, Mitra, Sushmita, editor, Bhattacharyya, Dhruba Kumar, editor, Bora, Prabin Kumar, editor, and Pal, Sankar Kumar, editor
- Published
- 2019
- Full Text
- View/download PDF
41. Batch and online variational learning of hierarchical Dirichlet process mixtures of multivariate Beta distributions in medical applications.
- Author
-
Manouchehri, Narges, Bouguila, Nizar, and Fan, Wentao
- Subjects
- *
BETA distribution , *DISTRIBUTION (Probability theory) , *ONLINE education , *LEUKOCYTE count , *MEDICAL personnel - Abstract
Thanks to the significant developments in healthcare industries, various types of medical data are generated. Analysing such valuable resources aid healthcare experts to understand the illnesses more precisely and provide better clinical services. Machine learning as one of the capable tools could assist healthcare experts in achieving expressive interpretation and making proper decisions. As annotation of medical data is a costly and sensitive task that can be performed just by healthcare professionals, label-free methods could be significantly promising. Interpretability and evidence-based decision are other concerns in medicine. These needs were our motivators to propose a novel clustering method based on hierarchical Dirichlet process mixtures of multivariate Beta distributions. To learn it, we applied batch and online variational methods for finding the proper number of clusters as well as estimating model parameters at the same time. The effectiveness of the proposed models is evaluated on three medical real applications, namely oropharyngeal carcinoma diagnosis, osteosarcoma analysis, and white blood cell counting. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
42. A Novel Approach to the Unsupervised Extraction of Reliable Training Samples From Thematic Products.
- Author
-
Paris, Claudia and Bruzzone, Lorenzo
- Subjects
- *
THEMATIC maps , *PROBLEM solving , *REMOTE sensing , *PLURALITY voting , *INFORMATION resources , *LAND cover - Abstract
Supervised classification algorithms require a sufficiently large set of representative training samples to generate accurate land-cover maps. Collecting reference data is difficult, expensive, and unfeasible at the large scale. To solve this problem, this article introduces a novel approach that aims to extract reliable labeled data from existing thematic products. Although these products represent a potentially useful information source, their use is not straightforward. They are not completely reliable since they may present classification errors. They are typically aggregated at polygon level, where polygons do not necessarily correspond to homogeneous areas. Finally, usually, there is a semantic gap between map legends and remote sensing (RS) data. In this context, we propose an approach that aims to: 1) perform a domain understanding to detect the discrepancies between the thematic map domain and the RS data domain; 2) use RS data contemporary to the map to decompose the thematic product from the semantic and spatial viewpoints; and 3) extract a database of informative and reliable training samples. The database of weak labeled units is used for training an ensemble of classifiers on recent data whose results are then combined in a majority voting rule. Two sets of experimental results obtained on MS images by extracting training samples from a crop type map and the 2018 Corine Land Cover (CLC) map, respectively, confirm the effectiveness of the proposed approach. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
43. On the functional model–based method for vibration-based robust damage detection: versions and experimental assessment.
- Author
-
Aravanis, Tryfon-Chrysovalantis, Sakellariou, John, and Fassois, Spilios
- Subjects
DRONE aircraft ,COMPOSITE structures ,STRUCTURAL health monitoring ,FUNCTIONAL assessment - Abstract
The problem of random vibration–based robust damage detection for structures operating under varying and non-measurable environmental and operating conditions is considered via a novel unsupervised functional model–based method. Two versions of the method are employed based on either the residual variance or uncorrelatedness (whiteness) of a proper functional model that incorporates the varying environmental and operating conditions in a scheduling vector. This article constitutes a proof-of-concept study in which a comprehensive laboratory assessment of the functional model–based method is undertaken using hundreds of experiments with a composite tail structure of an unmanned aerial vehicle and two early-stage damages under a considerable number of different environmental and operating conditions. Comparisons with two alternative state-of-the-art statistical time series type methods, that is, a multiple model–based method and a principal component analysis–based method, are also performed. The results indicate ideal detection performance for the functional model–based and multiple model–based methods, with the true positive rate reaching 100% at 0% false positive rate, but degraded performance for the pricipal component analysis–based method. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
44. An Unsupervised Approach to Leak Detection and Location in Water Distribution Networks
- Author
-
Quiñones-Grueiro Marcos, Verde Cristina, Prieto-Moreno Alberto, and Llanes-Santiago Orestes
- Subjects
water distribution networks ,leak location ,unsupervised methods ,principal component analysis ,demand model ,Mathematics ,QA1-939 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
The water loss detection and location problem has received great attention in recent years. In particular, data-driven methods have shown very promising results mainly because they can deal with uncertain data and the variability of models better than model-based methods. The main contribution of this work is an unsupervised approach to leak detection and location in water distribution networks. This approach is based on a zone division of the network, and it only requires data from a normal operation scenario of the pipe network. The proposition combines a periodic transformation and a data vector extension together with principal component analysis of leak detection. A reconstruction-based contribution index is used for determining the leak zone location. The Hanoi distribution network is employed as the case study for illustrating the feasibility of the proposal. Single leaks are emulated with varying outflow magnitudes at all nodes that represent less than 2.5% of the total demand of the network and between 3% and 25% of the node’s demand. All leaks can be detected within the time interval of a day, and the average classification rate obtained is 85.28% by using only data from three pressure sensors.
- Published
- 2018
- Full Text
- View/download PDF
45. Matching Seqlets: An Unsupervised Approach for Locality Preserving Sequence Matching.
- Author
-
Qiu, Jiayan, Wang, Xinchao, Fua, Pascal, and Tao, Dacheng
- Subjects
- *
HUMAN behavior , *FACIAL expression , *SPEECH perception , *PATTERN matching , *TASK analysis - Abstract
In this paper, we propose a novel unsupervised approach for sequence matching by explicitly accounting for the locality properties in the sequences. In contrast to conventional approaches that rely on frame-to-frame matching, we conduct matching using sequencelet or seqlet, a sub-sequence wherein the frames share strong similarities and are thus grouped together. The optimal seqlets and matching between them are learned jointly, without any supervision from users. The learned seqlets preserve the locality information at the scale of interest and resolve the ambiguities during matching, which are omitted by frame-based matching methods. We show that our proposed approach outperforms the state-of-the-art ones on datasets of different domains including human actions, facial expressions, speech, and character strokes. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
46. R/PY-SUMMA: An R/Python Package for Unsupervised Ensemble Learning for Binary Classification Problems in Bioinformatics.
- Author
-
Ahsen, Mehmet Eren, Vogel, Robert, and Stolovitzky, Gustavo A.
- Subjects
- *
PYTHON programming language , *FORECASTING , *ALGORITHMS , *GENE regulatory networks , *SOMATIC mutation , *BIOINFORMATICS - Abstract
The increasing availability of complex data in biology and medicine has promoted the use of machine learning in classification tasks to address important problems in translational and fundamental science. Two important obstacles, however, may limit the unraveling of the full potential of machine learning in these fields: the lack of generalization of the resulting models and the limited number of labeled data sets in some applications. To address these important problems, we developed an unsupervised ensemble algorithm called strategy for unsupervised multiple method aggregation (SUMMA). By virtue of being an ensemble method, SUMMA is more robust to generalization than the predictions it combines. By virtue of being unsupervised, SUMMA does not require labeled data. SUMMA receives as input predictions from a diversity of models and estimates their classification performance even when labeled data are unavailable. It then uses these performance estimates to combine these different predictions into an ensemble model. SUMMA can be applied to a variety of binary classification problems in bioinformatics including but not limited to gene network inference, cancer diagnostics, drug response prediction, somatic mutation, and differential expression calling. In this application note, we introduce the R/PY-SUMMA packages, available in R or Python, that implement the SUMMA algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
47. Video Object Segmentation and Tracking: A Survey.
- Author
-
Yao, Rui, Lin, Guosheng, Xia, Shixiong, Zhao, Jiaqi, and Zhou, Yong
- Abstract
Object segmentation and object tracking are fundamental research areas in the computer vision community. These two topics are difficult to handle some common challenges, such as occlusion, deformation, motion blur, scale variation, and more. The former contains heterogeneous object, interacting object, edge ambiguity, and shape complexity; the latter suffers from difficulties in handling fast motion, out-of-view, and real-time processing. Combining the two problems of Video Object Segmentation and Tracking (VOST) can overcome their respective difficulties and improve their performance. VOST can be widely applied to many practical applications such as video summarization, high definition video compression, human computer interaction, and autonomous vehicles. This survey aims to provide a comprehensive review of the state-of-the-art VOST methods, classify these methods into different categories, and identify new trends. First, we broadly categorize VOST methods into Video Object Segmentation (VOS) and Segmentation-based Object Tracking (SOT). Each category is further classified into various types based on the segmentation and tracking mechanism. Moreover, we present some representative VOS and SOT methods of each time node. Second, we provide a detailed discussion and overview of the technical characteristics of the different methods. Third, we summarize the characteristics of the related video dataset and provide a variety of evaluation metrics. Finally, we point out a set of interesting future works and draw our own conclusions. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
48. Using Unstructured Profile Information for Gender Classification of Portuguese and English Twitter Users
- Author
-
Vicente, Marco, Carvalho, Joao P., Batista, Fernando, Diniz Junqueira Barbosa, Simone, Series editor, Chen, Phoebe, Series editor, Du, Xiaoyong, Series editor, Filipe, Joaquim, Series editor, Kara, Orhun, Series editor, Kotenko, Igor, Series editor, Liu, Ting, Series editor, Sivalingam, Krishna M., Series editor, Washio, Takashi, Series editor, Sierra-Rodríguez, José-Luis, editor, Leal, José-Paulo, editor, and Simões, Alberto, editor
- Published
- 2015
- Full Text
- View/download PDF
49. UV-Visible Spectrophotometry-Based Metabolomic Analysis of Cedrela Fissilis Velozzo (Meliaceae) Calluses - A Screening Tool for Culture Medium Composition and Cell Metabolic Profiles
- Author
-
Pilatti, Fernanda Kokowicz, Costa, Christopher, Rocha, Miguel, Maraschin, Marcelo, Viana, Ana Maria, Kacprzyk, Janusz, Series editor, Overbeek, Ross, editor, Rocha, Miguel P., editor, Fdez-Riverola, Florentino, editor, and De Paz, Juan F., editor
- Published
- 2015
- Full Text
- View/download PDF
50. Analysis of Negation Cues for Semantic Orientation Classification of Reviews in Spanish
- Author
-
Galicia-Haro, Sofía N., Palomino-Garibay, Alonso, Gallegos-Acosta, Jonathan, Gelbukh, Alexander, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Pichardo Lagunas, Obdulia, editor, Herrera Alcántara, Oscar, editor, and Arroyo Figueroa, Gustavo, editor
- Published
- 2015
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.