Descriptor: "Self training" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Self training"' showing total 420 results

Start Over Descriptor "Self training"

420 results on '"Self training"'

1. Self-training for handwritten word recognition and retrieval

Author: Wolf, Fabian and Fink, Gernot A.
Published: 2024
Full Text: View/download PDF

2. Domain Adaptation for Medical Image Segmentation Using Transformation-Invariant Self-training

Author: Ghamsarian, Negin, Gamazo Tejero, Javier, Márquez-Neila, Pablo, Wolf, Sebastian, Zinkernagel, Martin, Schoeffmann, Klaus, Sznitman, Raphael, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Greenspan, Hayit, editor, Madabhushi, Anant, editor, Mousavi, Parvin, editor, Salcudean, Septimiu, editor, Duncan, James, editor, Syeda-Mahmood, Tanveer, editor, and Taylor, Russell, editor
Published: 2023
Full Text: View/download PDF

3. DeMRC: Dynamically Enhanced Multi-hop Reading Comprehension Model for Low Data

Author: Tang, Xiu, Xu, Yangchao, Lu, Xuefeng, He, Qiang, Fang, Jun, Chen, Junjie, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Chen, Weitong, editor, Yao, Lina, editor, Cai, Taotao, editor, Pan, Shirui, editor, Shen, Tao, editor, and Li, Xue, editor
Published: 2022
Full Text: View/download PDF

4. Self-Training-Transductive-Learning Broad Learning System (STTL-BLS): A model for effective and efficient image classification.

Author: Yi, Lin, Lv, Di, Liu, Dinghao, Li, Suhuan, and Liu, Ran
Subjects: *CONVOLUTIONAL neural networks, *IMAGE recognition (Computer vision), *FEATURE extraction, *TIME complexity, *SOURCE code
Abstract: A novel model called Self-Training-Transductive-Learning Broad Learning System (STTL-BLS) is proposed for image classification. The model consists of two key blocks: Feature Block (FB) and Enhancement Block (EB). The FB utilizes the Proportion of Large Values Attention (PLVA) technique and an Encoder for feature extraction. Multiple FBs are cascaded in the model to learn discriminative features. The Enhancement Block (EB) enhances feature learning and prevents under-fitting on complex datasets. Additionally, an architecture that combines characteristics of Broad Learning System (BLS) and gradient descent is designed for STTL-BLS, enabling the model to leverage the advantages of both BLS and Convolutional Neural Networks (CNNs). Moreover, a training algorithm (STTL) that combines self-training and transductive learning is presented for the model to improve its generalization ability. Experimental results demonstrate that the accuracy of the proposed model surpasses all compared BLS variants and performs comparably or even superior to deep networks: on small-scale datasets, STTL-BLS has an average accuracy improvement of 14.82 percentage points compared to other models; on large-scale datasets, 12.95 percentage points. Notably, the proposed model exhibits low time complexity, particularly with the shortest testing time on the small-scale datasets among all compared models: it has an average testing time of 46.4 s less than other models. It proves to be an additional valuable solution for image classification tasks on both small- and large-scale datasets. The source code for this paper can be accessed at https://github.com/threedteam/sttl_bls. • Proportion of Large Values Attention (PLVA) is introduced for feature extraction. • Enhancement Block is introduced for feature learning and to prevent under-fitting. • The designed architecture has the characteristics of both BLS and gradient descent. • Self-training and transductive learning are combined in the training algorithm. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

5. Adapting OCR with Limited Supervision

Author: Das, Deepayan, Jawahar, C. V., Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Bai, Xiang, editor, Karatzas, Dimosthenis, editor, and Lopresti, Daniel, editor
Published: 2020
Full Text: View/download PDF

6. Training deep neural networks with noisy clinical labels: toward accurate detection of prostate cancer in US data.

Author: Javadi, Golara, Samadi, Samareh, Bayat, Sharareh, Sojoudi, Samira, Hurtado, Antonio, Eshumani, Walid, Chang, Silvia, Black, Peter, Mousavi, Parvin, and Abolmaesumi, Purang
Abstract: Purpose: Ultrasound is the standard-of-care to guide the systematic biopsy of the prostate. During the biopsy procedure, up to 12 biopsy cores are randomly sampled from six zones within the prostate, where the histopathology of those cores is used to determine the presence and grade of the cancer. Histopathology reports only provide statistical information on the presence of cancer and do not normally contain fine-grain information of cancer distribution within each core. This limitation hinders the development of machine learning models to detect the presence of cancer in ultrasound so that biopsy can be more targeted to highly suspicious prostate regions. Methods: In this paper, we tackle this challenge in the form of training with noisy labels derived from histopathology. Noisy labels often result in the model overfitting to the training data, hence limiting its generalizability. To avoid overfitting, we focus on the generalization of the features of the model and present an iterative data label refinement algorithm to amend the labels gradually. We simultaneously train two classifiers, with the same structure, and automatically stop the training when we observe any sign of overfitting. Then, we use a confident learning approach to clean the data labels and continue with the training. This process is iteratively applied to the training data and labels until convergence. Results: We illustrate the performance of the proposed method by classifying prostate cancer using a dataset of ultrasound images from 353 biopsy cores obtained from 90 patients. We achieve area under the curve, sensitivity, specificity, and accuracy of 0.73, 0.80, 0.63, and 0.69, respectively. Conclusion: Our approach is able to provide clinicians with a visualization of regions that likely contain cancerous tissue to obtain more accurate biopsy samples. The results demonstrate that our proposed method produces superior accuracy compared to the state-of-the-art methods. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

7. Revisiting instance search: A new benchmark using cycle self-training.

Author: Zhang, Yuqi, Liu, Chong, Chen, Weihua, Xu, Xianzhe, Wang, Fan, Li, Hao, Hu, Shiyu, and Zhao, Xin
Subjects: *GENERALIZATION
Abstract: Instance search aims at retrieving a particular object instance from a set of scene images. Although studied in previous competitions like TRECVID, there have been limited literature or datasets on this topic. In this paper, to overcome the generalization issue when arbitrary categories are involved in search and to benefit from the large amount of unlabeled data, we propose a cycle self-training framework which trains the instance search pipeline with automatic supervision. Given the two-stage pipeline with a localization and ranking module, the cycle self-training includes a ranker-guided localizer, and a localizer-guided ranker, each carefully designed to handle noisy labels that come with self-supervision. Furthermore, we build and release large-scale groundtruth annotations for instances to facilitate the algorithm evaluation and analysis in this research topic, especially for small objects in complex background. The datasets are publicly available at https://github.com/instance-search/instance-search. Extensive experiments show the effectiveness of the proposed cycle self-training framework and the superior performance compared with other state-of-the-art methods. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

8. G2L: A Global to Local Alignment Method for Unsupervised Domain Adaptive Semantic Segmentation.

Author: Manh, Nguyen Viet, Nam, Kieu Dang, Sang, Dinh Viet, and Nguyen, Thi-Oanh
Subjects: IMAGE registration, KNOWLEDGE transfer, FOURIER transforms, CONFIDENCE
Abstract: Unsupervised domain adaptation (UDA) for semantic segmentation aims to transfer knowledge from a source dataset with dense pixel-level annotations to an unlabeled target dataset. However, the performance of UDA methods often suffers from the domain shift, which is the discrepancy between the feature distributions of the two domains. There have been several attempts to match these distributions at the image level marginally. However, due to the so-called category-level domain shift, such global alignments do not guarantee a good separability of deep features extracted from different categories in the target domain. As a result, the generated pseudo-labels can be noisy and thus poison the learning process on the target domain. Some recent methods focus on denoising the pseudo-labels online using category-wise information. This paper introduces a novel UDA method called Global-to-Local alignment (G2L) that leverages fine-grained adversarial training and a newly proposed chromatic Fourier transform to address the image-level domain shift from a global perspective. Next, our method deals with the category-level domain shift under a local view. Specifically, we propose a long-tail category rating strategy as well as apply dynamic confidence thresholds and category-wise priority weights when generating and denoising the pseudo-labels to favor rare categories. Finally, self-distillation is used to boost the final segmentation results. Experiments on popular benchmarks GTA5 → Cityscapes and SYNTHIA → Cityscapes show that our method yields superior accuracy performance than other state-of-the-art methods. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

9. Entropy-aware self-training for graph convolutional networks.

Author: Zhao, Gongpei, Wang, Tao, Li, Yidong, Jin, Yi, and Lang, Congyan
Subjects: *BOOSTING algorithms, *RANDOM walks, *ALGORITHMS
Abstract: [Display omitted] • An entropy-aggregation layer proposed to strengthen reasoning ability of GCN. • An ingenious checking part based on self-training to enhance node classification. • Sufficient experiments and analyses to validate the superiority of ES-GCN. Recently, graph convolutional networks (GCNs) have achieved significant success in many graph-based learning tasks, especially for node classification, due to its excellent ability in representation learning. Nevertheless, it remains challenging for GCN models to obtain satisfying predictions on graphs where only few nodes are with known labels. In this paper, we propose a novel entropy-aware self-training algorithm to boost semi-supervised node classification on graphs with little supervised information. Firstly, an entropy-aggregation layer is developed to strengthen the reasoning ability of GCN models. To the best of our knowledge, this is the first work to combine the entropy-based random walk theory with GCN design. Furthermore, we propose an ingenious checking part to add new nodes as supervision after each training round to enhance node prediction. In particular, the checking part is designed based on aggregated features, which is demonstrated more effective than previous methods and boosts node classification significantly. The proposed algorithm is validated on six public benchmarks in comparison with several state-of-the-art baseline algorithms, and the results illustrate its excellent performance. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

10. Improving Machine Reading Comprehension with Multi-Task Learning and Self-Training

Author: Jianquan Ouyang and Mengen Fu
Subjects: machine reading comprehension, Natural Language Processing, multi-task learning, Self Training, pre-trained model, Mathematics, QA1-939
Abstract: Machine Reading Comprehension (MRC) is an AI challenge that requires machines to determine the correct answer to a question based on a given passage, in which extractive MRC requires extracting an answer span to a question from a given passage, such as the task of span extraction. In contrast, non-extractive MRC infers answers from the content of reference passages, including Yes/No question answering to unanswerable questions. Due to the specificity of the two types of MRC tasks, researchers usually work on one type of task separately, but real-life application situations often require models that can handle many different types of tasks in parallel. Therefore, to meet the comprehensive requirements in such application situations, we construct a multi-task fusion training reading comprehension model based on the BERT pre-training model. The model uses the BERT pre-training model to obtain contextual representations, which is then shared by three downstream sub-modules for span extraction, Yes/No question answering, and unanswerable questions, next we fuse the outputs of the three sub-modules into a new span extraction output and use the fused cross-entropy loss function for global training. In the training phase, since our model requires a large amount of labeled training data, which is often expensive to obtain or unavailable in many tasks, we additionally use self-training to generate pseudo-labeled training data to train our model to improve its accuracy and generalization performance. We evaluated the SQuAD2.0 and CAIL2019 datasets. The experiments show that our model can efficiently handle different tasks. We achieved 83.2EM and 86.7F1 scores on the SQuAD2.0 dataset and 73.0EM and 85.3F1 scores on the CAIL2019 dataset.
Published: 2022
Full Text: View/download PDF

11. Few-Shot Learning and Self-Training for eNodeB Log Analysis for Service-Level Assurance in LTE Networks.

Author: Aoki, Shogo, Shiomoto, Kohei, and Eng, Chin Lam
Abstract: With the increasing network topology complexity and continuous evolution of the new wireless technology, it is challenging to address the network service outage with traditional methods. In the long-term evolution (LTE) networks, a large number of base stations called eNodeBs are deployed to cover the entire service areas spanning various kinds of geographical regions. Each eNodeB generates a large number of key performance indicators (KPIs). Hundreds of thousands of eNodeBs are typically deployed to cover a nation-wide service area. Operators need to handle hundreds of millions of KPIs to cover the areas. It is impractical to handle manually such a huge amount of KPI data, and automation of data processing is therefore desired. To improve network operation efficiency, a suitable machine learning technique is used to learn and classify individual eNodeBs into different states based on multiple performance metrics during a specific time window. However, an issue with supervised learning requires a large amount of labeled dataset, which takes costly human-labor and time to annotate data. To mitigate the cost and time issues, we propose a method based on few-shot learning that uses Prototypical Networks algorithm to complement the eNodeB states analysis. Using a dataset from a live LTE network that consists of thousand of eNodeB, our experiment results show that the proposed technique provides high performance while using a low number of labeled data. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

12. Harnessing the Power of Self-Training for Gaze Point Estimation in Dual Camera Transportation Datasets

Author: Bhagat, Hirva Alpesh and Bhagat, Hirva Alpesh
Abstract: This thesis proposes a novel approach for efficiently estimating gaze points in dual camera transportation datasets. Traditional methods for gaze point estimation are dependent on large amounts of labeled data, which can be both expensive and time-consuming to collect. Additionally, alignment and calibration of the two camera views present significant challenges. To overcome these limitations, this thesis investigates the use of self-learning techniques such as semi-supervised learning and self-training, which can reduce the need for labeled data while maintaining high accuracy. The proposed method is evaluated on the DGAZE dataset and achieves a 57.2\% improvement in performance compared to the previous methods. This approach can prove to be a valuable tool for studying visual attention in transportation research, leading to more cost-effective and efficient research in this field.
Published: 2023

13. A Self-Adaptive Temporal-Spatial Self-Training Algorithm for Semisupervised Fault Diagnosis of Industrial Processes

Author: Jinsong Zhao and Shaodong Zheng
Subjects: Measure (data warehouse), Process (engineering), Property (programming), Computer science, Self adaptive, Work in process, Fault (power engineering), Computer Science Applications, Control and Systems Engineering, Benchmark (computing), Electrical and Electronic Engineering, Self training, Algorithm, Information Systems
Abstract: Investigating process monitoring techniques is required to reduce the loss of property and life caused by industrial processes accidents. Fault diagnosis, which attempts to determine the fault type, is a vital step in process monitoring because it can help operators respond to abnormal situations appropriately. Adequate data labels to train supervised fault diagnosis models are difficult to acquire in practice; however, semi-supervised methods, which are attracting increasing attention, can use unlabeled data. Self-labeled algorithms are an effective paradigm of semi-supervised methods, but their applications in industrial process fault diagnosis do not meet expectations, because they are prone to performance deterioration when handling industrial process data. To address this issue, a self-training algorithm with a modified confidence measure is proposed. The confidence measure is temporal-spatial with temporal identities of data introduced to its definition and calculation, which makes the algorithm adaptable to industrial processes. The proposed algorithm is also self-adaptive to avoid time-consuming hyper-parameter tuning processes. The benchmark Tennessee Eastman process data were used to evaluate the proposed algorithm, and the experiment results demonstrate its superiority compared to competing semi-supervised methods.
Published: 2022
Full Text: View/download PDF

14. Effective training duration and frequency for lip-seal training in older people using a self-training instrument

Author: Midori Ohta, Takeshi Oki, Kaoru Sakurai, Takayuki Ueda, and Tomofumi Takano
Subjects: medicine.medical_specialty, education, 03 medical and health sciences, 0302 clinical medicine, Swallowing, Medicine, Humans, 030212 general & internal medicine, Muscle Strength, General Dentistry, Aged, Cross-Over Studies, business.industry, Training (meteorology), 030206 dentistry, Continuous training, Crossover study, Lip, Test (assessment), Deglutition, Duration (music), Physical therapy, Female, Geriatrics and Gerontology, Older people, business, Self training
Abstract: Objective To determine the effects of the training duration and frequency on lip-seal strength (LSS) in older people. Background Lip-seal is important for speaking, eating and swallowing. LSS decreases after training ends; therefore, continuous training is essential. Materials and methods Participants underwent the resistance training of LSS. Regarding training duration, eight women aged ≥65 years participated in a crossover study with trainings A (direction: 1, duration: 50 seconds) and B (directions: 3, duration: 3 minutes), daily for 4 weeks. Regarding training frequency, 40 women aged ≥65 years were divided into four groups based on frequency (everyday, every-other-day, once-a-week and control groups), and all groups excluding the control group performed training B for 4 weeks. LSS was measured at weeks 0, 2 and 4 using a digital strain gauge. Friedman's test was used, followed by Steel-Dwass test (α = 0.05). Results Regarding the effects of the training duration, significant differences in LSS were noted between weeks 0 and 4 for training B, but no difference was noted for training A. Regarding training frequency, significant differences were observed between weeks 0 and 2 or 4 in the everyday and once-a-week groups. Significant differences were observed in the every-other-day group between weeks 0 and 4 and no difference in the control group. For all groups, median LSS was higher in week 2 or 4 than that in week 0. Conclusion Lip-seal training for 3 minutes per session everyday, every-other-day or once-a-week for 4 weeks increased LSS of older people.
Published: 2021

15. A novel semi-supervised self-training method based on resampling for Twitter fake account identification

Author: Shouqiang Sun, Ziming Zeng, Jie Yin, Jingjing Sun, and Tingting Li
Subjects: Computer science, business.industry, Process (computing), Semi-supervised learning, Library and Information Sciences, Machine learning, computer.software_genre, Data set, Identification (information), Resampling, Classifier (linguistics), Labeled data, Artificial intelligence, business, computer, Self training, Information Systems
Abstract: PurposeTwitter fake accounts refer to bot accounts created by third-party organizations to influence public opinion, commercial propaganda or impersonate others. The effective identification of bot accounts is conducive to accurately judge the disseminated information for the public. However, in actual fake account identification, it is expensive and inefficient to manually label Twitter accounts, and the labeled data are usually unbalanced in classes. To this end, the authors propose a novel framework to solve these problems.Design/methodology/approachIn the proposed framework, the authors introduce the concept of semi-supervised self-training learning and apply it to the real Twitter account data set from Kaggle. Specifically, the authors first train the classifier in the initial small amount of labeled account data, then use the trained classifier to automatically label large-scale unlabeled account data. Next, iteratively select high confidence instances from unlabeled data to expand the labeled data. Finally, an expanded Twitter account training set is obtained. It is worth mentioning that the resampling technique is integrated into the self-training process, and the data class is balanced at the initial stage of the self-training iteration.FindingsThe proposed framework effectively improves labeling efficiency and reduces the influence of class imbalance. It shows excellent identification results on 6 different base classifiers, especially for the initial small-scale labeled Twitter accounts.Originality/valueThis paper provides novel insights in identifying Twitter fake accounts. First, the authors take the lead in introducing a self-training method to automatically label Twitter accounts from the semi-supervised background. Second, the resampling technique is integrated into the self-training process to effectively reduce the influence of class imbalance on the identification effect.
Published: 2021
Full Text: View/download PDF

16. Entropy-aware self-training for graph convolutional networks

Author: Tao Wang, Congyan Lang, Yi Jin, Yidong Li, and Gongpei Zhao
Subjects: Theoretical computer science, Artificial Intelligence, Computer science, Cognitive Neuroscience, Node (networking), Entropy (information theory), Layer (object-oriented design), Random walk, Self training, Feature learning, Graph, Computer Science Applications
Abstract: Recently, graph convolutional networks (GCNs) have achieved significant success in many graph-based learning tasks, especially for node classification, due to its excellent ability in representation learning. Nevertheless, it remains challenging for GCN models to obtain satisfying predictions on graphs where only few nodes are with known labels. In this paper, we propose a novel entropy-aware self-training algorithm to boost semi-supervised node classification on graphs with little supervised information. Firstly, an entropy-aggregation layer is developed to strengthen the reasoning ability of GCN models. To the best of our knowledge, this is the first work to combine the entropy-based random walk theory with GCN design. Furthermore, we propose an ingenious checking part to add new nodes as supervision after each training round to enhance node prediction. In particular, the checking part is designed based on aggregated features, which is demonstrated more effective than previous methods and boosts node classification significantly. The proposed algorithm is validated on six public benchmarks in comparison with several state-of-the-art baseline algorithms, and the results illustrate its excellent performance.
Published: 2021
Full Text: View/download PDF

17. Semi-Supervised Self-Training of Hate and Offensive Speech from Social Media

Author: Samira Sadaoui and Safa Alsafari
Subjects: ComputingMethodologies_PATTERNRECOGNITION, Artificial Intelligence, Computer science, Applied psychology, Offensive, Social media, Self training
Abstract: Improving Offensive and Hate Speech (OHS) classifiers’ performances requires a large, confidently labeled textual training dataset. Our study devises a semi-supervised classification approach with ...
Published: 2021
Full Text: View/download PDF

18. Extracting Relations from Italian Wikipedia Using Self-Training

Author: Siciliani, Lucia, Cassotti, Pierluigi, Basile, Pierpaolo, De Gemmis, Marco, Lops, Pasquale, and Semeraro, Giovanni
Subjects: Computational Linguistics, linguistica, linguistique computationelle, Italian Wikipedia, open information extraction, Self training, Linguistics, linguistica computazionale, linguistique, Wikipedia, Language
Abstract: This dataset contains relations extracted from the Italian Wikipedia by the WikiOIE framework. WikiOIE is based on UDPipe and the Universal Dependencies project for text processing. It easily allows customizing the information extraction (IE) approach to automatically extract triples (subject, predicate, object). This dataset contains relations extracted by a supervised approach based on self-training. The extraction process is provided in JSON format. Version 2 of the dataset was extracted using an improved version of the learning algorithm. The files of version 2 are identified by the suffix "_reg" in the file name. More information and the Java code are available here: https://github.com/pippokill/WikiOIE Self-training approach: Lucia Siciliani, Pierluigi Cassotti, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, and Giovanni Semeraro2021. Extracting Relations from Italian Wikipedia using Self-Training. In Proceedings of the Eighth Italian Conference on Computational Linguistics (CLiC-it2021). CEUR-WS. WikiOIE framework: Pierluigi Cassotti, Lucia Siciliani, Pierpaolo Basile, Marco de Gemmis, and Pasquale Lops. 2021. Extracting relations from Italian Wikipedia using unsupervised information extraction. In Proceedings of the 11th Italian Information Retrieval Workshop 2021 (IIR 2021). CEUR-WS.
Published: 2022

19. A Surgical Training Simulator for Quantitative Assessment of the Anastomotic Technique of Coronary Artery Bypass Grafting

Author: Park, Y., Shinke, M., Kanemitsu, N., Yagi, T., Azuma, T., Shiraishi, Y., Kormos, R., Umezu, M., Magjarevic, R., editor, Nagel, J. H., editor, Lim, Chwee Teck, editor, and Goh, James C. H., editor
Published: 2009
Full Text: View/download PDF

20. A semi-supervised learning method for hyperspectral imagery based on self-training and local-based affinity propagation

Author: Liguo Wang, Wenlong Zhu, Haizhu Pan, Cheng Li, Yanping Teng, Yanzhong Liu, and Haimiao Ge
Subjects: 010504 meteorology & atmospheric sciences, Computer science, business.industry, 0211 other engineering and technologies, Hyperspectral imaging, Pattern recognition, 02 engineering and technology, Semi-supervised learning, 01 natural sciences, Remote sensing (archaeology), General Earth and Planetary Sciences, Affinity propagation, Artificial intelligence, business, Self training, 021101 geological & geomatics engineering, 0105 earth and related environmental sciences
Abstract: In hyperspectral remote sensing, the classification of hyperspectral imagery is an important issue of concern. However, obtaining sufficient labelled samples for the classification is hard work and...
Published: 2021
Full Text: View/download PDF

21. Graph Convolutional Network-based Model for Incident-related Congestion Prediction: A Case Study of Shanghai Expressways

Author: Hui Li, Weishan Sun, Wenbin Wang, Xi Wang, and Yibo Chai
Subjects: 050210 logistics & transportation, General Computer Science, Computer science, 05 social sciences, 02 engineering and technology, Management Information Systems, Transport engineering, Congestion prediction, Megacity, Traffic congestion, Obstacle, 0502 economics and business, 0202 electrical engineering, electronic engineering, information engineering, Graph (abstract data type), 020201 artificial intelligence & image processing, China, Self training
Abstract: Traffic congestion has become a significant obstacle to the development of mega cities in China. Although local governments have used many resources in constructing road infrastructure, it is still insufficient for the increasing traffic demands. As a first step toward optimizing real-time traffic control, this study uses Shanghai Expressways as a case study to predict incident-related congestions. Our study proposes a graph convolutional network-based model to identify correlations in multi-dimensional sensor-detected data, while simultaneously taking into account environmental, spatiotemporal, and network features in predicting traffic conditions immediately after a traffic incident. The average accuracy, average AUC, and average F-1 score of the predictive model are 92.78%, 95.98%, and 88.78%, respectively, on small-scale ground-truth data. Furthermore, we improve the predictive model’s performance using semi-supervised learning by including more unlabeled data instances. As a result, the accuracy, AUC, and F-1 score of the model increase by 2.69%, 1.25%, and 4.72%, respectively. The findings of this article have important implications that can be used to improve the management and development of Expressways in Shanghai, as well as other metropolitan areas in China.
Published: 2021
Full Text: View/download PDF

22. Attitudes toward overtime work and self‐training: A survey on obstetricians and gynecologists in Japan

Author: Michinori Mayama, Chisato Kodera, Takayuki Enomoto, Michiko Kido, Tokumasa Suemitsu, Takuma Ohsuga, Masayuki Sekine, Yosuke Sugita, Takeshi Umazume, Kazutoshi Nakano, Yuto Maeda, Satoshi Nakagawa, Koji Nishijima, Takashi Murakami, Hidemichi Watari, Yukio Suzuki, Ayako Shibata, Makio Shozu, Nobuya Unno, Yohei Onodera, Jumpei Ogura, and Hiroaki Komatsu
Subjects: Generation gap, medicine.medical_specialty, business.industry, media_common.quotation_subject, education, Obstetrics and Gynecology, Questionnaire, Overtime work, Obstetrics, Attitude, Japan, Obstetrics and gynaecology, Gynecology, Surveys and Questionnaires, Family medicine, Humans, Medicine, Christian ministry, Quality (business), business, Self training, Welfare, media_common
Abstract: Aim The Ministry of Health, Labour, and Welfare of Japan proposed a regulation of overtime work as a reform in work style. However, the regulation may deteriorate the quality of medical services due to the reduction in training time. Thus, the study aimed to reveal perceptions in terms of generation gaps in views on self-training and overtime work, among members of the Japan Society of Obstetrics and Gynecology (JSOG). Methods A web-based, self-administered questionnaire survey was conducted among members of the JSOG. In total, 1256 respondents were included in the analysis. Data were collected on age, sex, experience as a medical doctor, location of workplace, work style, the type of main workplace, and number of full-time doctors in the main workplace. The study examined the attitudes of the respondents toward overtime work and self-training. The respondents were categorized based on experience as a medical doctor. Results According to years of experience, 112 (8.9%), 226 (18.0%), 383 (30.5%), 535 (42.6%) doctors have been working for ≤5, 6-10, 11-19, and ≥ 20 years, respectively. Although 54.5% of doctors with ≤5 years of experience expected the regulation on working hours to improve the quality of medical services, those with ≥20 years of experience expressed potential deterioration. After adjusting for covariates, more years of experience were significantly related with the expectation of deterioration in the quality of medical services. Conclusions The study revealed a generation gap in the views about self-training and overtime work among obstetricians and gynecologists in Japan.
Published: 2021
Full Text: View/download PDF

23. Perspective of Making Self-training Habit from Psychological Consideration and Practice

Author: Hiroshi Bando, Akito Moriyasu, Hiroya Hanabusa, Mitsuru Murakami, and Makoto Takasugi
Subjects: Protocol (science), Self-efficacy, Motivation, Rehabilitation, medicine.medical_treatment, media_common.quotation_subject, Perspective (graphical), Applied psychology, General Medicine, Sport psychology, Task (project management), Push-up, Self-training, medicine, Habit, Psychology, Self training, media_common
Abstract: Authors and collaborators have continued clinical practice and research on rehabilitation and self-training, in which various problems were found. Protocol: The author himself tried home self-training exercise of push-up for 2 months, which was successfully achieved. Results: Positive changes were 94 to 96.5cm in chest circumference, and 45 to 100 times in continuous push-up, respectively. Discussion: From the viewpoint of sport psychology, close relationship among motivation, self-efficacy and performance has been observed. Self-efficacy can influence one’s beliefs concerning accomplishing and continuing the task, activities and effort. This report will hopefully become the reference for future practice and research development.
Published: 2021
Full Text: View/download PDF

24. An Effective Tumor Classification With Deep Forest and Self-Training

Author: Lili Shen, Xiaojun Sun, and Zhanbo Chen
Subjects: semi-supervised learning, Gene expression omnibus, General Computer Science, business.industry, Process (engineering), Computer science, Tumor classification, Supervised learning, General Engineering, Sample (statistics), Machine learning, computer.software_genre, Field (computer science), TK1-9971, Random forest, ComputingMethodologies_PATTERNRECOGNITION, self-training, Robustness (computer science), deep forest, General Materials Science, Electrical engineering. Electronics. Nuclear engineering, Artificial intelligence, business, computer, Self training
Abstract: In recent years, tumor classification based on the gene expression omnibus has become a continuous attention field in the area of bioinformatics. Integration machine learning techniques are an efficient methods to solve these problems. Generally, in order to obtain good performance in the supervised learning tasks, a large number of labelled samples will be required. However, in many cases, only a few labelled samples and abundant unlabelled samples exist in the training database. The process of labelling these unlabelled samples manually is difficult and expensive. Therefore, semi-supervised learning approaches have been proposed to utilize unlabelled samples to improve the performance of a model. However, noisy samples decrease the robustness of model in semi-supervised learning. We wish training style that samples can be implemented to train by from high- to low-confidence, self-training can meet this requirement, and the deep forest approach with the hyper-parameter settings used in this work can obtain good accuracy. Therefore, in this paper, we present a novel semi-supervised learning approach with a deep forest model to increase the performance of tumor classification, which employs unlabelled samples and minimizes the cost; that is, a updated unlabelled sample mechanism is investigated to expand the number of high-confidence pseudo-labelled samples. Multiple real-world experiments indicate that our proposed approach can obtain results up 0.96 accuracy and F1-Score, and 0.9798 AUCs.
Published: 2021
Full Text: View/download PDF

25. PRPS-ST: A Protocol-Agnostic Self-training Method for Gene Expression–Based Classification of Blood Cancers

Author: Christopher Rushton, Ryan D. Morin, Bruno M. Grande, David W. Scott, Aixiang Jiang, Jeffrey Tang, and Laura K. Hilton
Subjects: Protocol (science), Computer science, Gene Expression, General Medicine, Computational biology, Data type, Article, Blood cancer, Class imbalance, Binary classification, Hematologic Neoplasms, Neoplasms, Gene expression, Humans, Enhanced sensitivity, Self training
Abstract: Gene expression classifiers are gaining increasing popularity for stratifying tumors into subgroups with distinct biological features. A fundamental limitation shared by current classifiers is the requirement for comparable training and testing datasets. Here, we describe a self-training implementation of our probability ratio-based classification prediction score method (PRPS-ST), which facilitates the porting of existing classification models to other gene expression datasets. In comparison with gold standards, we demonstrate favorable performance of PRPS-ST in gene expression–based classification of diffuse large B-cell lymphoma (DLBCL) and B-lineage acute lymphoblastic leukemia (B-ALL) using a diverse variety of gene expression data types and preprocessing methods, including in classifications with a high degree of class imbalance. Tumors classified by our method were significantly enriched for prototypical genetic features of their respective subgroups. Interestingly, this included cases that were unclassifiable by established methods, implying the potential enhanced sensitivity of PRPS-ST. Significance: The adoption of binary classifiers such as cell of origin (COO) has been thwarted, in part, by the challenges imposed by batch effects and continual evolution of gene expression technologies. PRPS-ST resolves this by enabling classifiers to be ported across platforms while retaining high accuracy. This article is highlighted in the In This Issue feature, p. 215
Published: 2020
Full Text: View/download PDF

26. A self-training hierarchical prototype-based approach for semi-supervised classification

Author: Xiaowei Gu
Subjects: Structure (mathematical logic), Information Systems and Management, business.industry, Computer science, Process (engineering), 05 social sciences, 050301 education, 02 engineering and technology, Machine learning, computer.software_genre, Computer Science Applications, Theoretical Computer Science, Knowledge base, Artificial Intelligence, Control and Systems Engineering, 0202 electrical engineering, electronic engineering, information engineering, Benchmark (computing), Key (cryptography), 020201 artificial intelligence & image processing, Artificial intelligence, business, 0503 education, Self training, computer, Software
Abstract: This paper introduces a novel self-training hierarchical prototype-based approach for semi-supervised classification. The proposed approach firstly identifies meaningful prototypes from labelled samples at multiple levels of granularity and, then, self-organizes a highly transparent, multi-layered recognition model by arranging them in a form of pyramidal hierarchies. After this, the learning model continues to self-evolve its structure and self-expand its knowledge base to incorporate new patterns recognized from unlabelled samples by exploiting the pseudo-label technique. Thanks to its prototype-based nature, the overall computational process of the proposed approach is highly explainable and traceable. Experimental studies with various benchmark image recognition problems demonstrate the state-of-the-art performance of the proposed approach, showing its strong capability to mine key information from unlabelled data for classification.
Published: 2020
Full Text: View/download PDF

27. A Prediction Approach Based on Self-Training and Deep Learning for Biological Data

Author: Mohamed Lamine Berkane, Mahmoud Boufaida, and Mohamed Nadjib Boufenara
Subjects: Biological data, ComputingMethodologies_PATTERNRECOGNITION, business.industry, Computer science, Deep learning, Artificial intelligence, business, Machine learning, computer.software_genre, computer, Self training
Abstract: With the exponential growth of biological data, labeling this kind of data becomes difficult and costly. Although unlabeled data are comparatively more plentiful than labeled ones, most supervised learning methods are not designed to use unlabeled data. Semi-supervised learning methods are motivated by the availability of large unlabeled datasets rather than a small amount of labeled examples. However, incorporating unlabeled data into learning does not guarantee an improvement in classification performance. This paper introduces an approach based on a model of semi-supervised learning, which is the self-training with a deep learning algorithm to predict missing classes from labeled and unlabeled data. In order to assess the performance of the proposed approach, two datasets are used with four performance measures: precision, recall, F-measure, and area under the ROC curve (AUC).
Published: 2020
Full Text: View/download PDF

28. Tunnel condition assessment via cloud model‐based random forests and self‐training approach

Author: Hehua Zhu, J. Woody Ju, Feng Guo, Mengqi Zhu, and Xueqin Chen
Subjects: Computer science, business.industry, Decision tree, Cloud computing, Machine learning, computer.software_genre, Computer Graphics and Computer-Aided Design, Condition assessment, Computer Science Applications, Random forest, Computational Theory and Mathematics, Artificial intelligence, CRFS, business, computer, Self training, Civil and Structural Engineering
Abstract: To proactively assess the losses caused by the deterioration of metro tunnels during the operational period, a new method, the cloud model‐based random forests (CRFs), is proposed to discu...
Published: 2020
Full Text: View/download PDF

29. A semi-supervised self-training method based on density peaks and natural neighbors

Author: Junnan Li and Suwen Zhao
Subjects: 0209 industrial biotechnology, General Computer Science, business.industry, Computer science, Decision tree, Pattern recognition, Computational intelligence, 02 engineering and technology, k-nearest neighbors algorithm, Support vector machine, ComputingMethodologies_PATTERNRECOGNITION, 020901 industrial engineering & automation, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, Cluster analysis, business, Self training, Classifier (UML)
Abstract: The semi-supervised self-training method is one of the successful methodologies of semi-supervised classification and can train a classifier by exploiting both labeled data and unlabeled data. However, most of the self-training methods are limited by the distribution of initial labeled data, heavily rely on parameters and have the poor ability of prediction in the self-training process. To solve these problems, a novel self-training method based on density peaks and natural neighbors (STDPNaN) is proposed. In STDPNaN, an improved parameter-free density peaks clustering (DPCNaN) is firstly presented by introducing natural neighbors. The DPCNaN can reveal the real structure and distribution of data without any parameter, and then helps STDPNaN restore the real data space with the spherical or non-spherical distribution. Also, an ensemble classifier is employed to improve the predictive ability of STDPNaN in the self-training process. Intensive experiments show that (a) STDPNaN outperforms state-of-the-art methods in improving classification accuracy of k nearest neighbor, support vector machine and classification and regression tree; (b) STDPNaN also outperforms comparison methods without any restriction on the number of labeled data; (c) the running time of STDPNaN is acceptable.
Published: 2020
Full Text: View/download PDF

30. Semi‐Supervised Learning

Author: Gaurav Malik, Deepak Kumar Sharma, and Manish Devgan
Subjects: symbols.namesake, business.industry, Computer science, symbols, Artificial intelligence, Semi-supervised learning, Baum–Welch algorithm, Machine learning, computer.software_genre, business, computer, Self training
Published: 2020
Full Text: View/download PDF

31. Self-training algorithm combining density peak and cut edge weight

Author: Yang Liu
Subjects: Computer science, Edge (geometry), Self training, Algorithm
Published: 2020
Full Text: View/download PDF

32. A boosting Self-Training Framework based on Instance Generation with Natural Neighbors for K Nearest Neighbor

Author: Junnan Li and Qingsheng Zhu
Subjects: Boosting (machine learning), Computer science, business.industry, 02 engineering and technology, Machine learning, computer.software_genre, Ensemble learning, k-nearest neighbors algorithm, ComputingMethodologies_PATTERNRECOGNITION, Artificial Intelligence, 0202 electrical engineering, electronic engineering, information engineering, Labeled data, 020201 artificial intelligence & image processing, Artificial intelligence, business, Self training, computer, Classifier (UML)
Abstract: The semi-supervised self-training method is one of the successful methodologies of semi-supervised classification. The mislabeling is the most challenging issue in self-training methods and the ensemble learning is one of the common techniques for dealing with the mislabeling. Specifically, the ensemble learning can solve or alleviate the mislabeling by constructing an ensemble classifier to improve prediction accuracy in the self-training process. However, most ensemble learning methods may not perform well in self-training methods because it is difficult for ensemble learning methods to train an effective ensemble classifier with a small number of labeled data. Inspired by the successful boosting methods, we introduce a new boosting self-training framework based on instance generation with natural neighbors (BoostSTIG) in this paper. BoostSTIG is compatible with most boosting methods and self-training methods. It can use most boosting methods to solve or alleviate the mislabeling of existing self-training methods by improving the prediction accuracy in the self-training process. Besides, an instance generation with natural neighbors is proposed to enlarge initial labeled data in BoostSTIG, which makes boosting methods more suitable for self-training methods. In experiments, we apply the BoostSTIG framework to 2 self-training methods and 4 boosting methods, and then validate BoostSTIG by comparing some state-of-the-art technologies on real data sets. Intensive experiments show that BoostSTIG can improve the performance of tested self-training methods and train an effective k nearest neighbor.
Published: 2020
Full Text: View/download PDF

33. Self-training and learning the waveform features of microseismic data using an adaptive dictionary

Author: Quan Zhang, Hang Wang, Jinwei Fang, Guoyin Zhang, and Yangkang Chen
Subjects: Microseism, 010504 meteorology & atmospheric sciences, Computer science, Process (computing), 010502 geochemistry & geophysics, computer.software_genre, 01 natural sciences, Geophysics, Hydraulic fracturing, Geochemistry and Petrology, Waveform, Unsupervised learning, Data mining, computer, Dictionary learning, Self training, 0105 earth and related environmental sciences
Abstract: Microseismic monitoring is an indispensable technique in characterizing the physical processes that are caused by extraction or injection of fluids during the hydraulic fracturing process. Microseismic data, however, are often contaminated with strong random noise and have a low signal-to-noise ratio (S/N). The low S/N in most microseismic data severely affects the accuracy and reliability of the source localization and source-mechanism inversion results. We have developed a new denoising framework to enhance the quality of microseismic data. We use the method of adaptive sparse dictionaries to learn the waveform features of the microseismic data by iteratively updating the dictionary atoms and sparse coefficients in an unsupervised way. Unlike most existing dictionary learning applications in the seismic community, we learn the features from 1D microseismic data, thereby to learn 1D features of the waveforms. We develop a sparse dictionary learning framework and then prepare the training patches and implement the algorithm to obtain favorable denoising performance. We use extensive numerical examples and real microseismic data examples to demonstrate the validity of our method. Results show that the features of microseismic waveforms can be learned to distinguish signal patches and noise patches even from a single channel of microseismic data. However, more training data can make the learned features smoother and better at representing useful signal components.
Published: 2020
Full Text: View/download PDF

34. Divide-and-conquer ensemble self-training method based on probability difference

Author: Tingting Li and Jia Lu
Subjects: Divide and conquer algorithms, Structure (mathematical logic), General Computer Science, Generalization, business.industry, Computer science, Process (computing), 020206 networking & telecommunications, Computational intelligence, Pattern recognition, 02 engineering and technology, ComputingMethodologies_PATTERNRECOGNITION, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Noise (video), Artificial intelligence, business, Classifier (UML), Self training
Abstract: Self-training method can train an effective classifier by exploiting labeled instances and unlabeled instances. In the process of self-training method, the high confidence instances are usually selected iteratively and added to the training set for learning. Unfortunately, the structure information of high confidence instances is so similar that it leads to local over-fitting during the iterations. In order to avoid the over-fitting phenomenon, and improve the classification effect of self-training methods, a novel divide-and-conquer ensemble self-training framework based on probability difference is proposed. Firstly, the probability difference of instances is calculated by the category probability of each classifier, the low-fuzzy and high-fuzzy instances of each classifier are divided through the probability difference. Then, a divide-and-conquer strategy is adopted. That is, the low-fuzzy instances determined by all the classifiers are directly labeled and high-fuzzy instances are manually labeled. Finally, the labeled instances are added to the training set for iteration self-training. This method expands the training set by selecting low-fuzzy instances with accurate structure information and high-fuzzy instances with more comprehensive structure information, and it improves the generalization performance of the method effectively. The method is more suitable for noise data sets and it can obtain structure information even in a few labeled instances. The effectiveness of the proposed method is verified by comparative experiments on the University of California Irvine (UCI).
Published: 2020
Full Text: View/download PDF

35. STDS: self-training data streams for mining limited labeled data in non-stationary environment

Author: Jafar Tanha, Arash Sharifi, Shirin Khezri, and Ali Ahmadi
Subjects: Concept drift, Data stream mining, business.industry, Computer science, 02 engineering and technology, Machine learning, computer.software_genre, ComputingMethodologies_PATTERNRECOGNITION, Data point, Artificial Intelligence, 0202 electrical engineering, electronic engineering, information engineering, Labeled data, 020201 artificial intelligence & image processing, Artificial intelligence, Cluster analysis, business, Self training, Classifier (UML), computer
Abstract: Inthis article, wefocus on the classification problem to semi-supervised learning in non-stationary environment. Semi-supervised learning is a learning task from both labeled and unlabeled data points. There are several approaches to semi-supervised learning in stationary environment which are not applicable directly for data streams. We propose a novel semi-supervised learning algorithm, named STDS. The proposed approach uses labeled and unlabeled data and employs an approach to handle the concept drift in data streams. The main challenge in semi-supervised self-training for data streams is to find a proper selection metric in order to find a set of high-confidence predictions and a proper underlying base learner. We therefore propose an ensemble approach to find a set of high-confidence predictions based on clustering algorithms and classifier predictions. We then employ the Kullback-Leibler (KL) divergence approach to measure the distribution differences between sequential chunks in order to detect the concept drift. When drift is detected, a new classifier is updated from the new set of labeled data in the current chunk; otherwise, a percentage of high-confidence newly labeled data in the current chunk is added to the labeled data in the next chunk for updating the incremental classifier based on the proposed selection metric. The results of our experiments on a number of classification benchmark datasets show that STDS outperforms the supervised and the most of other semi-supervised learning methods.
Published: 2020
Full Text: View/download PDF

36. METHODOLOGY OF ORGANISING SELF-TRAINING OF PROSPECTIVE SINGERS FOR STAGING POPULAR PERFORMANCES

Author: D. Lievit
Subjects: Medical education, Psychology, Self training
Published: 2020
Full Text: View/download PDF

37. Improved well-log classification using semisupervised label propagation and self-training, with comparisons to popular supervised algorithms

Author: Alison Malcolm, Michael W. Dunham, and J. Kim Welford
Subjects: 010504 meteorology & atmospheric sciences, Computer science, business.industry, 010502 geochemistry & geophysics, Machine learning, computer.software_genre, 01 natural sciences, ComputingMethodologies_PATTERNRECOGNITION, Geophysics, Geochemistry and Petrology, Artificial intelligence, business, computer, Self training, 0105 earth and related environmental sciences, Label propagation
Abstract: Machine-learning techniques allow geoscientists to extract meaningful information from data in an automated fashion, and they are also an efficient alternative to traditional manual interpretation methods. Many geophysical problems have an abundance of unlabeled data and a paucity of labeled data, and the lithology classification of wireline data reflects this situation. Training supervised algorithms on small labeled data sets can lead to overtraining, and subsequent predictions for the numerous unlabeled data may be unstable. However, semisupervised algorithms are designed for classification problems with limited amounts of labeled data, and they are theoretically able to achieve better accuracies than supervised algorithms in these situations. We explore this hypothesis by applying two semisupervised techniques, label propagation (LP) and self-training, to a well-log data set and compare their performance to three popular supervised algorithms. LP is an established method, but our self-training method is a unique adaptation of existing implementations. The well-log data were made public through an SEG competition held in 2016. We simulate a semisupervised scenario with these data by assuming that only one of the 10 wells has labels (i.e., core samples), and our objective is to predict the labels for the remaining nine wells. We generate results from these data in two stages. The first stage is applying all the algorithms in question to the data as is (i.e., the global data), and the results from this motivate the second stage, which is applying all algorithms to the data when they are decomposed into two separate data sets. Overall, our findings suggest that LP does not outperform the supervised methods, but our self-training method coupled with LP can outperform the supervised methods by a notable margin if the assumptions of LP are met.
Published: 2020
Full Text: View/download PDF

38. METHODS OF DEVELOPMENT AND USE OF MENTAL MAPS DURING SELF-TRAINING OF HIGHER EDUCATION PERSONS ENGAGED IN MARTIAL ARTS

Author: O.O. Nesterenko, O.A. Samoilenko, S.V. Levchenko, V.P. Skliarenko, and S.I. Karpenko
Subjects: Medical education, Fuel Technology, Martial arts, Higher education, business.industry, Process Chemistry and Technology, Mental mapping, Economic Geology, General Medicine, Psychology, business, Self training
Published: 2020
Full Text: View/download PDF

39. Chronological Self-Training for Real-Time Speaker Diarization

Author: Dirk Padfield and Daniel J. Liebling
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Sound (cs.SD), Computer Science - Computation and Language, Computer science, Speech recognition, Computer Science - Sound, Machine Learning (cs.LG), Speaker diarisation, Audio and Speech Processing (eess.AS), FOS: Electrical engineering, electronic engineering, information engineering, Self training, Computation and Language (cs.CL), Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Diarization partitions an audio stream into segments based on the voices of the speakers. Real-time diarization systems that include an enrollment step should limit enrollment training samples to reduce user interaction time. Although training on a small number of samples yields poor performance, we show that the accuracy can be improved dramatically using a chronological self-training approach. We studied the tradeoff between training time and classification performance and found that 1 second is sufficient to reach over 95% accuracy. We evaluated on 700 audio conversation files of about 10 minutes each from 6 different languages and demonstrated average diarization error rates as low as 10%., Comment: 5 pages, 5 figures, ICASSP 2021
Published: 2022
Full Text: View/download PDF

40. THEORETICAL TRAINING OF CADETS OF HIGHER EDUCATION INSTITUTIONS OF THE MINISTRY OF INTERNAL AFFAIRS IN SPORTS AND PEDAGOGICAL DISCIPLINES

Author: O. Arkhipova, E. Krestnikova, and D. Gladkikh
Subjects: Officer, Medical education, Higher education, business.industry, Political science, Foundation (evidence), Christian ministry, business, Training (civil), Self training, Professional activity
Abstract: The purpose of the study was the presence of a theoretical foundation of knowledge in sports and pedagogical disciplines among cadets of the Ministry of Internal Affairs Universities, which determines their success in further professional activities. Theory not only motivates, programs, and regulates, but also controls the practical activities of the future police officer. The successful acquisition of certain knowledge and skills serves as criteria for their entry into the general cultural baggage of a specialist, expanding the opportunities for the development of his professional activity
Published: 2021
Full Text: View/download PDF

41. Uncertainty-Aware Self-Training for Semi-Supervised Event Temporal Relation Extraction

Author: Wei Bi, Jun Zhao, Yubo Chen, Xinyu Zuo, Pengfei Cao, and Kang Liu
Subjects: Sample selection, Event (computing), business.industry, Computer science, Process (engineering), Natural language understanding, computer.software_genre, Machine learning, Relationship extraction, Task (project management), Artificial intelligence, business, Self training, Data Annotation, computer
Abstract: Extracting event temporal relations is an important task for natural language understanding. Many works have been proposed for supervised event temporal relation extraction, which typically requires a large amount of human-annotated data for model training. However, the data annotation for this task is very time-consuming and challenging. To this end, we study the problem of semi-supervised event temporal relation extraction. Self-training as a widely used semi-supervised learning method can be utilized for this problem. However, it suffers from the noisy pseudo-labeling problem. In this paper, we propose the use of uncertainty-aware self-training framework (UAST) to quantify the model uncertainty for coping with pseudo-labeling errors. Specifically, UAST utilizes (1) Uncertainty Estimation module to compute the model uncertainty for pseudo-labeling unlabeled data; (2) Sample Selection with Exploration module to select informative samples based on uncertainty estimates; and (3) Uncertainty-Aware Learning module to explicitly incorporate the model uncertainty into the self-training process. Experimental results indicate that our approach significantly outperforms previous state-of-the-art methods.
Published: 2021
Full Text: View/download PDF

42. Active learning algorithms for multitopic classification

Author: Universitat Politècnica de Catalunya. Departament d'Enginyeria Telemàtica, Moreno Bilbao, M. Asunción, Ruiz Costa-Jussà, Marta, Bonafonte Pardàs, Guillem, Universitat Politècnica de Catalunya. Departament d'Enginyeria Telemàtica, Moreno Bilbao, M. Asunción, Ruiz Costa-Jussà, Marta, and Bonafonte Pardàs, Guillem
Abstract: In this master thesis we develop a model that surpasses previous studies to be able to detect cyberbullying and other disorders that are a common behaviour in teenagers. We analyze short sentences in social media with new techniques that haven?t been studied in depth in language processing in order to be able to detect these problems. Deep learning is nowadays the common approach for text analysis. However, struggling with dataset size is one of the most common problems. It is not optimal to dedicate thousands of hours to label data by humans every time we want to create a new model. Different techniques have been used over the years to solve or at least minimize this problem, for instance transfer learning or self-learning. One of the most known ways to solve this is by data augmentation. In this thesis we make use of active learning and self-training to address having restrictions of labelled data. We have used data that has not been labeled to improve the performance of our models. The architecture of the model is composed of a Bert model plus a linear layer that projects the Bert sentence embedding into the number of classes we want to detect. We take advantage of this already functional model to label new data that we will use afterwards to create our final model. Using noise techniques we modify the data so the final model has to predict less structured data and learn from difficult scenarios. Thanks to this technique we were able to improve the results in some of the classes, for instance the F-score modified increases by 7% for substance abuse (drugs, alcohol, etc) and 3% in disorders (anxiety, depression and distress) while keeping the performance of the other classes.
Published: 2021

43. Dual-Consistency Self-Training For Unsupervised Domain Adaptation

Author: Jie Wang, Yasuto Yokota, Chaoliang Zhong, Masaru Ide, Cheng Feng, and Jun Sun
Subjects: Dual consistency, Domain adaptation, Computer science, business.industry, Artificial intelligence, Machine learning, computer.software_genre, business, Self training, computer
Published: 2021
Full Text: View/download PDF

44. An Improved Self-Training Method for Positive Unlabeled Time Series Classification Using DTW Barycenter Averaging

Author: Yabo Dong, Duanqing Xu, Tongbin Zuo, Jing Li, and Haowen Zhang
Subjects: Time series classification, Dynamic time warping, Computer science, Boundary (topology), TP1-1185, Biochemistry, Article, Analytical Chemistry, Domain (software engineering), Set (abstract data type), self-training, Cluster Analysis, Humans, Electrical and Electronic Engineering, Instrumentation, Sequence, business.industry, Chemical technology, positive unlabeled time series classification, Pattern recognition, Atomic and Molecular Physics, and Optics, ComputingMethodologies_PATTERNRECOGNITION, dynamic time warping, Labeled data, Artificial intelligence, business, Self training, DTW barycenter averaging
Abstract: Traditional supervised time series classification (TSC) tasks assume that all training data are labeled. However, in practice, manually labelling all unlabeled data could be very time-consuming and often requires the participation of skilled domain experts. In this paper, we concern with the positive unlabeled time series classification problem (PUTSC), which refers to automatically labelling the large unlabeled set U based on a small positive labeled set PL. The self-training (ST) is the most widely used method for solving the PUTSC problem and has attracted increased attention due to its simplicity and effectiveness. The existing ST methods simply employ the one-nearest-neighbor (1NN) formula to determine which unlabeled time-series should be labeled. Nevertheless, we note that the 1NN formula might not be optimal for PUTSC tasks because it may be sensitive to the initial labeled data located near the boundary between the positive and negative classes. To overcome this issue, in this paper we propose an exploratory methodology called ST-average. Unlike conventional ST-based approaches, ST-average utilizes the average sequence calculated by DTW barycenter averaging technique to label the data. Compared with any individuals in PL set, the average sequence is more representative. Our proposal is insensitive to the initial labeled data and is more reliable than existing ST-based methods. Besides, we demonstrate that ST-average can naturally be implemented along with many existing techniques used in original ST. Experimental results on public datasets show that ST-average performs better than related popular methods.
Published: 2021

45. Personalized Simulated HUT as an At-home Prediction Model for Heart Rate Changes in Syncope Patients and At-home, Orthostatic Self-training Efficacy

Author: Helmut Ahammer, Herbert F. Jelinek, Dahlia Hassan, Dominik Wehler, and Robert Krones
Subjects: medicine.medical_specialty, Orthostatic vital signs, biology, business.industry, Internal medicine, Heart rate, Syncope (genus), Cardiology, medicine, biology.organism_classification, business, Self training
Abstract: Head-up tilt (HUT) testing supports the diagnosis of syncope by detecting abnormalities in heart rate and blood pressure changes. Home-based self-training can be of benefit to neurocardiogenic patients if during clinical HUT, heart rate decreases in the early stage of being in an upright position. However, HUT testing is not always possible in the hospital as it is inconvenient and sometimes even risky for patients with cardiac abnormalities as it may trigger a loss of consciousness and arrhythmia. To address this, the current paper introduces a personalized HUT simulation to determine the efficacy of at-home training. To develop the model, Holter ECG recordings were obtained from 28 syncope patients and the simulated output was compared to clinical findings. The model aims to predict heart rate changes associated with the simulated HUT that can indicate efficacy of an at-home program. Heart rate represents a variable of velocity in the model measured in liters per second against gravity. The results show that a decrease in heart rate in early simulated HUT as determined by the model shows a greater than 84% efficiency for syncope patients to benefit from at-home training and allows physicians to recommend home training during an online or telemedicine consultation. Keywords— head-up tilt test, syncope, blood flow, heart rate prediction Clinical Relevance— The cardiovascular model predicts the patient-specific efficacy of at home tilt-training for patients diagnosed with syncope.
Published: 2021
Full Text: View/download PDF

46. Text Classification with Heterogeneous Data Using Multiple Self-Training Classifiers

Author: Dong-Hoon Lee, Namgyu Kim, and William Xiu Shun Wong
Subjects: Information Systems and Management, Sociology and Political Science, Computer science, business.industry, Artificial intelligence, business, Machine learning, computer.software_genre, Self training, computer
Published: 2019
Full Text: View/download PDF

47. Interpolative self-training approach for link prediction

Author: Somayyeh Aghababaei and Masoud Makrehchi
Subjects: Artificial Intelligence, business.industry, Computer science, Computer Vision and Pattern Recognition, Artificial intelligence, business, Machine learning, computer.software_genre, Link (knot theory), computer, Self training, Theoretical Computer Science
Published: 2019
Full Text: View/download PDF

48. Deep Contextualized Self-training for Low Resource Dependency Parsing

Author: Roi Reichart and Guy Rotman
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Linguistics and Language, Computer Science - Computation and Language, Low resource, business.industry, Computer science, Communication, lcsh:P98-98.5, computer.software_genre, Machine Learning (cs.LG), Computer Science Applications, Human-Computer Interaction, Artificial Intelligence, Dependency grammar, Labeled data, Artificial intelligence, lcsh:Computational linguistics. Natural language processing, business, Computation and Language (cs.CL), computer, Self training, Natural language processing
Abstract: Neural dependency parsing has proven very effective, achieving state-of-the-art results on numerous domains and languages. Unfortunately, it requires large amounts of labeled data, that is costly and laborious to create. In this paper we propose a self-training algorithm that alleviates this annotation bottleneck by training a parser on its own output. Our Deep Contextualized Self-training (DCST) algorithm utilizes representation models trained on sequence labeling tasks that are derived from the parser's output when applied to unlabeled data, and integrates these models with the base parser through a gating mechanism. We conduct experiments across multiple languages, both in low resource in-domain and in cross-domain setups, and demonstrate that DCST substantially outperforms traditional self-training as well as recent semi-supervised training methods., Comment: Accepted to TACL in September 2019
Published: 2019
Full Text: View/download PDF

49. Logistics optimisation of slab pre-marshalling problem in steel industry

Author: Lixin Tang, Ying Meng, Jiyin Liu, Peixin Ge, and Ren Zhao
Subjects: 0209 industrial biotechnology, 021103 operations research, Computer science, business.industry, Strategy and Management, 0211 other engineering and technologies, 02 engineering and technology, Structural engineering, Management Science and Operations Research, Hybrid algorithm, Industrial and Manufacturing Engineering, Marshalling, 020901 industrial engineering & automation, Stack (abstract data type), Group (periodic table), Slab, business, Self training
Abstract: We study the slab pre-marshalling problem to re-position slabs in a way that the slabs are stored in the least number of stacks and each stack contains only the slabs of the same group, which can b...
Published: 2019
Full Text: View/download PDF

50. Materials for Self-Training of Foreign Students in the Course 'Fundamentals of Linguistics'

Author: Oksana Voloshina
Subjects: Mathematics education, Psychology, Self training, Course (navigation)
Published: 2019
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

420 results on '"Self training"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources