52,570 results
Search Results
2. Content-based quality evaluation of scientific papers using coarse feature and knowledge entity network.
- Author
-
Wang, Zhongyi, Zhang, Haoxuan, Chen, Haihua, Feng, Yunhe, and Ding, Junhua
- Subjects
MACHINE learning ,SCIENCE education ,COMPUTER science ,PEER pressure ,RANDOM forest algorithms - Abstract
Pre-evaluating scientific paper quality aids in alleviating peer review pressure and fostering scientific advancement. Although prior studies have identified numerous quality-related features, their effectiveness and representativeness of paper content remain to be comprehensively investigated. Addressing this issue, we propose a content-based interpretable method for pre-evaluating the quality of scientific papers. Firstly, we define quality attributes of computer science (CS) papers as integrity , clarity , novelty , and significance , based on peer review criteria from 11 top-tier CS conferences. We formulate the problem as two classification tasks: Accepted/Disputed/Rejected (ADR) and Accepted/Rejected (AR). Subsequently, we construct fine-grained features from metadata and knowledge entity networks, including text structure, readability, references, citations, semantic novelty, and network structure. We empirically evaluate our method using the ICLR paper dataset, achieving optimal performance with the Random Forest model, yielding F1 scores of 0.715 and 0.762 for the two tasks, respectively. Through feature analysis and case studies employing SHAP interpretable methods, we demonstrate that the proposed features enhance the performance of machine learning models in scientific paper quality evaluation, offering interpretable evidence for model decisions. • Define four criteria for quality evaluation of scientific papers: integrity, clarity, novelty, and significance. • Propose a framework for quality evaluation of scientific papers based on coarse features and knowledge entity network. • An effective algorithm for measuring the novelty and significance of scientific papers based on knowledge entity networks. • Create and release a rigorous dataset, which could serve as the gold standard for quality evaluation of scientific papers. • Conduct extensive experiments to validate the effectiveness of the proposed framework. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Comparing LSTM and GRU Models to Predict the Condition of a Pulp Paper Press
- Author
-
Antonio J. Marques Cardoso, Rui Assis, Balduíno César Mateus, Mateus Mendes, and José Torres Farinha
- Subjects
Technology ,Multivariate statistics ,Control and Optimization ,Computer science ,GRU ,Energy Engineering and Power Technology ,Machine learning ,computer.software_genre ,Predictive maintenance ,predictive maintenance ,LSTM ,recurrent neural network ,paper press ,Autoregressive integrated moving average ,Electrical and Electronic Engineering ,Engineering (miscellaneous) ,Hyperparameter ,Artificial neural network ,Renewable Energy, Sustainability and the Environment ,business.industry ,Univariate ,Statistical model ,Recurrent neural network ,Artificial intelligence ,business ,computer ,Energy (miscellaneous) - Abstract
The accuracy of a predictive system is critical for predictive maintenance and to support the right decisions at the right times. Statistical models, such as ARIMA and SARIMA, are unable to describe the stochastic nature of the data. Neural networks, such as long short-term memory (LSTM) and the gated recurrent unit (GRU), are good predictors for univariate and multivariate data. The present paper describes a case study where the performances of long short-term memory and gated recurrent units are compared, based on different hyperparameters. In general, gated recurrent units exhibit better performance, based on a case study on pulp paper presses. The final result demonstrates that, to maximize the equipment availability, gated recurrent units, as demonstrated in the paper, are the best options.
- Published
- 2021
- Full Text
- View/download PDF
4. A co‐training ‐based approach for the hierarchical multi‐label classification of research papers
- Author
-
Khalil Drira, Hatem Bellaaj, Abir Masmoudi, Mohamed Jmaiel, Université de Sfax - University of Sfax, Équipe Services et Architectures pour Réseaux Avancés (LAAS-SARA), Laboratoire d'analyse et d'architecture des systèmes (LAAS), Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)-Université Toulouse III - Paul Sabatier (UT3), Université Fédérale Toulouse Midi-Pyrénées-Institut National des Sciences Appliquées - Toulouse (INSA Toulouse), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées, Unité de Recherche en développement et contrôle d'applications distribuées (REDCAD), École Nationale d'Ingénieurs de Sfax | National School of Engineers of Sfax (ENIS), Université Toulouse Capitole (UT Capitole), Université de Toulouse (UT)-Université de Toulouse (UT)-Institut National des Sciences Appliquées - Toulouse (INSA Toulouse), Institut National des Sciences Appliquées (INSA)-Université de Toulouse (UT)-Institut National des Sciences Appliquées (INSA)-Université Toulouse - Jean Jaurès (UT2J), Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3), Université de Toulouse (UT)-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université de Toulouse (UT)-Université Toulouse Capitole (UT Capitole), and Université de Toulouse (UT)
- Subjects
0209 industrial biotechnology ,Computer science ,02 engineering and technology ,Semi-supervised learning ,Imbalanced data ,[INFO.INFO-SE]Computer Science [cs]/Software Engineering [cs.SE] ,Hierarchical Multi-label classication ,Machine learning ,computer.software_genre ,Theoretical Computer Science ,Set (abstract data type) ,Consistency (database systems) ,[INFO.INFO-NI]Computer Science [cs]/Networking and Internet Architecture [cs.NI] ,020901 industrial engineering & automation ,Cardinality ,Co-training ,Artificial Intelligence ,0202 electrical engineering, electronic engineering, information engineering ,Multi-label classification ,Hierarchy (mathematics) ,business.industry ,[INFO.INFO-WB]Computer Science [cs]/Web ,Research papers classication ,Bibliographic coupling ,ComputingMethodologies_PATTERNRECOGNITION ,Computational Theory and Mathematics ,Control and Systems Engineering ,020201 artificial intelligence & image processing ,[INFO.INFO-ET]Computer Science [cs]/Emerging Technologies [cs.ET] ,Artificial intelligence ,business ,computer - Abstract
International audience; This paper focuses on the problem of the hierarchical multi‐label classification of research papers, which is the task of assigning the set of relevant labels for a paper from a hierarchy, using reduced amounts of labelled training data. Specifically, we study leveraging unlabelled data, which are usually plentiful and easy to collect, in addition to the few available labelled ones in a semi‐supervised learning framework for achieving better performance results. Thus, in this paper, we propose a semi‐supervised approach for the hierarchical multi‐label classification task of research papers based on the well‐known Co‐training algorithm, which exploit content and bibliographic coupling information as two distinct papers' views. In our approach, two hierarchical multi‐label classifiers, are learnt on different views of the labelled data, and iteratively select their most confident unlabelled samples, which are further added to the labelled set. The success of our suggested Co‐training‐based approach lies in two main components. The first is the use of two suggested selection criteria (i.e., Maximum Agreement and Labels Cardinality Consistency) that enforce selecting confident unlabelled samples. The second is the appliance of an oversampling method that rebalances the labels distribution of the initial labelled set, which reduces the reinforcement of the label imbalance issue during the Co‐training learning. The proposed approach is evaluated using a collection of scientific papers extracted from the ACM digital library. Performed experiments show the effectiveness of our approach with regards to several baseline methods.
- Published
- 2021
5. Optimization of the development of latent fingermarks on thermal papers
- Author
-
Laurent Tamisier, Pierre Ledroit, Marianne Malo, Damien Henrot, and Florine Hallez
- Subjects
Paper ,Hot Temperature ,Luminescent Agents ,Time Factors ,business.industry ,Computer science ,Ninhydrin ,Acetates ,Comparative trial ,Machine learning ,computer.software_genre ,Pathology and Forensic Medicine ,Magnetic powder ,Acetone ,Indans ,Humans ,Indicators and Reagents ,Cyanoacrylates ,Artificial intelligence ,Dermatoglyphics ,Powders ,Volatilization ,business ,Law ,computer - Abstract
Thermal papers, commonly used for printed receipts or lottery tickets, are omnipresent in our everyday life. They are regarded as semi-porous substrates, and yet can be critical to analyze when looking for latent fingermarks due to their thermosensibility. The aim of this study was to investigate a development sequence that would better combine the adequate detection techniques in order to maximize the chances to develop latent fingermarks left on these substrates. Different methods of development have been compared on test substrates: black magnetic powder, Lumicyano™, thermal development, ninhydrin and 1,2-indanedione/ZnCl2. Whitening stages and thermal development have been focused on, tested and optimized. The results of these preliminary tests enabled the study of three development sequences. They have subsequently been compared to the one currently used in the Gendarmerie's laboratories and the best results have been provided during pseudo-operational comparative trials by one of these sequences, consisting in 6 stages.
- Published
- 2019
6. Notable Papers and New Directions in Sensors, Signals, and Imaging Informatics
- Author
-
William Hsu, Christian Baumgartner, and Thomas M. Deserno
- Subjects
Diagnostic Imaging ,Biometry ,Imaging informatics ,Computer science ,Image processing ,Section 4: Sensor, Signal and Imaging Informatics ,Health informatics ,Machine Learning ,Set (abstract data type) ,Medical imaging ,Humans ,medical informatics ,Information retrieval ,Sensors ,business.industry ,Reproducibility of Results ,signals ,Electroencephalography ,Subject (documents) ,General Medicine ,Informatics ,imaging informatics ,Synopsis ,Neural Networks, Computer ,Yearbook ,business - Abstract
Summary Objective: To identify and highlight research papers representing noteworthy developments in signals, sensors, and imaging informatics in 2020. Method: A broad literature search was conducted on PubMed and Scopus databases. We combined Medical Subject Heading (MeSH) terms and keywords to construct particular queries for sensors, signals, and image informatics. We only considered papers that have been published in journals providing at least three articles in the query response. Section editors then independently reviewed the titles and abstracts of preselected papers assessed on a three-point Likert scale. Papers were rated from 1 (do not include) to 3 (should be included) for each topical area (sensors, signals, and imaging informatics) and those with an average score of 2 or above were subsequently read and assessed again by two of the three co-editors. Finally, the top 14 papers with the highest combined scores were considered based on consensus. Results: The search for papers was executed in January 2021. After removing duplicates and conference proceedings, the query returned a set of 101, 193, and 529 papers for sensors, signals, and imaging informatics, respectively. We filtered out journals that had less than three papers in the query results, reducing the number of papers to 41, 117, and 333, respectively. From these, the co-editors identified 22 candidate papers with more than 2 Likert points on average, from which 14 candidate best papers were nominated after intensive discussion. At least five external reviewers then rated the remaining papers. The four finalist papers were found using the composite rating of all external reviewers. These best papers were approved by consensus of the International Medical Informatics Association (IMIA) Yearbook editorial board. Conclusions. Sensors, signals, and imaging informatics is a dynamic field of intense research. The four best papers represent advanced approaches for combining, processing, modeling, and analyzing heterogeneous sensor and imaging data. The selected papers demonstrate the combination and fusion of multiple sensors and sensor networks using electrocardiogram (ECG), electroencephalogram (EEG), or photoplethysmogram (PPG) with advanced data processing, deep and machine learning techniques, and present image processing modalities beyond state-of-the-art that significantly support and further improve medical decision making.
- Published
- 2021
7. Fast FMCW Terahertz Imaging for In-Process Defect Detection in Press Sleeves for the Paper Industry and Image Evaluation with a Machine Learning Approach
- Author
-
Fabian Friederich, Raphael Hussung, Carsten Matheis, Joachim Jonuscheit, Maris Bauer, Uwe Matuschczyk, Peter Weichenberger, Jens Beck, Hermann Reichert, and Publica
- Subjects
Fabrication ,Terahertz radiation ,Computer science ,TP1-1185 ,Molding (process) ,Biochemistry ,Article ,Rotational molding ,Analytical Chemistry ,Machine Learning ,terahertz imaging ,frequency-modulated continuous wave ,Data acquisition ,Nondestructive testing ,Electrical and Electronic Engineering ,Instrumentation ,Image resolution ,nondestructive testing ,business.industry ,Chemical technology ,Pulp and paper industry ,anomaly detection ,Atomic and Molecular Physics, and Optics ,paper industry ,Anomaly detection ,press sleeves ,business - Abstract
We present a rotational terahertz imaging system for inline nondestructive testing (NDT) of press sleeves for the paper industry during fabrication. Press sleeves often consist of polyurethane (PU) which is deposited by rotational molding on metal barrels and its outer surface mechanically processed in several milling steps afterwards. Due to a stabilizing polyester fiber mesh inlay, small defects can form on the sleeve’s backside already during the initial molding, however, they cannot be visually inspected until the whole production processes is completed. We have developed a fast-scanning frequenc-modulated continuous wave (FMCW) terahertz imaging system, which can be integrated into the manufacturing process to yield high resolution images of the press sleeves and therefore can help to visualize hidden structural defects at an early stage of fabrication. This can save valuable time and resources during the production process. Our terahertz system can record images at 0.3 and 0.5 THz and we achieve data acquisition rates of at least 20 kHz, exploiting the fast rotational speed of the barrels during production to yield sub-millimeter image resolution. The potential of automated defect recognition by a simple machine learning approach for anomaly detection is also demonstrated and discussed.
- Published
- 2021
8. Design of English Intelligent Simulated Paper Marking System
- Author
-
Wei Liu and Lina Yang
- Subjects
Structure (mathematical logic) ,0209 industrial biotechnology ,Multidisciplinary ,Article Subject ,General Computer Science ,business.industry ,Computer science ,Rationality ,QA75.5-76.95 ,02 engineering and technology ,Machine learning ,computer.software_genre ,Triage ,Support vector machine ,Normal distribution ,020901 industrial engineering & automation ,Electronic computers. Computer science ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Composition (language) ,Word (computer architecture) ,Sentence - Abstract
In this paper, we analyze the intelligent subdesign of the simulated marking system through an in-depth study of it. This paper proposes a correlation analysis-based quantification of N-element sense values and a rationality enhancement-based scoring fitting algorithm for English essays. This paper also extracts word features, sentence features, and chapter structure features in essays to fit English composition scores. Since not all students can complete the essays according to the topic requirements, a triage scoring model is used to separate the normal essays from the low-scoring essays. Statistically, it was found that the essay scores also showed a certain normal distribution. The standard support vector regression algorithm is prone to data skewing problems, so this paper addresses this problem by using a rationality enhancement method that gives a corresponding penalty factor according to the distribution of the dataset. The results show that the English essay scoring fitting algorithm proposed in this paper can well improve the prediction accuracy of some data and solve the problem of skewed data where the scores show a normal distribution. This paper designs and implements an online mock examination system that incorporates an intelligent scoring system for essays, enabling it to meet the needs of teachers and students for online examinations and intelligent scoring.
- Published
- 2021
9. Survey Paper on Automatic Detection of Fake Profile Using Machine Learning on Instagram
- Author
-
Komal Kharbikar, Anushree Awachat, Er. Pranay Meshram, Rutika Bhambulkar, and Puja Pokale
- Subjects
business.industry ,Computer science ,0202 electrical engineering, electronic engineering, information engineering ,020207 software engineering ,020201 artificial intelligence & image processing ,02 engineering and technology ,Artificial intelligence ,business ,Machine learning ,computer.software_genre ,computer - Abstract
With the arrival of the Internet and social media, at the same time as masses of humans have benefitted from the full-size reassets of records available, there was an full-size boom with inside the upward push of cyber-crimes, mainly targeted closer to women. According to a 2019 file with inside the Economics Times, India has witnessed a 457% upward push in cybercrime with inside the 5 years span among 2011 and 2016. Most speculate that that is because of effect of social media inclusive of Facebook, Instagram and Twitter on our day by day lives. While those simply assist in growing a legitimate social network, advent of consumer debts in those websites normally desires simply an email-id. A actual lifestyles man or woman can create more than one fake IDs and for this reason impostors can effortlessly be made. Unlike the actual international state of affairs in which more than one policies and guidelines are imposed to become aware of oneself in a completely unique manner (as an instance at the same time as issuing one’s passport or driver’s license), with inside the digital international of social media, admission does now no longer require this kind of checks. In this paper, we study the one-of-a-kind debts of Instagram, specifically and try and verify an account as fake or actual the use of Machine Learning strategies specifically Logistic Regression and Random Forest Algorithm.
- Published
- 2021
10. Improved assessment of accuracy and performance indicators in paper-based ELISA
- Author
-
Luis Aparecido Milan, Carlos Alberto Mestriner, Thiago Mazzu-Nascimento, Diego Furtado Silva, Fabiana Cristina Donofrio, Giorgio Gianini Morbioli, and Emanuel Carrilho
- Subjects
Receiver operating characteristic ,Computer science ,business.industry ,General Chemical Engineering ,Point-of-care testing ,010401 analytical chemistry ,General Engineering ,Value (computer science) ,02 engineering and technology ,Paper based ,021001 nanoscience & nanotechnology ,Machine learning ,computer.software_genre ,01 natural sciences ,0104 chemical sciences ,Analytical Chemistry ,Software portability ,ENSAIO CLÍNICO ,Performance indicator ,Sensitivity (control systems) ,Artificial intelligence ,Macro ,0210 nano-technology ,business ,computer - Abstract
Paper-based devices are an excellent match for low-cost point-of-care testing (POCT) tools. Their user-friendliness, portability, and short time of analysis, coupled with ease of local manufacture make these devices the best option for inexpensive diagnostic testing tools. However, despite all their positive features, these low-cost diagnostic devices must present good performance indicators, such as sensitivity, specificity, and accuracy. We developed and validated a paper-based ELISA for toxoplasmosis diagnosis through the detection of Toxoplasma gondii immunoglobulin G (IgG) antibodies in 100 human serum samples. From among the different ways to define the cut-off value, we chose Youden's J index (cut-off = 21.73 A.U.), which presented a higher sensitivity value. Our paper-based assay presented a sensitivity of 0.96, a specificity of 0.87, and a gray zone comprising 16 samples (±15% of the cut-off value, with 3 false positive outputs). The accuracy of the test was estimated by using ROC curves (AUC = 0.97). We also created a macro in Microsoft Excel® to estimate the accuracy of the test (m-Accuracy) based on a non-parametric method, which evidenced a value = 0.88, which classifies our test as moderately to highly accurate. We also provide the m-Accuracy macro for download and the paper-based microplate designs for printing, in order to collaborate with the scientific community and facilitate further studies using this platform. The improvement of these diagnostic tools can bring this technology for those who need it, contributing to population health and well-being.
- Published
- 2017
11. Three machine learning algorithms and their utility in exploring risk factors associated with primary cesarean section in low‐risk women: A methods paper
- Author
-
Jintong Hou and Rebecca R. S. Clark
- Subjects
Adult ,medicine.medical_specialty ,Adolescent ,Computer science ,media_common.quotation_subject ,Oxytocin ,Machine learning ,computer.software_genre ,Outcome (game theory) ,Article ,Terminology ,Machine Learning ,Young Adult ,03 medical and health sciences ,0302 clinical medicine ,Pregnancy ,Risk Factors ,Oxytocics ,medicine ,Humans ,Obesity ,030212 general & internal medicine ,Association (psychology) ,Function (engineering) ,General Nursing ,media_common ,030504 nursing ,Cesarean Section ,business.industry ,Rank (computer programming) ,Regression ,Random forest ,Cross-Sectional Studies ,Female ,Artificial intelligence ,Outcomes research ,0305 other medical science ,business ,computer ,Algorithm - Abstract
Machine learning, a branch of artificial intelligence, is increasingly used in health research, including nursing and maternal outcomes research. Machine learning algorithms are complex and involve statistics and terminology that are not common in health research. The purpose of this methods paper is to describe three machine learning algorithms in detail and provide an example of their use in maternal outcomes research. The three algorithms, classification and regression trees, least absolute shrinkage and selection operator, and random forest, may be used to understand risk groups, select variables for a model, and rank variables’ contribution to an outcome, respectively. While machine learning has plenty to contribute to health research, it also has some drawbacks, and these are discussed as well. In order to provide an example of the different algorithms’ function, they were used on a completed cross-sectional study examining the association of oxytocin total dose exposure with primary cesarean section. The results of the algorithms are compared to what was done or found using more traditional methods.
- Published
- 2021
12. Review Paper on Prediction of Heart Disease using Machine Learning Algorithms
- Author
-
Aadar Pandita
- Subjects
Heart disease ,business.industry ,Computer science ,medicine ,Artificial intelligence ,business ,medicine.disease ,Machine learning ,computer.software_genre ,computer - Abstract
Heart disease has been one of the ruling causes for death for quite some time now. About 31% of all deaths every year in the world take place as a result of cardiovascular diseases [1]. A majority of the patients remain uninformed of their symptoms until quite late while others find it difficult to minimise the effects of risk factors that cause heart diseases. Machine Learning Algorithms have been quite efficacious in producing results with a high level of correctness thereby preventing the onset of heart diseases in many patients and reducing the impact in the ones that are already affected by such diseases. It has helped medical researchers and doctors all over the world in recognising patterns in the patients resulting in early detections of heart diseases.
- Published
- 2021
13. Deep Learning-Enabled Point-of-Care Sensing Using Multiplexed Paper-Based Sensors
- Author
-
Hyou-Arm Joung, Omai B. Garner, Aydogan Ozcan, Dino Di Carlo, Artem Goncharov, Jesse Liang, Zachary S. Ballard, and Karina Nugroho
- Subjects
Analyte ,Computer science ,Coefficient of variation ,Computer applications to medicine. Medical informatics ,Real-time computing ,R858-859.7 ,Medicine (miscellaneous) ,Health Informatics ,02 engineering and technology ,lcsh:Computer applications to medicine. Medical informatics ,01 natural sciences ,Multiplexing ,Article ,Health Information Management ,Machine learning ,Vertical flow ,Sensitivity (control systems) ,Point of care ,Assay systems ,business.industry ,Deep learning ,010401 analytical chemistry ,Linearity ,Diagnostic markers ,Paper based ,021001 nanoscience & nanotechnology ,Serum samples ,0104 chemical sciences ,Computer Science Applications ,Cardiovascular diseases ,Medical test ,Optical sensors ,lcsh:R858-859.7 ,Artificial intelligence ,0210 nano-technology ,business - Abstract
We present a deep learning-based framework to design and quantify point-of-care sensors. As its proof-of-concept and use-case, we demonstrated a low-cost and rapid paper-based vertical flow assay (VFA) for high sensitivity C-Reactive Protein (hsCRP) testing, a common medical test used for quantifying the degree of inflammation in patients at risk of cardio-vascular disease (CVD). A machine learning-based sensor design framework was developed for two key tasks: (1) to determine an optimal configuration of immunoreaction spots and conditions, spatially-multiplexed on a paper-based sensing membrane, and (2) to accurately infer the target analyte concentration based on the signals of the optimal VFA configuration. Using a custom-designed mobile-phone based VFA reader, a clinical study was performed with 85 human serum samples to characterize the quantification accuracy around the clinically defined cutoffs for CVD risk stratification. Results from blindly-tested VFAs indicate a competitive coefficient of variation of 11.2% with a linearity of R2 = 0.95; in addition to the success in the high-sensitivity CRP range (i.e., 0-10 mg/L), our results further demonstrate a mitigation of the hook-effect at higher CRP concentrations due to the incorporation of antigen capture spots within the multiplexed sensing membrane of the VFA. This paper-based computational VFA that is powered by deep learning could expand access to CVD health screening, and the presented machine learning-enabled sensing framework can be broadly used to design cost-effective and mobile sensors for various point-of-care diagnostics applications.
- Published
- 2019
14. Exam paper generation based on performance prediction of student group
- Author
-
Chenjie Mao, Changqin Huang, Tao He, and Zhengyang Wu
- Subjects
Information Systems and Management ,Computer science ,media_common.quotation_subject ,02 engineering and technology ,Machine learning ,computer.software_genre ,Theoretical Computer Science ,Task (project management) ,Artificial Intelligence ,ComputingMilieux_COMPUTERSANDEDUCATION ,0202 electrical engineering, electronic engineering, information engineering ,Performance prediction ,Quality (business) ,media_common ,business.industry ,05 social sciences ,050301 education ,Computer Science Applications ,Control and Systems Engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,Focus (optics) ,business ,0503 education ,computer ,Software ,Student group - Abstract
Exam paper generation is an indispensable part of teaching. Existing methods focus on the use of question extraction algorithms with labels for each question provided. Obviously, manual labeling is inefficient and cannot avoid label bias. Furthermore, the quality of the exam papers generated by the existing methods is not guaranteed. To address these problems, we propose a novel approach to generating exam papers based on prediction of exam performance. As such, we update the quality of the initially generated questions one by using dynamic programming, as well as in batches by using genetic algorithms. We performed the prediction task by using Deep Knowledge Tracing. Our approach considered the skill weight, difficulty, and distribution of exam scores. By comparisons, experimental results indicate that our approach performed better than the two baselines. Furthermore, it can generate exam papers with adaptive difficulties closely to the expected levels, and the related student exam scores will be guaranteed to be relatively reasonable distribution. In addition, our approach was evaluated in a real learning scenarios and shows advantages.
- Published
- 2020
15. [Paper] Image Retrieval Based on Supervised Local Regression and Global Alignment with Relevance Feedback for Insect Identification
- Author
-
Takahiro Ogawa, Keisuke Maeda, Susumu Genma, and Miki Haseyama
- Subjects
relevance feedback ,Computer science ,business.industry ,insect identification ,Local regression ,Relevance feedback ,Insect identification ,sLRGA ,Machine learning ,computer.software_genre ,Computer Graphics and Computer-Aided Design ,Signal Processing ,Media Technology ,Pairwise sequence alignment ,Artificial intelligence ,Image retrieval ,business ,computer - Abstract
A method for image retrieval based on supervised local regression and global alignment (sLRGA) with relevance feedback for insect identification is presented in this paper. Based on the novel sLRGA, which is an extended version of LRGA, the proposed method estimates ranking scores for image retrieval in such a way that the neighborhood structure of a feature space of the database can be optimally preserved with consideration of class information. This is the main contribution of this paper. By measuring the relevance between all of the images and the query image in the database, sLRGA realizes accurate image retrieval. Furthermore, when positive/negative labels to retrieved images are given by users, the proposed method can improve image retrieval performance considering the query relevance information via use of both relevance feedback and sLRGA. This is the second contribution of this paper. Experimental results show the effectiveness of the proposed method.
- Published
- 2020
16. Brief Paper: Augmentation of Hidden Markov Chain for Complex Sequential Data in Context
- Author
-
Bong-Kee Sin
- Subjects
Model inference ,business.industry ,Computer science ,Context (language use) ,Segmentation ,Sequential data ,Artificial intelligence ,business ,Machine learning ,computer.software_genre ,Hidden Markov model ,computer - Published
- 2021
17. Expert System for the Identification of Review Papers Using Ensemble Learning
- Author
-
Ghulam Mustafa
- Subjects
Identification (information) ,business.industry ,Computer science ,Artificial intelligence ,computer.software_genre ,business ,Machine learning ,computer ,Ensemble learning ,Expert system - Published
- 2021
18. Optimizing a literature surveillance strategy to retrieve sound overall prognosis and risk assessment model papers
- Author
-
Peter LaVita, Patricia L. Kavanagh, Francine Frater, Alfonso Iorio, Rick Parrish, and Tamara Navarro
- Subjects
PubMed ,Computer science ,Mesh term ,business.industry ,media_common.quotation_subject ,Information Storage and Retrieval ,Health Informatics ,Guideline ,Prognosis ,Research and Applications ,Machine learning ,computer.software_genre ,Risk Assessment ,Sensitivity and Specificity ,Humans ,Quality (business) ,Artificial intelligence ,Risk assessment ,business ,computer ,media_common - Abstract
Objective Our aim was to develop an efficient search strategy for prognostic studies and clinical prediction guides (CPGs), optimally balancing sensitivity and precision while independent of MeSH terms, as relying on them may miss the most current literature. Materials and Methods We combined 2 Hedges-based search strategies, modified to remove MeSH terms for overall prognostic studies and CPGs, and ran the search on 269 journals. We read abstracts from a random subset of retrieved references until ≥ 20 per journal were reviewed and classified them as positive when fulfilling standardized quality criteria, thereby assembling a standard dataset used to calibrate the search strategy. We determined performance characteristics of our new search strategy against the Hedges standard and performance characteristics of published search strategies against the standard dataset. Results Our search strategy retrieved 16 089 references from 269 journals during our study period. One hundred fifty-four journals yielded ≥ 20 references and ≥ 1 prognostic study or CPG. Against the Hedges standard, the new search strategy had sensitivity/specificity/precision/accuracy of 84%/80%/2%/80%, respectively. Existing published strategies tested against our standard dataset had sensitivities of 36%–94% and precision of 5%–10%. Discussion We developed a new search strategy to identify overall prognosis studies and CPGs independent of MeSH terms. These studies are important for medical decision-making, as they identify specific populations and individuals who may benefit from interventions. Conclusion Our results may benefit literature surveillance and clinical guideline efforts, as our search strategy performs as well as published search strategies while capturing literature at the time of publication.
- Published
- 2021
19. Quantitative and Qualitative Approach of Scientific Paper Popularity By Naïve Bayes Classifier
- Author
-
Glauber Tadaiesky Marques, J. Felipe Almeida, Tobias Ribeiro Sombra, P.H.O.V. Campos, Otavio Andre Chase, Alex de Jesus Zissou, Paulo Cerqueira dos Santos Júnior, Emerson Cordeiro Morais, Rose Marie Santini, and Walmir Oliveira Couto
- Subjects
Naive Bayes classifier ,Computer science ,business.industry ,General Medicine ,Artificial intelligence ,Machine learning ,computer.software_genre ,business ,Popularity ,computer - Abstract
Usually, scientific research begins with the collection of data in which online social media tools can be some of the most rewarding and informative resources. The extensive measure of accessible information pulls in users from undergraduate students to postdoc. The search for scientific themes has popularized due to the availability of abundant publications that resides in scientific social networks such as Mendeley, ResearchGate etc. Articles are published on these media inform of text for knowledge dissemination, scientific support, research, updates etc, and are frequently uploaded after its publication in a proceedings or journal. In this sense, data collected from database often contains high noise and its analysis can be treated as a characterization undertaking as it groups the introduction of a content into either good or bad. In this text, we present quantitative and qualitative analysis of papers popularity in Mendeley repository by using naive Bayes Classifier.
- Published
- 2020
20. Categorisation of Computer Science Research Papers using Supervised Machine Learning Techniques
- Author
-
Sameerchand Pudaruth and Hemrajsingh Gheeseewan
- Subjects
Computer Networks and Communications ,Computer science ,business.industry ,Document classification ,Deep learning ,computer.software_genre ,Logistic regression ,Machine learning ,Computer Graphics and Computer-Aided Design ,Human-Computer Interaction ,Artificial Intelligence ,Management of Technology and Innovation ,Artificial intelligence ,business ,computer ,Information Systems - Published
- 2020
21. Reproducibility Companion Paper
- Author
-
Hong-Han Shuai, Wai Keung Wong, Xun Yang, Lizi Liao, Jinyoung Moon, Yunshan Ma, Tat-Seng Chua, and Yujuan Ding
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Computer science ,business.industry ,Artifact (software development) ,Python (programming language) ,Machine learning ,computer.software_genre ,Replication (computing) ,Computer Science - Information Retrieval ,Machine Learning (cs.LG) ,Multimedia (cs.MM) ,Trend analysis ,Artificial intelligence ,Time series ,business ,computer ,Computer Science - Multimedia ,Information Retrieval (cs.IR) ,computer.programming_language - Abstract
This companion paper supports the replication of the fashion trend forecasting experiments with the KERN (Knowledge Enhanced Recurrent Network) method that we presented in the ICMR 2020. We provide an artifact that allows the replication of the experiments using a Python implementation. The artifact is easy to deploy with simple installation, training and evaluation. We reproduce the experiments conducted in the original paper and obtain similar performance as previously reported. The replication results of the experiments support the main claims in the original paper.
- Published
- 2021
22. Application of Computational Intelligence Methods for the Automated Identification of Paper-Ink Samples Based on LIBS
- Author
-
Ozal Yildirim, Tomasz Łojewski, Paweł Pławiak, Krzysztof Rzecki, Tomasz Sośnicki, Małgorzata Król, U. Rajendra Acharya, Mateusz Baran, and Michał Niedźwiecki
- Subjects
Computer science ,Decision tree ,Computational intelligence ,02 engineering and technology ,lcsh:Chemical technology ,01 natural sciences ,Biochemistry ,Spectral line ,Article ,Analytical Chemistry ,computational intelligence methods ,Probabilistic neural network ,0202 electrical engineering, electronic engineering, information engineering ,Preprocessor ,artificial_intelligence_robotics ,lcsh:TP1-1185 ,Electrical and Electronic Engineering ,Spectroscopy ,Instrumentation ,LIBS ,Artificial neural network ,business.industry ,010401 analytical chemistry ,Pattern recognition ,paper-ink analysis ,Perceptron ,Atomic and Molecular Physics, and Optics ,0104 chemical sciences ,Random forest ,Support vector machine ,machine learning ,classification ,discrimination power ,020201 artificial intelligence & image processing ,Artificial intelligence ,business - Abstract
Laser-induced breakdown spectroscopy (LIBS) is an important analysis technique with applications in many industrial branches and fields of scientific research. Nowadays, the advantages of LIBS are impaired by the main drawback in the interpretation of obtained spectra and identification of observed spectral lines. This procedure is highly time-consuming since it is essentially based on the comparison of lines present in the spectrum with the literature database. This paper proposes the use of various computational intelligence methods to develop a reliable and fast classification of quasi-destructively acquired LIBS spectra into a set of predefined classes. We focus on a specific problem of classification of paper-ink samples into 30 separate, predefined classes. For each of 30 classes (10 pens of each of 5 ink types combined with 10 sheets of 5 paper types plus empty pages), 100 LIBS spectra are collected. Four variants of preprocessing, seven classifiers (decision trees, random forest, k-nearest neighbor, support vector machine, probabilistic neural network, multi-layer perceptron, and generalized regression neural network), 5-fold stratified cross-validation, and a test on an independent set (for methods evaluation) scenarios are employed. Our developed system yielded an accuracy of 99.08%, obtained using the random forest classifier. Our results clearly demonstrates that machine learning methods can be used to identify the paper-ink samples based on LIBS reliably at a faster rate.
- Published
- 2018
23. Recent trends in real estate research: a comparison of recent working papers and publications using machine learning algorithms
- Author
-
Bertram I. Steininger and Wolfgang Breuer
- Subjects
Economics and Econometrics ,LDA ,Computer science ,real estate ,Real estate ,recent trends ,Latent Dirichlet allocation ,symbols.namesake ,Order (exchange) ,0502 economics and business ,Relevance (information retrieval) ,Latent Dirichlet Allocation ,Business and International Management ,Potential impact ,R30 ,050208 finance ,05 social sciences ,Data science ,Ekonomi och näringsliv ,Editorial ,machine learning ,Economics and Business ,Section (archaeology) ,Human resource management ,C80 ,symbols ,Unsupervised learning ,050203 business & management ,Lead time ,C45 - Abstract
Journal of Business Economics (JBE) 90(7), 963-974 (2020). doi:10.1007/s11573-020-01005-w special issue: "Special Issue: Recent Trends in Real Estate Research / Issue editors: Wolfgang Breuer ; Bertram Steininger", Published by Springer, Berlin ; Heidelberg
- Published
- 2020
24. Survey Paper on Algorithms used for Sentiment Analysis
- Author
-
Meghashree K
- Subjects
Computer science ,business.industry ,Sentiment analysis ,Artificial intelligence ,Machine learning ,computer.software_genre ,business ,computer - Published
- 2020
25. Survey Paper on Fraud Detection in Medicare Using Machine Learning
- Author
-
S Muthulakshmi
- Subjects
Psychiatry and Mental health ,Clinical Psychology ,Computer science ,business.industry ,Artificial intelligence ,Pshychiatric Mental Health ,business ,Machine learning ,computer.software_genre ,computer - Published
- 2020
26. Machine learning approaches and databases for prediction of drug–target interaction: a survey paper
- Author
-
Maureen A. Sartor, Maryam Bagherian, Zaneta Nikolovska-Coleska, Kai Wang, Kayvan Najarian, and Elyas Sabeti
- Subjects
Databases, Factual ,AcademicSubjects/SCI01060 ,Computer science ,Process (engineering) ,Drug target ,Review Article ,computer.software_genre ,Machine learning ,Task (project management) ,Machine Learning ,03 medical and health sciences ,0302 clinical medicine ,Drug Discovery ,DTI software ,Humans ,Set (psychology) ,Molecular Biology ,030304 developmental biology ,0303 health sciences ,Database ,business.industry ,drug–target interaction prediction ,Computational Biology ,Key (cryptography) ,Artificial intelligence ,Erratum ,business ,computer ,DTI database ,030217 neurology & neurosurgery ,Information Systems - Abstract
The task of predicting the interactions between drugs and targets plays a key role in the process of drug discovery. There is a need to develop novel and efficient prediction approaches in order to avoid costly and laborious yet not-always-deterministic experiments to determine drug–target interactions (DTIs) by experiments alone. These approaches should be capable of identifying the potential DTIs in a timely manner. In this article, we describe the data required for the task of DTI prediction followed by a comprehensive catalog consisting of machine learning methods and databases, which have been proposed and utilized to predict DTIs. The advantages and disadvantages of each set of methods are also briefly discussed. Lastly, the challenges one may face in prediction of DTI using machine learning approaches are highlighted and we conclude by shedding some lights on important future research directions.
- Published
- 2020
27. Cross-Validation, Risk Estimation, and Model Selection: Comment on a Paper by Rosset and Tibshirani
- Author
-
Stefan Wager
- Subjects
Statistics and Probability ,Estimation ,Computer science ,business.industry ,Model selection ,05 social sciences ,Machine learning ,computer.software_genre ,01 natural sciences ,Cross-validation ,Task (project management) ,010104 statistics & probability ,0502 economics and business ,Range (statistics) ,Artificial intelligence ,0101 mathematics ,Statistics, Probability and Uncertainty ,business ,computer ,050205 econometrics - Abstract
How best to estimate the accuracy of a predictive rule has been a longstanding question in statistics. Approaches to this task range from simple methods like Mallow’s Cp to algorithmic techniques l...
- Published
- 2020
28. MLCAD: A Survey of Research in Machine Learning for CAD Keynote Paper
- Author
-
David Z. Pan, Yibo Lin, Jorg Henkel, Marilyn Wolf, Martin Rapp, Bei Yu, and Hussam Amrouch
- Subjects
business.industry ,Computer science ,Heuristic (computer science) ,Reliability (computer networking) ,media_common.quotation_subject ,DATA processing & computer science ,Brute-force search ,CAD ,02 engineering and technology ,Machine learning ,computer.software_genre ,Computer Graphics and Computer-Aided Design ,020202 computer hardware & architecture ,Set (abstract data type) ,Open research ,0202 electrical engineering, electronic engineering, information engineering ,Quality (business) ,Artificial intelligence ,Configuration space ,Electrical and Electronic Engineering ,ddc:004 ,business ,computer ,Software ,media_common - Abstract
Due to the increasing size of s (s), their design and optimization phases (i.e., ) grow increasingly complex. At design time, a large design space needs to be explored to find an implementation that fulfills all specifications and then optimizes metrics like energy, area, delay, reliability, etc. At run time, a large configuration space needs to be searched to find the best set of parameters (e.g., voltage/frequency) to further optimize the system. Both spaces are infeasible for exhaustive search typically leading to heuristic optimization algorithms that find some trade-off between design quality and computational overhead. ML can build powerful models that have successfully been employed in related domains. In this survey, we categorize how () may be used and is used for design-time and run-time optimization and exploration strategies of s. A meta-study of published techniques unveils areas in that are well-explored and underexplored with, as well as trends in the employed algorithms. We present a comprehensive categorization and summary of the state of the art on for. Finally, we summarize remaining challenges and promising open research directions.
- Published
- 2022
- Full Text
- View/download PDF
29. Colorimetric detection on paper analytical device using machine learning
- Author
-
Basant Giri, Pravin Pokhrel, Bidur Khanal, and Bishesh Khanal
- Subjects
Analyte ,Artificial neural network ,business.industry ,Computer science ,HSL and HSV ,Color space ,Machine learning ,computer.software_genre ,Sample (graphics) ,Random forest ,Support vector machine ,RGB color model ,Artificial intelligence ,business ,computer - Abstract
Paper-based analytical devices (PADs) employing colorimetric detection and smartphone images have gained wider acceptance in a variety of measurement applications. The PADs are primarily meant to be used in field settings where assay and imaging conditions greatly vary resulting in less accurate results. Recently, machine learning (ML) assisted models have been used in image analysis. We evaluated a combinations of four ML models - logistic regression, support vector machine, random forest, and artificial neural network, and three image color spaces - RGB, HSV, and LAB for their ability to accurately predict analyte concentrations. We used images of PADs taken at varying lighting conditions, with different cameras, and users for food color and enzyme inhibition assays to create training and test datasets. Prediction accuracy was higher for food color than enzyme inhibition assays in most of the ML model and colorspace combinations. All models better predicted coarse level classification than fine grained concentration labels. ML models using sample color along with a reference color increased the models’ ability in predicting the result in which the reference color may have partially factored out the variation in ambient assay and imaging conditions. The best concentration label prediction accuracy obtained for food color was 0.966 when using ANN model and LAB colorspace. The accuracy for enzyme inhibition assay was 0.908 when using SVM model and LAB colorspace. Appropriate model and colorspace combinations can be useful to analyze large numbers of samples on PADs as a powerful low-cost quick field-testing tool.
- Published
- 2021
30. Data Augmentation Applied to Machine Learning-Based Monitoring of a Pulp and Paper Process
- Author
-
Andréa Pereira Parente, R. O. M. Folly, Andrea Valdman, and Maurício B. de Souza
- Subjects
0209 industrial biotechnology ,Computer science ,Nearest neighbor search ,Bioengineering ,02 engineering and technology ,Machine learning ,computer.software_genre ,Fault detection and isolation ,Data-driven ,pulp and paper industry ,020901 industrial engineering & automation ,0202 electrical engineering, electronic engineering, information engineering ,Chemical Engineering (miscellaneous) ,Data collection ,Artificial neural network ,business.industry ,Process Chemistry and Technology ,Process (computing) ,neural networks ,Monte Carlo technique ,study case ,machine learning ,Data point ,Process safety ,data-driven ,FDD ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer - Abstract
Industrial archived process data represent a convenient source of information for data-driven models, such as artificial neural network (ANN), that can be used for safety and efficiency improvement like early or even predictive fault detection and diagnosis (FDD). Nonetheless, most of the data used for model generation are representative of the process nominal states and therefore are not enough for classification problems intended to determine abnormal process conditions. This work proposes the use of techniques to augment the original real data standards, dismissing the need for experiments that could jeopardize process safety. It uses the Monte Carlo technique to artificially increase the number of model inputs coupled to the nearest neighbor search (NNS) by geometric distances to consistently classify the generated patterns in normal or faulty statuses. Finally, a radial basis function neural network is trained with the augmented data. The methodology was validated by a study case in which 3381 pulp and paper industrial data points were expanded to monitor the formation of particles in a recovery boiler. Only 5.8% of the original process data were examples of faulty conditions, but the new expanded and balanced data collection leveraged the classification performance of the neural network, allowing its future use for monitoring purpose.
- Published
- 2019
31. On the Safety of Automotive Systems Incorporating Machine Learning Based Components: A Position Paper
- Author
-
Andrea Bondavalli, Paolo Lollini, Elvio Gilberto Amparore, Susanna Donatelli, Marco Botta, and Mohamad Gharib
- Subjects
Functional safety ,business.industry ,Computer science ,Reliability (computer networking) ,020207 software engineering ,02 engineering and technology ,Safety standards ,Machine learning ,computer.software_genre ,Sketch ,Software ,0202 electrical engineering, electronic engineering, information engineering ,Position paper ,Dependability ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Verification and validation - Abstract
Machine learning (ML) components are increasingly adopted in many automated systems. Their ability to learn and work with novel input/incomplete knowledge and their generalization capabilities make them highly desirable solutions for complex problems. This has motivated the inclusion of ML techniques/components in products for many industrial domains including automotive systems. Such systems are safety-critical systems since their failure may cause death or injury to humans. Therefore, their safety must be ensured before they are used in their operational environment. However, existing safety standards and Verification and Validation (V&V) techniques do not properly address the special characteristics of ML-based components such as non-determinism, non-transparency, instability. This position paper presents the authors' view on the safety of automotive systems incorporating ML-based components, and it is intended to motivate and sketch a research agenda for extending a safety standard, namely ISO 26262, to address challenges posed by incorporating ML-based components in automotive systems.
- Published
- 2018
32. Editorial for Special Issue: 'Feature Papers of Forecasting'
- Author
-
Sonia Leva
- Subjects
2019-20 coronavirus outbreak ,Coronavirus disease 2019 (COVID-19) ,Computer science ,business.industry ,Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) ,lcsh:Mathematics ,Machine learning ,computer.software_genre ,lcsh:QA1-939 ,n/a ,Feature (computer vision) ,Artificial intelligence ,business ,lcsh:Science (General) ,computer ,lcsh:Q1-390 - Abstract
Nowadays, forecasting applications are receiving unprecedent attention thanks to their capability to improve the decision-making processes by providing useful indications [...]
- Published
- 2021
33. Predicting rank for scientific research papers using supervised learning
- Author
-
Mohamed El Mohadab, Said Safi, and Belaid Bouikhalene
- Subjects
Process (engineering) ,Computer science ,0102 computer and information sciences ,02 engineering and technology ,Machine learning ,computer.software_genre ,01 natural sciences ,Ranking (information retrieval) ,Task (project management) ,0202 electrical engineering, electronic engineering, information engineering ,lcsh:T58.5-58.64 ,business.industry ,lcsh:Information technology ,Supervised learning ,Rank (computer programming) ,Computer Science Applications ,ComputingMethodologies_PATTERNRECOGNITION ,010201 computation theory & mathematics ,Classification methods ,020201 artificial intelligence & image processing ,Learning to rank ,Artificial intelligence ,business ,Precision and recall ,computer ,Software ,Information Systems - Abstract
Automatic data processing represents the future for the development of any system, especially in scientific research. In this paper, we describe one of the automatic classification methods applied to scientific research as a supervised learning task. Throughout the process, we identify the main features that are used as keys to play a significant role in terms of predicting the new rank under the supervised learning setup. First, we propose an overview of the work that has been realized in ranking scientific research papers. Second, we evaluate and compare some of state-of-the-art for the classification by supervised learning, semi-supervised learning and non-supervised learning. During the preliminary tests, we have obtained good results for performance on realistic corpus then we have compared performance metrics, such as NDCG, MAP, GMAP, F-Measure, Precision and Recall in order to define the influential features in our work. Keywords: Scientific research, Ranking scientific research papers, Data mining, Supervised learning, Multilayer perceptron algorithm
- Published
- 2019
34. Paper-and-pencil questionnaires analysis: a new automated technique to reduce analysis time and errors
- Author
-
Aurelie Collado, Olivier Hue, Boris Cheval, and Clovis Chabert
- Subjects
Computer science ,business.industry ,Limiting ,Asset (computer security) ,Machine learning ,computer.software_genre ,Automated technique ,Code (cryptography) ,Artificial intelligence ,business ,Scale (map) ,Reference model ,computer ,Pencil (mathematics) ,Reliability (statistics) - Abstract
Background and ObjectiveQuestionnaires are essential tools in many scientific fields, including health and medicine. However, the analysis of paper-and-pencil questionnaires is time consuming, source of errors and expensive, limiting its use in large cohort studies. Computer-based questionnaires might be a valuable alternative but they may introduce bias, especially for sensitive questions, and they require programming skills. The aim of this study is to develop a reliable and adaptable open-source technique (i.e. LightQuest) to automatically analyse various types of scanned paper-and-pencil questionnaires with closed questions, including those with inverted scale.MethodsTo evaluate the usefulness of LightQuest, the time needed for 7 experimenters for manually code 10 sets of 4 frequently used questionnaires and the number of errors (i.e. reliability) were compared with the time and errors their made using LightQuest.ResultsLightQuest was twice as fast as the manual analysis, even though the time to create the reference model was taken into account (933s vs. 1935s, t(2)=8.81, p-1 for the manual technique versus 0.55s.question-1 for LightQuest (t(2)=22.5, pConclusionLightQuest demonstrated clear superiority both in terms of time and reliability. The script of this first open-source technique, which does not require programming skills, is downloadable in supplemental data and may become an asset for all studies using questionnaires.
- Published
- 2021
35. Automatic test suite generation for key-points detection DNNs using many-objective search (experience paper)
- Author
-
Donghwan Shin, Jun Wang, Fitash Ul Haq, Lionel C. Briand, and Thomas Stifter
- Subjects
FOS: Computer and information sciences ,Test data generation ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,Key-point detection ,Automotive industry ,02 engineering and technology ,Machine learning ,computer.software_genre ,Image (mathematics) ,Computer Science - Software Engineering ,Random search ,Search algorithm ,0202 electrical engineering, electronic engineering, information engineering ,Test suite ,Computer science [C05] [Engineering, computing & technology] ,business.industry ,deep neural network ,software testing ,020207 software engineering ,Sciences informatiques [C05] [Ingénierie, informatique & technologie] ,Software Engineering (cs.SE) ,Key (cryptography) ,many-objective search algorithm ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Test data - Abstract
Automatically detecting the positions of key-points (e.g., facial key-points or finger key-points) in an image is an essential problem in many applications, such as driver's gaze detection and drowsiness detection in automated driving systems. With the recent advances of Deep Neural Networks (DNNs), Key-Points detection DNNs (KP-DNNs) have been increasingly employed for that purpose. Nevertheless, KP-DNN testing and validation have remained a challenging problem because KP-DNNs predict many independent key-points at the same time -- where each individual key-point may be critical in the targeted application -- and images can vary a great deal according to many factors. In this paper, we present an approach to automatically generate test data for KP-DNNs using many-objective search. In our experiments, focused on facial key-points detection DNNs developed for an industrial automotive application, we show that our approach can generate test suites to severely mispredict, on average, more than 93% of all key-points. In comparison, random search-based test data generation can only severely mispredict 41% of them. Many of these mispredictions, however, are not avoidable and should not therefore be considered failures. We also empirically compare state-of-the-art, many-objective search algorithms and their variants, tailored for test suite generation. Furthermore, we investigate and demonstrate how to learn specific conditions, based on image characteristics (e.g., head posture and skin color), that lead to severe mispredictions. Such conditions serve as a basis for risk analysis or DNN retraining., to appear in ISSTA 2021
- Published
- 2021
36. Feature Paper in Environmental Chemistry and Technology
- Author
-
Daniela Varrica and Varrica D.
- Subjects
Technology ,Computer science ,business.industry ,Health, Toxicology and Mutagenesis ,Public Health, Environmental and Occupational Health ,Environment ,Machine learning ,computer.software_genre ,Editorial ,n/a ,Feature (computer vision) ,Medicine ,Artificial intelligence ,business ,computer - Abstract
Attention to the environment and its problems has undergone unprecedented growth in recent years [...]
- Published
- 2021
37. The quiet revolution in machine vision - A state-of-the-art survey paper, including historical review, perspectives, and future directions
- Author
-
Mark F. Hansen, Melvyn L. Smith, and Lyndon N. Smith
- Subjects
0209 industrial biotechnology ,General Computer Science ,Computer science ,Machine vision ,media_common.quotation_subject ,Control (management) ,Big data ,Centre for Machine Vision ,02 engineering and technology ,state-of-the-art ,Field (computer science) ,020901 industrial engineering & automation ,Engineering ,0202 electrical engineering, electronic engineering, information engineering ,Quality (business) ,Industrial Revolution ,media_common ,business.industry ,General Engineering ,deep learning ,machine vision ,Data science ,machine learning ,Key (cryptography) ,020201 artificial intelligence & image processing ,State (computer science) ,business - Abstract
Over the past few years, what might not unreasonably be described as a true revolution has taken place in the field of machine vision, radically altering the way many things had previously been done and offering new and exciting opportunities for those able to quickly embrace and master the new techniques. Rapid developments in machine learning, largely enabled by faster GPU-equipped computing hardware, has facilitated an explosion of machine vision applications into hitherto extremely challenging or, in many cases, previously impossible to automate industrial tasks. Together with developments towards an internet of things and the availability of big data, these form key components of what many consider to be the fourth industrial revolution. This transformation has dramatically improved the efficacy of some existing machine vision activities, such as in manufacturing (e.g. inspection for quality control and quality assurance), security (e.g. facial biometrics) and in medicine (e.g. detecting cancers), while in other cases has opened up completely new areas of use, such as in agriculture and construction (as well as in the existing domains of manufacturing and medicine). Here we will explore the history and nature of this change, what underlies it, what enables it, and the impact it has had - the latter by reviewing several recent indicative applications described in the research literature. We will also consider the continuing role that traditional or classical machine vision might still play. Finally, the key future challenges and developing opportunities in machine vision will also be discussed.
- Published
- 2021
38. Machine Learning for Paper Grammage Prediction Based on Sensor Measurements in Paper Mills
- Author
-
Hosny A. Abbas
- Subjects
FOS: Computer and information sciences ,Grammage ,Computer Science - Machine Learning ,Interface (Java) ,Generalization ,business.industry ,Computer science ,Process (computing) ,Machine learning ,computer.software_genre ,Automation ,Machine Learning (cs.LG) ,Control system ,Production (economics) ,Mill ,artificial_intelligence_robotics ,Artificial intelligence ,business ,computer - Abstract
Automation is at the core of modern industry. It aims to increase production rates, decrease production costs, and reduce human intervention in order to avoid human mistakes and time delays during manufacturing. On the other hand, human assistance is usually required to customize products and reconfigure control systems through a special process interface called Human Machine Interface (HMI). Machine Learning (ML) algorithms can effectively be used to resolve this tradeoff between full automation and human assistance.This paper provides an example of the industrial application of ML algorithms to help human operators save their mental effort and avoid time delays and unintended mistakes for the sake of high production rates. Based on real-time sensor measurements, several ML algorithms have been tried to classify paper rolls according to paper grammage in a white paper mill. The performance evaluation shows that the AdaBoost algorithm is the best ML algorithm for this application with classification accuracy (CA), precision, and recall of 97.1%. The generalization of the proposed approach for achieving cost-effective mills construction will be the subject of our future research., 9 pages
- Published
- 2019
39. Toward a Progress Indicator for Machine Learning Model Building and Data Mining Algorithm Execution: A Position Paper
- Author
-
Gang Luo
- Subjects
Computer science ,business.industry ,Geography, Planning and Development ,02 engineering and technology ,Machine learning ,computer.software_genre ,Data mining algorithm ,Article ,Task (project management) ,Data mining software ,Load management ,Software ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,General Earth and Planetary Sciences ,Position paper ,020201 artificial intelligence & image processing ,Artificial intelligence ,Software system ,business ,Model building ,computer ,Water Science and Technology - Abstract
For user-friendliness, many software systems offer progress indicators for long-duration tasks. A typical progress indicator continuously estimates the remaining task execution time as well as the portion of the task that has been finished. Building a machine learning model often takes a long time, but no existing machine learning software supplies a non-trivial progress indicator. Similarly, running a data mining algorithm often takes a long time, but no existing data mining software provides a nontrivial progress indicator. In this article, we consider the problem of offering progress indicators for machine learning model building and data mining algorithm execution. We discuss the goals and challenges intrinsic to this problem. Then we describe an initial framework for implementing such progress indicators and two advanced, potential uses of them, with the goal of inspiring future research on this topic
- Published
- 2017
40. High Precision Digitization of Paper-Based ECG Records: A Step Toward Machine Learning
- Author
-
Ali El Hajj, Hassan Ghaziri, Ossama K. Abou Hassan, Lise Safatly, Mohammed Baydoun, and Hussain Isma'eel
- Subjects
lcsh:Medical technology ,020205 medical informatics ,Computer science ,Biomedical Engineering ,Image processing ,02 engineering and technology ,030204 cardiovascular system & hematology ,lcsh:Computer applications to medicine. Medical informatics ,Machine learning ,computer.software_genre ,QT interval ,Article ,Correlation ,03 medical and health sciences ,QRS complex ,0302 clinical medicine ,0202 electrical engineering, electronic engineering, information engineering ,medicine ,cardiovascular diseases ,PR interval ,MATLAB ,Digitization ,computer.programming_language ,medicine.diagnostic_test ,business.industry ,Matlab tool ,General Medicine ,Electrocardiogram ,image processing ,lcsh:R855-855.5 ,digitization ,lcsh:R858-859.7 ,Artificial intelligence ,business ,Electrocardiography ,computer - Abstract
Introduction: The electrocardiogram (ECG) plays an important role in the diagnosis of heart diseases. However, most patterns of diseases are based on old datasets and stepwise algorithms that provide limited accuracy. Improving diagnostic accuracy of the ECG can be done by applying machine learning algorithms. This requires taking existing scanned or printed ECGs of old cohorts and transforming the ECG signal to the raw digital (time (milliseconds), voltage (millivolts)) form. Objectives: We present a MATLAB-based tool and algorithm that converts a printed or scanned format of the ECG into a digitized ECG signal. Methods: 30 ECG scanned curves are utilized in our study. An image processing method is first implemented for detecting the ECG regions of interest and extracting the ECG signals. It is followed by serial steps that digitize and validate the results. Results: The validation demonstrates very high correlation values of several standard ECG parameters: PR interval 0.984 +/−0.021 (p-value < 0.001), QRS interval 1+/− SD (p-value < 0.001), QT interval 0.981 +/− 0.023 p-value < 0.001, and RR interval 1 +/− 0.001 p-value < 0.001. Conclusion: Digitized ECG signals from existing paper or scanned ECGs can be obtained with more than 95% of precision. This makes it possible to utilize historic ECG signals in machine learning algorithms to identify patterns of heart diseases and aid in the diagnostic and prognostic evaluation of patients with cardiovascular disease., We present a MATLAB-based tool and algorithm that converts a printed or scanned ECG into a digitized ECG signal. Validation on 30 ECG scanned curves more than 95% precision. This makes it possible to utilize historic ECG signals in machine learning algorithms to identify patterns of heart diseases and aid in the diagnosis and evaluation of patients with cardiovascular disease.
- Published
- 2019
41. 'Garbage In, Garbage Out' Revisited: What Do Machine Learning Application Papers Report About Human-Labeled Training Data?
- Author
-
Jamie Ip, Aayush Shah, Marsha Lotosh, Jenny Weng, R. Stuart Geiger, Rebekah Tang, and Dominique Cope
- Subjects
Social and Information Networks (cs.SI) ,FOS: Computer and information sciences ,Ground truth ,Computer Science - Machine Learning ,Computer science ,business.industry ,media_common.quotation_subject ,Best practice ,Computer Science - Social and Information Networks ,General Medicine ,Machine learning ,computer.software_genre ,Variety (linguistics) ,Task (project management) ,Machine Learning (cs.LG) ,Annotation ,Computer Science - Computers and Society ,Garbage in, garbage out ,Computers and Society (cs.CY) ,Quality (business) ,Social media ,Artificial intelligence ,business ,computer ,media_common - Abstract
Supervised machine learning, in which models are automatically derived from labeled training data, is only as good as the quality of that data. This study builds on prior work that investigated to what extent “best practices” around labeling training data were followed in applied ML publications within a single domain (social media platforms). In this paper, we expand by studying publications that apply supervised ML in a far broader spectrum of disciplines, focusing on human-labeled data. We report to what extent a random sample of ML application papers across disciplines give specific details about whether best practices were followed, while acknowledging that a greater range of application fields necessarily produces greater diversity of labeling and annotation methods. Because much of machine learning research and education only focuses on what is done once a “ground truth” or “gold standard” of training data is available, it is especially relevant to discuss issues around the equally important aspect of whether such data is reliable in the first place. This determination becomes increasingly complex when applied to a variety of specialized fields, as labeling can range from a task requiring little-to-no background knowledge to one that must be performed by someone with career expertise.
- Published
- 2021
- Full Text
- View/download PDF
42. Assessment of spatial abilities through paper-based and online tests
- Author
-
Andrea Kárpáti and Bernadett Babály
- Subjects
Psychiatry and Mental health ,business.industry ,Computer science ,Artificial intelligence ,Paper based ,business ,Machine learning ,computer.software_genre ,computer - Published
- 2015
43. Demonstration Paper: Monitoring Machine Learning Contracts with QoA4ML
- Author
-
Minh-Tri Nguyen and Hong-Linh Truong
- Subjects
Service (business) ,business.industry ,Computer science ,Service contract ,02 engineering and technology ,Machine learning ,computer.software_genre ,System monitoring ,Set (abstract data type) ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer - Abstract
Using machine learning (ML) services, both service customers and providers need to monitor complex contractual constraints of ML service that are strongly related to ML models and data. Therefore, establishing and monitoring comprehensive ML contracts are crucial in ML serving. This paper demonstrates a set of features and utilities of the QoA4ML framework for ML contracts.
- Published
- 2021
44. Intrusion Detection System Classification Using Different Machine Learning Algorithms on KDD-99 and NSL-KDD Datasets - A Review Paper
- Author
-
Munther Abualkibash and Ravipati Rama Devi
- Subjects
Computer science ,business.industry ,020206 networking & telecommunications ,02 engineering and technology ,Intrusion detection system ,Machine learning ,computer.software_genre ,Term (time) ,Constant false alarm rate ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Anomaly detection ,Artificial intelligence ,False alarm ,Detection rate ,business ,Algorithm ,computer - Abstract
Intrusion Detection System (IDS) has been an effective way to achieve higher security in detecting malicious activities for the past couple of years. Anomaly detection is an intrusion detection system. Current anomaly detection is often associated with high false alarm rates and only moderate accuracy and detection rates because it’s unable to detect all types of attacks correctly. An experiment is carried out to evaluate the performance of the different machine learning algorithms using KDD-99 Cup and NSL-KDD datasets. Results show which approach has performed better in term of accuracy, detection rate with reasonable false alarm rate.
- Published
- 2019
45. Review Paper on Crops Disease Diagnosing Using Image-Based Deep Learning Mechanism
- Author
-
Pragya Lariya and Mukul Shrivastava
- Subjects
Computer science ,business.industry ,Deep learning ,Artificial intelligence ,Machine learning ,computer.software_genre ,business ,computer ,Mechanism (sociology) ,Image based - Published
- 2019
46. Review Paper on Object Detection using Deep Learning- Understanding different Algorithms and Models to Design Effective Object Detection Network
- Author
-
Jogi John
- Subjects
Computer science ,business.industry ,Deep learning ,Artificial intelligence ,Machine learning ,computer.software_genre ,business ,computer ,Object detection - Published
- 2019
47. Auto-Model: Utilizing Research Papers and HPO Techniques to Deal with the CASH problem
- Author
-
Hongzhi Wang, Jianzhong Li, Hong Gao, Chunnan Wang, and Tianyu Mu
- Subjects
FOS: Computer and information sciences ,Hyperparameter ,Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,business.industry ,Computer science ,media_common.quotation_subject ,02 engineering and technology ,Machine learning ,computer.software_genre ,Machine Learning (cs.LG) ,Task (project management) ,Statistical classification ,Artificial Intelligence (cs.AI) ,020204 information systems ,Cash ,Hyperparameter optimization ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,Configuration space ,business ,computer ,media_common - Abstract
In many fields, a mass of algorithms with completely different hyperparameters have been developed to address the same type of problems. Choosing the algorithm and hyperparameter setting correctly can promote the overall performance greatly, but users often fail to do so due to the absence of knowledge. How to help users to effectively and quickly select the suitable algorithm and hyperparameter settings for the given task instance is an important research topic nowadays, which is known as the CASH problem. In this paper, we design the Auto-Model approach, which makes full use of known information in the related research paper and introduces hyperparameter optimization techniques, to solve the CASH problem effectively. Auto-Model tremendously reduces the cost of algorithm implementations and hyperparameter configuration space, and thus capable of dealing with the CASH problem efficiently and easily. To demonstrate the benefit of Auto-Model, we compare it with classical Auto-Weka approach. The experimental results show that our proposed approach can provide superior results and achieves better performance in a short time., 12 pages, 3 figures
- Published
- 2020
48. AxCell: Automatic Extraction of Results from Machine Learning Papers
- Author
-
Sebastian Ruder, Marcin Kardas, Ross Taylor, Robert Stojnic, Piotr Czapla, Sebastian Riedel, and Pontus Stenetorp
- Subjects
FOS: Computer and information sciences ,Computer Science - Computation and Language ,business.industry ,Computer science ,Machine Learning (stat.ML) ,020206 networking & telecommunications ,02 engineering and technology ,Machine learning ,computer.software_genre ,Pipeline (software) ,Task (project management) ,Statistics - Machine Learning ,0202 electrical engineering, electronic engineering, information engineering ,Code (cryptography) ,Table (database) ,020201 artificial intelligence & image processing ,Segmentation ,Extraction (military) ,Artificial intelligence ,State (computer science) ,business ,Computation and Language (cs.CL) ,computer - Abstract
Tracking progress in machine learning has become increasingly difficult with the recent explosion in the number of papers. In this paper, we present AxCell, an automatic machine learning pipeline for extracting results from papers. AxCell uses several novel components, including a table segmentation subtask, to learn relevant structural knowledge that aids extraction. When compared with existing methods, our approach significantly improves the state of the art for results extraction. We also release a structured, annotated dataset for training models for results extraction, and a dataset for evaluating the performance of models on this task. Lastly, we show the viability of our approach enables it to be used for semi-automated results extraction in production, suggesting our improvements make this task practically viable for the first time. Code is available on GitHub.
- Published
- 2020
49. Hierarchical Bi-Directional Self-Attention Networks for Paper Review Rating Recommendation
- Author
-
Hao Peng, Congying Xia, Jianxin Li, Philip S. Yu, Zhongfen Deng, and Lifang He
- Subjects
FOS: Computer and information sciences ,Text corpus ,Computer Science - Machine Learning ,Computer Science - Computation and Language ,Computer science ,business.industry ,Deep learning ,Self attention ,02 engineering and technology ,Machine learning ,computer.software_genre ,Machine Learning (cs.LG) ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Leverage (statistics) ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Computation and Language (cs.CL) ,Encoder ,computer ,Sentence - Abstract
Review rating prediction of text reviews is a rapidly growing technology with a wide range of applications in natural language processing. However, most existing methods either use hand-crafted features or learn features using deep learning with simple text corpus as input for review rating prediction, ignoring the hierarchies among data. In this paper, we propose a Hierarchical bi-directional self-attention Network framework (HabNet) for paper review rating prediction and recommendation, which can serve as an effective decision-making tool for the academic paper review process. Specifically, we leverage the hierarchical structure of the paper reviews with three levels of encoders: sentence encoder (level one), intra-review encoder (level two) and inter-review encoder (level three). Each encoder first derives contextual representation of each level, then generates a higher-level representation, and after the learning process, we are able to identify useful predictors to make the final acceptance decision, as well as to help discover the inconsistency between numerical review ratings and text sentiment conveyed by reviewers. Furthermore, we introduce two new metrics to evaluate models in data imbalance situations. Extensive experiments on a publicly available dataset (PeerRead) and our own collected dataset (OpenReview) demonstrate the superiority of the proposed approach compared with state-of-the-art methods., Accepted by COLING 2020
- Published
- 2020
50. Automatic Paper-to-reviewer Assignment, based on the Matching Degree of the Reviewers
- Author
-
Toyohide Watanabe and Xinlian Li
- Subjects
Matching (statistics) ,Information retrieval ,Degree (graph theory) ,business.industry ,Computer science ,paper-to-reviewer assignment ,matching degree ,Machine learning ,computer.software_genre ,relevance degree ,Preference ,Hungarian algorithm ,expertise degree ,General Earth and Planetary Sciences ,Artificial intelligence ,business ,computer ,General Environmental Science - Abstract
There are a number of issues which are involved with organizing a conference. Among these issues, assigning conference-papers to reviewers is one of the most difficult tasks. Assigning conference-papers to reviewers is automatically the most crucial part. In this paper, we address this issue of paper-to-reviewer assignment, and we propose a method to model the reviewers, based on the matching degree between the reviewers and the papers by combining a preference-based approach and a topic-based approach. We explain the assignment algorithm and show the evaluation results in comparison with the Hungarian algorithm.
- Published
- 2013
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.