22,099 results
Search Results
2. Data-based robust multiobjective optimization of interconnected processes: energy efficiency case study in papermaking.
- Author
-
Afshar P, Brown M, Maciejowski J, and Wang H
- Subjects
- Energy Transfer, Artificial Intelligence, Data Mining methods, Databases, Factual, Feedback, Models, Theoretical, Paper
- Abstract
Reducing energy consumption is a major challenge for "energy-intensive" industries such as papermaking. A commercially viable energy saving solution is to employ data-based optimization techniques to obtain a set of "optimized" operational settings that satisfy certain performance indices. The difficulties of this are: 1) the problems of this type are inherently multicriteria in the sense that improving one performance index might result in compromising the other important measures; 2) practical systems often exhibit unknown complex dynamics and several interconnections which make the modeling task difficult; and 3) as the models are acquired from the existing historical data, they are valid only locally and extrapolations incorporate risk of increasing process variability. To overcome these difficulties, this paper presents a new decision support system for robust multiobjective optimization of interconnected processes. The plant is first divided into serially connected units to model the process, product quality, energy consumption, and corresponding uncertainty measures. Then multiobjective gradient descent algorithm is used to solve the problem in line with user's preference information. Finally, the optimization results are visualized for analysis and decision making. In practice, if further iterations of the optimization algorithm are considered, validity of the local models must be checked prior to proceeding to further iterations. The method is implemented by a MATLAB-based interactive tool DataExplorer supporting a range of data analysis, modeling, and multiobjective optimization techniques. The proposed approach was tested in two U.K.-based commercial paper mills where the aim was reducing steam consumption and increasing productivity while maintaining the product quality by optimization of vacuum pressures in forming and press sections. The experimental results demonstrate the effectiveness of the method.
- Published
- 2011
- Full Text
- View/download PDF
3. Looking at the fringes of MedTech innovation: a mapping review of horizon scanning and foresight methods.
- Author
-
Garcia Gonzalez-Moral S, Beyer FR, Oyewole AO, Richmond C, Wainwright L, and Craig D
- Subjects
- Humans, Consensus, Databases, Bibliographic, Databases, Factual, Artificial Intelligence, Data Mining
- Abstract
Objectives: Horizon scanning (HS) is a method used to examine signs of change and may be used in foresight practice. HS methods used for the identification of innovative medicinal products cannot be applied in medical technologies (MedTech) due to differences in development and regulatory processes. The aim of this study is to identify HS and other methodologies used for MedTech foresight in support to healthcare decision-making., Method: A mapping review was performed. We searched bibliographical databases including MEDLINE, Embase, Scopus, Web of Science, IEEE Xplore and Compendex Engineering Village and grey literature sources such as Google, CORE database and the International HTA database. Our searches identified 8888 records. After de-duplication, and manual and automated title, abstracts and full-text screening, 49 papers met the inclusion criteria and were data extracted., Results: Twenty-five single different methods were identified, often used in combination; of these, only three were novel (appearing only once in the literature). Text mining or artificial intelligence solutions appear as early as 2012, often practised in patent and social media sources. The time horizon used in scanning was not often justified. Some studies regarded experts both as a source and as a method. Literature searching remains one of the most used methods for innovation identification. HS methods were vaguely reported, but often involved consulting with experts and stakeholders., Conclusion: Heterogeneous methodologies, sources and time horizons are used for HS and foresight of MedTech innovation with little or no justification provided for their use. This review revealed an array of known methods being used in combination to overcome the limitations posed by single methods. The review also revealed inconsistency in methods reporting, with a lack of any consensus regarding best practice. Greater transparency in methods reporting and consistency in methods use would contribute to increased output quality to support informed timely decision-making., Competing Interests: Competing interests: None declared., (© Author(s) (or their employer(s)) 2023. Re-use permitted under CC BY. Published by BMJ.)
- Published
- 2023
- Full Text
- View/download PDF
4. A Hybrid Model Based on LFM and BiGRU Toward Research Paper Recommendation
- Author
-
Ziqing Nie, Xu Zhao, Chenkun Meng, Tie Feng, and Hui Kang
- Subjects
Word embedding ,General Computer Science ,Computer science ,Feature vector ,Feature extraction ,02 engineering and technology ,Semantics ,computer.software_genre ,LFM ,Matrix decomposition ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Recommender systems ,General Materials Science ,BiGRU ,user attention ,Artificial neural network ,business.industry ,Deep learning ,General Engineering ,deep learning ,TK1-9971 ,020201 artificial intelligence & image processing ,Artificial intelligence ,Data mining ,Electrical engineering. Electronics. Nuclear engineering ,business ,computer ,Word (computer architecture) - Abstract
To improve the accuracy of user implicit rating prediction, we combine the traditional latent factor model (LFM) and bidirectional gated recurrent unit neural network (BiGRU) model to propose a hybrid model that deeply mines the latent semantics in the unstructured content of the text and generates a more accurate rating matrix. First, we utilize the user’s historical behavior (favorites records) to build a user rating matrix and decompose the matrix to obtain the latent factor vectors of users and literature. We also apply the BERT model for word embedding of the research papers to obtain the sequence of word vectors. Then, we apply the BiGRU with the user attention mechanism to mine the research paper textual content and to generate the new literature latent feature vectors that are used to replace the original literature latent factor vectors decomposed from the rating matrix. Finally, a new rating matrix is generated to obtain users’ ratings of noninteractive research papers and to generate the recommendation list according to the user latent factor vector. We design experiments on the real datasets and verify that the research paper recommendation model is superior to traditional recommendation models in terms of precision, recall, F1-value, coverage, popularity and diversity.
- Published
- 2020
5. Reproducibility Companion Paper
- Author
-
Zhenzhong Kuang, Xinke Li, Zekun Tong, Cise Midoglu, Yabang Zhao, Yuqing Liao, and Andrew Lim
- Subjects
Source code ,Computer science ,business.industry ,media_common.quotation_subject ,Deep learning ,Point cloud ,computer.software_genre ,File format ,Replication (computing) ,Photogrammetry ,Benchmark (surveying) ,Segmentation ,Artificial intelligence ,Data mining ,business ,computer ,media_common - Abstract
This companion paper is to support the replication of paper "Campus3D: A Photogrammetry Point Cloud Benchmark for Outdoor Scene Hierarchical Understanding", which was presented at ACM Multimedia 2020. The supported paper's main purpose was to provide a photogrammetry point cloud-based dataset with hierarchical multilabels to facilitate the area of 3D deep learning. Based on this provided dataset and source code, in this work, we build a complete package to reimplement the proposed methods and experiments (i.e., the hierarchical learning framework and the benchmarks of the hierarchical semantic segmentation task). Specifically, this paper contains the technical details of the package, including file structure, dataset preparation, installation package, and the conduction of the experiment. We also present the replicated experiment results and indicate our contributions to the original implementation.
- Published
- 2021
6. Image Matching Across Wide Baselines: From Paper to Practice
- Author
-
Yuhe Jin, Kwang Moo Yi, Pascal Fua, Eduard Trulls, Jiri Matas, Dmytro Mishkin, and Anastasiia Mishchuk
- Subjects
FOS: Computer and information sciences ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,02 engineering and technology ,computer.software_genre ,benchmark ,Artificial Intelligence ,0202 electrical engineering, electronic engineering, information engineering ,dataset ,Structure from motion ,local features ,3d reconstruction ,structure from motion ,stereo ,Benchmarking ,Pipeline (software) ,Pattern recognition (psychology) ,Metric (mathematics) ,Benchmark (computing) ,Embedding ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Data mining ,Heuristics ,computer ,performance ,Software - Abstract
We introduce a comprehensive benchmark for local features and robust estimation algorithms, focusing on the downstream task -- the accuracy of the reconstructed camera pose -- as our primary metric. Our pipeline's modular structure allows easy integration, configuration, and combination of different methods and heuristics. This is demonstrated by embedding dozens of popular algorithms and evaluating them, from seminal works to the cutting edge of machine learning research. We show that with proper settings, classical solutions may still outperform the perceived state of the art. Besides establishing the actual state of the art, the conducted experiments reveal unexpected properties of Structure from Motion (SfM) pipelines that can help improve their performance, for both algorithmic and learned methods. Data and code are online https://github.com/vcg-uvic/image-matching-benchmark, providing an easy-to-use and flexible framework for the benchmarking of local features and robust estimation methods, both alongside and against top-performing methods. This work provides a basis for the Image Matching Challenge https://vision.uvic.ca/image-matching-challenge., Comment: Added: KeyNet-SOSNet, AffNet-HardNet, TFeat, MKD from kornia
- Published
- 2020
7. A Year of Papers Using Biomedical Texts
- Author
-
Cyril, Grouin and Natalia, Grabar
- Subjects
Artificial Intelligence ,social media ,Section 10: Natural Language Processing ,state-of-the-art review ,Synopsis ,Data Mining ,Electronic Health Records ,Information Storage and Retrieval ,Natural Language Processing - Abstract
Summary Objectives : Analyze papers published in 2019 within the medical natural language processing (NLP) domain in order to select the best works of the field. Methods : We performed an automatic and manual pre-selection of papers to be reviewed and finally selected the best NLP papers of the year. We also propose an analysis of the content of NLP publications in 2019. Results : Three best papers have been selected this year including the generation of synthetic record texts in Chinese, a method to identify contradictions in the literature, and the BioBERT word representation. Conclusions : The year 2019 was very rich and various NLP issues and topics were addressed by research teams. This shows the will and capacity of researchers to move towards robust and reproducible results. Researchers also prove to be creative in addressing original issues with relevant approaches.
- Published
- 2020
8. The New Version of the ANDDigest Tool with Improved AI-Based Short Names Recognition.
- Author
-
Ivanisenko TV, Demenkov PS, Kolchanov NA, and Ivanisenko VA
- Subjects
- PubMed, Databases, Factual, Proteins, Artificial Intelligence, Data Mining methods
- Abstract
The body of scientific literature continues to grow annually. Over 1.5 million abstracts of biomedical publications were added to the PubMed database in 2021. Therefore, developing cognitive systems that provide a specialized search for information in scientific publications based on subject area ontology and modern artificial intelligence methods is urgently needed. We previously developed a web-based information retrieval system, ANDDigest, designed to search and analyze information in the PubMed database using a customized domain ontology. This paper presents an improved ANDDigest version that uses fine-tuned PubMedBERT classifiers to enhance the quality of short name recognition for molecular-genetics entities in PubMed abstracts on eight biological object types: cell components, diseases, side effects, genes, proteins, pathways, drugs, and metabolites. This approach increased average short name recognition accuracy by 13%.
- Published
- 2022
- Full Text
- View/download PDF
9. A Systematic Literature Review for New Technologies in IT Audit.
- Author
-
Tanrıverdi, Nur Sena and Taşkın, Nazım
- Subjects
INFORMATION technology ,MACHINE learning ,AUDITING ,ARTIFICIAL intelligence ,DATA mining ,NATURAL language processing - Abstract
Copyright of Acta Infologica is the property of Acta Infologica and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
10. Construction and Model Realization of Financial Intelligence System Based on Multisource Information Feature Mining.
- Author
-
Li J
- Subjects
- Intelligence, Software, Artificial Intelligence, Data Mining
- Abstract
Multisource information mining systems and related business intelligence technology are currently a hot topic of research. However, the current commercial applications and applications are not ideal in terms of application. Because there is still much work to be done before decision support, it is best to transition to them only financially. This paper examines the multisource part of the information used in mining and introduces research hotspots in the fields of accounting informatization, the development status of intelligent financial analysis software, the research and application status of data warehouse, data mining, and decision support systems. This paper examines the specific composition and content of a financial information system using information mining to lay a solid foundation. Financial intelligent analysis, financial intelligent monitoring, financial intelligent decision-making, and financial intelligent early warning are the four parts of the financial intelligent system. It then examined the structure and processing of the financial intelligence system and proposed a financial intelligence system operation strategy. Financial intelligence low-risk integrated implementation strategies and ideal financial intelligence models, according to the current state of research and practical applications. According to the findings, the overall discrimination accuracy of the financial information system based on mining multisource information features is up to 95%, which is 42% higher than the traditional model. The development and use of financial information benefit from the realization and exploration of the financial intelligence system model., Competing Interests: The author declares that there are no conflicts of interest., (Copyright © 2022 Jing Li.)
- Published
- 2022
- Full Text
- View/download PDF
11. Two-stage approach to extracting visual objects from paper documents
- Author
-
Paweł Forczmański and Andrzej Markiewicz
- Subjects
Computer science ,02 engineering and technology ,computer.software_genre ,01 natural sciences ,Image (mathematics) ,010104 statistics & probability ,Visual Objects ,0202 electrical engineering, electronic engineering, information engineering ,AdaBoost ,0101 mathematics ,computer.programming_language ,business.industry ,Process (computing) ,Bootstrapping (linguistics) ,Pattern recognition ,Object detection ,Computer Science Applications ,Identification (information) ,Hardware and Architecture ,Cascade ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Data mining ,business ,computer ,Software - Abstract
In the paper we present an approach to the automatic detection and identification of important elements in paper documents. This includes stamps, logos, printed text blocks, signatures and tables. Presented approach consists of two stages. The first one includes object detection by means of AdaBoost cascade of weak classifiers and Haar-like features. Resulting image blocks are, at the second stage, subjected to verification based on selected features calculated from recently proposed low-level descriptors combined with certain classifiers representing current machine-learning approaches. The training phase, for both stages, uses bootstrapping, i.e., integrative process, aiming at increasing the accuracy. Experiments performed on large set of digitized paper documents showed that adopted strategy is useful and efficient.
- Published
- 2016
12. Monitoring the performance of the paper making process
- Author
-
A.J. Morris, Elaine Martin, and Y. Bissessur
- Subjects
Engineering ,business.product_category ,Artificial neural network ,business.industry ,Applied Mathematics ,Process (computing) ,Condition monitoring ,computer.software_genre ,Machine learning ,Fault detection and isolation ,Computer Science Applications ,Paper machine ,Control and Systems Engineering ,Feature (machine learning) ,Process control ,Data mining ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Representation (mathematics) ,computer - Abstract
This paper presents two approaches for monitoring the performance of the paper making process. The first uses a neural network vibration-based condition monitoring system for providing advance warning of press felt problems. The characteristics of this system include spectral analysis through acceleration enveloping, a peak detection algorithm as a feature extractor and a neural network classifier. The second approach makes use of multivariate statistical techniques. This involves the generation of a multivariate statistical representation of nominal process behaviour. Process malfunctions are then identified by deviations from the developed model. Both approaches are successfully applied to an industrial paper machine.
- Published
- 1999
13. Strategic technological determinant in smart destinations: obtaining an automatic classification of the quality of the destination
- Author
-
Díaz-González, Sergio, Torres, Jesus M., Parra-López, Eduardo, and Aguilar, Rosa M.
- Published
- 2022
- Full Text
- View/download PDF
14. Exploring cell tower data dumps for supervised learning-based point-of-interest prediction (industrial paper)
- Author
-
Mingxuan Yuan, Yan Lyu, Victor C. S. Lee, Yanhua Li, Ran Wang, Chi-Yin Chow, and Sarana Nutanong
- Subjects
Ground truth ,Point of interest ,Computer science ,business.industry ,Mobile broadband ,Geography, Planning and Development ,Supervised learning ,02 engineering and technology ,computer.software_genre ,Machine learning ,Data set ,Application domain ,020204 information systems ,Histogram ,0202 electrical engineering, electronic engineering, information engineering ,Key (cryptography) ,020201 artificial intelligence & image processing ,Data mining ,Artificial intelligence ,business ,computer ,Information Systems - Abstract
Exploring massive mobile data for location-based services becomes one of the key challenges in mobile data mining. In this paper, we investigate a problem of finding a correlation between the collective behavior of mobile users and the distribution of points of interest (POIs) in a city. Specifically, we use large-scale cell tower data dumps collected from cell towers and POIs extracted from a popular social network service, Weibo. Our objective is to make use of the data from these two different types of sources to build a model for predicting the POI densities of different regions in the covered area. An application domain that may benefit from our research is a business recommendation application, where a prediction result can be used as a recommendation for opening a new store/branch. The crux of our contribution is the method of representing the collective behavior of mobile users as a histogram of connection counts over a period of time in each region. This representation ultimately enables us to apply a supervised learning algorithm to our problem in order to train a POI prediction model using the POI data set as the ground truth. We studied 12 state-of-the-art classification and regression algorithms; experimental results demonstrate the feasibility and effectiveness of the proposed method.
- Published
- 2015
15. A Novel Metadata Based Multi-Label Document Classification Technique.
- Author
-
Sajid, Naseer Ahmed, Ahmad, Munir, Rahman, Atta-ur, Zaman, Gohar, Ahmed, Mohammed Salih, Ibrahim, Nehad, Ahmed, Mohammed Imran B., Krishnasamy, Gomathi, Alzaher, Reem, Alkharraa, Mariam, AlKhulaifi, Dania, AlQahtani, Maryam, Salam, Asiya A., Saraireh, Linah, Gollapalli, Mohammed, and Ahmed, Rashad
- Subjects
INDEXING ,METADATA ,DATA mining ,ARTIFICIAL intelligence ,MACHINE learning - Abstract
From the beginning, the process of research and its publication is an ever-growing phenomenon and with the emergence of web technologies, its growth rate is overwhelming. On a rough estimate, more than thirty thousand research journals have been issuing around four million papers annually on average. Search engines, indexing services, and digital libraries have been searching for such publications over the web. Nevertheless, getting the most relevant articles against the user requests is yet a fantasy. It is mainly because the articles are not appropriately indexed based on the hierarchies of granular subject classification. To overcome this issue, researchers are striving to investigate new techniques for the classification of the research articles especially, when the complete article text is not available (a case of nonopen access articles). The proposed study aims to investigate the multilabel classification over the available metadata in the best possible way and to assess, "to what extent metadata-based features can perform in contrast to content-based approaches." In this regard, novel techniques for investigating multilabel classification have been proposed, developed, and evaluated on metadata such as the Title and Keywords of the articles. The proposed technique has been assessed for two diverse datasets, namely, from the Journal of universal computer science (J.UCS) and the benchmark dataset comprises of the articles published by the Association for computing machinery (ACM). The proposed technique yields encouraging results in contrast to the state-of-the-art techniques in the literature. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
16. A Survey Paper on Image Classification and Methods of Image Mining
- Author
-
Sri Khetwat Saritha and Sandeep Pandey
- Subjects
Automatic image annotation ,Contextual image classification ,business.industry ,Computer science ,Pattern recognition ,Data mining ,Artificial intelligence ,computer.software_genre ,business ,computer ,Image (mathematics) - Published
- 2017
17. Reducing Errors from the Electronic Transcription of Data Collected on Paper Forms: A Research Data Case Study
- Author
-
Monika M. Wahi, David V. Parks, Robert Skeate, and Steven B. Goldin
- Subjects
Paper ,Medical Records Systems, Computerized ,Computer science ,Health Informatics ,Case Report ,Data entry ,computer.software_genre ,Medical Records ,User-Computer Interface ,Software ,Reliability study ,Surveys and Questionnaires ,Feature (machine learning) ,Humans ,Software system ,Research data ,Electronic Data Processing ,Medical Errors ,business.industry ,Optical character recognition ,Artificial intelligence ,Data mining ,Forms and Records Control ,Transcription (software) ,business ,computer ,Natural language processing - Abstract
We conducted a reliability study comparing single data entry (SE) into a Microsoft Excel spreadsheet to entry using the existing forms (EF) feature of the Teleforms software system, in which optical character recognition is used to capture data off of paper forms designed in non-Teleforms software programs. We compared the transcription of data from multiple paper forms from over 100 research participants representing almost 20,000 data entry fields. Error rates for SE were significantly lower than those for EF, so we chose SE for data entry in our study. Data transcription strategies from paper to electronic format should be chosen based on evidence from formal evaluations, and their design should be contemplated during the paper forms development stage.
- Published
- 2008
18. Review Paper: A Comparative Study on Partitioning Techniques of Clustering Algorithms
- Author
-
Rohit Srivastava and Gopi Gandhi
- Subjects
Fuzzy clustering ,Computer science ,business.industry ,Document classification ,Correlation clustering ,Machine learning ,computer.software_genre ,Field (computer science) ,Data set ,Set (abstract data type) ,ComputingMethodologies_PATTERNRECOGNITION ,Text mining ,Pattern recognition (psychology) ,Artificial intelligence ,Data mining ,Cluster analysis ,business ,computer - Abstract
plays a vital role in research area in the field of data mining. Clustering is a process of partitioning a set of data in a meaningful sub classes called clusters. It helps users to understand the natural grouping or cluster from the data set. It is unsupervised classification that means it has no predefined classes. This paper presents a study of various partitioning techniques of clustering algorithms and their relative study by reflecting their advantages individually. Applications of cluster analysis are Economic Science, Document classification, Pattern Recognition, Image Processing, text mining. No single algorithm is efficient enough to crack problems from different fields. Hence, in this study some algorithms are presented which can be used according to one's requirement. In this paper, various well known partitioning based methods - k-means, k-medoids and Clarans - are compared. The study given here explores the behaviour of these three methods.
- Published
- 2014
19. Notes about the paper entitled 'A hybridized K-means clustering approach for high dimensional dataset'
- Author
-
SM Pérez-Plaza, M Muñoz-Marquez, AJ Arriaza-Gómez, and F Fernández-Palacin
- Subjects
business.industry ,Computer science ,Principal component analysis ,k-means clustering ,Initialization ,Pattern recognition ,Artificial intelligence ,High dimensional ,Data mining ,business ,Cluster analysis ,computer.software_genre ,computer - Abstract
In the paper “A hybridized K-means clustering approach for high dimensional dataset” Dash, Mishra, Rash and Acharya have presented a new version of the k-means algorithm. In it, principal components analysis (PCA) was used before applying the kmeans algorithm with a new initialization method. The authors compare the results obtained by using the HKMCA and PCA with the results of the original k-means, but a direct comparison is not valid as this paper shows. Keywords: Cluster analysis, K-means Algorithm, Principal Component Analysis, Hybridized K-means algorithmInternational Journal of Engineering, Science and Technology Vol. 6, No. 1, 2014, pp. 20-26
- Published
- 2014
20. Survey Paper of Encrypted Data Hiding using Skin Tone Detection
- Author
-
Sachin Patel, Rekha D. Kalambe, and Rakesh Pandit
- Subjects
Biometrics ,Steganography ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Data security ,Cryptography ,Data_CODINGANDINFORMATIONTHEORY ,Encryption ,computer.software_genre ,Information hiding ,Key (cryptography) ,Computer vision ,Artificial intelligence ,Data mining ,business ,computer - Abstract
Steganography is the skill of hiding the existence of data in other transmission medium to attain secret communication. It does not restore cryptography but quite boost the security using its abstruse features. In this paper we have surveyed on a Steganography and cryptography techniques which provide highly secure skin tone data hiding. Biometric characteristic used to apply steganography is skin tone region of images. Here important data is implanted within skin region of image which will give an outstanding secure location for data hiding. For this skin tone detection is need to be performed. Different steps of data hiding can be applied by cropping an image interactively. Cropping of an image improved security than hiding data without cropping the whole image, so cropped region works as a key at decoding region. Cryptography algorithm is used to convert the secret messages to an unreadable form before embedding; which provides a strong backbone for data security. This survey paper focuses on illuminating the technique to secure data or message with authenticity and non repudiation. So with this object oriented steganogaphy we track skin tone objects in image with the higher security and satisfactory PSNR .Modern steganography’s goal is to keep its mere presence undetectable.
- Published
- 2013
21. A Survey Paper on Trajectory Pattern Mining for Pattern Matching Query
- Author
-
S. M. Shinde and S. R. Ghule
- Subjects
Transportation planning ,business.industry ,Computer science ,computer.software_genre ,Machine learning ,Graph ,Data set ,Location-based service ,Trajectory pattern ,Graph (abstract data type) ,Data mining ,Artificial intelligence ,Pattern matching ,Sequential Pattern Mining ,business ,computer - Abstract
amount information of moving objects on road network is being collected with the help of various recent technologies. The tracking of these moving objects on road networks is becoming important because of it's application in various areas. Classification has been used for classifying various kinds of data sets like graph, text documents. However, there is a lack of study on data like trajectories on road networks. Data mining techniques, especially, sequential pattern mining can be used to extract frequent spatio-temporal patterns. Again it needs to confine the length of sequential patterns to ensure high efficiency. After extracting frequent sequential patterns in trajectories, classification can be applied to classify patterns which provide useful information in applications such as city and transportation planning, road construction, design, and maintenance, marketing sector. In this paper, whole pattern matching query concept is adopted after the classification to find total traffic volume on given trajectory edge. At the same time, user can find number of vehicles moving in one as well as in both directions on that particular trajectory. Keywordssequential patterns, frequent pattern based classification, location based services, pattern matching.
- Published
- 2014
22. Review paper on adapting data stream mining concept drift using ensemble classifier approach
- Author
-
Nilima Motghare and Arvind Mewada
- Subjects
Data stream ,Concept drift ,Data stream mining ,business.industry ,Computer science ,computer.software_genre ,Machine learning ,ComputingMethodologies_PATTERNRECOGNITION ,Data stream clustering ,Data mining ,Artificial intelligence ,business ,Stream data ,Cluster analysis ,computer ,Classifier (UML) - Abstract
Data stream is massive, fast changing and infinite in nature. It is very natural that large amount of unlabeled data and small amount labeled are available in data stream environments. Storing and labeling all data is considered expensive and impractical. The objective is to label small portion of stream data and analyze data online without storing it. Concept drift, concept evolving, stream evolving is also the major challenging problem occurs while working with data stream. Online data stream active learning is needed to tackle these problems. Classification and clustering are two technical areas that are widely used to extract pattern from the large data stream, from that a classification model must endlessly adapt itself to the most recent concept. Hence, this paper gives the overview of various ensemble based classification algorithm techniques in the field of data stream mining.
- Published
- 2014
23. Methods and Applications of Data Mining in Business Domains.
- Author
-
Amrit, Chintan and Abdi, Asad
- Subjects
DATA mining ,DEEP learning ,ARTIFICIAL neural networks ,MACHINE learning ,ARTIFICIAL intelligence ,DECISION support systems - Abstract
These papers collectively showcase the adaptability and effectiveness of data mining techniques, making substantial contributions to the broader realm of " I Methods and Applications of Data Mining in Business Domains i ". In a business context, the challenge is that one would like to see (i) how the algorithms can be repeatable in the real world, (ii) how the patterns mined can be utilized by the business, and (iii) how the resulting model can be understood and utilized in the business environment [[1]]. Additionally, they provide insights into factors influencing the adoption of business intelligence systems (BISs) in small and medium-sized enterprises (SMEs) [[26]], and conduct a systematic literature review on AI-based methods for automating business processes and decision support [[27]]. This Special Issue invited researchers to contribute original research in the field of data mining, particularly in its application to diverse domains, like healthcare, software development, logistics, and human resources. [Extracted from the article]
- Published
- 2023
- Full Text
- View/download PDF
24. Intelligent computer vision system for segregating recyclable waste papers
- Author
-
Mohammad Osiur Rahman, Aini Hussain, Mahammad A. Hannan, Hassan Basri, and Edgar Scavino
- Subjects
Sorting algorithm ,Process (engineering) ,Computer science ,business.industry ,media_common.quotation_subject ,General Engineering ,Sorting ,Feature selection ,Machine learning ,computer.software_genre ,Computer Science Applications ,Identification (information) ,Artificial Intelligence ,Quality (business) ,Artificial intelligence ,Data mining ,business ,computer ,Throughput (business) ,media_common - Abstract
This article explores the application of image processing techniques in recyclable waste paper sorting. In recycling, waste papers are segregated into various grades as they are subjected to different recycling processes. Highly sorted paper streams facilitate high quality end products and save processing chemicals and energy. From 1932 to 2009, different mechanical and optical paper sorting methods have been developed to fill the paper sorting demand. Still, in many countries including Malaysia, waste papers are sorted into different grades using a manual sorting system. Because of inadequate throughput and some major drawbacks of mechanical paper sorting systems, the popularity of optical paper sorting systems has increased. Automated paper sorting systems offer significant advantages over human inspection in terms of worker fatigue, throughput, speed, and accuracy. This research attempts to develop a smart vision sensing system that is able to separate the different grades of paper using first-order features. To construct a template database, a statistical approach with intra-class and inter-class variation techniques are applied to the feature selection process. Finally, the K-nearest neighbor (KNN) algorithm is applied for paper object grade identification. The remarkable achievement obtained with the method is the accurate identification and dynamic sorting of all grades of papers using simple image processing techniques.
- Published
- 2011
25. Bio-Potential Signal Extraction from Multi-Channel Paper Recorded Charts
- Author
-
Ali Almejrad
- Subjects
Signal processing ,Engineering ,Scanner ,Multidisciplinary ,Channel (digital image) ,business.industry ,Image processing ,computer.software_genre ,Signal ,Chart ,Computer data storage ,Waveform ,Computer vision ,Artificial intelligence ,Data mining ,business ,computer - Abstract
Problem Statement: Almost all of the modern biomedical equipments that record the biopotential actions have digital output but the paper chart records are a must. The volume of these records is significant and increasing rapidly. Keeping a bio-signal chart of a patient make it easy for quick assessment but still create problems in the essence of data storage, archiving , data interchange and communications. Approach: As a solution to all these problems is to convert these paper records to digital form. In this study, a method for bio-potential signal extraction from single or multi-channel paper recorded charts using image processing techniques is developed. Results: After scanning the paper charts and converting them into images using a commercial scanner, the developed algorithm is applied to eliminate the background of the scanned paper chart from any recording device single channel or multi-channel using binary neighborhood morphological operations then converting the extracted waveform image into quantized values representing the waveform recorded paper chart. The extracted signal is then filtered to remove the high frequency effects that result from the morphological operations. A correlation and frequency analysis procedure is then conducted to verify the result against known sampled waveform. Conclusion: A chart paper conversion to digital values that represent only the values for the biopotential waveform and eliminating the other irrelevant information has been achieved. These resulted in a less space occupation of patient records and make it easy for data further processing and manipulations.
- Published
- 2011
26. Hybrid wet paper coding mechanism for steganography employing n-indicator and fuzzy edge detector
- Author
-
T. Hoang Ngan Le, Jung-San Lee, and Chin-Chen Chang
- Subjects
Guard (information security) ,Theoretical computer science ,Steganography ,Computer science ,Image quality ,Applied Mathematics ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Data_CODINGANDINFORMATIONTHEORY ,computer.software_genre ,Fuzzy logic ,Computational Theory and Mathematics ,Artificial Intelligence ,Robustness (computer science) ,Information hiding ,Signal Processing ,Embedding ,Computer Vision and Pattern Recognition ,Data mining ,Electrical and Electronic Engineering ,Statistics, Probability and Uncertainty ,computer ,Coding (social sciences) - Abstract
Data hiding technique can facilitate security and the safe transmission of important information in the digital domain, which generally requires a high embedding payload and good stego image quality. Recently, a steganographic framework known as wet paper coding has been utilized as an effective strategy in image hiding to achieve the requirements of high embedding payload, good quality and robust security. In this paper, besides employing this mechanism as a fundamental stage, we take advantage of two novel techniques, namely, an efficient n-indicator and a fuzzy edge detector. The first is to increase the robustness of the proposed system to guard against being detected or traced by the statistics methods while allowing the receiver without knowledge of secret data positions to retrieve the embedded information. The second is to improve the payload and enhance the quality of stego image. The experimental results show that our proposed scheme outperforms its ability to reduce the conflict among three steganography requirements.
- Published
- 2010
27. Intelligent extraction of reservoir dispatching information integrating large language model and structured prompts.
- Author
-
Yang, Yangrui, Chen, Sisi, Zhu, Yaping, Liu, Xuemei, Ma, Wei, and Feng, Ling
- Subjects
LANGUAGE models ,ARTIFICIAL intelligence ,RESERVOIRS ,DATA mining ,MERGERS & acquisitions ,FLOOD control - Abstract
Reservoir dispatching regulations are a crucial basis for reservoir operation, and using information extraction technology to extract entities and relationships from heterogeneous texts to form triples can provide structured knowledge support for professionals in making dispatch decisions and intelligent recommendations. Current information extraction technologies require manual data labeling, consuming a significant amount of time. As the number of dispatch rules increases, this method cannot meet the need for timely generation of dispatch plans during emergency flood control periods. Furthermore, utilizing natural language prompts to guide large language models in completing reservoir dispatch extraction tasks also presents challenges of cognitive load and instability in model output. Therefore, this paper proposes an entity and relationship extraction method for reservoir dispatch based on structured prompt language. Initially, a variety of labels are refined according to the extraction tasks, then organized and defined using the Backus–Naur Form (BNF) to create a structured format, thus better guiding large language models in the extraction work. Moreover, an AI agent based on this method has been developed to facilitate operation by dispatch professionals, allowing for the quick acquisition of structured data. Experimental verification has shown that, in the task of extracting entities and relationships for reservoir dispatch, this AI agent not only effectively reduces cognitive burden and the impact of instability in model output but also demonstrates high extraction performance (with F1 scores for extracting entities and relationships both above 80%), offering a new solution approach for knowledge extraction tasks in other water resource fields. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Variable Selection and Grouping in a Paper Machine Application
- Author
-
Esko Juuso, Timo Ahola, and Kauko Leiviskä
- Subjects
Engineering ,business.product_category ,business.industry ,Computer Networks and Communications ,Computer science ,Feature selection ,General Medicine ,Machine learning ,computer.software_genre ,Fuzzy logic ,Computer Science Applications ,Paper machine ,Computational Theory and Mathematics ,Industrial systems ,Artificial intelligence ,Sensitivity (control systems) ,Data mining ,business ,computer - Abstract
This paper describes the possibilities of variable selection in large-scale industrial systems. It introduces knowledge-based, data-based and model-based methods for this purpose. As an example, Case-Based Reasoning application for the evaluation of the web break sensitivity in a paper machine is introduced. The application uses Linguistic Equations approach and basic Fuzzy Logic. The indicator combines the information of on-line measurements with expert knowledge and provides a continuous indication of the break sensitivity. The web break sensitivity defines the current operating situation at the paper mill and gives new information to the operators. Together with information of the most important variables this prediction gives operators enough time to react to the changing operating situation.
- Published
- 2007
29. Quantitative knowledge presentation models of traditional Chinese medicine (TCM): A review.
- Author
-
Chu X, Sun B, Huang Q, Peng S, Zhou Y, and Zhang Y
- Subjects
- Humans, Artificial Intelligence, Biomedical Research methods, Data Mining methods, Medicine, Chinese Traditional methods
- Abstract
Modern computer technology sheds light on new ways of innovating Traditional Chinese Medicine (TCM). One method that gets increasing attention is the quantitative research method, which makes use of data mining and artificial intelligence technology as well as the mathematical principles in the research on rationales, academic viewpoints of famous doctors of TCM, dialectical treatment by TCM, clinical technology of TCM, the patterns of TCM prescriptions, clinical curative effects of TCM and other aspects. This paper reviews the methods, means, progress and achievements of quantitative research on TCM. In the core database of the Web of Science, "Traditional Chinese Medicine", "Computational Science" and "Mathematical Computational Biology" are selected as the main retrieval fields, and the retrieval time interval from 1999 to 2019 is used to collect relevant literature. It is found that researchers from China Academy of Chinese Medical Sciences, Zhejiang University, Chinese Academy of Sciences and other institutes have opened up new methods of research on TCM since 2009, with quantitative methods and knowledge presentation models. The adopted tools mainly consist of text mining, knowledge discovery, technologies of the TCM database, data mining and drug discovery through TCM calculation, etc. In the future, research on quantitative models of TCM will focus on solving the heterogeneity and incompleteness of big data of TCM, establishing standardized treatment systems, and promoting the development of modernization and internationalization of TCM., Competing Interests: Declaration of Competing Interest The authors declare that they have no conflict of interest., (Copyright © 2020 Elsevier B.V. All rights reserved.)
- Published
- 2020
- Full Text
- View/download PDF
30. Research on Industry 4.0 and on key related technologies in Vietnam: A bibliometric analysis using Scopus.
- Author
-
Pham‐Duc, Binh, Tran, Trung, Le, Hien‐Thu‐Thi, Nguyen, Nhi‐Thi, Cao, Ha‐Thi, and Nguyen, Tien‐Trung
- Subjects
INDUSTRY 4.0 ,ARTIFICIAL intelligence ,DEEP learning ,COMPUTER science education ,BIBLIOMETRICS ,DATA mining - Abstract
Bibliometric analysis was performed to study the development of publications related to Industry 4.0 and its key technologies in Vietnam. Comparisons with data from other ASEAN countries, and with global data have been done to identify distinctive characteristics of Industry 4.0 literature from Vietnam. The collection of 1,470 retrieved papers was analysed to answer seven research questions. Our results highlighted some valuable insights of Industry 4.0 literature in Vietnam. The number of papers in Industry 4.0 in Vietnam increased rapidly in recent years, mostly focused on Computer Science, Engineering, and Mathematics. Iran, China, and South Korea were the most productive partner countries with Vietnam in Industry 4.0. Machine learning, artificial intelligence, big data, deep learning, Internet of things, neural networks, and data mining were among the most popular research themes in Industry 4.0 in Vietnam. Vietnam ranked third among 10 Southeast Asian countries, based on the number of published papers in Industry 4.0, but the gap with the two top countries was large. Compared to the global data, the annual growth rate of Industry 4.0 papers in Vietnam, and other Southeast Asian countries was lower. Findings from this work can be helpful for other scholars in establishing potential future research lines related to Industry 4.0 in Vietnam. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
31. A Reliable Classification Method for Paper Currency Based on the Non-Linear PC
- Author
-
Sigeru Omatu, Toshihisa Kosaka, and Ali Ahmadi
- Subjects
Engineering ,Learning vector quantization ,business.industry ,Machine learning ,computer.software_genre ,Nonlinear system ,Currency ,Classification methods ,Data mining ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,computer ,Reliability (statistics) - Published
- 2003
32. Input variable selection in time-critical knowledge integration applications: A review, analysis, and recommendation paper
- Author
-
Alireza Mousavi, Stefan Poslad, and Siamak Tavakoli
- Subjects
Input variable selection ,Computer science ,Context (language use) ,Feature selection ,02 engineering and technology ,computer.software_genre ,Supervisory control and data acquisition ,01 natural sciences ,010104 statistics & probability ,Data acquisition ,SCADA ,Artificial Intelligence ,Knowledge integration ,0202 electrical engineering, electronic engineering, information engineering ,Sensitivity (control systems) ,0101 mathematics ,Dimensionality reduction ,Time-critical control ,13. Climate action ,020201 artificial intelligence & image processing ,Data mining ,State (computer science) ,Sensitivity analysis ,computer ,Information Systems - Abstract
This is the post-print version of the final paper published in Advanced Engineering Informatics. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2013 Elsevier B.V. The purpose of this research is twofold: first, to undertake a thorough appraisal of existing Input Variable Selection (IVS) methods within the context of time-critical and computation resource-limited dimensionality reduction problems; second, to demonstrate improvements to, and the application of, a recently proposed time-critical sensitivity analysis method called EventTracker to an environment science industrial use-case, i.e., sub-surface drilling. Producing time-critical accurate knowledge about the state of a system (effect) under computational and data acquisition (cause) constraints is a major challenge, especially if the knowledge required is critical to the system operation where the safety of operators or integrity of costly equipment is at stake. Understanding and interpreting, a chain of interrelated events, predicted or unpredicted, that may or may not result in a specific state of the system, is the core challenge of this research. The main objective is then to identify which set of input data signals has a significant impact on the set of system state information (i.e. output). Through a cause-effect analysis technique, the proposed technique supports the filtering of unsolicited data that can otherwise clog up the communication and computational capabilities of a standard supervisory control and data acquisition system. The paper analyzes the performance of input variable selection techniques from a series of perspectives. It then expands the categorization and assessment of sensitivity analysis methods in a structured framework that takes into account the relationship between inputs and outputs, the nature of their time series, and the computational effort required. The outcome of this analysis is that established methods have a limited suitability for use by time-critical variable selection applications. By way of a geological drilling monitoring scenario, the suitability of the proposed EventTracker Sensitivity Analysis method for use in high volume and time critical input variable selection problems is demonstrated. EU
- Published
- 2013
- Full Text
- View/download PDF
33. An Intelligent Test Paper Generating Algorithm Based on Maximum Coverage of Knowledge Points
- Author
-
Nie Er, Min Wu, Yinhuang Le, Hui Zhang, and Man Yang
- Subjects
College English ,Computer science ,business.industry ,Process (engineering) ,Diagnostic test ,computer.software_genre ,Machine learning ,Test (assessment) ,Knowledge hierarchy ,Data mining ,Artificial intelligence ,business ,computer ,Algorithm - Abstract
In order to keep up with the requirements of College English Diagnostic Test System developed by our team, after analyzing the pros and cons of current test paper generating algorithms, the MCKP algorithm, which is based on maximum coverage of knowledge points (MCKP), is proposed. In this paper, the knowledge hierarchy of College English Test (CET4) of China is constructed to provide the foundation for MCKP. Then given three main parameters, including the required items number, the mastery rates of knowledge-point and the testing frequencies of knowledge points, the theory and implementation of the MCKP algorithm is elaborated. At last, experiments indicate relatively high success rate and effectiveness of test paper generating process. Therefore, MCKP algorithm can provide an essential improvement in test paper generating process. Keywordsmaximum covarage of knowledge points (MCKP); knowledge hierarchy of CET-4;mastery rates ;testing frequencies; success rate
- Published
- 2013
34. Review paper on Error Correcting Output Code Based on Multiclass Classification
- Author
-
Irfan Poladi and Hitesh Ishwardas
- Subjects
Multiclass classification ,business.industry ,Computer science ,Error correcting ,Code (cryptography) ,Pattern recognition ,Artificial intelligence ,Data mining ,business ,computer.software_genre ,computer - Published
- 2012
35. Some issues on scalable feature selection1This is an extended version of the paper presented at the Fourth World Congress of Expert Systems: Application of Advanced Information Technologies held in Mexico City in March 1998.1
- Author
-
Rudy Setiono and Huan Liu
- Subjects
business.industry ,Computer science ,Feature extraction ,General Engineering ,Feature selection ,Machine learning ,computer.software_genre ,Computer Science Applications ,Randomized algorithm ,k-nearest neighbors algorithm ,Artificial Intelligence ,Feature (computer vision) ,Scalability ,Feature (machine learning) ,Artificial intelligence ,Data mining ,business ,computer ,Feature learning - Abstract
Feature selection determines relevant features in the data. It is often applied in pattern classification, data mining, as well as machine learning. A special concern for feature selection nowadays is that the size of a database is normally very large, both vertically and horizontally. In addition, feature sets may grow as the data collection process continues. Effective solutions are needed to accommodate the practical demands. This paper concentrates on three issues: large number of features, large data size, and expanding feature set. For the first issue, we suggest a probabilistic algorithm to select features. For the second issue, we present a scalable probabilistic algorithm that expedites feature selection further and can scale up without sacrificing the quality of selected features. For the third issue, we propose an incremental algorithm that adapts to the newly extended feature set and captures `concept drifts' by removing features from previously selected and newly added ones. We expect that research on scalable feature selection will be extended to distributed and parallel computing and have impact on applications of data mining and machine learning.
- Published
- 1998
36. High speed paper currency recognition by neural networks
- Author
-
S. Omatu and F. Takeda
- Subjects
Artificial neural network ,Computer Networks and Communications ,Computer science ,Time delay neural network ,General Medicine ,computer.software_genre ,Computer Science Applications ,Data set ,Set (abstract data type) ,ComputingMethodologies_PATTERNRECOGNITION ,Artificial Intelligence ,Pattern recognition (psychology) ,Data mining ,Time series ,Fourier series ,computer ,Software - Abstract
In this paper a new technique is proposed to improve the recognition ability and the transaction speed to classify the Japanese and US paper currency. Two types of data sets, time series data and Fourier power spectra, are used in this study. In both cases, they are directly used as inputs to the neural network. Furthermore, we also refer a new evaluation method of recognition ability. Meanwhile, a technique is proposed to reduce the input scale of the neural network without preventing the growth of recognition. This technique uses only a subset of the original data set which is obtained using random masks. The recognition ability of using large data set and a reduced data set are discussed. In addition to that the results of using a reduced data set of the Fourier power spectra and the time series data are compared. >
- Published
- 1995
37. Survey Paper on Clustering based Segmentation Approach to Detect Brain Tumour from MRI Scan
- Author
-
Upasana Gaikwad, Kanika Debbarma, and Silkesha Thigale
- Subjects
Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Brain tumor ,Pattern recognition ,Image processing ,Image segmentation ,medicine.disease ,computer.software_genre ,Fuzzy logic ,Component (UML) ,medicine ,Segmentation ,Data mining ,Artificial intelligence ,Focus (optics) ,Cluster analysis ,business ,Literature survey ,computer - Abstract
In the last few years, many image processing techniques have been presented in order to perform different brain tumor detection tasks. These cover Content-Based Retrieval Technique, Component Labelling Algorithm, Fuzzy C-Mean Algorithm. There is fast growth of image processing available in last few years; image segmentation have work together to get extraction of meaningful data into useful information. Image processing is used to finding applicable and efficient information from data available. With this survey paper we focus on literature survey of segmentation based image processing model as contain efficient performance.
- Published
- 2015
38. Forecasting rock trencher performance using fuzzy logic11A shorter version of this paper was presented at the 36th US Rock Mechanics Symposium, New York
- Author
-
M. Alvarez Grima and P. N. W. Verhoef
- Subjects
Engineering ,Fuzzy expert system ,business.industry ,Fuzzy set ,Excavation ,Geotechnical Engineering and Engineering Geology ,computer.software_genre ,Fuzzy logic ,Rock cutting ,Rock engineering ,Trench ,Artificial intelligence ,Stage (hydrology) ,Data mining ,business ,computer - Abstract
To study the performance of rock cutting trenchers, data on the excavation and tool consumption rate of one type of trencher, the Vermeer T-850, were gathered on 16 sites. The data assembled were compared with the rock characteristics by studying the trench geology and performing rock engineering tests on samples. This study aims at more reliable predictions, by developing better methods to handle the data, which are commonly of an imprecise nature. To reach this goal, fuzzy set theory has been selected and successfully implemented. A Fuzzy Expert System model has been developed to predict the bit consumption and the excavation rate of the T-850 trencher. The results obtained so far are promising and the model is in the verification stage.
- Published
- 1999
39. Towards classifying species in systems biology papers using text mining
- Author
-
Qi Wei and Nigel Collier
- Subjects
Computer science ,Systems biology ,ved/biology.organism_classification_rank.species ,lcsh:Medicine ,computer.software_genre ,General Biochemistry, Genetics and Molecular Biology ,Task (project management) ,Text mining ,Type (biology) ,lcsh:Science (General) ,Model organism ,lcsh:QH301-705.5 ,Organism ,Medicine(all) ,Biochemistry, Genetics and Molecular Biology(all) ,business.industry ,ved/biology ,lcsh:R ,MeSH Headings ,General Medicine ,Biomedical text mining ,lcsh:Biology (General) ,Artificial intelligence ,Data mining ,business ,computer ,Natural language processing ,lcsh:Q1-390 ,Research Article - Abstract
Background In recent years high throughput methods have led to a massive expansion in the free text literature on molecular biology. Automated text mining has developed as an application technology for formalizing this wealth of published results into structured database entries. However, database curation as a task is still largely done by hand, and although there have been many studies on automated approaches, problems remain in how to classify documents into top-level categories based on the type of organism being investigated. Here we present a comparative analysis of state of the art supervised models that are used to classify both abstracts and full text articles for three model organisms. Results Ablation experiments were conducted on a large gold standard corpus of 10,000 abstracts and full papers containing data on three model organisms (fly, mouse and yeast). Among the eight learner models tested, the best model achieved an F-score of 97.1% for fly, 88.6% for mouse and 85.5% for yeast using a variety of features that included gene name, organism frequency, MeSH headings and term-species associations. We noted that term-species associations were particularly effective in improving classification performance. The benefit of using full text articles over abstracts was consistently observed across all three organisms. Conclusions By comparing various learner algorithms and features we presented an optimized system that automatically detects the major focus organism in full text articles for fly, mouse and yeast. We believe the method will be extensible to other organism types.
- Published
- 2011
40. Memetic Differential Evolution Frameworks in Filter Design for Defect Detection in Paper Production
- Author
-
Ville Tirronen and Ferrante Neri
- Subjects
Engineering ,Process (engineering) ,business.industry ,Particle swarm optimization ,Image processing ,computer.software_genre ,Filter design ,Differential evolution ,Memetic algorithm ,Data mining ,Artificial intelligence ,Adaptation (computer science) ,business ,computer ,Contraposition (traditional logic) - Abstract
This chapter studies and analyzes Memetic Differential Evolution (MDE) Frameworks for designing digital filters, which aim at detecting paper defects produced during an industrial process. MDE Frameworks employ the Differential Evolution (DE) as an evolutionary framework and a list of local searchers adaptively coordinated by a control scheme. Here, three different variants of MDE are taken into account and their features and performance are compared. The binomial explorative features of the DE framework in contraposition to the exploitative features of the local searcher are analyzed in detail in light of the stagnation prevention problem, typical for the DE. Much emphasis in this chapter is given to the various adaptation systems and to their applicability this image processing problem.
- Published
- 2009
41. Formal Medical Knowledge Representation Supports Deep Learning Algorithms, Bioinformatics Pipelines, Genomics Data Analysis, and Big Data Processes.
- Author
-
Dhombres F and Charlet J
- Subjects
- Big Data, Computational Biology, Data Analysis, Genomics, Knowledge Management, Semantics, Artificial Intelligence, Biological Ontologies, Data Mining, Deep Learning, Genetic Association Studies
- Abstract
Objective: To select, present, and summarize the best papers published in 2018 in the field of Knowledge Representation and Management (KRM)., Methods: A comprehensive and standardized review of the medical informatics literature was performed to select the most interesting papers published in 2018 in KRM, based on PubMed and ISI Web Of Knowledge queries., Results: Four best papers were selected among the 962 publications retrieved following the Yearbook review process. The research areas in 2018 were mainly related to the ontology-based data integration for phenotype-genotype association mining, the design of ontologies and their application, and the semantic annotation of clinical texts., Conclusion: In the KRM selection for 2018, research on semantic representations demonstrated their added value for enhanced deep learning approaches in text mining and for designing novel bioinformatics pipelines based on graph databases. In addition, the ontology structure can enrich the analyses of whole genome expression data. Finally, semantic representations demonstrated promising results to process phenotypic big data., Competing Interests: Disclosure The authors report no conflicts of interest in this work., (Georg Thieme Verlag KG Stuttgart.)
- Published
- 2019
- Full Text
- View/download PDF
42. Research on Influencing Factors of Technological Innovation in Industrial Clusters Based on Data Mining and Artificial Intelligence Technology.
- Author
-
Yang, Yaliu
- Subjects
INDUSTRIAL clusters ,DATA mining ,ARTIFICIAL intelligence ,INNOVATIONS in business ,INDUSTRIALIZATION ,TECHNOLOGICAL innovations - Abstract
At present, realizing the upgrading and sustainable development of industrial clusters has become an urgent problem. Based on this, based on data mining and artificial intelligence technology, this paper combines the actual needs of industrial clusters and technological innovation to construct an analysis model of the influencing factors of technological innovation in industrial clusters. Based on the complex network foundation, this paper regards the activity of the innovation network as the result of the joint influence of the presence index, closeness index, betweenness index, agglomeration index and path length index. Moreover, this paper combines the actual situation of the cluster to construct an evaluation system for the activity of the innovation network. In addition, this paper uses the PROMETHEE method based on the cloud model to evaluate the activity of the innovative network to be studied in this paper. Finally, this paper designs experiments to verify the performance of the algorithm model constructed in this paper. The research results show that the system model constructed in this paper has a certain effect. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
43. Best papers from the 12th Pacific-Asia conference on knowledge discovery and data mining (PAKDD2008)
- Author
-
Takashi Washio, Kai Ming Ting, and Einoshin Suzuki
- Subjects
Causal induction ,business.industry ,Computer science ,computer.software_genre ,Knowledge acquisition ,Data warehouse ,Human-Computer Interaction ,Data visualization ,Knowledge extraction ,Artificial Intelligence ,Hardware and Architecture ,Information system ,Software mining ,Review process ,Data mining ,business ,computer ,Software ,Information Systems - Abstract
The Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) has been held every year since 1997. PAKDD 2008, the 12th in the series, was held in Osaka, Japan on May 20–23, 2008. PAKDD is a leading international conference in the area of data mining. It provides an international forum for researchers and industry practitioners to share their new ideas, original research results and practical development experiences from all KDD-related areas including data mining, data warehousing, machine learning, databases, statistics, knowledge acquisition, automatic scientific discovery, data visualization, causal induction and knowledge-based systems. We received a total of 312 research papers from34 countries and regions inAsia, Australia, North America, South America, Europe and Africa. Only approximately 11.9% of the these submissions were accepted as long papers, 12.8% of them were accepted as regular papers, and 11.5% of them were accepted as short papers upon rigorous reviews by two or three reviewers, discussions by the reviewers under the supervision of an area chair and judgment by the Program Committee Co-chairs. The highest rated papers in the review process were further evaluated by the Program Committee Co-chairs, and two papers were selected for the Best Paper Award and the Best Paper Runner-up Award. In addition to these award-winning papers, two outstanding papers were also nominated for this special issue. The authors of the four selected papers were asked to substantially extend the contents of their papers after the
- Published
- 2010
44. Al assistant trawls papers for hidden info.
- Author
-
Harris, Mark
- Subjects
SCIENTIFIC literature ,ARTIFICIAL intelligence ,DATA mining - Abstract
The article discusses the artificial intelligence (AI) tools that data mine scientific papers in order to develop new scientific ideas, including the AI tool known as Semantic Scholar developed by the Allen Institute for Artificial Intelligence (AI2) in Seattle, Washington.
- Published
- 2015
- Full Text
- View/download PDF
45. Novel Tool for Complete Digitization of Paper Electrocardiography Data
- Author
-
Amit J. Shah, Lakshminarayan Ravichandran, C. Harless, Srini Tridandapani, Carson A. Wick, and James H. McClellan
- Subjects
optical character recognition ,lcsh:Medical technology ,Computer science ,electrocardiography ,Feature extraction ,Biomedical Engineering ,lcsh:Computer applications to medicine. Medical informatics ,computer.software_genre ,Grayscale ,Signal ,Article ,electronic medical records ,cardiovascular diseases ,Digitization ,Pixel ,business.industry ,Signal reconstruction ,Pattern recognition ,General Medicine ,Optical character recognition ,Thresholding ,lcsh:R855-855.5 ,lcsh:R858-859.7 ,Artificial intelligence ,Data mining ,business ,computer - Abstract
Objective: We present a Matlab-based tool to convert electrocardiography (ECG) information from paper charts into digital ECG signals. The tool can be used for long-term retrospective studies of cardiac patients to study the evolving features with prognostic value. Methods and procedures: To perform the conversion, we: 1) detect the graphical grid on ECG charts using grayscale thresholding; 2) digitize the ECG signal based on its contour using a column-wise pixel scan; and 3) use template-based optical character recognition to extract patient demographic information from the paper ECG in order to interface the data with the patients' medical record. To validate the digitization technique: 1) correlation between the digital signals and signals digitized from paper ECG are performed and 2) clinically significant ECG parameters are measured and compared from both the paper-based ECG signals and the digitized ECG. Results: The validation demonstrates a correlation value of 0.85–0.9 between the digital ECG signal and the signal digitized from the paper ECG. There is a high correlation in the clinical parameters between the ECG information from the paper charts and digitized signal, with intra-observer and inter-observer correlations of 0.8–0.9 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$({\rm p}
- Published
- 2013
46. Data Mining Algorithm Based on Fusion Computer Artificial Intelligence Technology.
- Author
-
Yingqian Bai, Kepeng Bao, and Tao Xu
- Subjects
ARTIFICIAL intelligence ,DATA mining ,ALGORITHMS ,DISTRIBUTED databases ,ENTROPY (Information theory) - Abstract
INTRODUCTION: The paper constructs a massive data mining model of distributed spatiotemporal databases for the Internet of Things. Then a homologous data fusion method based on information entropy is proposed. The storage space required by the tree structure is reduced by constructing the data schema tree of the merged data set. Secondly, the optimal dynamic support degree is obtained by using a neural network and genetic algorithm. Frequent items in the Internet of Things data are mined to achieve the normalization of the clustered feature data based on the threshold value. Experiments show that the F-measure of the data mining algorithm improves the efficiency by 15.64% and 18.25% compared with the kinds of other literatures respectively. RI increased by 21.17% and 26.07%, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
47. Detecting sarcasm in customer tweets: an NLP based approach
- Author
-
Mukherjee, Shubhadeep and Bala, Pradip Kumar
- Published
- 2017
- Full Text
- View/download PDF
48. Best papers from the Fifth International Conference on Advanced Data Mining and Applications (ADMA 2009)
- Author
-
Xue Li, Qiang Yang, Jian Pei, João Gama, and Ronghuai Huang
- Subjects
Human-Computer Interaction ,Information retrieval ,Artificial Intelligence ,Hardware and Architecture ,Computer science ,Data mining ,computer.software_genre ,computer ,Data science ,Software ,Information Systems - Published
- 2011
49. Editorial: Special Issue on Data Mining, Machine Learning and Decision Support Systems in Health Care.
- Author
-
Valls, Aida, Alsinet, Teresa, and Moreno, Antonio
- Subjects
DECISION support systems ,MACHINE learning ,MEDICAL care ,DATA mining ,ARTIFICIAL intelligence ,DEEP learning - Published
- 2023
- Full Text
- View/download PDF
50. Representing and querying now-relative relational medical data.
- Author
-
Anselma L, Piovesan L, Stantic B, and Terenziani P
- Subjects
- Databases, Factual, Humans, Practice Guidelines as Topic, Reaction Time, Time Factors, Artificial Intelligence, Data Mining methods, Decision Support Systems, Clinical, Decision Support Techniques, Electronic Health Records, Medical Informatics methods
- Abstract
Temporal information plays a crucial role in medicine. Patients' clinical records are intrinsically temporal. Thus, in Medical Informatics there is an increasing need to store, support and query temporal data (particularly in relational databases), in order, for instance, to supplement decision-support systems. In this paper, we show that current approaches to relational data have remarkable limitations in the treatment of "now-relative" data (i.e., data holding true at the current time). This can severely compromise their applicability in general, and specifically in the medical context, where "now-relative" data are essential to assess the current status of the patients. We propose a theoretically grounded and application-independent relational approach to cope with now-relative data (which can be paired, e.g., with different decision support systems) overcoming such limitations. We propose a new temporal relational representation, which is the first relational model coping with the temporal indeterminacy intrinsic in now-relative data. We also propose new temporal algebraic operators to query them, supporting the distinction between possible and necessary time, and Allen's temporal relations between data. We exemplify the impact of our approach, and study the theoretical and computational properties of the new representation and algebra., (Copyright © 2018 Elsevier B.V. All rights reserved.)
- Published
- 2018
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.