42 results on '"Friedhelm Schwenker"'
Search Results
2. Co-creative Drawing with One-Shot Generative Models
- Author
-
Friedhelm Schwenker and Sabine Wieluch
- Subjects
One shot ,Artificial neural network ,business.industry ,Computer science ,02 engineering and technology ,Machine learning ,computer.software_genre ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Generative grammar ,Transformer (machine learning model) - Abstract
This paper presents and evaluates co-creative drawing scenarios in which a user is asked to provide a small hand-drawn pattern which then is interactively extended with the support of a trained neural model. We show that it is possible to use one-shot trained Transformer Neural Networks to generate stroke-based images and that these trained models can successfully be used for design assisting tasks.
- Published
- 2021
3. Two to Trust: AutoML for Safe Modelling and Interpretable Deep Learning for Robustness
- Author
-
Mohammadreza Amirian, Lukas Tuggener, Ricardo Chavarriaga, Yvan Putra Satyawan, Frank-Peter Schilling, Friedhelm Schwenker, and Thilo Stadelmann
- Subjects
Automated deep learning ,Adversarial attacks ,AutoDL ,005: Computerprogrammierung, Programme und Daten ,006: Spezielle Computerverfahren - Abstract
With great power comes great responsibility. The success of machine learning, especially deep learning, in research and practice has attracted a great deal of interest, which in turn necessitates increased trust. Sources of mistrust include matters of model genesis ("Is this really the appropriate model?") and interpretability ("Why did the model come to this conclusion?", "Is the model safe from being easily fooled by adversaries?"). In this paper, two partners for the trustworthiness tango are presented: recent advances and ideas, as well as practical applications in industry in (a) Automated machine learning (AutoML), a powerful tool to optimize deep neural network architectures and netune hyperparameters, which promises to build models in a safer and more comprehensive way; (b) Interpretability of neural network outputs, which addresses the vital question regarding the reasoning behind model predictions and provides insights to improve robustness against adversarial attacks.
- Published
- 2021
4. Fuzzy-Based Pseudo Segmentation Approach for Handwritten Word Recognition Using a Sequence to Sequence Model with Attention
- Author
-
Samir Malakar, Friedhelm Schwenker, Ram Sarkar, and Rajdeep Bhattacharya
- Subjects
Sequence ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Convolutional neural network ,Handwriting recognition ,Word recognition ,Feature (machine learning) ,Segmentation ,Artificial intelligence ,business ,Encoder ,Word (computer architecture) - Abstract
Sequence to sequence models have shown significant progress in the field of handwriting recognition. The recent trend has been that the input to these models is fed from a convolutional neural network (CNN) that acts as a generic feature extractor for the handwritten text images. The input to the CNN is usually either a sequence of patches extracted from the text image or the output of a segmentation algorithm applied on the text image to break it up into individual characters. However, patching is unable to convey proper information about character boundaries in the image, and the segmentation-based approach often suffers from over and under segmentation. To this end, we propose a fuzzy-based pseudo segmentation approach for handwritten word recognition using a sequence to sequence model. We use a fuzzy triangular function that generates column wise weights based on the distance of the nearest data pixel from the top of the text image. Thus the probable segmentation regions are assigned higher weights than the other regions. These weights are superimposed on the original text image and this modified input is fed patch wise to the CNN. The features extracted are then encoded as a first part of a sequence to sequence model and then decoded to obtain the sequence of characters in the input. An attention mechanism is used to ensure that the decoder focuses on the appropriate section of the features outputted by the encoder while generating each character. Our method is tested upon the IAM word database after the words in it undergo skew and slant correction. The architecture of the model is optimized based on exhaustive numerical simulations on the IAM database and it shows promising results.
- Published
- 2021
5. Personalized k-fold Cross-Validation Analysis with Transfer from Phasic to Tonic Pain Recognition on X-ITE Pain Database
- Author
-
Steffen Walter, Ziad Yasser, Friedhelm Schwenker, Yara Samaha, and Youssef Wally
- Subjects
genetic structures ,Artificial neural network ,Database ,Computer science ,010401 analytical chemistry ,02 engineering and technology ,Pain management ,computer.software_genre ,01 natural sciences ,eye diseases ,Cross-validation ,0104 chemical sciences ,Random forest ,Tonic (physiology) ,Binary classification ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,sense organs ,Affective computing ,computer - Abstract
Automatic pain recognition is currently one of the most interesting challenges in affective computing as it has great potential to improve pain management for patients in a clinical environment. In this work, we analyse automatic pain recognition using binary classification with personalized k-fold cross-validation analysis which is an approach that focuses on training on the labels of specific subjects and validating on the labels of other subjects in the Experimentally Induced Thermal and Electrical (X-ITE) Pain Database using both a random forest and a dense neural network model. The effectiveness of each approach is inspected on each of the phasic electro, phasic heat, tonic electro, and tonic heat datasets separately. Therefore, the analysis of subsets of the X-ITE dataset (phasic electro, tonic electro, phasic heat, and tonic heat) was made individually. Afterward, phasic to tonic transfer was made by training models on the phasic electro dataset and testing them on the tonic electro dataset. Our outcomes and evaluations indicate that electro datasets always perform better than heat datasets. Also, personalized scores had better performance than normal scores. Moreover, dense neural networks performed better than randoms forests in transfer from phasic electro to tonic electro and showed promising performance in the personalized transfer.
- Published
- 2021
6. Introducing Bidirectional Ordinal Classifier Cascades Based on a Pain Intensity Recognition Scenario
- Author
-
Peter Bellmann, Friedhelm Schwenker, Ludwig Lausser, and Hans A. Kestler
- Subjects
Computer science ,business.industry ,Classifier (linguistics) ,Error correcting ,Mean absolute error ,Data_CODINGANDINFORMATIONTHEORY ,Artificial intelligence ,business ,Machine learning ,computer.software_genre ,computer ,Intensity (heat transfer) - Abstract
Ordinal classifier cascades (OCCs) are popular machine learning tools in the area of ordinal classification. OCCs constitute specific classification ensemble schemes that work in sequential manner. Each of the ensemble’s members either provides the architecture’s final prediction, or moves the current input to the next ensemble member. In the current study, we first confirm the fact that the direction of OCCs can have a high impact on the distribution of its predictions. Subsequently, we introduce and analyse our proposed bidirectional combination of OCCs. More precisely, based on a person-independent pain intensity scenario, we provide an ablation study, including the evaluation of different OCCs, as well as different popular error correcting output codes (ECOC) models. The provided outcomes show that our proposed straightforward approach significantly outperforms common OCCs, with respect to the accuracy and mean absolute error performance measures. Moreover, our results indicate that, while our proposed bidirectional OCCs are less complex in general, they are able to compete with and even outperform most of the analysed ECOC models.
- Published
- 2021
7. Experimental Analysis of Bidirectional Pairwise Ordinal Classifier Cascades
- Author
-
Peter Bellmann, Hans A. Kestler, Friedhelm Schwenker, and Ludwig Lausser
- Subjects
Sequence ,Computer science ,business.industry ,Machine learning ,computer.software_genre ,Class (biology) ,Field (computer science) ,Support vector machine ,Data set ,Task (computing) ,Classifier (linguistics) ,Pairwise comparison ,Artificial intelligence ,business ,computer - Abstract
Ordinal classifier cascades (OCCs) are basic machine learning tools in the field of ordinal classification (OC) that consist of a sequence of classification models (CMs). Each of the CMs is trained in combination with a specific subtask of the initial OC task. OCC architectures make use of a data set’s ordinal class structure by simply arranging the CMs with respect to the corresponding class order (e.g., small - medium - large). Recently, we proposed bidirectional OCC (bOCC) architectures that combine two basic one-directional OCCs, based on a person-independent pain intensity recognition scenario, in combination with support vector machines. In the current study, we further analyse the effectiveness of bOCC architectures. To this end, we evaluate our proposed approach based on different OC benchmark data sets. Additionally, we analyse the proposed bOCCs in combination with two different classification models. Our outcomes indicate that it seems to be beneficial to replace basic pairwise one-directional OCCs by the pairwise bOCC architecture, in general.
- Published
- 2021
8. Feature Extraction: A Time Window Analysis Based on the X-ITE Pain Database
- Author
-
Tobias Ricken, Steffen Walter, Peter Bellmann, Adrian Steinert, and Friedhelm Schwenker
- Subjects
Database ,Computer science ,010401 analytical chemistry ,Feature extraction ,Window (computing) ,computer.software_genre ,01 natural sciences ,Signal ,0104 chemical sciences ,Random forest ,03 medical and health sciences ,0302 clinical medicine ,Binary classification ,Time windows ,Point (geometry) ,Focus (optics) ,computer ,030217 neurology & neurosurgery - Abstract
In this work, we analyse different temporal feature extraction window approaches, in combination with short-time heat and electric pain stimuli. Thereby, we focus on the physiological signals of the Experimentally Induced Thermal and Electrical (X-ITE) Pain Database. Each of our proposed approaches is evaluated based on the leave-one-subject-out cross-validation using the random forest method. Moreover, the effectiveness of each physiological signal is inspected separately, as well as by applying the feature fusion approach. Thereby, we analyse different binary classification tasks, as well as four-class classification tasks. Our outcomes indicate that a shifted temporal feature extraction window increases the classification performance significantly, when pain is induced by thermal stimuli. Moreover, our evaluations point out that the outcomes differ significantly, when participants are exposed to electrical pain stimuli. For short-term electric pain stimuli, the best results are obtained without temporal shifts of the feature extraction windows.
- Published
- 2020
9. Pain Intensity Recognition - An Analysis of Short-Time Sequences in a Real-World Scenario
- Author
-
Peter Bellmann, Patrick Thiam, and Friedhelm Schwenker
- Subjects
Hospital setting ,Computer science ,business.industry ,Decision tree ,020206 networking & telecommunications ,Heat pain ,Pain detection ,02 engineering and technology ,Machine learning ,computer.software_genre ,Upper and lower bounds ,Intensity (physics) ,Task (project management) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,Focus (optics) ,business ,computer - Abstract
Pain intensity recognition still constitutes a challenging classification task. In this work, we focus on the physiological signals of the publicly available BioVid Heat Pain Database, which was collected at Ulm University. The BioVid Heat Pain Database consists of different recordings of healthy test subjects that were exposed to various short-time heat stimuli. The results reported in the literature, which are based on those short-time sequences do not justify the implementation of automated pain detection systems, due to unsatisfactory accuracy rates. In the current study, we show that the outcomes, which are stated in the literature, most likely represent lower bound estimations. For this purpose, we transfer the classification task, which is provided by the BioVid Heat Pain Database, to a real-world scenario. More precise, according to an expected hospital setting, we analyse the automated pain intensity recognition approach in combination with different sets of short-time sequences. Our outcomes indicate, that in real-world applications, where the detection of pain intensity is based on more than one single short-time sequence, the accuracy values can be significantly improved. In the current study, the classification performance of bagged decision tree ensembles is evaluated, based on a person-independent scenario.
- Published
- 2020
10. Using Mask R-CNN for Image-Based Wear Classification of Solid Carbide Milling and Drilling Tools
- Author
-
Friedhelm Schwenker, Jasmin Dalferth, and Sven Winkelmann
- Subjects
0209 industrial biotechnology ,Contextual image classification ,Computer science ,business.industry ,Pattern recognition ,Context (language use) ,02 engineering and technology ,Image segmentation ,Multiclass classification ,020303 mechanical engineering & transports ,020901 industrial engineering & automation ,0203 mechanical engineering ,Feature (computer vision) ,Minimum bounding box ,Pyramid (image processing) ,Artificial intelligence ,Tool wear ,business - Abstract
In order to ensure high productivity and quality in industrial production, early identification of tool wear is needed. Within the context of Industry 4.0, we integrate wear monitoring of solid carbide milling and drilling cutters automatically into the production process. Therefore, we propose to analyze wear types with image instance segmentation using Mask R-CNN with feature pyramid and bounding box regression. Our approach is able to recognize the five most important wear types: flank wear, crater wear, fracture, built-up edge and plastic deformation. While other methods use image classification and classify only one wear type for each image, our model is able to detect multiple wear types. Over 35 models with different hyperparameter settings were trained on 5,000 labeled images to establish a reliable classifier. The results show up to 82.03% accuracy and benefit for overlapping wear types, which is crucial for using the model in production.
- Published
- 2020
11. Detection of Bat Acoustics Signals Using Voice Activity Detection Techniques with Random Forests Classification
- Author
-
Adrián Tonatiuh Ruiz, Santiago Martínez Balvanera, Everardo Robredo, Julian Equihua, Friedhelm Schwenker, and Günther Palm
- Subjects
0301 basic medicine ,Feature engineering ,Voice activity detection ,Receiver operating characteristic ,Computer science ,business.industry ,Deep learning ,Detector ,Pattern recognition ,Statistical model ,Convolutional neural network ,Random forest ,03 medical and health sciences ,030104 developmental biology ,Artificial intelligence ,business - Abstract
Bats are indicators for ecosystem health, and therefore the determination of bat activity and species abundance provides essential information for biodiversity research and conservation monitoring. In this study, we propose a computational method for the detection of bat echolocation calls. This method uses feature engineering and consists of a statistical model-based Voice Activity Detector combined with a Random Forests classifier (VAD+RF). Using an open-access library (www.batdetective.org), we trained and tested the performance of our method and compare it to other existing detection methods. These methods include a detector based on deep neural networks along with other commercial detection systems. To visualize the detector performance over the full range of possible class distributions and misclassification costs, we calculated the Cost Curves and \(F_1\)-measure Curves. Results show that the detecting power of VAD+RF is comparable to methods based on deep learning. Based on the results we give recommendations to improve the future designs of the bat call detector.
- Published
- 2019
12. Filter Method Ensemble with Neural Networks
- Author
-
Agneet Chatterjee, Rajonya De, Friedhelm Schwenker, Anuran Chakraborty, and Ram Sarkar
- Subjects
Ensemble forecasting ,Artificial neural network ,business.industry ,Computer science ,Pooling ,020206 networking & telecommunications ,Pattern recognition ,Feature selection ,02 engineering and technology ,Mutual information ,Perceptron ,ComputingMethodologies_PATTERNRECOGNITION ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Classifier (UML) - Abstract
The main concept behind designing a multiple classifier system is to combine a number of classifiers such that the resulting system succeeds to topple the individual classifiers by pooling together the decisions of all classifiers. Uniting relatively simple pattern recognition models with limited performance is commonly found in the literature. It performs better when each learner be trained well, but different learners have different working principles which adds diversity in the ensemble. In this paper, we first select three optimal subsets of features using three different filter methods namely Mutual Information (MI), Chi-square, and Anova F-Test. Then with the selected features we build three learning models using Multi-layer Perceptron (MLP) based classifier. Class membership values provided by these three classifiers for each sample are concatenated which is then fed to next MLP based classifier. Experimentation performed on five UCI Machine Learning Repository, namely Arrhythmia, Ionosphere, Hill-Valley, Waveform, Horse Colic shows the effectiveness of the proposed ensemble model.
- Published
- 2019
13. Deep Learning Algorithms for Emotion Recognition on Low Power Single Board Computers
- Author
-
Friedhelm Schwenker, Venkatesh Srinivasan, and Sascha Meudt
- Subjects
Artificial neural network ,business.industry ,Computer science ,Human–computer interaction ,Deep learning ,Deep neural networks ,Inference ,State (computer science) ,Artificial intelligence ,Emotion recognition ,business ,Computing systems ,Power (physics) - Abstract
In the world of Human-Computer Interaction, a computer should have the ability to communicate with humans. One of the communication skill that a computer requires is to recognize the emotional state of the human. With the state-of-the-art computing systems along with Graphical Processing Units, a Deep Neural Network can be realized by training on any publicly available dataset and learn the whole emotion estimation into one single network. In a real-time application, the inference of such a network may not need high computational power as training a network does.
- Published
- 2019
14. Visualizing Facial Expression Features of Pain and Emotion Data
- Author
-
Jan Sellner, Friedhelm Schwenker, and Patrick Thiam
- Subjects
Sadness ,Facial expression ,Amusement ,media_common.quotation_subject ,Feature (machine learning) ,Context (language use) ,Anger ,Psychology ,Disgust ,Expression (mathematics) ,media_common ,Cognitive psychology - Abstract
Pain and emotions reveal important information about the state of a person and are often expressed via the face. Most of the time, systems which analyse these states consider only one type of expression. For pain, the medical context is a common scenario for automatic monitoring systems and it is not unlikely that emotions occur there as well. Hence, these systems should not confuse both types of expressions. To facilitate advances in this field, we use video data from the BioVid Heat Pain Database, extract Action Unit (AU) intensity features and conduct first analyses by creating several feature visualizations. We show that the AU usage pattern is more distinct for the pain, amusement and disgust classes than for the sadness, fear and anger classes. For the former, we present additional visualizations which reveal a clearer picture of the typically used AUs per expression by highlighting dependencies between AUs (joint usages). Finally, we show that the feature discrimination quality varies heavily across the 64 tested subjects.
- Published
- 2019
15. Evolutionary Algorithms for the Design of Neural Network Classifiers for the Classification of Pain Intensity
- Author
-
Eugene Semenkin, Viktor Kessler, Iana S. Polonskaia, Alina Skorokhod, Danila Mamontov, and Friedhelm Schwenker
- Subjects
Recurrent neural network ,Artificial neural network ,Computer science ,business.industry ,Genetic algorithm ,Evolutionary algorithm ,Feedforward neural network ,Genetic programming ,Pattern recognition ,Artificial intelligence ,business ,Adaptation (computer science) ,Intensity (heat transfer) - Abstract
In this paper we present a study on multi-modal pain intensity recognition based on video and bio-physiological sensor data. The newly recorded SenseEmotion dataset consisting of 40 individuals, each subjected to three gradually increasing levels of painful heat stimuli, has been used for the evaluation of the proposed algorithms. We propose and evaluated evolutionary algorithms for the design and adaptation of the structure of deep artificial neural network architectures. Feedforward Neural Network and Recurrent Neural Network have been considered for the optimisation by using a Self-Configuring Genetic Algorithm (SelfCGA) and Self-Configuring Genetic Programming (SelfCGP).
- Published
- 2019
16. Trace and Detect Adversarial Attacks on CNNs Using Feature Response Maps
- Author
-
Friedhelm Schwenker, Mohammadreza Amirian, and Thilo Stadelmann
- Subjects
FOS: Computer and information sciences ,Model interpretability ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,0211 other engineering and technologies ,Context (language use) ,005: Computerprogrammierung, Programme und Daten ,02 engineering and technology ,computer.software_genre ,Convolutional neural network ,Image (mathematics) ,Adversarial system ,0202 electrical engineering, electronic engineering, information engineering ,Diagnostic ,Debugger ,TRACE (psycholinguistics) ,021110 strategic, defence & security studies ,Network architecture ,business.industry ,Pattern recognition ,Feature (computer vision) ,Feature visualization ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer - Abstract
The existence of adversarial attacks on convolutional neural networks (CNN) questions the fitness of such models for serious applications. The attacks manipulate an input image such that misclassification is evoked while still looking normal to a human observer -- they are thus not easily detectable. In a different context, backpropagated activations of CNN hidden layers -- "feature responses" to a given input -- have been helpful to visualize for a human "debugger" what the CNN "looks at" while computing its output. In this work, we propose a novel detection method for adversarial examples to prevent attacks. We do so by tracking adversarial perturbations in feature responses, allowing for automatic detection using average local spatial entropy. The method does not alter the original network architecture and is fully human-interpretable. Experiments confirm the validity of our approach for state-of-the-art attacks on large-scale models trained on ImageNet., 13 pages, 6 figures
- Published
- 2018
17. A $$k$$-Nearest Neighbor Based Algorithm for Multi-Instance Multi-Label Active Learning
- Author
-
Guenther Palm, Adrián Tonatiuh Ruiz, Friedhelm Schwenker, and Patrick Thiam
- Subjects
Data labeling ,Computer science ,business.industry ,Data domain ,0206 medical engineering ,02 engineering and technology ,Machine learning ,computer.software_genre ,Multi instance multi label ,Oracle ,k-nearest neighbors algorithm ,Text categorization ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Neighbor algorithm ,Artificial intelligence ,business ,Classifier (UML) ,computer ,020602 bioinformatics - Abstract
Multi-instance multi-label learning (MIML) is a framework in machine learning in which each object is represented by multiple instances and associated with multiple labels. This relatively new approach has achieved success in various applications, particularly those involving learning from complex objects. Because of the complexity of MIML, the cost of data labeling increases drastically along with the improvement of the model performance. In this paper, we introduce a MIML active learning approach to reduce the labeling costs of MIML data without compromising the model performance. Based on a query strategy, we select and request from the Oracle the label set of the most informative object. Our approach is formulated in a pool-based scenario and uses Miml-\(k\) nn as the base classifier. This classifier for MIML is based on the \(k\)-Nearest Neighbor algorithm and has achieved superior performance in different data domains. We proposed novel query strategies and also implemented previously used query strategies for MIML learning. Finally, we conducted an experimental evaluation on various benchmark datasets. We demonstrate that these approaches can achieve significantly improved results than without active selection for all datasets on various evaluation criteria.
- Published
- 2018
18. Selecting Features from Foreign Classes
- Author
-
Ludwig Lausser, Friedhelm Schwenker, Robin Szekely, Hans A. Kestler, and Viktor Kessler
- Subjects
0301 basic medicine ,Computer science ,Process (engineering) ,business.industry ,Supervised learning ,Feature selection ,Pattern recognition ,Decision rule ,Field (computer science) ,03 medical and health sciences ,Identification (information) ,ComputingMethodologies_PATTERNRECOGNITION ,030104 developmental biology ,0302 clinical medicine ,030220 oncology & carcinogenesis ,Feature (machine learning) ,Artificial intelligence ,Transfer of learning ,business - Abstract
Supervised learning algorithms restrict the training of classification models to the classes of interest. Other related classes are typically neglected in this process and are not involved in the final decision rule. Nevertheless, the analysis of these foreign samples and their labels might provide additional information on the classes of interest. By revealing common patterns in foreign classification tasks it might lead to the identification of structures suitable for the original classes. This principle is used in the field of transfer learning. In this work, we investigate the use of foreign classes for the feature selection process of binary classifiers. While the final classification model is trained according to the traditional supervised learning scheme, its feature signature is designed for separating a pair of foreign classes. We systematically analyse these classifiers in \(10 \times 10\) cross-validation experiments on microarray datasets with multiple diagnostic classes. For each evaluated classification model, we observed foreign feature combinations that outperformed at least 90% of those feature sets designed for the original diagnostic classes on at least 88.9% of all datasets.
- Published
- 2018
19. Classification of Mammograms Using Convolutional Neural Network Based Feature Extraction
- Author
-
Achim Ibenthal, Taye Girma Debelee, Günther Palm, Mohammadreza Amirian, and Friedhelm Schwenker
- Subjects
medicine.diagnostic_test ,Computer science ,business.industry ,Screening mammography ,Image (category theory) ,Feature extraction ,Image processing ,Pattern recognition ,02 engineering and technology ,medicine.disease ,Convolutional neural network ,030218 nuclear medicine & medical imaging ,03 medical and health sciences ,0302 clinical medicine ,Breast cancer ,Principal component analysis ,0202 electrical engineering, electronic engineering, information engineering ,medicine ,Mammography ,020201 artificial intelligence & image processing ,Artificial intelligence ,business - Abstract
Breast cancer is the most common cause of death among women in the entire world and the second cause of death after lung cancer. The use of automatic breast cancer detection and classification might possibly enhance the survival rate of the patients through starting early treatment. In this paper, the convolutional Neural Networks (CNN) based feature extraction method is proposed. The features dimensionality was reduced using Principal Component Analysis (PCA). The reduced features are given to the K-Nearest Neighbors (KNN) to classify mammograms as normal or abnormal using 10-fold cross-validation. The experimental result of the proposed approach performed on Mammography Image Analysis Society (MIAS) and Digital Database for Screening Mammography (DDSM) datasets were found to be promising compared to previous studies in the area of image processing, artificial intelligence and CNN with an accuracy of 98.75\(\%\) and 98.90\(\%\) on MIAS and DDSM dataset respectively.
- Published
- 2018
20. Multi-classifier-Systems: Architectures, Algorithms and Applications
- Author
-
Peter Bellmann, Patrick Thiam, and Friedhelm Schwenker
- Subjects
0209 industrial biotechnology ,Boosting (machine learning) ,Data stream mining ,business.industry ,Computer science ,02 engineering and technology ,Machine learning ,computer.software_genre ,Random forest ,020901 industrial engineering & automation ,0202 electrical engineering, electronic engineering, information engineering ,Decision fusion ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Crucial point ,Classifier (UML) - Abstract
In this work multi-classifier-systems (MCS) are discussed. Several fixed and trainable aggregation rules are presented. The most famous examples of MCS, namely bagging and boosting, are explained. Diversity between the base classifiers is a crucial point in order to build accurate MCS. Several criteria to measure diversity in MCS are defined and a motivation for diversity measures, based on the base classifiers’ outputs is given. A case study on pain intensity estimation, based on physiological data streams, is conducted. Within the framework of the case study, different MCS and fusion approaches are evaluated. The case study is conducted on two different data sets, with four and five pain levels respectively, which were induced to the test persons under strictly controlled conditions. The aim of the case study is to implement an automatic pain intensity application system and analyse its effectiveness.
- Published
- 2018
21. Multimodal Affect Recognition in the Context of Human-Computer Interaction for Companion-Systems
- Author
-
Michael Glodek, Martin Schels, Patrick Thiam, Ingo Siegert, Markus Kächele, Sascha Meudt, Miriam Schmidt-Wack, Gerald Krell, Ronald Böck, Andreas Wendemuth, and Friedhelm Schwenker
- Subjects
Facial expression ,Modalities ,Computer science ,05 social sciences ,Technical systems ,Context (language use) ,02 engineering and technology ,Human–computer interaction ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,0501 psychology and cognitive sciences ,Emotion recognition ,Multiple modalities ,Affect (linguistics) ,050107 human factors ,Gesture - Abstract
In general, humans interact with each other using multiple modalities. The main channels are speech, facial expressions, and gesture. But also bio-physiological data such as biopotentials can convey valuable information which can be used to interpret the communication in a dedicated way. A Companion-System can use these modalities to perform an efficient human-computer interaction (HCI). To do so, the multiple sources need to be analyzed and combined in technical systems. However, so far only few studies have been published dealing with the fusion of three or even more such modalities. This chapter addresses the necessary processing steps in the development of a multimodal system applying fusion approaches.
- Published
- 2017
22. Audio-Visual Recognition of Pain Intensity
- Author
-
Viktor Kessler, Patrick Thiam, Guenther Palm, Friedhelm Schwenker, and Steffen Walter
- Subjects
Modality (human–computer interaction) ,Discriminative model ,Computer science ,Speech recognition ,Audio visual ,Recognition system ,Feature set ,Intensity (physics) ,Communication channel ,Random forest - Abstract
In this work, a multi-modal pain intensity recognition system based on both audio and video channels is presented. The system is assessed on a newly recorded dataset consisting of several individuals, each subjected to 3 gradually increasing levels of painful heat stimuli under controlled conditions. The assessment of the dataset consists of the extraction of a multitude of features from each modality, followed by an evaluation of the discriminative power of each extracted feature set. Finally, several fusion architectures, involving early and late fusion, are assessed. The temporal availability of the audio channel is taken in consideration during the assessment of the fusion architectures.
- Published
- 2017
23. Fusion Architectures for Multimodal Cognitive Load Recognition
- Author
-
Sascha Meudt, Daniel Kindsvater, and Friedhelm Schwenker
- Subjects
Relation (database) ,Markov chain ,Computer science ,05 social sciences ,020207 software engineering ,02 engineering and technology ,Kalman filter ,Term (time) ,Task (project management) ,Human–computer interaction ,0202 electrical engineering, electronic engineering, information engineering ,0501 psychology and cognitive sciences ,State (computer science) ,050107 human factors ,Cognitive load ,Gesture - Abstract
Knowledge about the users emotional state is important to achieve human like, natural Human Computer Interaction (HCI) in modern technical systems. Humans rely on implicit signals like body gestures and posture, vocal changes (e.g. pitch) and mimic expressions when communicating. We investigate the relation between them and human emotion, specifically when completing easy or difficult tasks. Additionally we include physiological data which also differ in changes of cognitive load. We focus on discriminating between mental overload and mental underload, which can e.g. be useful in an e-tutorial system. Mental underload is a new term used to describe the state a person is in when completing a dull or boring task. It will be shown how to select suited features, build uni modal classifiers which then are combined to a multimodal mental load estimation by the use of Markov Fusion Networks (MFN) and Kalman Filter Fusion (KFF).
- Published
- 2017
24. Multi-modal Information Processing in Companion-Systems: A Ticket Purchase System
- Author
-
Günther Palm, Ingo Siegert, Sascha Meudt, Felix Schüssel, Klaus Dietmayer, Heiko Neumann, Gerald Krell, Andreas Wendemuth, Thilo Hörnle, Ayoub Al-Hamadi, Stephan Reuter, Georg Layher, Miriam Schmidt, Sebastian Handrich, and Friedhelm Schwenker
- Subjects
Computer science ,Human–computer interaction ,Component (UML) ,Ticket ,Information processing ,Context (language use) ,State (computer science) ,Dialog box ,Purchasing ,Gesture - Abstract
We demonstrate a successful multimodal dynamic human-computer interaction (HCI) in which the system adapts to the current situation and the user’s state is provided using the scenario of purchasing a train ticket. This scenario demonstrates that Companion Systems are facing the challenge of analyzing and interpreting explicit and implicit observations obtained from sensors under changing environmental conditions. In a dedicated experimental setup, a wide range of sensors was used to capture the situative context and the user, comprising video and audio capturing devices, laser scanners, a touch screen, and a depth sensor. Explicit signals describe a user’s direct interaction with the system, such as interaction gestures, speech and touch input. Implicit signals are not directly addressed to the system; they comprise the user’s situative context, his or her gesture, speech, body pose, facial expressions and prosody. Both multimodally fused explicit signals and interpreted information from implicit signals steer the application component, which was kept deliberately robust. The application offers stepwise dialogs gathering the most relevant information for purchasing a train ticket, where the dialog steps are sensitive and adaptable within the processing time to the interpreted signals and data. We further highlight the system’s potential for a fast-track ticket purchase when several pieces of information indicate a hurried user.
- Published
- 2017
25. Bimodal Recognition of Cognitive Load Based on Speech and Physiological Changes
- Author
-
Friedhelm Schwenker, Sascha Meudt, and Dennis Held
- Subjects
Speech recognition ,Emotional intelligence ,020207 software engineering ,02 engineering and technology ,Interpersonal communication ,Identification (information) ,Action (philosophy) ,Classification result ,Component (UML) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Psychology ,Cognitive load ,Cognitive psychology - Abstract
An essential component of the interaction between humans is the reaction through their emotional intelligence to emotional states of the counterpart and respond appropriately. This kind of action results in a successful interpersonal communication. The first step to achieve this goal within HCI is the identification of these emotional states.
- Published
- 2017
26. Emotion Recognition from Speech
- Author
-
Friedhelm Schwenker, Günther Palm, Andreas Wendemuth, Ingo Siegert, Ronald Böck, and Bogdan Vlasenko
- Subjects
Computer science ,business.industry ,Process (engineering) ,media_common.quotation_subject ,Emotion classification ,02 engineering and technology ,computer.software_genre ,Variety (linguistics) ,020303 mechanical engineering & transports ,0203 mechanical engineering ,0202 electrical engineering, electronic engineering, information engineering ,Natural (music) ,020201 artificial intelligence & image processing ,Quality (business) ,Artificial intelligence ,business ,Control (linguistics) ,Cluster analysis ,computer ,Natural language processing ,Spoken language ,media_common - Abstract
Spoken language is one of the main interaction patterns in human-human as well as in natural, companion-like human-machine interactions. Speech conveys content, but also emotions and interaction patterns determining the nature and quality of the user’s relationship to his counterpart. Hence, we consider emotion recognition from speech in the wider sense of application in Companion-systems. This requires a dedicated annotation process to label emotions and to describe their temporal evolution in view of a proper regulation and control of a system’s reaction. This problem is peculiar for naturalistic interactions, where the emotional labels are no longer a priori given. This calls for generating and measuring of a reliable ground truth, where the measurement is closely related to the usage of appropriate emotional features and classification techniques. Further, acted and naturalistic spoken data has to be available in operational form (corpora) for the development of emotion classification; we address the difficulties arising from the variety of these data sources. Speaker clustering and speaker adaptation will as well improve the emotional modeling. Additionally, a combination of the acoustical affective evaluation and the interpretation of non-verbal interaction patterns will lead to a better understanding of and reaction to user-specific emotional behavior.
- Published
- 2017
27. Active Multi-Instance Multi-Label Learning
- Author
-
Friedhelm Schwenker and Robert Retz
- Subjects
Training set ,business.industry ,Computer science ,Feature vector ,Machine learning ,computer.software_genre ,Multi instance multi label ,Annotation ,ComputingMethodologies_PATTERNRECOGNITION ,Hausdorff distance ,Artificial intelligence ,Benchmark data ,business ,Cluster analysis ,computer ,Classifier (UML) - Abstract
Multi-instance multi-label learning (MIML) introduced by Zhou and Zhang is a comparatively new framework in machine learning with two special characteristics: Firstly, each instance is represented by a set of feature vectors (a bag of instances), and secondly, bags of instances may belong to many classes (a Multi-Label). Thus, an MIML classifier receives a bag of instances and produces a Multi-Label. For classifier training, the training set is also of this MIML structure. Labeling a data set is always cost-intensive, especially in an MIMIL framework. In order to reduce the labeling costs it is important to restructure the annotation process in such a way that the most informative examples are labeled in the beginning, and less or non-informative data more to the end of the annotation phase. Active learning is a possible approach to tackle this kind of problems in this work we focus on the MIMLSVM algorithm in combination with the k-Medoids clustering algorithm to transform the Multi-Instance to a Single-Instance representation. For the clustering distance measure we consider variants of the Hausdorff distance, namely Median- and Average-Based Hausdorff distance. Finally, active learning strategies derived from the single-instance scenario have been investigated in the MIML setting and evaluated on a benchmark data set.
- Published
- 2016
28. Going Further in Affective Computing: How Emotion Recognition Can Improve Adaptive User Interaction
- Author
-
Felix Schüssel, Günther Palm, Sascha Meudt, Miriam Schmidt-Wack, Michael Weber, Friedhelm Schwenker, and Frank Honold
- Subjects
Multimedia ,Computer science ,Local binary patterns ,Joins ,020206 networking & telecommunications ,02 engineering and technology ,computer.software_genre ,Outcome (game theory) ,Task (project management) ,Human–computer interaction ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Emotion recognition ,State (computer science) ,Architecture ,Affective computing ,computer - Abstract
This article joins the fields of emotion recognition and human computer interaction. While much work has been done on recognizing emotions, they are hardly used to improve a user’s interaction with a system. Although the fields of affective computing and especially serious games already make use of detected emotions, they tend to provide application and user specific adaptions only on the task level. We present an approach of utilizing recognized emotions to improve the interaction itself, independent of the underlying application at hand. Examining the state of the art in emotion recognition research and based on the architecture of Companion-System, a generic approach for determining the main cause of an emotion within the history of interactions is presented, allowing a specific reaction and adaption. Using such an approach could lead to systems that use emotions to improve not only the outcome of a task but the interaction itself in order to be truly individual and empathic.
- Published
- 2016
29. On Gestures and Postural Behavior as a Modality in Ensemble Methods
- Author
-
Friedhelm Schwenker, Sascha Meudt, and Heinke Hihn
- Subjects
Focus (computing) ,Modality (human–computer interaction) ,Relation (database) ,Computer science ,Speech recognition ,02 engineering and technology ,Ensemble learning ,Task (project management) ,Discriminative model ,Human–computer interaction ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Natural (music) ,020201 artificial intelligence & image processing ,Gesture - Abstract
Knowledge about the users emotional state is important to achieve human like, natural HCI in modern technical systems. Humans rely on body gestures and posture when communicating. We investigate the relation between gestures and human emotion, specifically when completing tasks. The main focus of this work lies on discriminating between mental overload and mental underload, which can e.g. be useful in an e-tutorial system. Mental underload is a new term used to describe the state a person is in when completing a dull or boring task. It will be shown how to select suited features, such as gestures, movement and postural behavior. Furthermore those features will be investigated regarding their discriminative power. After features are selected, a multiple classifier system will be designed, trained and evaluated.
- Published
- 2016
30. Using Radial Basis Function Neural Networks for Continuous and Discrete Pain Estimation from Bio-physiological Signals
- Author
-
Mohammadreza Amirian, Markus Kächele, and Friedhelm Schwenker
- Subjects
Fusion scheme ,Mahalanobis distance ,Radial basis function network ,business.industry ,Computer science ,Pattern recognition ,02 engineering and technology ,Experimental validation ,Machine learning ,computer.software_genre ,03 medical and health sciences ,0302 clinical medicine ,Radial basis function neural ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Radial basis function ,Artificial intelligence ,business ,computer ,030217 neurology & neurosurgery - Abstract
In this work we present extensions for Radial Basis Function networks to improve their ability for discrete and continuous pain intensity estimation. Besides proposing a mid-level fusion scheme, the use of standardization and unconventional loss functions are covered. We show that RBF networks can be improved in this way and present extensive experimental validation to support our findings on a multi-modal dataset.
- Published
- 2016
31. Emotion Recognition in Speech with Deep Learning Architectures
- Author
-
Markus Kächele, Friedhelm Schwenker, and Mehmet Erdal
- Subjects
business.industry ,Computer science ,Deep learning ,Speech recognition ,Feed forward neural ,02 engineering and technology ,010501 environmental sciences ,01 natural sciences ,0202 electrical engineering, electronic engineering, information engineering ,Deep neural networks ,020201 artificial intelligence & image processing ,Artificial intelligence ,Hidden layer ,Emotion recognition ,business ,Raw data ,Classifier (UML) ,0105 earth and related environmental sciences - Abstract
Deep neural networks (DNNs) became very popular for learning abstract high-level representations from raw data. This lead to improvements in several classification tasks including emotion recognition in speech. Besides the use as feature learner a DNN can also be used as classifier. In any case it is a challenge to determine the number of hidden layers and neurons in each layer for such networks. In this work the architecture of a DNN is determined by a restricted grid-search with the aim to recognize emotion in human speech. Because speech signals are essentially time series the data will be transformed in an appropriate format to use it as input for deep feed forward neural networks without losing much time dependent information. Furthermore the Elman-Net will be examined. The results shows that by maintaining time dependent information in the data better classification accuracies can be achieved with deep architectures.
- Published
- 2016
32. Active Learning for Speech Event Detection in HCI
- Author
-
Guenther Palm, Patrick Thiam, Friedhelm Schwenker, and Sascha Meudt
- Subjects
Computer science ,Active learning (machine learning) ,Event (computing) ,Speech recognition ,Supervised learning ,Sampling (statistics) ,02 engineering and technology ,Mixture model ,01 natural sciences ,Support vector machine ,010104 statistics & probability ,Radial basis function kernel ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Anomaly detection ,0101 mathematics - Abstract
In this work, a pool-based active learning approach combining outlier detection methods with uncertainty sampling is proposed for speech event detection. Events in this case are regarded as atypical utterances (e.g. laughter, heavy breathing) occurring sporadically during a Human Computer Interaction (HCI) scenario. The proposed approach consists in using rank aggregation to select informative speech segments which have previously been ranked using different outlier detection techniques combined with an uncertainty sampling technique. The uncertainty sampling method is based on the distance to the boundary of a Support Vector Machine with Radial Basis Function kernel trained on the available annotated samples. Extensive experimental results prove the effectiveness of the proposed approach.
- Published
- 2016
33. Machine Learning Driven Heart Rate Detection with Camera Photoplethysmography in Time Domain
- Author
-
Friedhelm Schwenker, Markus Kächele, Viktor Kessler, Sascha Meudt, and Günther Palm
- Subjects
Signal processing ,Artificial neural network ,Mean squared error ,business.industry ,Computer science ,Gaussian ,010103 numerical & computational mathematics ,Machine learning ,computer.software_genre ,01 natural sciences ,k-nearest neighbors algorithm ,010309 optics ,symbols.namesake ,Multilayer perceptron ,Photoplethysmogram ,0103 physical sciences ,symbols ,Computer vision ,Time domain ,Artificial intelligence ,0101 mathematics ,business ,computer - Abstract
Measuring bio signals such as the heart rate in non medical applications is gaining an increasing importance. With camera based photoplethysmography (PPG) it is possible to measure the heart rate remotely with built in webcams of every tablet and laptop. Recent research with machine learning based methods showed great success compared to signal processing based methods. In this paper, we use k-nearest neighbor (kNN) and multilayer perceptron (MLP) with an alternative representation of the input vector. Estimating the quality of peaks with a Gaussian distribution could further improve the detection. Overall we could improve the root mean square error (RMSE) from 23.97 to 8.62.
- Published
- 2016
34. Monte Carlo Based Importance Estimation of Localized Feature Descriptors for the Recognition of Facial Expressions
- Author
-
Friedhelm Schwenker, Günther Palm, and Markus Kächele
- Subjects
Facial expression ,business.industry ,Local binary patterns ,Computer science ,Perspective (graphical) ,Pattern recognition ,Machine learning ,computer.software_genre ,Convolutional neural network ,Support vector machine ,Discriminative model ,Face (geometry) ,Feature (machine learning) ,Artificial intelligence ,business ,computer - Abstract
The automated and exact identification of facial expressions in human computer interaction scenarios is a challenging but necessary task to recognize human emotions by a machine learning system. The human face consists of regions whose elements contribute to single expressions in a different manner. This work aims to shed light onto the importance of specific facial regions to provide information which can be used to discriminate between different facial expressions from a statistical pattern recognition perspective. A sampling based classification approach is used to reveal informative locations in the face. The results are expression-sensitive importance maps that indicate regions of high discriminative power which can be used for various applications.
- Published
- 2015
35. Multimodal Data Fusion for Person-Independent, Continuous Estimation of Pain Intensity
- Author
-
Mohammadreza Amirian, Friedhelm Schwenker, Markus Kächele, Philipp Werner, Patrick Thiam, Günther Palm, and Steffen Walter
- Subjects
Fusion ,Modalities ,Mean squared error ,Computer science ,business.industry ,Local binary patterns ,Pattern recognition ,Intensity (physics) ,Task (project management) ,Feature (computer vision) ,Artificial intelligence ,business ,Focus (optics) ,psychological phenomena and processes - Abstract
In this work, a method is presented for the continuous estimation of pain intensity based on fusion of bio-physiological and video features. The focus of the paper is to analyse which modalities and feature sets are suited best for the task of recognizing pain levels in a person-independent setting. A large set of features is extracted from the available bio-physiological channels (ECG, EMG and skin conductivity) and the video stream. Experimental validation demonstrates which modalities contribute the most to a robust prediction and the effects when combining them to improve the continuous estimation given unseen persons.
- Published
- 2015
36. Audio-Visual User Identification in HCI Scenarios
- Author
-
Sascha Meudt, Markus Kächele, Andrej Schwarz, and Friedhelm Schwenker
- Subjects
Activity recognition ,Computer science ,Speech recognition ,Audio visual ,Segmentation ,Input device ,Transcription (software) ,Affective computing ,Mobile device ,Classifier (UML) - Abstract
Modern computing systems are usually equipped with various input devices such as microphones or cameras, and hence the user of such a system can easily be identified. User identification is important in many human computer interaction (HCI) scenarios, such as speech recognition, activity recognition, transcription of meeting room data or affective computing. Here personalized models may significantly improve the performance of the overall recognition system. This paper deals with audio-visual user identification. The main processing steps are segmentation of the relevant parts from video and audio streams, extraction of meaningful features and construction of the overall classifier and fusion architectures. The proposed system has been evaluated on the MOBIO dataset, a benchmark database consisting of real-world recordings collected from mobile devices, e.g. cell-phones. Recognition rates of up to 92 % could be achieved for the proposed audio-visual classifier system.
- Published
- 2015
37. On Annotation and Evaluation of Multi-modal Corpora in Affective Human-Computer Interaction
- Author
-
Patrick Thiam, Günther Palm, Sascha Meudt, Viktor Kessler, Markus Kächele, Friedhelm Schwenker, Michael Glodek, Martin Schels, and Stephan Tschechne
- Subjects
business.industry ,Computer science ,Process (engineering) ,computer.software_genre ,Affect (psychology) ,Fuzzy logic ,Annotation ,Modal ,Software ,Artificial intelligence ,business ,Affective computing ,computer ,Natural language processing - Abstract
In this paper, we discuss the topic of affective human-computer interaction from a data driven viewpoint. This comprises the collection of respective databases with emotional contents, feasible annotation procedures and software tools that are able to conduct a suitable labeling process. A further issue that is discussed in this paper is the evaluation of the results that are computed using statistical classifiers. Based on this we propose to use fuzzy memberships in order to model affective user state and endorse respective fuzzy performance measures.
- Published
- 2015
38. uulmMAD – A Human Action Recognition Dataset for Ground-Truth Evaluation and Investigation of View Invariances
- Author
-
Michael Glodek, Felix Heilemann, Florian Gawrilowicz, Georg Layher, Heiko Neumann, Friedhelm Schwenker, and Günther Palm
- Subjects
Ground truth ,Computer science ,business.industry ,Perspective (graphical) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Action (philosophy) ,Pattern recognition (psychology) ,Benchmark (computing) ,Artificial intelligence ,Noise (video) ,Focus (optics) ,business ,Baseline (configuration management) - Abstract
In recent time, human action recognition has gained increasing attention in pattern recognition. However, many datasets in the literature focus on a limited number of target-oriented properties. Within this work, we present a novel dataset, named uulmMAD, which has been created to benchmark state-of-the-art action recognition architectures addressing multiple properties, e.g. high-resolutions cameras, perspective changes, realistic cluttered background and noise, overlap of action classes, different execution speeds, variability in subjects and their clothing, and the availability of a pose ground-truth. The uulmMAD was recorded using three synchronized high-resolution cameras and an inertial motion capturing system. Each subject performed fourteen actions at least three times in front of a green screen. Selected actions in four variants were recorded, i.e. normal, pausing, fast and deceleration. The data has been post-processed in order to separate the subject from the background. Furthermore, the camera and the motion capturing data have been mapped onto each other and 3D-avatars have been generated to further extend the dataset. The avatars have also been used to emulate the self-occlusion in pose recognition when using a time-of-flight camera. In this work, we analyze the uulmMAD using a state-of-the-art action recognition architecture to provide first baseline results. The results emphasize the unique characteristics of the dataset. The dataset will be made publicity available upon publication of the paper.
- Published
- 2015
39. Bio-Visual Fusion for Person-Independent Recognition of Pain Intensity
- Author
-
Markus Kächele, Philipp Werner, Ayoub Al-Hamadi, Günther Palm, Steffen Walter, and Friedhelm Schwenker
- Subjects
Facial expression ,Fusion ,Modalities ,Computer science ,business.industry ,Speech recognition ,Feature selection ,Artificial intelligence ,Experimental validation ,Multiple classification ,business - Abstract
In this work, multi-modal fusion of video and biopotential signals is used to recognize pain in a person-independent scenario. For this purpose, participants were subjected to painful heat stimuli under controlled conditions. Subsequently, a multitude of features have been extracted from the available modalities. Experimental validation suggests that the cues that allow the successful recognition of pain are highly similar across different people and complementary in the analysed modalities to an extent that fusion methods are able to achieve an improvement over single modalities. Different fusion approaches (early, late, trainable) are compared on a large set of state-of-the art features for the biopotentials and video channels in multiple classification experiments.
- Published
- 2015
40. Multimodal Pattern Recognition of Social Signals in Human-Computer-Interaction
- Author
-
Friedhelm Schwenker, Louis-Philippe Morency, and Stefan Scherer
- Subjects
business.industry ,Computer science ,Pattern recognition (psychology) ,Pattern recognition ,Artificial intelligence ,business - Published
- 2015
41. Artificial Neural Networks in Pattern Recognition
- Author
-
Friedhelm Schwenker, Ching Y. Suen, and Neamat El Gayar
- Subjects
Engineering ,Artificial neural network ,business.industry ,Pattern recognition (psychology) ,Library science ,Pattern recognition ,Artificial intelligence ,Large range ,business - Abstract
This book constitutes the refereed proceedings of the 6th IAPR TC3 International Workshop on Artificial Neural Networks in Pattern Recognition, ANNPR 2014, held in Montreal, QC, Canada, in October 2014. The 24 revised full papers presented were carefully reviewed and selected from 37 submissions for inclusion in this volume. They cover a large range of topics in the field of learning algorithms and architectures and discussing the latest research, results, and ideas in these areas.
- Published
- 2014
42. Decision Tree-Based Multiple Classifier Systems: An FPGA Perspective
- Author
-
BARBARESCHI, MARIO, GARGIULO, francesco, MAZZEO, ANTONINO, SANSONE, CARLO, Del Prete, Salvatore, Friedhelm Schwenker, Fabio Roli, Josef Kittler, Barbareschi, Mario, Del Prete, Salvatore, Gargiulo, Francesco, Mazzeo, Antonino, and Sansone, Carlo
- Subjects
Boosting (machine learning) ,Computer science ,business.industry ,Decision tree ,Machine learning ,computer.software_genre ,Multiple classifier ,Random forest ,ComputingMethodologies_PATTERNRECOGNITION ,Short latency ,Artificial intelligence ,business ,Field-programmable gate array ,Classifier (UML) ,computer - Abstract
Combining a hardware approach with a multiple classifier method can deeply improve system performance, since the multiple classifier system can successfully enhance the classification accuracy with respect to a single classifier, and a hardware implementation would lead to systems able to classify samples with high throughput and with a short latency. To the best of our knowledge, no paper in the literature takes into account the multiple classifier scheme as additional design parameter, mainly because of lack of efficient hardware combiner architecture. In order to fill this gap, in this paper we will first propose a novel approach for an efficient hardware implementation of the majority voting combining rule. Then, we will illustrate a design methodology to suitably embed in a digital device a multiple classifier system having Decision Trees as base classifiers and a majority voting rule as combiner. Bagging, Boosting and Random Forests will be taken into account. We will prove the effectiveness of the proposed approach on two real case studies related to Big Data issues.
- Published
- 2015
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.