53 results on '"Matthias Wolff"'
Search Results
2. Ecological network analysis metrics: The need for an entire ecosystem approach in management and policy
- Author
-
Matthias Wolff, Brian D. Fath, Ulrike Schückel, Ragnhild Asmus, Victor N. de Jonge, Stuart R. Borrett, Ursula M. Scharler, Alessandro Ludovisi, Harald Asmus, Dan Baird, Nathalie Niquil, Towson University [Towson, MD, United States], University of Maryland System, Alfred-Wegener-Institut, Helmholtz-Zentrum für Polar- und Meeresforschung (AWI), University of Stellenbosch Business School [Cape Town] (USB ), University of North Carolina [Wilmington] (UNC), University of North Carolina System (UNC), University of Hull [United Kingdom], Università degli Studi di Perugia = University of Perugia (UNIPG), Biologie des Organismes et Ecosystèmes Aquatiques (BOREA), Université de Caen Normandie (UNICAEN), Normandie Université (NU)-Normandie Université (NU)-Muséum national d'Histoire naturelle (MNHN)-Institut de Recherche pour le Développement (IRD)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Université des Antilles (UA), Centre National de la Recherche Scientifique (CNRS), University of KwaZulu-Natal [Durban, Afrique du Sud] (UKZN), Landesbetrieb für Küstenschutz, Nationalpark und Meeresschutz Schleswig-Holstein [Husum, Allemagne] (LKN.SH), Leibniz Centre for Tropical Marine Research (ZMT), Università degli Studi di Perugia (UNIPG), and University of KwaZulu-Natal (UKZN)
- Subjects
0106 biological sciences ,State variable ,Index (economics) ,010504 meteorology & atmospheric sciences ,Computer science ,[SDE.MCG]Environmental Sciences/Global Changes ,Ecological network analysis ,Management, Monitoring, Policy and Law ,Aquatic Science ,Oceanography ,01 natural sciences ,Marine and coastal environment ,Ecosystem approach ,Trophic length ,0105 earth and related environmental sciences ,Trophic level ,business.industry ,010604 marine biology & hydrobiology ,Environmental resource management ,Cycling ,Food web ,15. Life on land ,[SDE.ES]Environmental Sciences/Environmental and Society ,Average path length ,6. Clean water ,13. Climate action ,Metric (unit) ,[SDE.BE]Environmental Sciences/Biodiversity and Ecology ,business - Abstract
International audience; In this paper, we identified seven ecological network analysis (ENA) metrics that, in our opinion, have high potential to provide useful and practical information for environmental decision-makers and stakeholders. Measurement and quantification of the network indicators requires that an ecosystem level assessment is implemented. The ENA metrics convey the status of the ecological system state variables, and mostly, the flows and relations between the various nodes of the network. The seven metrics are: 1) Average Path Length (APL), 2) Finn Cycling Index (FCI), 3) Mean Trophic level (MTL), 4) Detritivory to Herbivory ratio (D:H), 5) Keystoneness, 6) Structural Information (SI), and 7) Flow-based Information indices. The procedure for calculating each metric is detailed along with a short evaluation of their potential assessment of environmental status.
- Published
- 2019
3. Machine learning for anomaly assessment in sensor networks for NDT in aerospace
- Author
-
Ivan Kraljevski, Frank Duckhorn, Constanze Tschöpe, Matthias Wolff, and Publica
- Subjects
business.industry ,Computer science ,Calibration (statistics) ,Anomaly (natural sciences) ,010401 analytical chemistry ,non-destructive testing ,Machine learning ,computer.software_genre ,01 natural sciences ,0104 chemical sciences ,Support vector machine ,machine learning ,Feature (computer vision) ,Nondestructive testing ,Anomaly detection ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Hidden Markov model ,ultrasonic transducers ,Instrumentation ,Wireless sensor network ,computer - Abstract
We investigated and compared various algorithms in machine learning for anomaly assessment with different feature analyses on ultrasonic signals recorded by sensor networks. The following methods were used and compared in anomaly detection modeling: hidden Markov models (HMM), support vector machines (SVM), isolation forest (IF), and reconstruction autoencoders (AEC). They were trained exclusively on sensor signals of the intact state of structures commonly used in various industries, like aerospace and automotive. The signals obtained on artificially introduced damage states were used for performance evaluation. Anomaly assessment was evaluated and compared using various classifiers and feature analysis methods. We introduced novel methodologies for two processes. The first was the dataset preparation with anomalies. The second was the detection and damage severity assessment utilizing the intact object state exclusively. The experiments proved that robust anomaly detection is practically feasible. We were able to train accurate classifiers which had a considerable safety margin. Precise quantitative analysis of damage severity will also be possible when calibration data become available during exploitation or by using expert knowledge.
- Published
- 2021
4. Convolutional Autoencoders for Health Indicators Extraction in Piezoelectric Sensors
- Author
-
Ivan Kraljevski, Constanze Tschoepe, Frank Duckhorn, and Matthias Wolff
- Subjects
0209 industrial biotechnology ,business.industry ,Computer science ,Piezoelectric sensor ,Feature extraction ,Pattern recognition ,02 engineering and technology ,Health indicator ,020901 industrial engineering & automation ,Component (UML) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Extraction (military) ,Artificial intelligence ,Hidden Markov model ,business - Abstract
We present a method for extracting health indicators from piezoelectric sensors applied in the case of microfluidic valves. Convolutional autoencoders were used to train a model on the normal operating conditions and tested on signals of different valves. The results of the model performance evaluation, as well as, the qualitative presentation of the indicator plots for each tested component, showed that the used approach is capable of detecting features that correspond to increasing component degradation. The extracted health indicators are the prerequisite and input for reliable remaining useful life prediction.
- Published
- 2020
5. Acoustic Resonance Testing of Glass IV Bottles
- Author
-
Ivan Kraljevski, Matthias Wolff, Constanze Tschoepe, Yong Chul Ju, and Frank Duckhorn
- Subjects
business.industry ,Computer science ,Deep learning ,Acoustics ,020208 electrical & electronic engineering ,02 engineering and technology ,Nondestructive testing ,otorhinolaryngologic diseases ,0202 electrical engineering, electronic engineering, information engineering ,Detection performance ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Acoustic resonance - Abstract
In this paper, acoustic resonance testing on glass intravenous (IV) bottles is presented. Different machine learning methods were applied to distinguish acoustic observations of bottles with defects from the intact ones. Due to the very limited amount of available specimens, the question arises whether the deep learning methods can achieve similar or even better detection performance compared with traditional methods.
- Published
- 2020
6. Behavioral Control of Cognitive Agents Using Database Semantics and Minimalist Grammars
- Author
-
Ingo Schmitt, Peter beim Graben, Günther Wirsching, Ronald Römer, Markus Huber, and Matthias Wolff
- Subjects
Database ,Knowledge representation and reasoning ,Computer science ,010102 general mathematics ,Cognition ,06 humanities and the arts ,0603 philosophy, ethics and religion ,Semantics ,computer.software_genre ,01 natural sciences ,Rule-based machine translation ,060302 philosophy ,Semiotics ,Relevance (information retrieval) ,0101 mathematics ,Control (linguistics) ,computer ,Meaning (linguistics) - Abstract
Knowledge representation and processing, learning and adjusting knowledge models, communication and interaction as well as problem solving are important skills of cognitive agents. But, in order to share the learned knowledge with other communication participants, the agent must pay attention to maintaining its own functionality when interacting with the physical environment. To comply with this requirement, the meaning of perceptions and the consequences of actions must be understood. This suggests that even the non-verbal exchange of information is based on the foundations of semiotics. In this work, we show that we can model non-verbal interactions with the same linguistic means as verbal communication. We demonstrate this by solving the bidirectional translation problem of symbolic sequences into semantics through the use of minimalist grammars. To verify this approach, we turn back to the well-known example of a cognitive mouse agent living in a maze world and model the interaction and behavioral control using database semantics in a fully deterministic setting. Finally, we propose a unifying perspective for non-verbal interaction and verbal communication as well as justify its relevance for Cognitive Infocommunications.
- Published
- 2019
7. Reinforcement Learning of Minimalist Numeral Grammars
- Author
-
Markus Huber, Ronald Römer, Werner Meyer, Peter beim Graben, and Matthias Wolff
- Subjects
FOS: Computer and information sciences ,Parsing ,Computer Science - Computation and Language ,Minimalist grammar ,Mental lexicon ,Computer science ,business.industry ,Computer Science - Artificial Intelligence ,computer.software_genre ,Lexicon ,Semantics ,Linguistic competence ,030507 speech-language pathology & audiology ,03 medical and health sciences ,Artificial Intelligence (cs.AI) ,Rule-based machine translation ,Artificial intelligence ,0305 other medical science ,business ,Computation and Language (cs.CL) ,computer ,Generative grammar ,Natural language processing - Abstract
Speech-controlled user interfaces facilitate the operation of devices and household functions to laymen. State-of-the-art language technology scans the acoustically analyzed speech signal for relevant keywords that are subsequently inserted into semantic slots to interpret the user's intent. In order to develop proper cognitive information and communication technologies, simple slot-filling should be replaced by utterance meaning transducers (UMT) that are based on semantic parsers and a \emph{mental lexicon}, comprising syntactic, phonetic and semantic features of the language under consideration. This lexicon must be acquired by a cognitive agent during interaction with its users. We outline a reinforcement learning algorithm for the acquisition of the syntactic morphology and arithmetic semantics of English numerals, based on minimalist grammar (MG), a recent computational implementation of generative linguistics. Number words are presented to the agent by a teacher in form of utterance meaning pairs (UMP) where the meanings are encoded as arithmetic terms from a suitable term algebra. Since MG encodes universal linguistic competence through inference rules, thereby separating innate linguistic knowledge from the contingently acquired lexicon, our approach unifies generative grammar and reinforcement learning, hence potentially resolving the still pending Chomsky-Skinner controversy., 13 pages, 1 figure
- Published
- 2019
8. Assessing the exploitation status of main fisheries resources in Ghana’s reservoirs based on reconstructed catches and a length-based bootstrapping stock assessment method
- Author
-
Seth Mensah Abobi, Matthias Wolff, Tobias Mildenberger, and Jeppe Kolding
- Subjects
0106 biological sciences ,Stock assessment ,Computer science ,010604 marine biology & hydrobiology ,Bootstrapping (linguistics) ,BFSA ,010501 environmental sciences ,Aquatic Science ,01 natural sciences ,Ghana ,Fishery ,Length-based indicators ,Reservoirs ,TropFishR ,0105 earth and related environmental sciences ,Water Science and Technology - Abstract
Abobi SM, Mildenberger TK, Kolding J, Wolff M. 2019. Assessing the exploitation status of main fisheries resources in Ghana’s reservoirs based on reconstructed catches and a length-based bootstrapping stock assessment method. Lake Reserv Manage. 35:415–434. The cichlid species Oreochromis niloticus, Sarotherondon galilaeus, and Coptodon zillii, which are among the most exploited resources in the small-scale fisheries of the Tono, Bontanga, and Golinga reservoirs in northern Ghana, were assessed based on length frequency samples. Growth, mortality, exploitation status, stock size, and relative yield per recruit reference points were determined using bootstrapping fish stock assessment (BFSA), a novel framework that allows for the estimation of uncertainties around the life-history parameters and reference levels (e.g., L∞, K, and F0.1). The results suggest that the 3 species studied are heavily exploited in all 3 reservoirs, but with no alarming signs of overexploitation. The fishing effort at Golinga is comparatively low as a result of insignificant fishing during the agriculture season, which relates to low exploitation rates. Sarotherondon galilaeus and C. zillii have the highest and lowest biomass (t/km2) respectively in all the 3 reservoirs. The small shallow reservoir (Golinga) has the highest biomass of the target resources per unit area. According to a second assessment approach, based on length-based indicators, all species at Bontanga and O. niloticus and S. galilaeus populations at Golinga have spawning stock biomasses below 40% of the unfished biomass. This points to a situation of a possible ongoing recruitment overfishing of those species in the 2 reservoirs and suggests that a further increase in fishing effort should be prevented. Further monitoring of these fisheries will be needed for the improvement of assessments and thus management advice.
- Published
- 2019
- Full Text
- View/download PDF
9. Bridging between load-flow and Kuramoto-like power grid models: A flexible approach to integrating electrical storage units
- Author
-
Matthias Wolff, Katrin Schmietendorf, Pedro G. Lind, Joachim Peinke, Oliver Kamps, and Philipp Maass
- Subjects
Bridging (networking) ,Quality management ,Computer science ,media_common.quotation_subject ,General Physics and Astronomy ,FOS: Physical sciences ,Inertia ,Electric power system ,Wind energy ,Mathematical Physics ,media_common ,Wind power ,business.industry ,Applied Mathematics ,Power grids ,Statistical and Nonlinear Physics ,Control engineering ,Grid ,Nonlinear Sciences - Adaptation and Self-Organizing Systems ,Renewable energy ,Nonlinear system ,Descriptive statistics ,Energy production ,Electrical engineering ,Mathematical modeling ,business ,Adaptation and Self-Organizing Systems (nlin.AO) - Abstract
In future power systems, electrical storage will be the key technology for balancing feed-in fluctuations. With increasing share of renewables and reduction of system inertia, the focus of research expands towards short-term grid dynamics and collective phenomena. Against this backdrop, Kuramoto-like power grids have been established as a sound mathematical modeling framework bridging between the simplified models from nonlinear dynamics and the more detailed models used in electrical engineering. However, they have a blind spot concerning grid components, which cannot be modeled by oscillator equations, and hence do not allow to investigate storage-related issues from scratch. We remove this shortcoming by bringing together Kuramoto-like and algebraic load-flow equations. This is a substantial extension of the current Kuramoto framework with arbitrary grid components. Based on this concept, we provide a solid starting point for the integration of flexible storage units enabling to address current problems like smart storage control, optimal siting and rough cost estimations. For demonstration purpose, we here consider a wind power application with realistic feed-in conditions. We show how to implement basic control strategies from electrical engineering, give insights into their potential with respect to frequency quality improvement and point out their limitations by maximum capacity and finite-time response., Comment: 12 pages, 6 figures
- Published
- 2019
10. Quantum-Based Modelling of Database States
- Author
-
Günther Wirsching, Matthias Wolff, and Ingo Schmitt
- Subjects
Type theory ,Database ,Computer science ,Linear algebra ,Order (ring theory) ,computer.software_genre ,computer ,Database design ,Quantum ,Data type ,AND gate ,Complex data structures - Abstract
Database design of real-world scenarios requires complex data structures in order to adequately model complex real-world objects. Complex data structures can be constructed by a recursive use of elementary data types and data type constructors. The mathematics behind quantum mechanics provides us an interesting theory combining concepts from linear algebra, probability calculus, and logic. In order to make the mathematics of quantum mechanics available for database structures and states we develop a mapping of concepts from type theory of databases to the mathematics of quantum mechanics.
- Published
- 2019
11. Towards a Quantum Mechanical Model of the Inner Stage of Cognitive Agents
- Author
-
Ronald Römer, Ingo Schmitt, Matthias Wolff, Günther Wirsching, Markus Huber, and Peter beim Graben
- Subjects
Cognitive science ,Cognitive systems ,Computer science ,Action planning ,Observable ,Relevance (information retrieval) ,Cognition ,Fantasy ,Quantum ,Veridicality - Abstract
We present a model, inspired by quantum field theory, of the so-called inner stage of technical cognitive agents. The inner stage represents all knowledge of the agent. It allows for planning of actions and for higher cognitive functions like coping and fantasy. By the example of a cognitive mouse agent living in a maze wold, we discuss learning, action planning, and attention in a fully deterministic setting and assuming a totally observable world. We explain the relevance of our approach to cognitive infocommunications.
- Published
- 2018
12. Power grid stability under perturbation of single nodes: Effects of heterogeneity and internal nodes
- Author
-
Pedro G. Lind, Philipp Maass, and Matthias Wolff
- Subjects
Computer science ,media_common.quotation_subject ,General Physics and Astronomy ,Perturbation (astronomy) ,FOS: Physical sciences ,Inertia ,Topology ,01 natural sciences ,010305 fluids & plasmas ,Electric power system ,0103 physical sciences ,Electronics ,010306 general physics ,Computer Science::Distributed, Parallel, and Cluster Computing ,Mathematical Physics ,media_common ,business.industry ,Applied Mathematics ,Kuramoto model ,Statistical and Nonlinear Physics ,Grid ,Nonlinear Sciences - Adaptation and Self-Organizing Systems ,Electric power transmission ,Electricity ,business ,Adaptation and Self-Organizing Systems (nlin.AO) - Abstract
Non-linear equations describing the time evolution of frequencies and voltages in power grids exhibit fixed points of stable grid operation. The dynamical behaviour after perturbations around these fixed points can be used to characterise the stability of the grid. We investigate both probabilities of return to a fixed point and times needed for this return after perturbation of single nodes. Our analysis is based on an IEEE test grid and the second-order swing equations for voltage phase angles $\theta_j$ at nodes $j$ in the synchronous machine model. The perturbations cover all possible changes $\Delta\theta$ of voltage angles and a wide range of frequency deviations in a range $\Delta f=\pm1$~Hz around the common frequency $\omega=2\pi f=\dot\theta_j$ in a synchronous fixed point state. Extensive numerical calculations are carried out to determine, for all node pairs $(j,k)$, the return times $t_{jk}(\Delta\theta,\Delta \omega)$ of node $k$ after a perturbation of node $j$. We find that for strong perturbations of some nodes, the grid does not return to its synchronous state. If returning to the fixed point, the times needed for the return are strongly different for different disturbed nodes and can reach values up to 20 seconds and more. When homogenising transmission line and node properties, the grid always returns to a synchronous state for the considered perturbations, and the longest return times have a value of about 4 seconds for all nodes. The neglect of reactances between points of power generation (internal nodes) and injection (terminal nodes) leads to an underestimation of return probabilities., Comment: 19 pages, 9 figures
- Published
- 2018
- Full Text
- View/download PDF
13. A Cognitive User Interface for a Multi-modal Human-Machine Interaction
- Author
-
Frank Duckhorn, Markus Huber, Werner Meyer, Matthias Wolff, and Constanze Tschöpe
- Subjects
Computer science ,business.industry ,Wireless network ,Interface (computing) ,Service provider ,Encryption ,law.invention ,File server ,Touchscreen ,Home automation ,law ,Human–computer interaction ,business ,Gesture - Abstract
We developed a hardware-based cognitive user interface to help inexperienced and little technology-affine people to get easy access to smart home devices. The interface is able to interact (via speech, gestures, or touchscreen) with the user. By learning from the user’s behavior, it can adapt to each individual. In contrast to most commercial products, our solution keeps all data required for operation internally and is connected to other UCUI devices only via an encrypted wireless network. By design, no data ever leave the system to file servers of third-party service providers. In this way, we ensure the privacy protection of the user.
- Published
- 2018
14. A Fock Space Toolbox and Some Applications in Computational Cognition
- Author
-
Ingo Schmitt, Matthias Wolff, Günther Wirsching, Ronald Römer, Markus Huber, and Peter beim Graben
- Subjects
Semantics (computer science) ,Computer science ,05 social sciences ,Linear operators ,Computational cognition ,02 engineering and technology ,050105 experimental psychology ,Toolbox ,Quantum logic ,Fock space ,Algebra ,Computer Science::Mathematical Software ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,0501 psychology and cognitive sciences ,Computational linguistics ,MATLAB ,computer ,computer.programming_language - Abstract
We present a Matlab toolbox, called “FockBox”, handling Fock spaces and objects associated with Fock spaces: scalars, ket and bra vectors, and linear operators. We give brief application examples from computational linguistics, semantics processing, and quantum logic, demonstrating the use of the toolbox.
- Published
- 2018
15. An embedded system for acoustic pattern recognition
- Author
-
Peter Bluthgen, Christian Richter, Frank Duckhorn, Constanze Tschöpe, and Matthias Wolff
- Subjects
Digital signal processor ,Signal processing ,Computer science ,business.industry ,010401 analytical chemistry ,Bandwidth (signal processing) ,Process (computing) ,Pattern recognition ,01 natural sciences ,Signal ,Flash memory ,0104 chemical sciences ,Filter (video) ,Pattern recognition (psychology) ,Artificial intelligence ,business ,Field-programmable gate array - Abstract
We present a miniaturized universal hardware module for acoustic pattern recognition in various types of multichannel sensor signals. The module implements configurable signal analysis (signal transforms, filter banks, statistical transforms) and a GMM-HMM recognizer. The main hardware components are a XC7A75T FPGA performing almost all the computations, a TMS320C6746 digital signal processor organizing the data flow and executing the automata search (decoding), 3×1 Gibit DDR2 memory and 4 Gibit flash memory. The module can be plugged into a host device and be connected to signal acquisition hardware. It can process input signals with up to approx. 250 kHz bandwidth for continuous input and up to approx. 3 MHz for burst input.
- Published
- 2017
16. Intelligent signal processing on a miniaturized hardware module
- Author
-
Constanze Tschöpe, Christian Richter, Matthias Wolff, Frank Duckhorn, and Peter Bluthgen
- Subjects
Signal processing ,Digital signal processor ,Computer science ,Filter (video) ,business.industry ,Bandwidth (signal processing) ,business ,Field-programmable gate array ,Signal ,Host (network) ,Computer hardware - Abstract
We present a miniaturized universal hardware module for acoustic pattern recognition in various types of multi-channel sensor signals. The module implements a configurable signal analysis (signal transforms, filter banks, statistical transforms) and GMM-HMM recognizer. The main hardware components are a XC7A75T FPGA performing almost all the computations, a TMS320C6746 digital signal processor organizing the data flow and executing the automata search (decoding), 3×1 Gibit DDR2 memory and 4 Gibit flash memory. The module can be plugged into a host device and be connected to signal acquisition hardware. It can process input signals with up to approx. 250 kHz bandwidth for continuous input and up to approx. 3 MHz for burst input.
- Published
- 2017
17. Denormalized quantum density operators for encoding semantic uncertainty in cognitive agents
- Author
-
Ronald Römer, Günther Wirsching, Ingo Schmitt, and Matthias Wolff
- Subjects
Theoretical computer science ,Relational database ,Computer science ,Probabilistic logic ,Hilbert space ,020206 networking & telecommunications ,02 engineering and technology ,Decision problem ,symbols.namesake ,Encoding (memory) ,0202 electrical engineering, electronic engineering, information engineering ,symbols ,Representation (mathematics) ,Quantum ,Randomness - Abstract
The design of a cognitive agent requires a behaviour control of actions and observations for exploring an unknown world. Typically, observations are influenced by a certain degree of randomness which can be modeled as probabilities. In our scenario we let a mouse explore a maze with walls, boundaries and a random portal. All observations are stored and managed in a so-called inner stage. As a decision problem, we want to be able to plan actions and to predict their resulting observations. In our approach we develop models of the inner stage based on concepts of probabilistic databases and their mapping to denormalized density matrices which are known from quantum mechanics. Density matrices provide a compact representation of the powerful but unwieldy many-world-semantics. We show that density matrices make the many-world-semantics feasible and are well suited to model the inner stage. We propose algorithms for learning and predicting action results.
- Published
- 2017
18. TropFishR: an R package for fisheries analysis with length-frequency data
- Author
-
Tobias Mildenberger, Marc H Taylor, and Matthias Wolff
- Subjects
0106 biological sciences ,Stock assessment ,Computer science ,010604 marine biology & hydrobiology ,Ecological Modeling ,Length frequency ,Fish stock ,010603 evolutionary biology ,01 natural sciences ,Toolbox ,Fishery ,R package ,Virtual population analysis ,Production model ,Ecology, Evolution, Behavior and Systematics ,Stock (geology) - Abstract
1. The R package TropFishR is a new analysis toolbox compiling single-species stock assessment methods specifically designed for data-limited fisheries analysis using length-frequency data. 2. It includes methods for (i) estimating biological stock characteristics such as growth and mortality parameters, (ii) exploring technical aspects of the fisheries (e.g. exploitation rate and selectivity characteristics), (iii) assessingsize and composition of a fish stock bymeans of virtual population analysis (VPA), and (iv) assessing stock status with yield prediction and production models.3. This paper introduces the package and demonstrates the functionality of a selection of its core methods.4. TropFishR modernises traditional stock assessment methods by easing application and development and by combining it with advanced statistical approaches
- Published
- 2017
19. Towards coping and imagination for cognitive agents
- Author
-
Günther Wirsching, Matthias Wolff, and Ronald Römer
- Subjects
Cognitive science ,Coping (psychology) ,Cognitive systems ,Computer science ,business.industry ,Repertoire ,Cognition ,Artificial intelligence ,business ,Electronic mail ,Automaton - Abstract
Current technical cognitive systems cannot react on unforeseen situations. In particular, they fail if their repertoire of actions is insufficient to solve a problem or to overcome a barrier. In this paper we propose a novel ‘coping apparatus’ for cognitive agents which is capable of conceiving entirely new actions, simulating their prospective effect in the world and learning from their actual effect. The apparatus also provides the agent with ‘fantasy’ which allows to imagine unseen world states and to develop a strategy how to try to get there.
- Published
- 2015
20. Intelligente Signalverarbeitung 2
- Author
-
Rüdiger Hoffmann and Matthias Wolff
- Subjects
Computer science - Published
- 2015
21. Toward Spontaneous Speech Synthesis—Utilizing Language Model Information in TTS
- Author
-
Matthias Eichner, Matthias Wolff, R. Hoffmann, and S. Werner
- Subjects
Speech production ,Acoustics and Ultrasonics ,Computer science ,business.industry ,Speech recognition ,Speech synthesis ,Pronunciation ,Speech processing ,computer.software_genre ,Speech shadowing ,Naturalness ,Factored language model ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Language model ,Electrical and Electronic Engineering ,business ,computer ,Software ,Natural language processing - Abstract
State-of-the-art speech synthesis systems achieve a high overall quality. However, synthesized speech still lacks naturalness. To produce more natural and colloquial synthetic speech, our research focuses on integration of effects present in spontaneous speech. Conventional speech synthesis systems do not consider the probability of a word in its context. Recent investigations on corpora of natural speech showed that words that are very likely to occur in a given context are pronounced less accurately and faster than improbable ones. In this paper three approaches are introduced to model this effect found in spontaneous speech. The first algorithm changes the speaking rate directly by shortening or lengthening the syllables of a word depending on the language model probability of that word. Since probable words are not only pronounced faster but also less accurately this approach was extended by selecting appropriate pronunciation variants of a word according to the language model probability. This second algorithm changes the local speaking rate indirectly by controlling the grapheme-phoneme conversion. In a third stage, a pronunciation sequence model was used to select the appropriate variants according to their sequence probability. In listening experiments test participants were asked to rate the synthesized speech in the categories colloquial impression and naturalness. Our approaches achieved a significant improvement in the category colloquial impression. However, no significantly higher naturalness could be observed. The observed effects will be discussed in detail.
- Published
- 2004
22. Intelligente Signalverarbeitung 1
- Author
-
Rüdiger Hoffmann and Matthias Wolff
- Subjects
Computer science - Published
- 2014
23. Analysis-by-synthesis approach for acoustic model adaptation
- Author
-
Frank Duckhorn, Ivan Kraljevski, Rüdiger Hoffmann, Guntram Strecha, Yitagessu Birhanu Gebremedhin, and Matthias Wolff
- Subjects
Voice activity detection ,Computer science ,Speech recognition ,Speech coding ,Speech technology ,Acoustic model ,Speech synthesis ,PSQM ,Speech processing ,computer.software_genre ,Linear predictive coding ,computer - Abstract
This paper presents an analysis-by-synthesis approach for acoustic model adaptation. Using artificial speech data for speech recognition systems adaptation, has the potential to address the problem of data sparseness, to avoid speech recordings in real conditions and to provide the capability of performing large number of development cycles for Automatic Speech Recognition (ASR) systems in shorter time. The proposed adaptation framework uses unified ASR and synthesis system to produce artificial adaptation speech signals. In order to confirm the usability of the proposed approach, several experiments were performed where the artificial speech data was coded-decoded by different speech and waveform coders and the acoustic model used for synthesis was adapted for each coder. The recognition results show that the proposed method could be used successfully in the process of speech recognition systems performance assessment and improvement, not only for coded speech effects evaluation and adaptation, but also for other environment conditions.
- Published
- 2013
24. Cross-Language Acoustic Modeling for Macedonian Speech Technology Applications
- Author
-
Rüdiger Hoffmann, Ivan Kraljevski, Matthias Wolff, Slavcho Chungurski, Guntram Strecha, and Oliver Jokisch
- Subjects
ComputingMethodologies_PATTERNRECOGNITION ,Computer science ,Speech recognition ,Speech technology ,Maximum a posteriori estimation ,Acoustic model ,Speech synthesis ,Bootstrapping (linguistics) ,Macedonian language ,Transcription (software) ,computer.software_genre ,Hidden Markov model ,computer - Abstract
This paper presents a cross-language development method for speech recognition and synthesis applications for Macedonian language. Unified system for speech recognition and synthesis trained on German language data was used for acoustic model bootstrapping and adaptation. Both knowledge-based and data-driven approaches for source and target language phoneme mapping were used for initial transcription and labeling of small amount of recorded speech. The recognition experiments on the source language acoustic model with target language dataset showed significant recognition performance degradation. Acceptable performance was achieved after Maximum a posteriori (MAP) model adaptation with limited amount of target language data, allowing suitable use for small to medium vocabulary speech recognition applications. The same unified system was used again to train new separate acoustic model for HMM based synthesis. Qualitative analysis showed, despite the low quality of the available recordings and sub-optimal phoneme mapping, that HMM synthesis produces perceptually good and intelligible synthetic speech.
- Published
- 2013
25. An Approach to Intelligent Signal Processing
- Author
-
Rüdiger Hoffmann and Matthias Wolff
- Subjects
Signal processing ,Finite-state machine ,Unification ,Computer science ,business.industry ,SIGNAL (programming language) ,computer.software_genre ,Multidimensional signal processing ,Artificial intelligence ,Architecture ,business ,Hidden Markov model ,Audio signal processing ,computer - Abstract
This paper describes an approach to intelligent signal processing. First we propose a general signal model which applies to speech, music, biological, and technical signals. We formulate this model mathematically using a unification of hidden Markov models and finite state machines. Then we name tasks for intelligent signal processing systems and derive a hierarchical architecture which is capable of solving them. We show the close relationship of our approach to cognitive dynamic systems. Finally we give a number of application examples.
- Published
- 2012
26. A new epsilon filter for efficient composition of weighted finite-state transducers
- Author
-
Matthias Wolff, Frank Duckhorn, and Rüdiger Hoffmann
- Subjects
Transducer ,Computer science ,Filter (video) ,Acoustics ,Finite state ,Composition (combinatorics) - Published
- 2011
27. Food intake recognition conception for wearable devices
- Author
-
Wolf-Joachim Fischer, Matthias Wolff, and Sebastian Päßler
- Subjects
Human food ,Hearing aid ,Food intake ,Computer science ,Microphone ,business.industry ,medicine.medical_treatment ,digestive, oral, and skin physiology ,Wearable computer ,Human–computer interaction ,otorhinolaryngologic diseases ,medicine ,Recognition algorithm ,Hidden Markov model ,business ,ComputingMilieux_MISCELLANEOUS ,Wearable technology - Abstract
Obesity is a growing healthcare challenge in present days. Objective automated methods of food intake monitoring are necessary to face this challenge in future. A method for non-invasive monitoring of human food intake behavior by the evaluation of chewing and swallowing sounds has been developed. A wearable food intake sensor has been created by integrating in-ear microphone and a reference microphone in a hearing aid case. A concept for food intake monitoring requiring low computational cost is presented. After the detection of food intake activity periods, signal recognition algorithms based on Hidden Markov Models distinguish several types of food based on the sound properties of their chewing sounds. Algorithms are developed using manual labeled records of the food intake sounds of 40 participants.
- Published
- 2011
28. Speech synthesis using HMM based diphone inventory encoding for low-resource devices
- Author
-
Matthias Wolff and Guntram Strecha
- Subjects
Computer science ,business.industry ,Speech recognition ,Speech coding ,Codebook ,Pattern recognition ,Speech synthesis ,Diphone ,computer.software_genre ,Encoding (memory) ,Artificial intelligence ,Concatenative synthesis ,Hidden Markov model ,business ,computer ,Data compression - Abstract
In this paper we describe the compression of diphone inventories used by the acoustic synthesis of a concatenative synthesis system. The inventory compression is based on a codebook drawn from the Gaussian mean vectors of phoneme HMMs. There are two encoding/synthesis schemes, a speaker dependent and a speaker independent one. The advantage of the latter is the potential common use of the HM-models by a recognizer and a synthesizer. We describe the steps to encode the inventories as well as the acoustic synthesis using them. Using the proposed method a diphone inventory with 1175 units can be compressed down to 19 kB. We will show that the synthesis quality with HMM-encoded inventories matches the quality of synthesis with AMR- or SPEEX-encoded inventories at noticeably smaller inventory sizes.
- Published
- 2011
29. Pattern recognition for sensor signals
- Author
-
Matthias Wolff and Constanze Tschöpe
- Subjects
Support vector machine ,Signal processing ,Syntax (programming languages) ,Relation (database) ,Computer science ,business.industry ,Pattern recognition (psychology) ,Pattern recognition ,Artificial intelligence ,Object (computer science) ,business ,Hidden Markov model ,Signal - Abstract
In this paper we propose a universal strategy for the automatic interpretation of sensor signals. We focus on acoustic signals. However, any time series may be used. We assume that changes in an object's state cause a typical and reproducible change in the characteristics of the acquired sensor signal. In such cases we can train pattern recognizers basing on Hidden-Markov-Models or Support Vector Machines with data recordings of different object states and use these classifiers to assess the state of identical or similar objects. Our approach assumes that the sensor signals consist of elementary signal events and some kind of syntax defining their temporal relation (much like a musical score defines the temporal relation between notes). It is capable of automatically determining both, the elementary events and their syntax from the training data. We present experimental results from seven different applications from the fields of non-destructive testing, bio and music signal processing.
- Published
- 2009
30. Fuzzy Multiscale Region Growing for Segmentation of MR Images of the Human Brain
- Author
-
André Brechmann, Anja Perlich, Frederik Maucksch, Matthias Wolff, Karin Engel, and Klaus D. Toennies
- Subjects
business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Human brain ,Fuzzy logic ,Software ,medicine.anatomical_structure ,Neuroimaging ,Multi resolution ,Region growing ,medicine ,Computer vision ,Segmentation ,Artificial intelligence ,Mr images ,business - Abstract
We propose an automatic region growing technique for the segmentation of the cerebral cortex and white matter in MRI data. Our method exploits general anatomical knowledge and uses an iterative multi resolution scheme for the estimation of intensity distributions to compensate for artifacts within the data. We present a comparison to segmentation results created by the neuroimaging software Brainvoyager QX and show advantages of our approach based on a qualitative and quantitative evaluation.
- Published
- 2009
31. A hybrid speech signal based algorithm for pitch marking using finite state machines
- Author
-
Oliver Jokisch, Rüdiger Hoffmann, Matthias Wolff, Hussein Hussein, Frank Duckhorn, and Guntram Strecha
- Subjects
Task (computing) ,Finite-state machine ,Computer science ,Speech recognition ,Stage (hydrology) ,Speech processing ,Algorithm ,Signal ,Selection (genetic algorithm) - Abstract
Pitch marking is a major task in speech processing. Thus, an accurate detection of pitch marks (PM) is required. In this paper, we propose a hybrid method for pitch marking that combines outputs of two different speech signal based pitch marking algorithms (PMA). We use a finite state machine (FSM) to represent and combine the pitch marks. The hybrid PMA is implemented in four stages: preprocessing, alignment, selection and postprocessing. In the alignment stage, the preprocessed pitch marks are shifted to a local minimum of the speech signal and the confidence score for every pitch mark is calculated. The confidence scores are used as transition weights for the FSM. The PMA outputs are combined into a single sequence of pitch marks. The more accurate pitch marks with the highest confidence score are chosen in the selection stage. A PM reference database contains 10 minutes speech including manually adjusted PM. The evaluation results indicate that the proposed hybrid method outperforms the single PMAs but also other current state-of-the-art algorithms which have been evaluated on a second reference database containing 44 speakers.
- Published
- 2008
32. Experiments in acoustic structural health monitoring of airplane parts
- Author
-
R. Hoffmann, R. Schubert, H. Neunubel, Matthias Wolff, E. Schulze, and Constanze Tschöpe
- Subjects
business.product_category ,Computer science ,Speech recognition ,Acoustics ,Carbon fibers ,Fibre-reinforced plastic ,Airplane ,Support vector machine ,visual_art ,visual_art.visual_art_medium ,Ultrasonic sensor ,Structural health monitoring ,Hidden Markov model ,business ,Focus (optics) ,Structural acoustics - Abstract
In this preliminary study we investigate the application of statistical classifiers for structural health monitoring of materials commonly used in airplanes. Our approach is based on the propagation of ultrasonic guided waves through materials like aluminum or carbon fiber reinforced plastic (CFRP). When the material gets damaged, the sound propagation changes. There are two ways of detecting these changes: we can use a physical model of the wave propagation or we can use a statistical approach. In this paper we focus on the latter. We present results using classifiers based on Hidden Markov Models (HMM) and Support Vector Machines (SVM). We compare these results to acoustic travel time tomography as a representative of the physical model based methods.
- Published
- 2008
33. Analysis of Verbal and Nonverbal Acoustic Signals with the Dresden UASR System
- Author
-
Matthias Wolff, Rüdiger Hoffmann, and Matthias Eichner
- Subjects
Structure (mathematical logic) ,Finite-state machine ,Computer science ,business.industry ,Speech recognition ,Speech synthesis ,computer.software_genre ,Speech processing ,Field (computer science) ,Focus (linguistics) ,Nonverbal communication ,Artificial intelligence ,business ,computer ,Analysis method ,Natural language processing - Abstract
During the last few years, a framework for the development of algorithms for speech analysis and synthesis was implemented. The algorithms are connected to common databases on the different levels of a hierarchical structure. This framework which is called UASR (Unified Approach for Speech Synthesis and Recognition) and some related experiments and applications are described. Special focus is directed to the suitability of the system for processing nonverbal signals. This part is related to the analysis methods which are addressed in the COST 2102 initiative now. A potential application field in interaction research is discussed.
- Published
- 2007
34. Automatic Decision Making in SHM Using Hidden Markov Models
- Author
-
Constanze Tschope and Matthias Wolff
- Subjects
Structure (mathematical logic) ,Signal classification ,Empirical research ,Knowledge management ,business.industry ,Computer science ,E-learning (theory) ,Distance education ,Continuing education ,Usability ,business ,Popularity ,Social psychology - Abstract
The institutional effect is used to explain and predict MBA students ' decisions to adopt e-learning in their studies. A composite model including four constructs, namely, perceived usefulness, perceived ease of use, sense of popularity and convenience, were formed and tested in the empirical study. This study found that "normatively appropriate " and "taken-for- granted structure" of institutions were the most important principles for adult students when making e- learning decisions.
- Published
- 2007
35. Elastic lists for facet browsers
- Author
-
Constanze Tschöpe and Matthias Wolff
- Subjects
Information management ,Sequence ,Computer science ,business.industry ,Condition monitoring ,Machine learning ,computer.software_genre ,Field (computer science) ,Null (SQL) ,Artificial intelligence ,Structural health monitoring ,Data mining ,Explicit knowledge ,Hidden Markov model ,business ,computer - Abstract
Decision making and classification methods are very important in the field of structural health monitoring and life cycle prediction. We want to introduce an approach basing on sequence classifiers which can be used to several applications without any explicit knowledge of structures. To illustrate the concept we explain the method by means of a special example. So we can demonstrate our approach detailed, but although not too abstract.
- Published
- 2007
36. The harming part of room acoustics in automatic speech recognition
- Author
-
Rüdiger Hoffmann, Matthias Wolff, Rico Petrick, and Kevin Lohde
- Subjects
Computer science ,Speech recognition ,Room acoustics - Published
- 2007
37. Auscultatory Blood Pressure Measurement using HMMs
- Author
-
R. Hoffmann, H. Husssein, Ulrich Kordon, Constanze Tschöpe, Matthias Wolff, and Matthias Eichner
- Subjects
Signal processing ,Blood pressure ,Stethoscope ,Computer science ,law ,Noise (signal processing) ,Speech recognition ,Korotkoff sounds ,law.invention - Abstract
This paper reports on a study of applying an HMM-based labeler along with a tailored feature extraction to Korotkoff sounds. These sounds can be heard through a stethoscope during the auscultatory blood pressure measurement usually done at medical practices. While this method works well when the patient is at rest, interfering noise from muscles and joints cause major problems when the subject is doing any activities like sports or fitness exercises. We propose a signal processing and classification method to overcome these difficulties and present first promising results.
- Published
- 2007
38. Pronunciation Variant Selection for Spontaneous Speech Synthesis-Listening Effort As a Quality Parameter
- Author
-
R. Hoffmann, Matthias Wolff, and S. Werner
- Subjects
Computer science ,Speech recognition ,Grapheme ,Active listening ,Speech synthesis ,Pronunciation ,Intelligibility (communication) ,computer.software_genre ,computer ,Spontaneous speech - Abstract
In previous works (see for instance S. Werner et al. (2004)) we introduced different duration control methods in speech synthesis. The most outstanding approach is to control the grapheme to phoneme conversion (and thus indirectly control the speaking rate) by selecting (reduced) pronunciation variants according to a pronunciation variant sequence model. Listeners would only accept long synthesized utterances if the listening effort is nearly the same as the one when listening to natural speech. To evaluate the quality of the variant synthesis compared to the canonical one (as the state-of-the-art system), we performed a listening test with two different synthesis systems. The variant synthesis applying a pronunciation variant sequence model achieved a significant lower listening effort and a higher overall rate (MOS) compared to the canonical synthesis. We also show that the listening effort can act as a quality parameter for a speech sample. The rating for the listening effort is correlated with the rating of the naturalness and intelligibility of synthesized speech sample
- Published
- 2006
39. Voice characteristics conversion for TTS using reverse VTLN
- Author
-
Matthias Wolff, Matthias Eichner, and R. Hoffmann
- Subjects
Normalization (statistics) ,Signal processing ,ComputingMethodologies_PATTERNRECOGNITION ,Voice activity detection ,Computer science ,Speech recognition ,Speech synthesis ,Loudspeaker ,Speaker recognition ,computer.software_genre ,computer ,Vocal tract ,Voice analysis - Abstract
In the past, several approaches have been proposed for voice conversion in TTS systems. Mostly, conversion is done by modification of the spectral properties and pitch to match a certain target voice. This conversion causes distortions that deteriorate the quality of the synthesized speech. In this paper we investigate a very simple and straightforward method for voice conversion. It generates a new voice from the source speaker instead of generating a certain target speaker's voice. For application in TTS systems it is often sufficient to synthesize new voices that sound sufficiently different to be distinguishable from each other. This is done by applying a spectral warping technique that is commonly used for speaker normalization in speech recognition systems called vocal tract length normalization (VTLN). Due to the low requirements of resources this method is especially suited for embedded systems.
- Published
- 2004
40. Modeling pronunciation variation for spontaneous speech synthesis
- Author
-
Matthias Wolff, R. Hoffinann, Matthias Eichner, and S. Werner
- Subjects
business.industry ,Computer science ,Speech recognition ,Speech synthesis ,Context (language use) ,Pronunciation ,computer.software_genre ,Variation (linguistics) ,Natural (music) ,Active listening ,Artificial intelligence ,Control (linguistics) ,business ,computer ,Natural language processing ,Spontaneous speech - Abstract
Integration of pronunciation modeling into speech synthesis makes synthetic speech more natural and colloquial. Pronunciation variation as one observable effect in spontaneous speech is a step towards spontaneous speech synthesis. In the previous works (see Proc. ICASSP, vol.1, p.417-20, Orlando, FL, USA, 2002 and Proc. ICASSP, Hong Kong, PR China, Apr. 2003) we introduced different duration control methods in speech synthesis. These methods are based on the observation that words, which are very likely to occur in a given context are pronounced faster and less accurately than improbable ones (see D. Jurafsky et al., Proc. ICASSP, vol.2, p.801-4, Salt Lake City, USA, 2001). Therefore we use the probability of a word in its context to either control directly the local speaking rate, or to select appropriate pronunciation variants in order to change the local speaking rate. Extending these methods by a pronunciation sequence model, we involve knowledge about how well two subsequent variants fit together. Using this proposed algorithm we could further improve the natural and colloquial listening impressions.
- Published
- 2004
41. Speech-Enabled Services in a Web-based e-Learning Environment
- Author
-
Matthias Wolff, R. Hoffmann, Matthias Eichner, M. Göcks, and M. Kühne
- Subjects
Web standards ,Web 2.0 ,Web development ,Computer science ,business.industry ,Educational technology ,Services computing ,Web application security ,Atomic and Molecular Physics, and Optics ,World Wide Web ,Web application ,Electrical and Electronic Engineering ,business ,WS-Policy - Published
- 2004
42. Voice activation using prosodic features
- Author
-
Marco Khne, Matthias Eichner, Rüdiger Hoffmann, and Matthias Wolff
- Subjects
ComputingMethodologies_PATTERNRECOGNITION ,Computer science ,Speech recognition ,Active listening ,Fundamental frequency ,Hidden Markov model ,Word (computer architecture) ,Energy (signal processing) ,Constant false alarm rate - Abstract
In this paper we propose a voice activation method based on prosodic keyword verification. In current voice activation systems features like the fundamental frequency contour have not been considered so far. Normally a continuous listening word spotter is used to detect a certain predefined keyword. We conducted an experiment which shows that people emphasize this keyword when they address a recognizer. To capture the prosodic information we trained an HMM on the fundamental frequency and energy contour of the keyword. The prosodic model is used to verify the keyword hypotheses of a phonetic recognizer. We investigated the performance of the prosodic model to distinguish between the same keyword spoken in command and non-command phrases. The introduction of the prosodic information significantly reduced the false alarm rate whereas the detection rate was only slightly degraded.
- Published
- 2004
- Full Text
- View/download PDF
43. Classification of non-speech acoustic signals using structure models
- Author
-
R. Hoffmann, Matthias Eichner, Matthias Wolff, Constanze Tschöpe, and D. Hentschel
- Subjects
Markov chain ,business.industry ,Computer science ,Speech recognition ,Feature extraction ,Markov process ,Condition monitoring ,Pattern recognition ,Class discrimination ,Field (computer science) ,Adaptive filter ,symbols.namesake ,Computer Science::Sound ,symbols ,Artificial intelligence ,business ,Hidden Markov model - Abstract
Non-speech acoustic signals are widely used as the input of systems for non-destructive testing. In this rapidly growing field, the signals have an increasing complexity leading to the fact that powerful models are required. Methods like DTW and HMM, which are established in speech recognition, have been successfully used but are not sufficient in all cases. We propose the application of generalized structured Markov graphs (SMG). We describe a task independent structure learning technique which automatically adapts the models to the structure of the test signals. We demonstrate that our solution outperforms hand-tuned HMM structures in terms of class discrimination by two case studies using data from real applications.
- Published
- 2004
44. Integrating speech enabled services in a Web-based e-learning environment
- Author
-
R. Hoffmann, S. Werner, Matthias Eichner, and Matthias Wolff
- Subjects
Computer science ,business.industry ,Speech technology ,Speech synthesis ,Static web page ,Dynamic web page ,computer.software_genre ,World Wide Web ,Web page ,Web application ,Web service ,business ,computer ,Java applet - Abstract
We investigate the deployment possibilities of speech enabled services in a Web based e-learning environment. The integration of speech technology is realized with a client/server architecture. Therefore, the services speech synthesis, speech recognition, and speaker verification are installed at a central SpeechServer. The client uses a Java applet (SpeechApplet), which is integrated in an HTML page. It takes the user's input (e.g. speech or text input) and activates the according service at the SpeechServer. The SpeechApplet is easy to integrate into existing Web pages and aims at a simple JavaScript interface for communication between the Web page and the applet. In this paper we introduce this system, explain different modules, and discuss first evaluation results of these technologies.
- Published
- 2004
45. Towards spontaneous speech synthesis - LM based selection of pronunciation variants
- Author
-
Matthias Eichner, S. Werner, Matthias Wolff, and R. Hoffmann
- Subjects
Cued speech ,Motor theory of speech perception ,Speech production ,Computer science ,business.industry ,Speech recognition ,Chinese speech synthesis ,Speech technology ,Speech corpus ,Speech synthesis ,Intelligibility (communication) ,Pronunciation ,computer.software_genre ,Speech shadowing ,Artificial intelligence ,business ,computer ,Utterance ,Natural language processing ,Speech error - Abstract
State of the art speech synthesis systems achieve a high overall quality. However, the synthesized speech still lacks naturalness. To make speech synthesis more natural and colloquial we are trying to integrate effects that are observable in spontaneous speech. In a previous paper we introduced a new approach for duration control in speech synthesis that uses the probability of a word in its context to control the local speaking rate within the utterance. This idea is based on the observation that words that are very likely to occur in a given context are pronounced faster than improbable ones. Since probable words are not only pronounced faster but also less accurate we extend this approach by selecting appropriate pronunciation variants to realize the change in the local speaking rate.
- Published
- 2003
46. Speech synthesis using stochastic Markov graphs
- Author
-
Matthias Eichner, R. Hoffmann, S. Ohnewald, and Matthias Wolff
- Subjects
symbols.namesake ,Markov chain ,Computer science ,Speech recognition ,symbols ,Markov process ,Speech synthesis ,Intelligibility (communication) ,Markov model ,computer.software_genre ,computer - Abstract
Speech synthesis systems basing on concatenation of natural speech segments achieve a high quality in terms of naturalness and intelligibility. However, in many applications such systems are not easy to apply because of the huge demand for storage capacity. Speech synthesis systems based on HMMs could be an alternative to concatenative speech synthesis systems but do not yet achieve the quality needed for use in applications. In one of our research projects we investigate the possibility of combining speech synthesis and speech recognition to a unified system using the same databases and similar algorithms for synthesis and recognition. In this context we examine the suitability of stochastic Markov graphs instead of HMMs to improve the performance of such synthesis systems. The paper describes the training procedure we used to train the SMGs, explains the synthesis process and introduces an algorithm for state selection and state duration modeling. We focus particularly on issues which arise using SMGs instead of HMMs.
- Published
- 2002
47. Automatic learning of numeral grammars for multi-lingual speech synthesizers
- Author
-
A. Wachtler, G. Flach, M. Holzapfel, Matthias Wolff, and C. Just
- Subjects
Grammar ,Computer science ,business.industry ,media_common.quotation_subject ,Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing) ,Graph theory ,computer.software_genre ,Unary numeral system ,Tree-adjoining grammar ,Numeral system ,Rule-based machine translation ,Graph (abstract data type) ,Artificial intelligence ,L-attributed grammar ,business ,computer ,Natural language ,Natural language processing ,media_common - Abstract
Presented is a trainable data-driven method of deriving numerals from number strings. The concept is based on learning a graph model for numeral grammars and a graph search which is capable of extracting numeral words from the grammar according to given number strings. Because this method separates code and data it is universal and applicable to every language. No a priori knowledge about formal descriptions of numeral grammars is required. Due to the underlying graph concept, the algorithm is able to automatically generate grammars for number to numeral translations with a minimal effort. The only required input information is a set of pairs of number strings and appropriate numerals. This method is useful for an easy implementation of knowledge bases for further languages in the framework of a multi-lingual speech synthesis system.
- Published
- 2002
48. Data-driven generation of pronunciation dictionaries in the German Verbmobil project: discussion of experimental results
- Author
-
Matthias Eichner and Matthias Wolff
- Subjects
Computer science ,business.industry ,Speech recognition ,Realization (linguistics) ,Orthographic transcription ,Pronunciation ,computer.software_genre ,language.human_language ,Data-driven ,German ,language ,Artificial intelligence ,business ,computer ,Word (computer architecture) ,Natural language ,Natural language processing - Abstract
In the framework of the German Verbmobil project we developed a procedure for the automatic, data-driven generation of pronunciation dictionaries for speech recognition systems. In most recognizers, only simple dictionaries containing the canonical pronunciation form are used. They represent the correct pronunciation, but in most cases the canonical pronunciation does not match the actual realization of the word. To solve this problem we chose an approach to derive pronunciation variants automatically from a speech database. The training algorithm is based on a canonical dictionary which is compiled into a graph representation in a first stage. Pronunciation variants are then learned from a training sample consisting of speech signal and its orthographic transcription. The authors focus on the experimental results obtained in the Verbmobil framework and introduce methods to evaluate pronunciation dictionaries generated by the training procedure.
- Published
- 2002
49. Framework design and implementation of Web-based tutorials in spoken language engineering
- Author
-
Matthias Wolff and R. Hoffmann
- Subjects
Multimedia ,Computer science ,business.industry ,Speech technology ,Educational technology ,Speech corpus ,Speech synthesis ,computer.software_genre ,Speech processing ,Web application ,Speech analytics ,business ,computer ,Spoken language - Abstract
Education in spoken language engineering may be supported very effectively by Web based methods. The paper describes at first some problems which arise if Web based tutorials are to feature online speech input and interactive usage. A framework is defined which is aimed at handling these problems. The architecture and the interfaces are explained in more detail. Finally, we describe a tutorial on speech synthesis which was designed using this framework.
- Published
- 2002
50. Data Driven Generation of Pronunciation Dictionaries
- Author
-
Matthias Wolff, Matthias Eichner, and Rüdiger Hoffmann
- Subjects
business.industry ,Computer science ,Speech recognition ,SIGNAL (programming language) ,Realization (linguistics) ,Pronunciation ,Orthographic transcription ,computer.software_genre ,Focus (linguistics) ,Data-driven ,Artificial intelligence ,business ,computer ,Word (computer architecture) ,Factor graph ,Natural language processing - Abstract
In the framework of the German Verbmobil project we developed a procedure for the automatic, data-driven generation of pronunciation dictionaries for speech recognition systems. In most recognizers only simple dictionaries containing the canonical pronunciation form are used. They represent the correct pronunciation, but in most cases the canonical pronunciation does not match the actual realization of the word. To solve this problem we chose an approach to derive pronunciation variants automatically from a speech database. The training algorithm bases on a canonical dictionary which is compiled into a graph representation in a first stage. Pronunciation variants are then learned from a training sample consisting of speech signal and its orthographic transcription. In this paper we will focus on the experimental results obtained in the Verbmobil framework and introduce methods to evaluate pronunciation dictionaries generated by the training procedure.
- Published
- 2000
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.