Author: "Santitham Prom-On" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Santitham Prom-On"' showing total 92 results

Start Over Author "Santitham Prom-On"

92 results on '"Santitham Prom-On"'

1. Estimating Underlying Articulatory Targets of Thai Vowels by Using Deep Learning Based on Generating Synthetic Samples From a 3D Vocal Tract Model and Data Augmentation

Author: Thanat Lapthawan, Santitham Prom-On, Peter Birkholz, and Yi Xu
Subjects: Acoustic-to-articulatory inversion, articulatory model, articulatory target acquisition, deep learning, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Representation learning is one of the fundamental issues in modeling articulatory-based speech synthesis using target-driven models. This paper proposes a computational strategy for learning underlying articulatory targets from a 3D articulatory speech synthesis model using a bi-directional long short-term memory recurrent neural network based on a small set of representative seed samples. Using a seeding set from VocalTractLab, a larger training set was generated that provided richer contextual variations for the model to learn. The deep learning model for acoustic-to-target mapping was then trained to model the inverse relation of the articulation process. This method allows the trained model to map the given acoustic data onto the articulatory target parameters which can then be used to identify the distribution based on linguistic contexts. The model was evaluated based on its effectiveness in mapping acoustics to articulation, and the perceptual accuracy of speech reproduced from the articulation estimated from the recorded speech by native Thai speakers. The model achieved more than 80% phoneme classification accuracy in the listening test conducted with 25 native Thai speakers. The results indicate that the model can accurately imitate speech with a high degree of phonemic precision.
Published: 2022
Full Text: View/download PDF

2. Trim Loss Optimization in Paper Production Using Reinforcement Artificial Bee Colony

Author: Suthida Fairee, Charoenchai Khompatraporn, Booncharoen Sirinaovakul, and Santitham Prom-On
Subjects: Stock cutting, optimization, swarm intelligence, artificial bee colony algorithm, pulp and paper industry, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: In paper production, a jumbo reel is cut into multiple intermediate rolls, and each intermediate roll is then sheeted as finished goods. This problem is called a cutting stock problem and is proven to be NP-hard. The objective is to minimize material waste or trim loss from all the cuttings. In the case that any intermediate roll is not entirely used for its associated order, the intermediate roll itself could turn to be a dead stock. We use the concept of universal sizes of intermediate rolls to eliminate the dead stock. A pre-defined number of universal sizes of intermediate rolls is to be used to serve all the orders. The problem is solved using Reinforcement Artificial Bee Colony algorithm with Integer Linear Programming subroutine. This proposed approach is then tested with a set of 1,055 orders and 127 different sizes of sheet papers from a paper manufacturer. The results reveal that our method outperforms other algorithms. Our method offers the total trim loss of 3.51%, compared to the trim loss reported by the industry of at least 5%. This approach not only reduces the number of partially cut rolls, but also decreases the number of the jumbo reels needed to serve all the orders. Therefore, both the inventory cost and material cost can be saved.
Published: 2020
Full Text: View/download PDF

3. Corrigendum: Economy of Effort or Maximum Rate of Information? Exploring Basic Principles of Articulatory Dynamics

Author: Yi Xu and Santitham Prom-on
Subjects: maximum rate of information, economy of effort, stiffness, peak velocity, target approximation, Psychology, BF1-990
Published: 2020
Full Text: View/download PDF

4. Economy of Effort or Maximum Rate of Information? Exploring Basic Principles of Articulatory Dynamics

Author: Yi Xu and Santitham Prom-on
Subjects: maximum rate of information, economy of effort, stiffness, peak velocity, target approximation, Psychology, BF1-990
Abstract: Economy of effort, a popular notion in contemporary speech research, predicts that dynamic extremes such as the maximum speed of articulatory movement are avoided as much as possible and that approaching the dynamic extremes is necessary only when there is a need to enhance linguistic contrast, as in the case of stress or clear speech. Empirical data, however, do not always support these predictions. In the present study, we considered an alternative principle: maximum rate of information, which assumes that speech dynamics are ultimately driven by the pressure to transmit information as quickly and accurately as possible. For empirical data, we asked speakers of American English to produce repetitive syllable sequences such as wawawawawa as fast as possible by imitating recordings of the same sequences that had been artificially accelerated and to produce meaningful sentences containing the same syllables at normal and fast speaking rates. Analysis of formant trajectories shows that dynamic extremes in meaningful speech sometimes even exceeded those in the nonsense syllable sequences but that this happened more often in unstressed syllables than in stressed syllables. We then used a target approximation model based on a mass-spring system of varying orders to simulate the formant kinematics. The results show that the kind of formant kinematics found in the present study and in previous studies can only be generated by a dynamical system operating with maximal muscular force under strong time pressure and that the dynamics of this operation may hold the solution to the long-standing enigma of greater stiffness in unstressed than in stressed syllables. We conclude, therefore, that maximum rate of information can coherently explain both current and previous empirical data and could therefore be a fundamental principle of motor control in speech production.
Published: 2019
Full Text: View/download PDF

5. Reinforcement learning for solution updating in Artificial Bee Colony.

Author: Suthida Fairee, Santitham Prom-On, and Booncharoen Sirinaovakul
Subjects: Medicine, Science
Abstract: In the Artificial Bee Colony (ABC) algorithm, the employed bee and the onlooker bee phase involve updating the candidate solutions by changing a value in one dimension, dubbed one-dimension update process. For some problems which the number of dimensions is very high, the one-dimension update process can cause the solution quality and convergence speed drop. This paper proposes a new algorithm, using reinforcement learning for solution updating in ABC algorithm, called R-ABC. After updating a solution by an employed bee, the new solution results in positive or negative reinforcement applied to the solution dimensions in the onlooker bee phase. Positive reinforcement is given when the candidate solution from the employed bee phase provides a better fitness value. The more often a dimension provides a better fitness value when changed, the higher the value of update becomes in the onlooker bee phase. Conversely, negative reinforcement is given when the candidate solution does not provide a better fitness value. The performance of the proposed algorithm is assessed on eight basic numerical benchmark functions in four categories with 100, 500, 700, and 900 dimensions, seven CEC2005's shifted functions with 100, 500, 700, and 900 dimensions, and six CEC2014's hybrid functions with 100 dimensions. The results show that the proposed algorithm provides solutions which are significantly better than all other algorithms for all tested dimensions on basic benchmark functions. The number of solutions provided by the R-ABC algorithm which are significantly better than those of other algorithms increases when the number of dimensions increases on the CEC2005's shifted functions. The R-ABC algorithm is at least comparable to the state-of-the-art ABC variants on the CEC2014's hybrid functions.
Published: 2018
Full Text: View/download PDF

6. Biomarkers for Refractory Lupus Nephritis: A Microarray Study of Kidney Tissue

Author: Thitima Benjachat, Pumipat Tongyoo, Pornpen Tantivitayakul, Poorichaya Somparn, Nattiya Hirankarn, Santitham Prom-On, Prapaporn Pisitkun, Asada Leelahavanichkul, Yingyos Avihingsanon, and Natavudh Townamchai
Subjects: lupus nephritis, biomarker, microarrays, gene expression, chronic kidney disease, Biology (General), QH301-705.5, Chemistry, QD1-999
Abstract: The prognosis of severe lupus nephritis (LN) is very different among individual patients. None of the current biomarkers can be used to predict the development of refractory LN. Because kidney histology is the gold standard for diagnosing LN, the authors hypothesize that molecular signatures detected in kidney biopsy tissue may have predictive value in determining the therapeutic response. Sixty-seven patients with biopsy-proven severely active LN by International Society of Nephrology/Renal Pathology Society (ISN/RPS) classification III/IV were recruited. Twenty-three kidney tissue samples were used for RNA microarray analysis, while the remaining 44 samples were used for validation by real-time polymerase chain reaction (PCR) gene expression analysis. From hundreds of differential gene expressions in refractory LN, 12 candidates were selected for validation based on gene expression levels as well as relevant functions. The candidate biomarkers were members of the innate immune response molecules, adhesion molecules, calcium-binding receptors, and paracellular tight junction proteins. S100A8, ANXA13, CLDN19 and FAM46B were identified as the best kidney biomarkers for refractory LN, and COL8A1 was identified as the best marker for early loss of kidney function. These new molecular markers can be used to predict refractory LN and may eventually lead to novel molecular targets for therapy.
Published: 2015
Full Text: View/download PDF

7. Thai Question Text-To-SQL Parsing Using Transformer.

Author: Natthawat Tungruethaipak and Santitham Prom-on
Published: 2024
Full Text: View/download PDF

8. Development of the Topic Tagging System for Thai and English-Translated Web Contents.

Author: Thitiworada Amsa-nguan, Nawakarn Leerattanachote, Ponlawat Suparat, Piyanit Wepulanon, and Santitham Prom-on
Published: 2024
Full Text: View/download PDF

9. Waterline Detection and Water Level Estimation Based on HED Edge Detection.

Author: Punn Kiriwong, Wipada Glahan, Jidapa Thongnirun, Supakorn Siddhichai, Kitti Koonsanit, Phitchakorn Watcharanurak, Santitham Prom-on, and Kharittha Jangsamsi
Published: 2024
Full Text: View/download PDF

10. Z-coref: Thai Coreference and Zero Pronoun Resolution.

Author: Poomphob Suwannapichat, Sansiri Tarnpradab, and Santitham Prom-on
Published: 2024

11. Model-Based Exploration of Linking Between Vowel Articulatory Space and Acoustic Space.

Author: Anqi Xu, Daniel R. van Niekerk, Branislav Gerazov, Paul Konstantin Krug, Santitham Prom-on, Peter Birkholz, and Yi Xu 0007
Published: 2021
Full Text: View/download PDF

12. Topic Modeling Enhancement using Word Embeddings.

Author: Siriwat Limwattana and Santitham Prom-on
Published: 2021
Full Text: View/download PDF

13. Evoc-Learn - High quality simulation of early vocal learning.

Author: Yi Xu 0007, Anqi Xu, Daniel R. van Niekerk, Branislav Gerazov, Peter Birkholz, Paul Konstantin Krug, Santitham Prom-on, and Lorna F. Halliday
Published: 2022

14. Effects of Facial Movements to Expressive Speech Productions: A Computational Study.

Author: Santitham Prom-on and Metita Onsri
Published: 2019
Full Text: View/download PDF

15. Acoustic-to-Articulatory Inversion of a Three-dimensional Theoretical Vocal Tract Model Using Deep Learning-based Model.

Author: Thanat Lapthawan and Santitham Prom-on
Published: 2019
Full Text: View/download PDF

16. Simulating vocal learning of spoken language: Beyond imitation

Author: Daniel R. van Niekerk, Anqi Xu, Branislav Gerazov, Paul K. Krug, Peter Birkholz, Lorna Halliday, Santitham Prom-on, and Yi Xu
Subjects: Linguistics and Language, Communication, Modeling and Simulation, Computer Vision and Pattern Recognition, Language and Linguistics, Software, Computer Science Applications
Published: 2023

17. Data exchange protocol for healthcare service in Thailand.

Author: Ruedeemart Jessadapattharakul, Santitham Prom-on, Chularat Tanprasert, and Tiranee Achalakul
Published: 2015
Full Text: View/download PDF

18. Estimating vocal tract shapes of Thai vowels from contextual vowel variation.

Author: Santitham Prom-on, Peter Birkholz, and Yi Xu 0007
Published: 2014
Full Text: View/download PDF

19. DOM: A big data analytics framework for mining Thai public opinions.

Author: Santitham Prom-on, Sirapop Na Ranong, Patcharaporn Jenviriyakul, Thepparit Wongkaew, Nareerat Saetiew, and Tiranee Achalakul
Published: 2014
Full Text: View/download PDF

20. Mora-based pre-low raising in Japanese pitch accent.

Author: Albert Lee, Yi Xu, and Santitham Prom-on
Published: 2013
Full Text: View/download PDF

21. Training an articulatory synthesizer with continuous acoustic data.

Author: Santitham Prom-on, Peter Birkholz, and Yi Xu
Published: 2013
Full Text: View/download PDF

22. Simulating Post-L F0 Bouncing by Modeling Articulatory Dynamics.

Author: Santitham Prom-on, Yi Xu 0007, and Fang Liu 0018
Published: 2011
Full Text: View/download PDF

23. Pathway-Based Microarray Analysis with Negatively Correlated Feature Sets for Disease Classification.

Author: Pitak Sootanan, Asawin Meechai, Santitham Prom-on, and Jonathan Hoyin Chan
Published: 2011
Full Text: View/download PDF

24. Modelling Extreme Tonal Reduction in Taiwan Mandarin Based on Target Approximation.

Author: Chierh Cheng, Yi Xu, and Santitham Prom-on
Published: 2011

25. Functional Modeling of Tone, Focus and Sentence Type in Mandarin Chinese.

Author: Santitham Prom-on, Fang Liu 0018, and Yi Xu 0007
Published: 2011

26. Articulatory-functional modeling of speech prosody: a review.

Author: Yi Xu and Santitham Prom-on
Published: 2010
Full Text: View/download PDF

27. Microarray-Based Disease Classification Using Pathway Activities with Negatively Correlated Feature Sets.

Author: Pitak Sootanan, Santitham Prom-on, Asawin Meechai, and Jonathan Hoyin Chan
Published: 2010
Full Text: View/download PDF

28. Unsupervised Algorithms for Population Classification and Ancestry Informative Marker Selection.

Author: Apaporn Rodpan, Pongsakorn Wangkumhang, Anunchai Assawamakin, Santitham Prom-on, and Sissades Tongsima
Published: 2010
Full Text: View/download PDF

29. Pathway Activity Inferences with Negatively Correlated Features for Pancreatic Cancer Classification.

Author: Pitak Sootanan, Santitham Prom-on, Asawin Meechai, and Jonathan Hoyin Chan
Published: 2009
Full Text: View/download PDF

30. Identifying Functional Modules Using MST-Based Weighted Gene Co-Expression Networks.

Author: Atthawut Chanthaphan, Santitham Prom-on, Asawin Meechai, and Jonathan Hoyin Chan
Published: 2009
Full Text: View/download PDF

31. Identifying Disease Susceptible DNA Regions Using Underlying Odds Ratio Contour Analysis.

Author: Santitham Prom-on, Jonathan Hoyin Chan, Asawin Meechai, Wallaya Jongjaroenprasert, and Boonsong Ongphiphadhanakul
Published: 2008
Full Text: View/download PDF

32. Pitch target analysis of Thai tones using quantitative target approximation model and unsupervised clustering.

Author: Santitham Prom-on
Published: 2008
Full Text: View/download PDF

33. Sample Filtering Relief Algorithm: Robust Algorithm for Feature Selection.

Author: Thammakorn Saethang, Santitham Prom-on, Asawin Meechai, and Jonathan Hoyin Chan
Published: 2008
Full Text: View/download PDF

34. Computational modelling of double focus in American English.

Author: Fang Liu 0018, Yi Xu 0007, Santitham Prom-on, and Douglas H. Whalen
Published: 2015

35. Quantitative Target Approximation Model: Simulating Underlying Mechanisms of Tones and Intonations.

Author: Santitham Prom-on, Yi Xu, and Bundit Thipakorn
Published: 2006
Full Text: View/download PDF

36. Pre-low raising in Cantonese and Thai: Effects of speech rate and vowel quantity

Author: Santitham Prom-on, Albert Lee, and Yi Xu
Subjects: medicine.medical_specialty, Variation (linguistics), Acoustics and Ultrasonics, Arts and Humanities (miscellaneous), Duration (music), Vowel, Realization (linguistics), medicine, Audiology, Psychology, Raising (linguistics), Speech rate
Abstract: Although pre-low raising (PLR) has been extensively studied as a type of contextual tonal variation, its underlying mechanism is barely understood. This paper explored the effects of phonetic vs phonological duration on PLR in Cantonese and Thai and examined how speech rate and vowel quantity interact with its realization in these languages, respectively. The results for Cantonese revealed that PLR always occurred before a large falling excursion (i.e., high-low); in other tonal contexts, it was observed more often in faster speech. In the Thai corpus, PLR also occurred before large falling excursions, and there was more PLR in short vowels. These results are discussed in terms of possible accounts of the underlying mechanism of PLR.
Published: 2021

37. Modelling English diphthongs with dynamic articulatory targets

Author: Anqi Xu, Branislav Gerazov, Daniel van Niekerk, Paul Konstantin Krug, Santitham Prom-on, Peter Birkholz, and Yi Xu
Abstract: The nature of English diphthongs has been much disputed. Bynow, the most influential account argues that diphthongs arephoneme entities rather than vowel combinations. However,mixed results have been reported regarding whether the rate offormant transition is the most reliable attribute in the perceptionand production of diphthongs. Here, we used computationalmodelling to explore the underlying forms of diphthongs. Wetested the assumption that diphthongs have dynamicarticulatory targets by training an articulatory synthesiser witha three-dimensional (3D) vocal tract model to learn Englishwords. An automatic phoneme recogniser was constructed toguide the learning of the diphthongs. Listening experiments bynative listeners indicated that the model succeeded in learninghighly intelligible diphthongs, providing support for thedynamic target assumption. The modelling approach paves anew way for validating hypotheses of speech perception andproduction.
Published: 2022

38. Trim Loss Optimization in Paper Production Using Reinforcement Artificial Bee Colony

Author: Santitham Prom-on, Booncharoen Sirinaovakul, Charoenchai Khompatraporn, and Suthida Fairee
Subjects: Mathematical optimization, General Computer Science, Computer science, swarm intelligence, General Engineering, Paper production, Stock cutting, Inventory cost, Trim, Artificial bee colony algorithm, pulp and paper industry, Cutting stock problem, General Materials Science, artificial bee colony algorithm, lcsh:Electrical engineering. Electronics. Nuclear engineering, Reinforcement, Integer programming, optimization, lcsh:TK1-9971
Abstract: In paper production, a jumbo reel is cut into multiple intermediate rolls, and each intermediate roll is then sheeted as finished goods. This problem is called a cutting stock problem and is proven to be NP-hard. The objective is to minimize material waste or trim loss from all the cuttings. In the case that any intermediate roll is not entirely used for its associated order, the intermediate roll itself could turn to be a dead stock. We use the concept of universal sizes of intermediate rolls to eliminate the dead stock. A pre-defined number of universal sizes of intermediate rolls is to be used to serve all the orders. The problem is solved using Reinforcement Artificial Bee Colony algorithm with Integer Linear Programming subroutine. This proposed approach is then tested with a set of 1,055 orders and 127 different sizes of sheet papers from a paper manufacturer. The results reveal that our method outperforms other algorithms. Our method offers the total trim loss of 3.51%, compared to the trim loss reported by the industry of at least 5%. This approach not only reduces the number of partially cut rolls, but also decreases the number of the jumbo reels needed to serve all the orders. Therefore, both the inventory cost and material cost can be saved.
Published: 2020

39. The PENTA Model: Concepts, Use, and Implications

Author: Yi Xu, Santitham Prom-on, and Fang Liu
Published: 2022

40. Author Response to the Commentary: Multiple Layers of Meanings Can Be Linked to Surface Prosody without Direct Mapping

Author: Yi Xu, Santitham Prom-on, and Fang Liu
Published: 2022

41. Model-Based Exploration of Linking Between Vowel Articulatory Space and Acoustic Space

Author: Santitham Prom-on, Anqi Xu, Paul Konstantin Krug, Daniel van Niekerk, Peter Birkholz, Yi Xu, and Branislav Gerazov
Subjects: Acoustic space, Speech production, Computer science, Perception, media_common.quotation_subject, Speech recognition, Vowel, British English, language, Space (commercial competition), language.human_language, Vocal tract, media_common
Abstract: While the acoustic vowel space has been extensively studied in previous research, little is known about the high-dimensional articulatory space of vowels. The articulatory imaging techniques are limited to tracking only a few key articulators, leaving the rest of the articulators unmonitored. In the present study, we attempted to develop a detailed articulatory space obtained by training a 3D articulatory synthesizer to learn eleven British English vowels. An analysis-by-synthesis strategy was used to acoustically optimize vocal tract parameters that represent twenty articulatory dimensions. The results show that tongue height and retraction, larynx location and lip roundness are the most perceptually distinctive articulatory dimensions. Yet, even for these dimensions, there is a fair amount of articulatory overlap between vowels, unlike the fine-grained acoustic space. This method opens up the possibility of using modelling to investigate the link between speech production and perception.
Published: 2021

42. Detecting Text Semantic Similarity by Siamese Neural Networks with MaLSTM in Thai Language

Author: Nathaphop Sundarabhogin, Natthawat Tungruethaipak, Santitham Prom-on, and Natkanok Poksappaiboon
Subjects: Artificial neural network, Computer science, business.industry, Frequently asked questions, Big data, Semantics, computer.software_genre, Semantic similarity, Similarity (network science), Encyclopedia, Artificial intelligence, business, computer, Natural language processing
Abstract: This paper proposes to develop a model to detect the text semantic similarity in Thai by using a Siamese neural network with MaLSTM. As the text's intent is varied and hard to analyze, thus making the comparison between two sentences for the semantics similarity challenging. Our project wants to find similar intent of two questions by comparing the question between frequently asked questions (FAQs) and input questions from customers via Facebook Messenger of computer engineering at the King Mongkut's University of Technology Thonburi (CPE KMUTT). The data gather manually from the Pantip, CPE KMUTT FAQ, and Wikipedia, which create a dataset and data corpus. Although, the overall score performance of the model is not as we expected. Our model shows promising results, as it can detect between two questions' intent similarity. Results suggest that with the increase of the data, the model should have great potential for finding semantic similarity.
Published: 2021

43. Thai Tokenization with Attention Mechanism

Author: Santitham Prom-on and Jednipat Atiwetsakun
Subjects: Computer science, business.industry, Deep learning, Lexical analysis, Text segmentation, Context (language use), computer.software_genre, Visualization, Segmentation, Artificial intelligence, business, computer, Sentence, Natural language processing, Word (computer architecture)
Abstract: Word segmentation is an important preprocessing step in natural language processing applications, particularly in languages with no demarcation indicators including Thai. A simple method like dictionary-based segmentation does not consider the context of the sentence. This paper proposes an attention-based deep learning approach for Thai word segmentation. With the help of attention, the model can learn character correlations across the entire sentence without gradient vanishing or gradient explode problems and tokenize them into word vectors. The goal of this research is to test three different types of attention mechanisms to determine the effectiveness of word tokenization. The visualization of attention for each attention mechanism is also included as an outcome.
Published: 2021

44. The Relations Between Implementation Date of Policies and The Spreading of COVID-19

Author: Narawit Tubtimtoe, Unchalisa Taetragool, Watcharin Sirinaovakul, Thanason Eiamyingsakul, and Santitham Prom-on
Subjects: Public economics, Coronavirus disease 2019 (COVID-19), Computer science, business.industry, Big data, Pandemic, Public policy, business, Disease cluster, Variety (cybernetics)
Abstract: COVID-19 is undeniably one of the worst incidents in the 21st century. There are a wide variety of factors that impact the spreading of COVID-19. This paper presents the study of relations of how public policy implementation might affect the onset and the spread of COVID-19 cases. Cluster analysis was employed to identify data patterns associating with the policy implementation profiles. The results suggest that the effectiveness of policy adoption relates to the onset spreading of COVID-19. This also indicates that the decision of public administrators was critical in the latter stage of the pandemic situation management.
Published: 2020
Full Text: View/download PDF

45. On the Network and Topological Analyses of Legal Documents using Text Mining Approach

Author: Supawit Somsakul and Santitham Prom-on
Subjects: Structure (mathematical logic), Information retrieval, Text mining, Computer science, business.industry, ComputingMethodologies_DOCUMENTANDTEXTPROCESSING, ComputingMilieux_LEGALASPECTSOFCOMPUTING, Network science, Document retrieval, business
Abstract: This paper presents a computational study of Thai legal documents using text mining and network analytic approach. Thai legal systems rely much on the existing judicial rulings. Thus, legal documents contain complex relationships and require careful examination. The objective of this study is to use text mining to model relationships between these legal documents and draw useful insights. A structure of document relationship was found as a result of the study in forms of a network that is related to the meaningful relations of legal documents. This can potentially be developed further into a document retrieval system based on how documents are related in the network.
Published: 2020

46. Corrigendum: Economy of Effort or Maximum Rate of Information? Exploring Basic Principles of Articulatory Dynamics

Author: Santitham Prom-on and Yi Xu
Subjects: maximum rate of information, lcsh:BF1-990, Dynamics (mechanics), target approximation, Stiffness, Correction, peak velocity, stiffness, lcsh:Psychology, economy of effort, Peak velocity, medicine, Psychology, Statistical physics, medicine.symptom, General Psychology, Maximum rate
Published: 2020

47. PENTATrainer2: A hypothesis-driven prosody modeling tool

Author: Santitham Prom-on and Yi Xu
Subjects: Scheme (programming language), Parallel encoding, business.industry, Process (engineering), Computer science, computer.software_genre, Mandarin Chinese, language.human_language, Annotation, language, Stochastic optimization, Artificial intelligence, Prosody, business, computer, Invariant (computer science), Natural language processing, computer.programming_language
Abstract: Prosody is an essential aspect of speech, as it carries both lexical and non-lexical information. A conventional approach for studying speech prosody is to collect and analyze F0 data based on certain hypotheses and then develop a theory based on the observation as the final conclusion of the study. This process is however far from complete, as the developed theory has not been actually tested for its ability to predict actual acoustic data. This paper presents PENTATrainer2, a prosody modeling tool based on the parallel encoding and target approximation framework. PENTATrainer2 can facilitate prosody studies in testing hypotheses and theories against speech data by using an automatic analysis-by-synthesis and stochastic learning algorithm. Users can flexibly design the annotation scheme based on their own hypotheses and determine whether the hypothesized categories can lead to accurate synthetic F0 contours. PENTATrainer2 consists of three main components: multi-layer annotation, target approximation, and stochastic optimization. First, acoustic data are annotated in parallel layers, each of which corresponds to a functional category that may affect F0 contours. These layers are then compiled into unique functional combinations. The combinations represent underlying invariant representations of communicative functions and their interaction with each other. Target approximation parameters of each combination are then learned through analysis-by-synthesis and stochastic optimization. Pilot tests of PENTATrainer 2 have been conducted on Thai, Mandarin and English. The results demonstrate not only high accuracy of the synthesized F0 contours but also distinctive contrasts in the distribution of pitch target parameters. This indicates the effectiveness of PENTATrainer2 in modeling speech prosody.
Published: 2019

48. Acoustic-to-Articulatory Inversion of a Three-dimensional Theoretical Vocal Tract Model Using Deep Learning-based Model

Author: Santitham Prom-on and Thanat Lapthawan
Subjects: Articulatory synthesis, Mean squared error, business.industry, Computer science, Deep learning, Speech recognition, 020206 networking & telecommunications, 02 engineering and technology, Formant, Vowel, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, Mel-frequency cepstrum, business, Sentence, Vocal tract
Abstract: This paper presents an acoustic-to-articulatory mapping of a three-dimensional theoretical vocal tract model using deep learning methods. Prominent deep learning-based network structures are explored and evaluated for their suitability in capturing the relationship between acoustic and articulatory-oriented vocal tract parameters. The dataset was synthesized from VocalTractLab, a three-dimensional theoretical articulatory synthesizer, in forms of the pairs of acoustic, represented by Mel-frequency cepstral coefficients (MFCCs), and articulatory signals, represented by 23 vocal tract parameters. The sentence structure used in the dataset generation were both monosyllabic and disyllabic vowel articulations. Models were evaluated using the root-mean-square error (RMSE) and R-squared (R2). The deep artificial neural network architecture (DNN), regulating using batch normalization, achieves the best performance for both inversion tasks, RMSE of 0.015 and R2 of 0.970 for monosyllabic vowels and RMSE of 0.015and R2 of 0.975 for disyllabic vowels. The comparison, between a formant of a sound from inverted articulatory parameters and the original synthesized sound, demonstrates that there is no statistically different between original and estimated parameters. The results indicate that the deep learning-based model is effectively estimated articulatory parameters in a three-dimensional space of a vocal tract model.
Published: 2019

49. Thai Named Entity Recognition Using Bi-LSTM-CRF with Word and Character Representation

Author: Suphanut Thattinaphanich and Santitham Prom-on
Subjects: Artificial neural network, Computer science, business.industry, Representation (systemics), 020206 networking & telecommunications, 02 engineering and technology, computer.software_genre, Character (mathematics), Recurrent neural network, Named-entity recognition, 0202 electrical engineering, electronic engineering, information engineering, Task analysis, 020201 artificial intelligence & image processing, Artificial intelligence, business, computer, Word (computer architecture), Natural language processing, Sentence
Abstract: Named Entity Recognition (NER) is a handy tool for many natural language processing tasks to identify and extract a unique entity such as person, location, organization and time. In English and Chinese, NER has been thoroughly researched and is able to be applied in more practical settings. Its development in Thai is still limited because of rare resources and language difficulties such as the lack of boundary indicator for words, phrases and sentences. In this paper, we present an application of Bi-LSTM-CRF with word/character level representation, to solve this problem. Firstly, we prepared texts by tokenizing a sentence to a bunch of words. We then prepared word representation and Bi-LSTM character representation. In the end, we built a recurrent neural network combined with CRF to learn the sequence of text and extract the knowledge to build NER recognition to overcome this problem. Our model was evaluated by the NER opensource corpus from a Facebook group ThaiNLP. The results of our model yielded precision, recall, and F1 at 91.79%, 91.51% and 91.65% respectively.
Published: 2019

50. The Design and Implementation of an Image Processing Framework with a Graphical Programming Interface for Low-Ended FPGAs

Author: Santitham Prom-on and Nuntipat Narkthong
Subjects: Computer science, business.industry, Interface (computing), Design tool, Image processing, 02 engineering and technology, 020202 computer hardware & architecture, law.invention, Microcontroller, Microprocessor, law, Embedded system, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, business, Field-programmable gate array, Visual programming language, Camera module
Abstract: Image processing and computer vision algorithms are very computationally intensive and could not be implemented on low power microcontroller used in embedded and small IoT devices. Performing computation on the cloud is not practical for always-on real-time operations and deploying a high-end microcontroller or a microprocessor consume too much power and are generally too expensive to fit into these small systems. In this paper, we propose a new programming workflow using reusable hardware modules and a graphical programming interface for implementing a complete image processing system on an FPGAs which overcomes the steep learning curve of tradition FPGAs design tool. The design can be deployed onto the low-ended FPGAs in a similar price point as a mid-range microcontroller. To demonstrate our proposed framework, we have implemented a number of image transformation operations on an FPGAs development board with Lattice iCE40 Ultra Plus FPGA and a tiny camera module. Results have shown that our designs can fit in a low-ended FPGAs while performing 5.5-345x faster and consuming 2.4-4x less power compared to current state-of-the-art microcontroller used in small embedded and IoT devices.
Published: 2019

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

92 results on '"Santitham Prom-On"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources