Author: "Tumisho Billson Mokgonyane" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Tumisho Billson Mokgonyane"' showing total 13 results

Start Over Author "Tumisho Billson Mokgonyane"

13 results on '"Tumisho Billson Mokgonyane"'

1. Practical Approach on Implementation of WordNets for South African Languages.

Author: Tshephisho Joseph Sefara, Tumisho Billson Mokgonyane, and Vukosi Marivate
Published: 2021

2. Automatic Speaker Recognition System based on Optimised Machine Learning Algorithms.

Author: Tumisho Billson Mokgonyane, Tshephisho Joseph Sefara, Thipe Isaiah Modipa, and Madimetja Jonas D. Manamela
Published: 2019
Full Text: View/download PDF

3. Gender Identification in Sepedi Speech Corpus

Author: Tumisho Billson Mokgonyane and Tshephisho Joseph Sefara
Subjects: Identification (information), Audio signal, Computer science, Multilayer perceptron, Speech recognition, Feature extraction, Speech corpus, Feature selection, Convolutional neural network, Data modeling
Abstract: Gender identification is the task of identifying the gender of the speaker from the audio signal. Most gender identification systems are developed using datasets belonging to well-resourced languages. There has been little focus on creating gender identification systems for under resourced African languages. This paper presents the development of a gender identification system using a Sepedi speech dataset containing a duration of 55.7 hours made of 30776 males and 28337 females. We build a gender identification system using machine learning models that are trained using multilayer Perceptron (MLP), convolutional neural network (CNN), and long short-term memory (LSTM). Mid-term features are extracted from time domain features, frequency domain features and cepstral domain features, and normalised using the Z-score normalisation technique. XGBoost is used as a feature selection method to select important features. MLP achieved the same F-score and an accuracy of 94% for data with seen speakers while LSTM and CNN achieved the same F-score and an accuracy of 97%. We further evaluated the models on data with unseen speakers. All the models achieved good performance in F-score and accuracy.
Published: 2021

4. A Cross-platform Interface for Automatic Speaker Identification and Verification

Author: Thipe Isaiah Modipa, Tshephisho Joseph Sefara, Madimetja Jonas Manamela, and Tumisho Billson Mokgonyane
Subjects: Identification (information), Computer science, Interface (Java), Speech recognition, Multilayer perceptron, Feature extraction, Audio analyzer, Language technology, Classifier (linguistics), Speaker recognition
Abstract: The task of automatically identifying and/or verifying the identity of a speaker from a recording of a speech sample, known as automatic speaker recognition, has been studied for many years and automatic speaker recognition technologies have improved recently and becoming inexpensive and reliable methods for identifying and verifying people. Although automatic speaker recognition research has now spanned over 50 years, there is not adequate research done with regards to low-resourced South African indigenous languages. In this paper, a multi-layer perceptron (MLP) classifier model is trained and deployed on a graphical user interface for real time identification and verification of Sepedi native speakers. Sepedi is a low-resourced language spoken by the majority of residents in the Limpopo province of South Africa. The data used to train the speaker recognition system is obtained from the NCHLT (National Centre for Human Language Technology) project. A total of 34 short-term acoustic features of speech are extracted with the use of py Audio Analysis library and Sklearn is used to train the MLP classifier model which performs well with an accuracy of 95%. The GUI is developed with QT Creator and PyQT4 and it has obtained a true acceptance rate (TAR) of 66.67% and a true rejection rate of (TRR) 13.33%.
Published: 2021

5. Emotional Speaker Recognition based on Machine and Deep Learning

Author: Tshephisho Joseph Sefara and Tumisho Billson Mokgonyane
Subjects: Support vector machine, Artificial neural network, Computer science, business.industry, Speech recognition, Multilayer perceptron, Deep learning, Deep neural networks, Artificial intelligence, business, Speaker recognition, Convolutional neural network, Random forest
Abstract: Speaker recognition is a method which recognise a speaker from characteristics of a voice. Speaker recognition technologies have been widely used in many domains. Most speaker recognition systems have been trained on normal clean recordings, however the performance of these speaker recognition systems tends to degrade when recognising speech which has emotions. This paper presents an emotional speaker recognition system trained using machine and deep learning algorithms using time, frequency and spectral features on emotional speech database acquired from the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS). We trained and compared the performance of five machine learning models (Logistic Regression, Support Vector Machine, Random Forest, XGBoost, and k-Nearest Neighbor), and three deep learning models (Long Short-Term Memory network, Multilayer Perceptron, and Convolutional Neural Network). After the evaluation of the models, the deep neural networks showed good performance compared to machine learning models by attaining the highest accuracy of 92% outperforming the state-of-the-art models in emotional speaker detection from speech signals.
Published: 2020

6. The Effects of Acoustic Features of Speech for Automatic Speaker Recognition

Author: Tumisho Billson Mokgonyane, Thipe Isaiah Modipa, Moses Sebaka Masekwameng, Tshephisho Joseph Sefara, and Madimetja Jonas Manamela
Subjects: Support vector machine, Kernel (linear algebra), Computer science, Speech recognition, Language technology, Feature extraction, Identity (object-oriented programming), Sample (statistics), Perceptron, Speaker recognition
Abstract: Automatic speaker recognition is the task of automatically determining or verifying the identity of a speaker from a recording of his or her speech sample and has been studied for many decades. One of the most important steps of speaker recognition that significantly influences the speaker recognition performance is known as feature extraction. Acoustic features of speech have been researched by many researchers around the world, however, there is limited research conducted on African indigenous languages, South African official languages in particular. This paper presents the effects of acoustic features of speech towards the performance of speaker recognition systems focusing on South African low-resourced languages. This study investigates the acoustic features of speech using the National Centre for Human Language Technology (NCHLT) Sepedi speech data. Acoustic features of speech such as Time-domain, Frequency-domain and Cepstral-domain features are evaluated on four machine learning algorithms: K-Nearest Neighbours (K-NN), two kernel-based Support Vector Machines (SVM), and Multilayer Perceptrons (MLP). The results show that the performance is poor for time-domain features and good for spectral-domain features and even better for cepstral-domain features. However, the combination of these three features resulted in a higher accuracy and $F_{1}$ score of 98%.
Published: 2020

7. Effects of Language Modelling for Sepedi-English Code-Switched Speech in Automatic Speech Recognition System

Author: Mercy Mosibudi Mogale, Madimetja Jonas Manamela, Tumisho Billson Mokgonyane, Moses Sebaka Masekwameng, and Thipe Isaiah Modipa
Subjects: 0209 industrial biotechnology, Computer science, Speech recognition, Languages of Africa, Speech coding, 02 engineering and technology, Code (semiotics), Data modeling, 020901 industrial engineering & automation, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Language modelling, Language model, Smoothing
Abstract: Speech is the primary means of communication among people. Spoken dialogue system give out some means for people to be able to interact with computer systems. The automatic speech recognition system itself is forms part a of spoken dialogue systems. This type of system did a great job for European languages with more challenges encountered for the recognition of South African languages. In this study, we investigate the appropriate approaches for the development of language models for the recognition of Sepedi-English code-switched speech and their effect in ASR. The SRI Language Modeling (SRILM) toolkit was used to develop the Language Model (LM). The Kaldi toolkit to develop ASR system was chosen which is specifically used for speech recognition. This toolkit was used to evaluate the effects of the smoothing techniques. We have evaluated Four smoothing techniques namely Good-Turing (GT), Witten-Bell (WB), Modified Kneser-Ney (MKN), and Laplace (LP) Smoothing. The Witten-Bell smoothing technique was found to perform better than the other three smoothing techniques for Sepedi-English CS data in Language Modelling and also in our ASR.
Published: 2020

8. Automatic Speaker Recognition System based on Optimised Machine Learning Algorithms

Author: Madimetja Jonas Manamela, Thipe Isaiah Modipa, Tumisho Billson Mokgonyane, and Tshephisho Joseph Sefara
Subjects: Progress in artificial intelligence, Automatic speaker recognition, Artificial neural network, Computer science, business.industry, 020206 networking & telecommunications, 02 engineering and technology, Machine learning, computer.software_genre, Speaker recognition, 01 natural sciences, k-nearest neighbors algorithm, Random forest, Support vector machine, ComputingMethodologies_PATTERNRECOGNITION, 0103 physical sciences, 0202 electrical engineering, electronic engineering, information engineering, Artificial intelligence, business, 010301 acoustics, Classifier (UML), computer, Algorithm
Abstract: Speaker recognition is a technique that automatically identifies a speaker from a recording of their voice. Speaker recognition technologies are taking a new trend due to the progress in artificial intelligence and machine learning and have been widely used in many domains. Continuing research in the field of speaker recognition has now spanned over 50 years. In over half a century, a great deal of progress has been made towards improving the accuracy of the system's decisions, through the use of more successful machine learning algorithms. This paper presents the development of automatic speaker recognition system based on optimised machine learning algorithms. The algorithms are optimised for better and improved performance. Four classifier models, namely, Support Vector Machines, K-Nearest Neighbors, Random Forest, Logistic Regression, and Artificial Neural Networks are trained and compared. The system resulted with Artificial Neural Networks obtaining the state-of-the-art accuracy of 96.03% outperforming KNN, SVM, RF and LR classifiers.
Published: 2019

9. HMM-based Speech Synthesis System incorporated with Language Identification for Low-resourced Languages

Author: Madimetja Jonas Manamela, Thipe Isaiah Modipa, Tshephisho Joseph Sefara, and Tumisho Billson Mokgonyane
Subjects: Language identification, Computer science, business.industry, Mean opinion score, Languages of Africa, Foreign language, Word error rate, Speech synthesis, Intelligibility (communication), computer.software_genre, Artificial intelligence, Hidden Markov model, business, computer, Natural language processing
Abstract: Text-to-speech (TTS) synthesis systems are of benefit towards learning new or foreign languages. These systems are currently available for various major languages but not available for low-resourced languages. Scarcity of these systems may lead to challenges in learning new languages specifically low-resourced languages. Development of language-specific systems like TTS and Language identification (LID) have an important task to address in mitigating the historical linguistic effects of discrimination and domination imposed onto low-resourced indigenous languages. This paper presents the development of a multi-language LID+TTS synthesis system that generate audio of input text using the predicted language in four South African languages, namely: Tshivenda, Sepedi, Xitsonga and IsiNdebele. On the front-end, is the LID module that detects language of the input text before the TTS synthesis module produces output audio. The LID module is trained on a 4 million words dataset resulted with 99% accuracy outperforming the state-of-the-art systems. A robust method for building TTS voices called hidden Markov model method is used to build new voices in the selected languages. The quality of the voices is measured using the mean opinion score and word error rate metrics that resulted with positive results on the understandability, naturalness, pleasantness, intelligibility and overall impression of the system of the newly created TTS voices. The system is available as a website service.
Published: 2019

10. The Effects of Data Size on Text-Independent Automatic Speaker Identification System

Author: Thipe Isaiah Modipa, Madimetja Jonas Manamela, Tshephisho Joseph Sefara, and Tumisho Billson Mokgonyane
Subjects: Progress in artificial intelligence, Support vector machine, Computer science, Speech recognition, Multilayer perceptron, Perceptron, Speaker recognition, Field (computer science), Utterance, k-nearest neighbors algorithm
Abstract: Speaker recognition is a technique that automatically identifies a speaker from a recording of their speech utterance. Speaker recognition technologies are taking a new direction due to rapid progress in artificial intelligence. Research in the field of speaker recognition has shown fruitful results. There is, however, not much work done for African indigenous languages that have limited speech data resources. This paper presents how data size impacts the accuracy of an automatic speaker recognition system models, focusing on the Sepedi language as it is one of the South African under-resourced language. The speech data used is acquired from the South African Centre for Digital Language Resources. Four machine learning models, namely, Support Vector Machines (SVM), K-Nearest Neighbors (KNN), Multilayer Perceptrons (MLP) and Logistic Regression (LR) are trained under four data setting environment. LR performed better than other models with the highest accuracy of 91% while SVM obtained the highest increase of 4% in accuracy as data size increases.
Published: 2019

11. Automatic Speaker Recognition System based on Machine Learning Algorithms

Author: Thipe Isaiah Modipa, Tumisho Billson Mokgonyane, Mercy Mosibudi Mogale, Tshephisho Joseph Sefara, Madimetja Jonas Manamela, and Phuti J. Manamela
Subjects: Computer science, business.industry, Perceptron, Machine learning, computer.software_genre, Speaker recognition, Cross-validation, Random forest, Support vector machine, Multilayer perceptron, Artificial intelligence, business, Classifier (UML), computer, Graphical user interface
Abstract: Speaker recognition is a technique used to automatically recognize a speaker from a recording of their voice or speech utterance. Speaker recognition technology has improved over recent years and has become inexpensive and and reliable method for person identification and verification. Research in the field of speaker recognition has now spanned over five decades and has shown fruitful results, however there is not much work done with regards to South African indigenous languages. This paper presents the development of an automatic speaker recognition system that incorporates classification and recognition of Sepedi home language speakers. Four classifier models, namely, Support Vector Machines, K-Nearest Neighbors, Multilayer Perceptrons (MLP) and Random Forest (RF), are trained using WEKA data mining tool. Auto-WEKA is applied to determine the best classifier model together with its best hyper-parameters. The performance of each model is evaluated in WEKA using 10-fold cross validation. MLP and RF yielded good accuracy surpassing the state-of-the-art with an accuracy of 97% and 99.9% respectively, the RF model is then implemented on a graphical user interface for development testing.
Published: 2019

12. The Automatic Recognition of Sepedi Speech Emotions Based on Machine Learning Algorithms

Author: Tshepisho J Sefara, Tumisho Billson Mokgonyane, Madimetja Jonas Manamela, Thipe Isaiah Modipa, and Phuti J. Manamela
Subjects: Support vector machine, Sadness, Statistical classification, Computer science, media_common.quotation_subject, Emotion classification, Feature extraction, Happiness, Speech corpus, Affective computing, Algorithm, media_common
Abstract: Over the past years, speech emotion recognition (SER) studies have been gaining much interest in the fields of affective computing and human-computer interaction (HCI). The idea was to improve the interaction between human beings and machines. In this paper, an SER system that classifies and recognise six basic emotions (anger, sadness, disgust, fear, happiness, and neutral) from speech spoken in Sepedi language (one of South Africa's official languages) is discussed. Speech recordings were collected from the Sepedi language speakers and TV drama broadcast to create emotional speech corpora. 34 speech features were then extracted from the speech corpora, using the pyAudioAnalysis tool, to train and compare different algorithms using 10 folds cross-validation. The experiments were conducted using WEKA data-mining software. The results showed that Auto-WEKA outperforms all the standard algorithms (SVM. KNN and MLP). Recorded speech corpus yielded good recognition accuracy compare to TV broadcast speech corpus.
Published: 2018

13. Development of a speech-enabled basic arithmetic m-learning application for foundation phase learners

Author: Madimetja Jonas Manamela, Tumisho Billson Mokgonyane, Thipe Isaiah Modipa, Tshephisho Joseph Sefara, and Phuti J. Manamela
Subjects: 060201 languages & linguistics, Computer program, business.industry, Process (engineering), Computer science, media_common.quotation_subject, Speech synthesis, 06 humanities and the arts, computer.software_genre, Software, Numeracy, M-learning, Reading (process), 0602 languages and literature, Arithmetic, business, computer, Spoken language, media_common
Abstract: In very simple terms, speech synthesis is the process of generating spoken language by machine on the basis of text input, and text-to-speech is a specific type which takes as input raw text and aims to mimic the human process of reading. Computerassisted learning (CAL) can be defined as learning or teaching through the use of computers with packaged knowledge content learning materials. CAL involves a computer program or file developed specifically for educational purposes. Mobile learning or “m-learning” is the ability to obtain or provide educational content on personal pocket devices such as PDAs, smartphones and mobile phones. m-Learning as an educational activity makes sense only when the technology in use facilitates and supports mobility in learning. In this paper, we discuss the development of a mathematical computer-assisted learning mobile application that integrates a text-to-speech synthesis module for South African low-resourced languages, initially targeting the Sepedi language. The system is aimed at assisting mathematically illiterate persons and foundation phase learners to learn and understand the representation and articulation of mathematical expressions incorporating four basic arithmetic operations (addition, subtraction, multiplication, and division). It also incorporates a few numeracy functions. The results obtained from the experiments conducted with the prototype CAL system show that 80% of the participants were impressed by the developed mobile application. There is great need to enhance the development of software applications that support the teaching and learning activities at the foundation phase of education in South Africa.
Published: 2017

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

13 results on '"Tumisho Billson Mokgonyane"'

1. Practical Approach on Implementation of WordNets for South African Languages.

2. Automatic Speaker Recognition System based on Optimised Machine Learning Algorithms.

3. Gender Identification in Sepedi Speech Corpus

4. A Cross-platform Interface for Automatic Speaker Identification and Verification

5. Emotional Speaker Recognition based on Machine and Deep Learning

6. The Effects of Acoustic Features of Speech for Automatic Speaker Recognition

7. Effects of Language Modelling for Sepedi-English Code-Switched Speech in Automatic Speech Recognition System

8. Automatic Speaker Recognition System based on Optimised Machine Learning Algorithms

9. HMM-based Speech Synthesis System incorporated with Language Identification for Low-resourced Languages

10. The Effects of Data Size on Text-Independent Automatic Speaker Identification System

11. Automatic Speaker Recognition System based on Machine Learning Algorithms

12. The Automatic Recognition of Sepedi Speech Emotions Based on Machine Learning Algorithms

13. Development of a speech-enabled basic arithmetic m-learning application for foundation phase learners

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

13 results on '"Tumisho Billson Mokgonyane"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources