31 results on '"Data mining"'
Search Results
2. ENHANCING ECOSYSTEM BIODIVERSITY THROUGH AIR POLLUTION CONCENTRATIONS PREDICTION USING SUPPORT VECTOR REGRESSION APPROACHES.
- Author
-
SOLEHAH, Syaidatul Umairah, Abidin, Aida Wati Zainan, WARRIS, Saiful Nizam, SHAZIAYANI, Wan Nur, OSMAN, Balkish Mohd, IBRAHIM, Nurain, NOOR, Norazian Mohamed, and UL-SAUFIE, Ahmad Zia
- Subjects
AIR pollution ,AIR quality indexes ,AIR quality monitoring ,AIR pollutants ,ENVIRONMENTAL quality ,STANDARD deviations - Abstract
Air is the most crucial element for the survival of life on Earth. The air we breathe has a profound effect on our ecosystem biodiversity. Consequently, it is always prudent to monitor the air quality in our environment. There are few ways can be done in predicting the air pollution index (API) like data mining. Therefore, this study aimed to evaluate three types of support vector regression (linear, SVR, libSVR) in predicting the air pollutant concentration and identify the best model. This study also would like to calculate the API by using the proposed model. The secondary daily data is used in this study from year 2002 to 2020 from the Department of Environment (DoE) Malaysia which located at Petaling Jaya monitoring station. There are six major pollutants that have been focusing in this work like PM
10 , PM2.5 , SO2 , NO2 , CO, and O3 . The root means square error (RMSE), mean absolute error (MAE) and relative error (RE) were used to evaluate the performance of the regression models. Experimental results showed that the best model is linear SVR with average of RMSE = 5.548, MAE = 3.490, and RE = 27.98% because had the lowest total rank value of RMSE, MAE, and RE for five air pollutants (PM10 , PM2.5 , SO2 , CO, O3 ) in this study. Unlikely for NO2 , the best model is support vector regression (SVR) with RMSE = 0.007, MAE = 0.006, and RE = 20.75% in predicting the air pollutant concentration. This work also illustrates that combining data mining with air pollutants prediction is an efficient and convenient way to solve some related environment problems. The best model has the potential to be applied as an early warning system to inform local authorities about the air quality and can reliably predict the daily air pollution events over three consecutive days. Besides, good air quality plays a significant role in supporting biodiversity and maintaning healthy ecosystems. [ABSTRACT FROM AUTHOR]- Published
- 2023
- Full Text
- View/download PDF
3. Analyzing the influence of internet of things (IoT) data mining in medical tourism industry.
- Author
-
Rashid, Intan Maizura Abd, Rawabdeh, Mohammad, Qudah, Anas Al, Samah, Irza Hanie Abu, Amlus, Mohammad Harith, and Wan Husain, Wan Ahmad Fauzi
- Subjects
- *
MEDICAL tourism , *TOURISM , *INTERNET of things , *DATA mining , *TOURIST attractions , *DIVERSIFICATION in industry - Abstract
The striking around the world transmission of COVID-19 has empowered numerous governments to meddle to maintain a strategic distance from the spread of the contamination. Medical tourism could be a developing marvel with arrangement suggestions for wellbeing frameworks, especially of goal nations. Medical tourism, a quickly developing advertise, has been recognized by numerous nations as a potential segment for financial diversification. In spite of the fact that Malaysia stands out as one of the best goals of therapeutic tourism, examination with respect to its competitiveness has been constrai ned and contract in scope. An increasing number of tourist destinations and hotels are using new technology and solutions to promote their products and services. IoT represents a great opportunity for tourism and hospitality to increase customer satisfaction while simultaneously reducing operational costs. This case ponder takes a quantitative approach to distinguish and analyze the IoT time-series data variables that position Malaysia as a competitive medical tourism goal. Based on an all-encompassing approach, this ponder has appeared that coordination of differing procedures for restorative tourism improvement with sound government arrangements and proactive administration hones has driven to significant positive results towards the shared victory of tourism and healthcare segments of Malaysia. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
4. Unfolding emotions for creating happiness and quality of life in Malaysia's low‐income community using text mining.
- Author
-
Shuhidan, Shuhaida Mohamed, Lokman, Anitawati Mohd, Hamidi, Saidatul Rahah, Kadir, Shamsiah Abd, Syahirah, Sharifah, and Alam, Md. Mahmudul
- Subjects
- *
SOCIAL groups , *WELL-being , *EDUCATION , *COMMUNITIES , *FAMILIES , *LABOR supply , *SURVEYS , *QUALITY of life , *QUESTIONNAIRES , *GOVERNMENT policy , *RESEARCH funding , *POVERTY , *EMOTIONS , *DATA analysis software , *GOVERNMENT aid , *POLICY sciences , *DATA mining , *TRANSPORTATION - Abstract
Determining the true reasons for happiness that exist in different social groups is important, especially in the case of those groups where poverty strongly dominates people's well‐being and happiness. To understand the factors that determine happiness among poor communities, this study collected data in the form of 1,793 responses from low‐income communities, who constitute 40% of Malaysia's workforce. To ascertain the feelings and emotional responses of the target group, primary source data were collected based on a questionnaire survey utilising the Lokman Emotion Importance Quadrant (LEIQ™) model. The data were analysed using text mining techniques based on R‐Language software. According to the findings, happiness in Malaysian low‐income communities is primarily shaped by concerns about family matters such as success, care issues, education and so on. Interestingly, government financial support, money, income and work matters are only fifth in importance, and lastly, the least emphasis was put on the public transportation issue. To ensure the true happiness and wellbeing of people who are struggling financially, policymakers must provide not only financial assistance but also ensure a better future for their families: better healthcare, better incomes, keeping a roof over their heads, etc. This study will be useful to the government, development agencies, NGOs, international organisations and other stakeholders wanting to realise the United Nations' Sustainable Development Goals (SDGs) in Malaysia and similarly affected countries. Please refer to the Supplementary Material section to find this article's Community and Social Impact Statement. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
5. The integration of non-destructive testing courses into University academic curriculum: Review on Malaysian context.
- Author
-
Sulaiman, Fauziah and Eldy, Elnetthra Folly
- Subjects
- *
NONDESTRUCTIVE testing , *CURRICULUM planning , *DATA mining , *CURRICULUM - Abstract
The non-destructive testing (NDT) module is an alternative instructional course developed and provided in the industry since the techniques have become more attractive in science and engineering. There is a demand to integrate the NDT instructions module into the university curriculum. This paper proposes integrating courses and other elements into the bachelor's curriculum at the universities level. The method has looked into three countries that established NDT programs at the University's level, e.g., Germany, Singapore, and Ukraine, where the guidelines from the NDT program structure were referred. The findings and concepts are different from the well-known courses offered in the private non-academic industrial company. Other than the application, it should address the development of new techniques and applications in a research-driven environment. While distorted elements of NDT are included in most programs at universities in Malaysia, it is uncoordinated. There is a close relation to topics like measuring techniques, data mining, and statistics, and the first elements of NDT can be integrated into Bachelor courses. This paper summarizes relevant scholarly articles, identifies methods and ideas, and the need for further action leads to recommendations for an NDT curriculum design. Thus, some references to structure NDT courses in universities are also discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
6. Evaluation of wildlife distribution data in Southeast Asia: Public data mining and ecological modelling of Malaysia's primate.
- Author
-
NAJMUDDIN, MOHD FAUDZIR, HARIS, HIDAYAH, SUFAHANI, SULIADI FIRDAUS, ABDULLAH, NAZIRAH MOHAMAD, MD-ZAIN, BADRUL MUNIR, LOKMAN, MOHD ILHAM NORHAKIM, SAHIMI, HANI NABILIA MUHD, ABD GHANI, SITI NOR HUDA, and ABDUL-LATIFF, MUHAMMAD ABU BAKAR
- Subjects
- *
CHI-squared test , *ANIMAL populations , *SOCIAL media , *ECOLOGICAL models , *DATA mining , *DATA distribution , *PRIMATES - Abstract
Primate population assessment relies heavily on ecological techniques such as camera traps, distance sampling, and passive acoustic detection. Although these techniques have proven effective in estimating the number of individuals and primate groups, challenges arise when attempting to scale them up to larger areas such as districts, states, or national levels. This study aims to evaluate the reliability of public responses in determining the primate population at district level. Specifically, an online survey was designed to collect information from the public regarding the primate population within their respective areas of knowledge, focusing on Muar district in Johor, Malaysia. This district comprises of 12 sub-districts ('Mukims'). The survey was conducted over two consecutive months (February to March 2021). A combination of online social media platforms and face-to-face interviews was employed to distribute the survey and successfully gathered responses from 257 respondents. The data collected encompassed species availability, habitat availability, individual primate counts and human-primate conflicts. Chi-squared analysis, point biserial correlation, descriptive statistics and alpha diversity indices were used to analyse the data. The results indicated that Macaca fascicularis reported the highest number of individuals and was predominantly present across all sub-districts. This species also frequently engaged in conflicts with humans, primarily as agricultural pests and residential nuisances. The method described herein proved effective in providing reliable tests for species presence-absence, and collecting human-wildlife conflict data with specific geographical locations. However, further improvements in the design are required to more accurately evaluate the primate populations. Nonetheless, this method serves as a promising pilot approach for estimating wildlife populations at the state level, offering a rapid, cost-effective, and extensive alternative to traditional ecological techniques. [ABSTRACT FROM AUTHOR]
- Published
- 2023
7. Prediction of Daily Air Pollutants Concentration and Air Pollutant Index Using Machine Learning Approach.
- Author
-
Mustakim, Nurul A’isyah, Ul-Saufie, Ahmad Zia, Shaziayani, Wan Nur, Noor, Norazian Mohamad, and Mutalib, Sofianita
- Subjects
AIR pollutants ,AIR quality monitoring ,PARTICULATE matter ,AIR pollution ,RANDOM forest algorithms ,CARBON monoxide ,MACHINE learning - Abstract
The major air pollutants in Malaysia that contribute to air pollution are carbon monoxide, sulfur dioxide, nitrogen dioxide, ozone, and particulate matter. Predicting the air pollutants concentration can help the government to monitor air quality and provide awareness to the public. Therefore, this study aims to overcome the problem by predicting the air pollutants concentration for the next day. This study focuses on an industrial, the Petaling Jaya monitoring station in Selangor. The data is obtained from the Department of Environment, which contains the dataset from 2004 to 2018. Subsequently, this study is conducted to construct predictive modeling that can predict the air pollutants concentrations for the next day using a tree-based approach. From the comparison of the three models, a random forest is a best-proposed model. The results of PM
10 concentration prediction for the random forest is the best performance which is shown by RMSE (15.7611–19.0153), NAE (0.6508–0.8216), and R² (0.346–0.5911). For SO2 , the RMSE was 0.0016–0.0017, the NAE was 0.7056–0.8052, and the R² was 0.3219–0.4676. The RMSE (0.0062– 0.0075), the NAE (0.7892–0.9591), and the R² (0.0814–0.3609) for NO2 . The RMSE (0.3438–0.3975), NAE (0.7387–0.9015), and R² (0.2005–0.4399) for CO were all within acceptable limits. For O3 , the RMSE was 0.0051–0.0057, the NAE was 0.8386–0.9263, and the R² was 0.1379–0.2953. The API calculation results indicate that PM10 is a significant pollutant in representing the API. [ABSTRACT FROM AUTHOR]- Published
- 2023
- Full Text
- View/download PDF
8. Data mining in retail banking: A case study of branch classification.
- Author
-
Chong, S. H., Chin, J. F., and Loh, W. P.
- Subjects
- *
RETAIL banking , *DATA mining , *ONLINE banking , *BRANCH banks , *BANKING industry - Abstract
As customers increasingly shift to online banking services, retail banks constantly review their retail banking network strategies to keep abreast with the trend. The paper studies data mining (DM) classification to determine the branch status of a case study involving 242 retail bank branches in Malaysia and 74 attributes of these branches. The likelihood of a branch being open or close soon was rated in an expert survey. The value is converted into a variable termed as aggregate closure possibility and discretized differently into the target classes of two balanced and unbalanced datasets. In the unbalanced dataset, attribute selection was applied only on instances randomly extracted from the majority classes. Then, two experiments were carried out to identify the best performing classifiers into these datasets. The results show that the decision table classifier produces superior performance over other classifiers. Attribute selection and instance reduction entailed more efficient datasets for data mining. Despite the mildly affected overall classification accuracy, the results are acceptable as there was no impact on the prediction of the critical class (To_Close). The study demonstrates the DM technology to emulate expert decision-making in the retail banking sector. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
9. The Performance of Classification Method in Telco Customer Trouble Ticket Dataset.
- Author
-
Fauzy Che Yayah, Ghauth, Khairil Imran, and Choo-Yee Ting
- Subjects
PROCESS capability ,DATA integration ,TICKETS ,BIG data ,CUSTOMER satisfaction ,DATA mining ,ELECTRONIC data processing - Abstract
A customer trouble ticketing system (CTT) is an organization's tool to track the detection, reporting, and resolution of tickets submitted by customers. It also comprises a summary of the issue reported, the status of the ticket, the incident information, and the approach that was previously utilized to resolve the problems. The technician's skill set and experience rely solely on completing the task without the right direction on which area to focus on first. As a result of this manual classification of a trouble ticket, it will be necessary to build methodologies for predicting future resolution codes. The research for this report is mainly focused on one of the telco companies in Malaysia. This study result assists the telco engineer, and the specialists resolve each issue in a very short amount of time. Additionally, the classification of the trouble ticket resolution code method used in this study will indicate the characteristics of each issue that is being investigated. The relationship between events is feasible to discover by exploring the root cause. It is critical to establish a link between recent events and events in the previous. Because of current data mining limitations, the study needs to be more comprehensive. Data processing methods are being implemented within big data platforms to overcome the limitation of data scalability, enhance classification accuracy, and increase computation speed. The research work will continue to progress in the direction of big data centricity. Some of the most effective approaches for big data integration and machine learning will be discussed in this paper. Throughout the experiment, any problems will be explained, as well as the solutions to each situation. A wide range of research subjects will be discussed, including construction classification models for trouble tickets. To achieve reasonable accuracy, a few customized transformations are required. The data set's custom parameter optimization process will further increase the classification trouble ticket's efficiency. However, greater processing capacity is necessitated to use multiple parallel classifiers such as Bayes, Decision-Tree, and Rule-Based with help of bigdata framewrks such as Spark. According to the study, an increase of 8% classification performance substantially influences service recovery time, customer satisfaction, and preventative maintenance expenses in the telco industry. [ABSTRACT FROM AUTHOR]
- Published
- 2022
10. Cost centric data mining for radiology procedures at teaching hospital in Malaysia
- Author
-
Ibrahim, Roszita, Mohd Aman, Azana Hafizah, Nur, Amrizal Muhd, and Aljunid, Syed Mohamed
- Published
- 2020
11. Association rule mining for identification of port state control patterns in Malaysian ports.
- Author
-
Osman, Mohd Tarmizi, Yuli, Chen, Li, Tian, and Senin, Syahrul Fithry
- Subjects
- *
ASSOCIATION rule mining , *APRIORI algorithm , *SCHOOL inspections (Educational quality) - Abstract
Port State Control (PSC) inspection data is used for determining the inspection pattern of PSC in Malaysia and identifying the relationship between the inspection place, flag state, number of deficiency, detention result, and ship risk profile. Based on 8,089 inspection reports from 2015 to 2019, the mining association rule is proposed as a learning approach due to its determination pattern in the information bank. The learning of association rules of PSC inspections is performed primarily on the Apriori Algorithm, in order to produce alluring rules. Inspection patterns of Malaysian ports revealed that flag state, ship risk profile, and inspection place generally lead to no detention result, as well as zero deficiency recorded on-board. The reported quantity of detention was significantly related to the high number of deficiencies raised for ships registered under blacklisted countries. Furthermore, the analysis of deficiency discovered the pattern of inspection at Malaysian ports is frequently related to zero and a low number of deficiencies raised by inspectors. Lastly, five major ports were selected for providing a useful rule to help PSC officers in organising an effective inspection plan. A similar approach can also be used for other ports beyond Malaysia for comparative analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
12. A comparative study of machine learning techniques for suicide attempts predictive model.
- Author
-
Nordin, Noratikah, Zainol, Zurinahni, Mohd Noor, Mohd Halim, and Fong, Chan Lai
- Subjects
- *
SUICIDE risk factors , *SUPPORT vector machines , *ACADEMIC medical centers , *MACHINE learning , *RACE , *RANDOM forest algorithms , *RISK assessment , *COMPARATIVE studies , *SUICIDAL ideation , *SEVERITY of illness index , *MENTAL depression , *DESCRIPTIVE statistics , *PREDICTION models , *PSYCHOLOGY & religion , *RECEIVER operating characteristic curves , *PREDICTIVE validity , *ALGORITHMS , *DATA mining , *EVALUATION - Abstract
Current suicide risk assessments for predicting suicide attempts are time consuming, of low predictive value and have inadequate reliability. This paper aims to develop a predictive model for suicide attempts among patients with depression using machine learning algorithms as well as presents a comparative study on single predictive models with ensemble predictive models for differentiating depressed patients with suicide attempts from non-suicide attempters. We applied and trained eight different machine learning algorithms using a dataset that consists of 75 patients diagnosed with a depressive disorder. A recursive feature elimination was used to reduce the features via three-fold cross validation. An ensemble predictive models outperformed the single predictive models. Voting and bagging revealed the highest accuracy of 92% compared to other machine learning algorithms. Our findings indicate that history of suicide attempt, religion, race, suicide ideation and severity of clinical depression are useful factors for prediction of suicide attempts. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
13. Application of Data Mining Techniques on Tourist Expenses in Malaysia.
- Author
-
Cai Miao and Tan Shi An
- Subjects
DATA mining ,CORPORATE profits ,QUALITY of service ,TOURISTS ,ECONOMIC development - Abstract
Copyright of Baghdad Science Journal is the property of Republic of Iraq Ministry of Higher Education & Scientific Research (MOHESR) and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2021
- Full Text
- View/download PDF
14. Tourism Knowledge Discovery through Data Mining Techniques.
- Author
-
Jamil, Jastini Mohd. and Mohd Shaharanee, Izwan Nizal
- Subjects
- *
DATA mining , *TOURISM , *ROUGH sets , *DECISION trees , *HOTELKEEPERS - Abstract
Tourism industry in Malaysia has been customarily thought and advanced towards universal markets since its early stages arrange in 1960s. Currently, study about tourism knowledge discovery is very little being addressed. The previous studies are still insufficient to extract important insights from tourism data within Malaysia context. Therefore, this paper aims to analyze profiles of tourists using data mining decision tree techniques where several combinations of the number of branches (2 and 3 branches) and different target splitting rules (Entropy, Gini, and Probability Chi-square) have been applied on comprehensive survey data and to find out the best performing algorithm among the six models for tourism knowledge discovery. Results show that there are a various type of tourists with each group having different patterns or rules. This research study can be very helpful for tourist association, hospitality and hotel managers. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
15. Review of the Extraction and Isolation Methods for Anaxagorea javanica Plant.
- Author
-
Ali, Auf
- Subjects
- *
DRUG discovery , *ETHYL acetate , *NORMAL-phase chromatography , *DATA mining , *COLUMN chromatography , *DRUG development , *ACID-base imbalances - Abstract
Introduction: The promotion of natural products for their potential therapeutic properties has gained increasing attention in recent years. Anaxagorea javanica, commonly known as 'Sekobang Kechil' or 'Larak Lecek' in Malaysia, is an underutilised medicinal plant with a long history of use in traditional medicine. In this report, our aim is to provide comprehensive information on its extraction, isolation and purification to guide future studies for the design and development of new relevant drugs from A. javanica. Methods: Using PRISMA, we conducted a review of multiple literatures to acquire information on the extraction methods of A. javanica from various electronic databases (PubMed, PubMed Central, Science Direct, and Google Scholar). We used the search words with the combinations of the name of the plant, "A. javanica" and the word "extraction" and "isolation". Another search term that we used are the combinations of the name of compounds and the word "purification" such as "copyrine alkaloid purification" etc. Results: Based on the results obtained, alkaloids respond positively in acid-base extraction and then later purified using silica-gel column chromatography. Solvents such as n-hexane, 95% ethanol, hot ethanol have been found to work best for extracting alkaloids, followed by partitioning with chloroform and water. Solvents such as 1,2-dichloroethane, chloroform, diethyl ether and benzene work the best for extracting copyrine alkaloids. Sequential extraction using dichloromethane, ethyl acetate and methanol, and followed by repeated chromatography over silica gel has shown to be effective for extracting terpenes. The compounds in the leaves have shown a positive response in dichloromethane, ethyl acetate and water. Conclusion: Drug discovery and drug design are lengthy processes. By reviewing the extraction methods for this plant, we can streamline one aspect of the process toward designing treatments for diseases. [ABSTRACT FROM AUTHOR]
- Published
- 2023
16. Personal bankruptcy prediction using decision tree model.
- Author
-
Nor, Sharifah Heryati Syed, Ismail, Shafinar, and Yap, Bee Wah
- Subjects
- *
PERSONAL bankruptcy , *DECISION trees , *ECONOMICS , *BUSINESS enterprises - Abstract
Purpose - Personal bankruptcy is on the rise in Malaysia. The Insolvency Department of Malaysia reported that personal bankruptcy has increased since 2007, and the total accumulated personal bankruptcy cases stood at 131,282 in 2014. This is indeed an alarming issue because the increasing number of personal bankruptcy cases will have a negative impact on the Malaysian economy, as well as on the society. From the aspect of individual's personal economy, bankruptcy minimizes their chances of securing a job. Apart from that, their account will be frozen, lost control on their assets and properties and not allowed to start any business nor be a part of any company's management. Bankrupts also will be denied from any loan application, restricted from travelling overseas and cannot act as a guarantor. This paper aims to investigate this problem by developing the personal bankruptcy prediction model using the decision tree technique. Design/methodology/approach - In this paper, bankrupt is defined as terminated members who failed to settle their loans. The sample comprised of 24,546 cases with 17 per cent settled cases and 83 per cent terminated cases. The data included a dependent variable, i.e. bankruptcy status (Y = 1(bankrupt), Y = 0 (non-bankrupt)) and 12 predictors. SAS Enterprise Miner 14.1 software was used to develop the decision tree model. Findings - Upon completion, this study succeeds to come out with the profiles of bankrupts, reliable personal bankruptcy scoring model and significant variables of personal bankruptcy. Practical implications - This decision tree model is possible for patent and income generation. Financial institutions are able to use this model for potential borrowers to predict their tendency toward personal bankruptcy. Originality/value - This decision tree model is able to facilitate and assist financial institutions in evaluating and assessing their potential borrower. It helps to identify potential defaulting borrowers. It also can assist financial institutions in implementing the right strategies to avoid defaulting borrowers. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
17. Analysis of Mobile Service Providers Performance Using Naive Bayes Data Mining Technique.
- Author
-
Burhanuddin, M. A., Ismail, Ronizam, Izzaimah, Nurul, Mohammed, Ali Abdul-Jabbar, and Zainol, Norzaimah
- Subjects
DATA mining ,TELECOMMUNICATION ,TELECOMMUNICATIONS services ,BIG data ,CUSTOMER feedback ,TELECOMMUNICATION satellites - Abstract
Recently, the mobile service providers have been growing rapidly in Malaysia. In this paper, we propose analytical method to find best telecommunication provider by visualizing their performance among telecommunication service providers in Malaysia, i.e. TM Berhad, Celcom, Maxis, U-Mobile, etc. This paperuses data mining technique to evaluate the performanceof telecommunication service providers using their customers feedback from Twitter Inc. It demonstrates on how the system could process and then interpret the big data into a simple graph or visualization format. In addition, build a computerized tool and recommend data analytic model based on the collected result. From prepping the data for pre-processing until conducting analysis, this project is focusing on the process of data science itself where Cross Industry Standard Process for Data Mining (CRISP-DM) methodology will be used as a reference. The analysis was developed by using R language and R Studio packages. From the result, it shows that Telco 4 is the best as it received highest positive scores from the tweet data. In contrast, Telco 3 should improve their performance as having less positive feedback from their customers via tweet data. This project bring insights of how the telecommunication industries can analyze tweet data from their customers. Malaysia telecommunication industry will get the benefit by improving their customer satisfaction and business growth. Besides, it will give the awareness to the telecommunication user of updated review from other users. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
18. THE LEARNING OF MULTIVARIATE ADAPTIVE REGRESSION SPLINES (MARS) MODEL IN RAINFALL-RUNOFF PROCESSES AT PAHANG RIVER CATCHMENT.
- Author
-
HALID, D. A., ATAN, I., JAAFAR, J., ASHAARI, Y., MOHAMED, S. N., SAMSUDIN, M. B., and BAKI, A.
- Subjects
- *
MARS (Planet) , *SPLINES , *RUNOFF , *CORPORATE finance , *DATA mining , *WATERSHEDS - Abstract
Recently, a novel data mining technique, Multivariate Adaptive Regression Splines (MARS) has begun attracted attention from several hydrological researchers because their application is relatively new in modelling hydrological processes. The power of this approach has been proven in variety learning problems such as financial analysis, species distributions modelling, and doweled pavement performance modelling. Therefore, the objective of this paper is to investigate the performance of MARS model in capture the rainfallrunoff processes at river catchment of Malaysia. Pahang River has been selected as area of study. 30-years data set of daily rainfall and runoff at upstream tributaries of Pahang River were used to developed and validate the capability of MARS model in flood prediction. The effect of different length of record data to performance of MARS model was also examined by arranged the data into 5-years data set, 10 years data set, 20 years data set, and 30 years data set. All these data sets used 1-year data of 2003 for validation process while the others were applied for calibration. Simulation results showed that MARS model was able to learn the rainfall-runoff processes in Pahang River catchment and the model performance improved due to the longer period of data. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
19. Natural language processing in narrative breast radiology reporting in University Malaya Medical Centre.
- Author
-
Tan WM, Ng WL, Ganggayah MD, Hoe VCW, Rahmat K, Zaini HS, Mohd Taib NA, and Dhillon SK
- Subjects
- Humans, Malaysia, Universities, Data Mining, Natural Language Processing, Radiology
- Abstract
Radiology reporting is narrative, and its content depends on the clinician's ability to interpret the images accurately. A tertiary hospital, such as anonymous institute, focuses on writing reports narratively as part of training for medical personnel. Nevertheless, free-text reports make it inconvenient to extract information for clinical audits and data mining. Therefore, we aim to convert unstructured breast radiology reports into structured formats using natural language processing (NLP) algorithm. This study used 327 de-identified breast radiology reports from the anonymous institute. The radiologist identified the significant data elements to be extracted. Our NLP algorithm achieved 97% and 94.9% accuracy in training and testing data, respectively. Henceforth, the structured information was used to build the predictive model for predicting the value of the BIRADS category. The model based on random forest generated the highest accuracy of 92%. Our study not only fulfilled the demands of clinicians by enhancing communication between medical personnel, but it also demonstrated the usefulness of mineable structured data in yielding significant insights.
- Published
- 2023
- Full Text
- View/download PDF
20. Regression Analysis on Agent Roles in Personal Knowledge Management Processes.
- Author
-
Ismail, Shahrinaz, Ahmad, Mohd Sharifuddin, and Hassan, Zainuddin
- Subjects
KNOWLEDGE management ,INFORMATION services management ,DATA mining ,INFORMATION resources management ,REGRESSION analysis - Abstract
Across the literature, there is a gradual development of research on personal knowledge management (PKM) from theoretical to technical perspective on managing personal knowledge over the computer and internet technologies. This research domain has aroused the interest of researchers in Malaysia with the introduction of a PKM model to accommodate the understanding of PKM among knowledge workers. In this model, called the GUSC model, there are four main processes, which are get knowledge, understand knowledge, share knowledge and connect to knowledge sources. This model entails four cognitive enablers that are proposed to mediate the PKM processes. This paper analyses the quantitative data on the GUSC model to further understand the roles of software agents in mediating human's PKM processes. It also analyses the role of cognitive enablers as mediating factors for the PKM processes, which are seen as potential strong notions for software agency. The results of the analysis show the significance of 'connect' as a role of an agent that depends on the rest of the factors. Based on the quantitative findings, it is recommended that the GUSC model is used to conceptualise an agent-based system, with cognitive enablers to determine the appropriate BDI structure for the system. [ABSTRACT FROM AUTHOR]
- Published
- 2013
21. Sentiment Mining of Malay Newspaper (SAMNews) Using Artificial Immune System.
- Author
-
Puteh, Mazidah, Isa, Norulhidayah, Puteh, Sayani, and Redzuan, Nur Amalina
- Subjects
ARTIFICIAL intelligence ,SENTIMENT analysis ,DATA mining ,ALGORITHMS ,NEWSPAPERS - Abstract
There are sheer volume of rich web resources such as digital newspaper, e-forum, blogs, Facebook and Twitter. Mining the digital text resources may reveal interesting knowledge to respective individuals or organizations. Text mining and sentiment mining or analysis are parts of a new area in sentiment research. Sentiment mining for Malay Newspaper (SAMNews) is constructed based on the artificial immune system called negative selection algorithm which is able to classify the sentiment in newspaper's sentences into the polarity (positive, negative or neutral) intelligently. The sentiment analysis in this project utilized 1000 sentences from newspapers to evaluate the average accuracy. The research used 900 sentences from newspapers as the training data and another 100 as the testing data. The accuracy is achieved at 88.5%. In the future, a comparative study on Artificial Immune System and other techniques or algorithms can be carried out to enhance the performance of the sentiment classification model. [ABSTRACT FROM AUTHOR]
- Published
- 2013
22. Identifying places of interest for tourists! Using knowledge discovery
- Author
-
Aghdam, Atae Rezaei, Kamalpour, Mostafa, and Sim, Alex Tze Hiang
- Published
- 2014
23. Sensitivity of missing values in classification tree for large sample.
- Author
-
Hasan, Norsida, Adam, Mohd Bakri, Mustapha, Norwati, and Abu Bakar, Mohd Rizam
- Subjects
- *
STATISTICS , *DATA mining , *SENSITIVITY analysis , *PATTERN perception , *MATHEMATICAL models , *PROBABILITY theory - Abstract
Missing values either in predictor or in response variables are a very common problem in statistics and data mining. Cases with missing values are often ignored which results in loss of information and possible bias. The objectives of our research were to investigate the sensitivity of missing data in classification tree model for large sample. Data were obtained from one of the high level educational institutions in Malaysia. Students' background data were randomly eliminated and classification tree was used to predict students degree classification. The results showed that for large sample, the structure of the classification tree was sensitive to missing values especially for sample contains more than ten percent missing values. [ABSTRACT FROM AUTHOR]
- Published
- 2012
- Full Text
- View/download PDF
24. ID 63. Characteristics of Electronic Cigarette and Vape Users in Malaysia: Lessons from Decision Tree Analysis.
- Author
-
Kartiwi, Mira, Mohamed, Mohamad Haniki Nik, Rahman, Jamalludin Ab, Draman, Samsul, and Rahman, Norny Syafinaz Ab
- Subjects
- *
ELECTRONIC cigarettes , *DECISION making , *DECISION trees , *SMOKING cessation , *DATA mining , *ELECTRONICS in surveying - Abstract
Introduction: The use of electronic cigarette and vape (ECV) among adults has been rapidly in Malaysia. Objectives: The primary objective of this paper is to understand the characteristics of ECV users in Malaysia by assessing the perceptions and demographic variables. The influence of perceptions and demographic variables were assessed on the current status of ECV use. Several predictor variables included in this study were: seven demographics variables (i.e., age, gender, race, residence, marital, occupation and education) and twenty variables on the perception of ECV use. An Induction Decision Tree (ID3) algorithm, one of the renowned data mining technique, was used in this study. Materials and Methods: A number of simulations was carried out on the dataset which was extracted from the National Electronic Cigarette Survey (NECS) 2016. Results: The result of this study shows that the most critical variable identified in this study was gender, hence indicates decision for ECV uses significantly differs among male and female. Conclusion: The findings of this study would contribute towards strategizing public health campaign on smoking cessation. [ABSTRACT FROM AUTHOR]
- Published
- 2020
25. Application of a neuro-fuzzy model to landslide-susceptibility mapping for shallow landslides in a tropical hilly area
- Author
-
Oh, Hyun-Joo and Pradhan, Biswajeet
- Subjects
- *
FUZZY systems , *LANDSLIDES , *SOIL maps , *MOUNTAINS , *GEOGRAPHIC information systems , *DATA mining , *REMOTE sensing - Abstract
Abstract: This paper presents landslide-susceptibility mapping using an adaptive neuro-fuzzy inference system (ANFIS) using a geographic information system (GIS) environment. In the first stage, landslide locations from the study area were identified by interpreting aerial photographs and supported by an extensive field survey. In the second stage, landslide-related conditioning factors such as altitude, slope angle, plan curvature, distance to drainage, distance to road, soil texture and stream power index (SPI) were extracted from the topographic and soil maps. Then, landslide-susceptible areas were analyzed by the ANFIS approach and mapped using landslide-conditioning factors. In particular, various membership functions (MFs) were applied for the landslide-susceptibility mapping and their results were compared with the field-verified landslide locations. Additionally, the receiver operating characteristics (ROC) curve for all landslide susceptibility maps were drawn and the areas under curve values were calculated. The ROC curve technique is based on the plotting of model sensitivity — true positive fraction values calculated for different threshold values, versus model specificity — true negative fraction values, on a graph. Landslide test locations that were not used during the ANFIS modeling purpose were used to validate the landslide susceptibility maps. The validation results revealed that the susceptibility maps constructed by the ANFIS predictive models using triangular, trapezoidal, generalized bell and polynomial MFs produced reasonable results (84.39%), which can be used for preliminary land-use planning. Finally, the authors concluded that ANFIS is a very useful and an effective tool in regional landslide susceptibility assessment. [Copyright &y& Elsevier]
- Published
- 2011
- Full Text
- View/download PDF
26. Organizational demographic variables and preliminary KM implementation success
- Author
-
Chong, Chin Wei, Chong, Siong Choy, and Lin, Binshan
- Subjects
- *
KNOWLEDGE management , *TELECOMMUNICATION , *INFORMATION services management , *INFORMATION retrieval , *DATA mining , *QUESTIONNAIRES , *MATHEMATICAL variables - Abstract
Abstract: This paper investigates the effect of organizational demographic variables on successful knowledge management (KM) implementation which insofar has not been thoroughly reported in the KM literature. For meaningful results to be generated, four organizational demographic variables, namely functional areas, years of KM involvement, KM development stage, and degree of knowledge intensity were moderated against a comprehensive set of KM activities, which comprise of KM preliminary success factors, KM strategies and KM processes, with organizational performance. The respondents comprised of middle managers working in the telecommunication industry in Malaysia. Based on the data collected from 289 respondents using a set of structured questionnaire, the results reveal that all the four demographic characteristics interacted with the degree of implementation of the KM activities, while three of the characteristics, with exception of functional areas, show significant relationships with organizational performance. The contributions of the paper, along with the implications of the results are discussed and interpreted to provide guidance to organizations for improved business performance through KM implementation success. [Copyright &y& Elsevier]
- Published
- 2010
- Full Text
- View/download PDF
27. KM implementation in Malaysian telecommunication industry: An empirical analysis.
- Author
-
Chong Chin Wei, Chong Siong Choy, and Heng Ping Yeow, Paul
- Subjects
KNOWLEDGE management ,TELECOMMUNICATION ,INFORMATION science ,DATA mining ,ORGANIZATIONAL structure ,BUSINESS planning ,METHODOLOGY ,QUESTIONNAIRES - Abstract
Purpose — This paper aims assess to the perceived importance (PI) and actual implementation (Al) of five preliminary knowledge management (KM) success factors, i.e. business strategy, organizational structure, knowledge team, knowledge audit, and knowledge map in the Malaysian telecommunication industry. Design/methodology/approach — A questionnaire survey was conducted on telecommunication organizations located in the capital of Malaysia. Data were analyzed using indices and parametric statistics. Findings — The results show that the organizations are aware of the importance of all the KM factors but fall short of implementation. The implemented factors consist of business strategy, organizational structure, and knowledge team. Knowledge audit and knowledge map are perceived as important but are the least implemented factors. Research limitations/implications — This study was conducted in only one industry in Malaysia. Furthermore, it focuses on the preliminary success factors of KM implementation rather than on learning and knowledge utilization. Practical implications — Telecommunication organizations have to overcome resources problems and enhance implementation level in order to narrow the gaps for effective, full scale KM implementation in the later stage. Such viable practice will significantly help the industry not only to compete more effectively within Malaysia, but also to position itself as a global player in the world. Originality/value — This study is perhaps one of the first to address the preliminary steps to be dealt with prior to KM implementation. Moreover, it attempts to compare the PI and AT of the five proposed success factors, which has received very little attention to date. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
28. Performance profiling of the unit trust funds in Malaysia with data mining techniques.
- Author
-
Khairuddin AF, Ng KH, and Khor KC
- Subjects
- Data Mining, Income, Malaysia, Financial Management, Investments
- Abstract
Background: Millennials are exposed to many investment opportunities, and they have shown their interest in gaining more income via investments. One popular investment avenue is unit trusts. However, analysing unit trusts' financial data and gaining valuable insights may not be as simple because not everyone has the required financial knowledge and adequate time to perform in-depth analytics on the numerous financial data. Furthermore, it is not easy to compile the performance of each unit trust available in Malaysia. The primary objective of this research is to identify unit trust funds that provide higher returns than their average peers via performance profiling. Methods: This research proposed a performance profiling on Malaysia unit trust funds using the two data mining techniques, i.e., Expectation Maximisation (EM) and Apriori, to assist amateur retail investors to choose the right unit trust based on their risk tolerance. EM clustered the unit trust funds in Malaysia into several groups based on their annual financial performances. This was then followed by finding the rules associated with each cluster by applying Apriori. The resulted rules shall serve the purpose of profiling the clustered unit trust funds. Retail investors can then select their preferred unit trust funds based on the performance profile of the clusters. Results: The yearly average total return of the financial year 2018 and 2019 was used to evaluate unit trust funds' performance in the clusters. The evaluation results indicated that the profiling could provide valuable and insightful information to retail investors with varying risk appetites. Conclusions: This research has demonstrated that the financial performance profiling of unit trust funds could be acquired via data mining approaches. This valuable information is crucial to unit trust investors for selecting suitable funds in investment., Competing Interests: No competing interests were disclosed., (Copyright: © 2021 Khairuddin AF et al.)
- Published
- 2021
- Full Text
- View/download PDF
29. VIEWPOINT Developing an integrated data management system for general practice.
- Author
-
Doshi, Hemendra Kumar
- Subjects
- *
COMPUTER software , *DATA mining , *FAMILY medicine , *TECHNOLOGY , *MEDICAL practice - Abstract
Developing a software program to manage data in a general practice setting is complicated. Vision Integrated Medical System is an example of a integrated management system that was developed by general practitioners, within a general practice, to offer a user friendly system with multi tasking capabilities. The present report highlights the reasons behind the development of this system and how it can assist day to day practice. [ABSTRACT FROM AUTHOR]
- Published
- 2003
- Full Text
- View/download PDF
30. Developing an Ensemble Predictive Safety Risk Assessment Model: Case of Malaysian Construction Projects.
- Author
-
Sadeghi H, Mohandes SR, Hosseini MR, Banihashemi S, Mahdiyar A, and Abdullah A
- Subjects
- Accidents, Occupational prevention & control, Humans, Malaysia, Workplace standards, Construction Industry methods, Construction Industry standards, Occupational Health, Risk Assessment
- Abstract
Occupational Health and Safety (OHS)-related injuries are vexing problems for construction projects in developing countries, mostly due to poor managerial-, governmental-, and technical safety-related issues. Though some studies have been conducted on OHS-associated issues in developing countries, research on this topic remains scarce. A review of the literature shows that presenting a predictive assessment framework through machine learning techniques can add much to the field. As for Malaysia, despite the ongoing growth of the construction sector, there has not been any study focused on OHS assessment of workers involved in construction activities. To fill these gaps, an Ensemble Predictive Safety Risk Assessment Model (EPSRAM) is developed in this paper as an effective tool to assess the OHS risks related to workers on construction sites. The developed EPSRAM is based on the integration of neural networks with fuzzy inference systems. To show the effectiveness of the EPSRAM developed, it is applied to several Malaysian construction case projects. This paper contributes to the field in several ways, through: (1) identifying major potential safety risks, (2) determining crucial factors that affect the safety assessment for construction workers, (3) predicting the magnitude of identified safety risks accurately, and (4) predicting the evaluation strategies applicable to the identified risks. It is demonstrated how EPSRAM can provide safety professionals and inspectors concerned with well-being of workers with valuable information, leading to improving the working environment of construction crew members.
- Published
- 2020
- Full Text
- View/download PDF
31. Characterization of spatial patterns in river water quality using chemometric pattern recognition techniques.
- Author
-
Gazzaz NM, Yusoff MK, Ramli MF, Aris AZ, and Juahir H
- Subjects
- Data Mining, Malaysia, Environmental Monitoring methods, Rivers, Water Pollutants, Chemical analysis, Water Quality
- Abstract
This study employed three chemometric data mining techniques (factor analysis (FA), cluster analysis (CA), and discriminant analysis (DA)) to identify the latent structure of a water quality (WQ) dataset pertaining to Kinta River (Malaysia) and to classify eight WQ monitoring stations along the river into groups of similar WQ characteristics. FA identified the WQ parameters responsible for variations in Kinta River's WQ and accentuated the roles of weathering and surface runoff in determining the river's WQ. CA grouped the monitoring locations into a cluster of low levels of water pollution (the two uppermost monitoring stations) and another of relatively high levels of river pollution (the mid-, and down-stream stations). DA confirmed these clusters and produced a discriminant function which can predict the cluster membership of new and/or unknown samples. These chemometric techniques highlight the potential for reasonably reducing the number of WQVs and monitoring stations for long-term monitoring purposes., (Copyright © 2012 Elsevier Ltd. All rights reserved.)
- Published
- 2012
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.