Database: Supplemental Index / Search Limiters: Full Text / Topic: data analysis - Searchworks@Jio Institute Digital Library Search Results

1. On the use of data analysis and OR modelling in MCDM problems: a case analysis - a rejoinder to the paper by Sousa Ribeiro et al.

Author: Owsiński‬, Jan W.
Subjects: MULTIPLE criteria decision making, DATA analysis, BIG data, SHOPPING malls
Abstract: Taking as an example the very interesting and motivating paper by Sousa Ribeiro et al. (2021) an attempt is made of providing a couple of insights into the decision making process from the point of view of the potentially helpful aspects of data analysis and OR-related modelling. These are just hints and suggestions, meant primarily to emphasise the multifaceted character of the decision making situations and processes, especially when concerning more complex issues. While the course of the procedure proposed and exemplified in Sousa Ribeiro et al. (2021) is treated as fully correctly and successfully carried out to the end, we wish to show the potential use of information, constituting in a sense a "by-product" of such a procedure, or, actually, of any similar procedure, aimed at supporting decision making. [ABSTRACT FROM AUTHOR]
Published: 2021

2. (Position paper) Characterizing the Behavior of Small Producers in Smart Grids A Data Sanity Analysis.

Author: Stefan, Maria, Gutierrez, Jose, Barlet, Pere, Prieto, Eduardo, Gomis, Oriol, and Olsen, Rasmus L.
Subjects: ELECTRIC power distribution grids, DATA analysis, ENERGY consumption, FORECASTING, CONSUMER profiling, LOAD forecasting (Electric power systems)
Abstract: Renewable energy production throughout low-voltage grids has gradually increased in electrical distribution systems, therefore introducing small energy producers - prosumers. This paradigm challenges the traditional unidirectional energy distribution flow to include disperse power production from renewables. To understand how energy usage can be optimized in the dynamic electrical grid, it is important to understand the behavior of prosumers and their impact on the grid's operational procedures. The main focus of this study is to investigate how grid operators can obtain an automatic data-driven system for the low-voltage electrical grid management, by analyzing the available grid topology and time-series consumption data from a real-life test area. The aim is to argue for how different consumer profiles, clustering and prediction methods contribute to the grid-related operations. Ultimately, this work is intended for future research directions that can contribute to improving the trade-off between systematic and scalable data models and software computational challenges. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

3. A Prospective Randomized Blister Prevention Trial Assessing Paper Tape in Endurance Distances (Pre-TAPED).

Author: Lipman, Grant S., Ellis, Mark A., Lewis, Erica J., Waite, Brandee L., Lissoway, John, Chan, Garrett K., and Krabak, Brian J.
Subjects: RANDOMIZED controlled trials, PHYSICAL fitness, FOOT injuries, ULTRAMARATHON running, DATA analysis, BLISTERS
Abstract: Objective Friction foot blisters are a common injury occurring in up to 39% of marathoners, the most common injury in adventure racing, and represent more than 70% of medical visits in multi-stage ultramarathons. The goal of the study was to determine whether paper tape could prevent foot blisters in ultramarathon runners. Methods This prospective randomized trial was undertaken during RacingThePlanet 155-mile (250-km), 7-day self-supported ultramarathons in China, Australia, Egypt, Chile, and Nepal in 2010 and 2011. Paper tape was applied prerace to one randomly selected foot, with the untreated foot acting as the own control. The study end point was development of a hot spot or blister on any location of either foot. Results One hundred thirty-six participants were enrolled with 90 (66%) having completed data for analysis. There were 36% women, with a mean age of 40 ± 9.4 years (range, 25–40 years) and pack weight of 11 ± 1.8 kg (range, 8–16 kg). All participants developed blisters, with 89% occurring by day 2 and 59% located on the toes. No protective effect was observed by the intervention (47 versus 35; 52% versus 39%; P = .22), with fewer blisters occurring around the tape on the experimental foot than under the tape (23 vs 31; 25.6% versus 34.4%), yet 84% of study participants when queried would choose paper tape for blister prevention in the future. Conclusions Although paper tape was not found to be significantly protective against blisters, the intervention was well tolerated with high user satisfaction. [ABSTRACT FROM AUTHOR]
Published: 2014
Full Text: View/download PDF

4. Araştırmanın Yöntem Bölümüne İlişkin Öz Yeterlik Ölçeği Geliştirilmesi.

Author: GÖK, Bilge, ATALAY KABASAKAL, Kübra, and ÇETİN, Sevda
Subjects: SELF-efficacy in students, GRADUATE students, GRADUATE study in education, GRADUATE education, DATA analysis, CRONBACH'S alpha
Abstract: Copyright of Ilkogretim Online is the property of Ilkogretim Online and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2015
Full Text: View/download PDF

5. Layered Evaluation of Multi-Criteria Collaborative Filtering for Scientific Paper Recommendation.

Author: Manouselis, Nikos and Verbert, Katrien
Subjects: MULTIPLE criteria decision making, COMPUTER algorithms, INFORMATION theory, DATA analysis, END users (Information technology), PERFORMANCE evaluation
Abstract: Abstract: Recommendation algorithms have been researched extensively to help people deal with abundance of information. In recent years, the incorporation of multiple relevance criteria has attracted increased interest. Such multi-criteria recommendation approaches are researched as a paradigm for building intelligent systems that can be tailored to multiple interest indicators of end-users – such as combinations of implicit and explicit interest indicators in the form of ratings or ratings on multiple relevance dimensions. Nevertheless, evaluation of these recommendation techniques in the context of real-life applications still remains rather limited. Previous studies dealing with the evaluation of recommender systems have outlined that the performance of such algorithms is often dependent on the dataset – and indicate the importance of carrying out careful testing and parameterization. Especially when looking at large scale datasets, it becomes very difficult to deploy evaluation methods that may help in assessing the effect that different system components have to the overall design. In this paper, we study how layered evaluation can be applied for the case of a multi-criteria recommendation service that we plan to deploy for paper recommendation using the Mendeley dataset. The paper introduces layered evaluation and suggests two experiments that may help assess the components of the envisaged system separately. [Copyright &y& Elsevier]
Published: 2013
Full Text: View/download PDF

6. Characterization of the indoor radio propagation channel at 2.4 GHz.

Author: Wysocki, Tadeusz A. and Zepernick, Hans-Jürgen
Subjects: DATA transmission systems, DATA analysis, STATISTICS, MICROWAVES, ATTENTION
Abstract: The unlicensed industrial, scientific, and medical (ISM) band at 2.4 GHz has gained increased attention recently due to the high data rate communication systems developed to operate in this band. The paper presents measurement results of fading characteristics, multipath parameters and background interference for these frequencies. Some statistical analysis of the measured data is presented. The paper provides information that may be useful in design and deployment of communication systems operating in the 2.4 GHz ISM band, like those compliant with IEEE 802.11 standard and Bluetooth open wireless standard. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

7. Estimation of the constant-stress model with bathtub-shaped failure rates under progressive type-I interval censoring scheme.

Author: Sief, Mohamed, Liu, Xinsheng, Alsadat, Najwan, and Abd El-Raheem, Abd El-Raheem M.
Subjects: MAXIMUM likelihood statistics, ACCELERATED life testing, CONFIDENCE intervals, PARAMETER estimation, MARKOV chain Monte Carlo, DATA analysis
Abstract: This paper investigates constant-stress accelerated life tests interrupted by a progressive type-I interval censoring regime. We provide a model based on the Chen distribution with a constant shape parameter and a log-linear connection between the scale parameter and stress loading. Inferential methods, whether classical or Bayesian, are employed to address model parameters and reliability attributes. Classical methods involve the estimation of model parameters through maximum likelihood and midpoint techniques. Bayesian approximations are achieved via the utilization of the Metropolis–Hastings algorithm, Tierney-Kadane procedure, and importance sampling methods. Furthermore, we engage in a discourse on the estimation of confidence intervals, making references to both asymptotic confidence intervals and credible intervals. To conclude, we furnish a simulation study, a corresponding discussion, and supplement these with an analysis of real data. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

8. A Data-Driven Assessment Model for Metaverse Maturity.

Author: Mincong Tang, Jie Cao, Zixiang Fan, Dalin Zhang, and Pandelica, Ionut
Subjects: SHARED virtual environments, WEIGHING instruments, DATA analysis
Abstract: The rapid development of the metaverse has sparked extensive discussion on how to estimate its development maturity using quantifiable indicators, which can offer an assessment framework for governing the metaverse. Currently, the measurable methods for assessing the maturity of the metaverse are still in the early stages. Data-driven approaches, which depend on the collection, analysis, and interpretation of large volumes of data to guide decisions and actions, are becoming more important. This paper proposes a data-driven approach to assess the maturity of the metaverse based on K-means-AdaBoost. This method automatically updates the indicator weights based on the knowledge acquired from the model, thereby significantly enhancing the accuracy of model predictions. Our approach assesses the maturity of metaverse systems through a thorough analysis of metaverse data and provides strategic guidance for their development. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

9. Stakeholders of Cardiovascular Innovation Ecosystems in Germany: A First Level Analysis and an Example.

Author: Kirichenko, Stanislav, Koumpis, Adamantios, and Beyan, Oya
Subjects: SCIENTIFIC literature, MULTILEVEL models, ECOSYSTEMS, ACQUISITION of data, DATA analysis
Abstract: This paper aims to provide a first attempt towards analysis innovation ecosystems for cardiovascular pathologies in Germany through the use of a stakeholder model. We present essential stakeholders for the development and deployment of innovations in the field of cardiovascular research and medicine, and the primary functions they fulfill in the context of these innovation ecosystems. The adopted approach consists of the implementation of a multilevel system model for analyzing stakeholders in this particular field. Data acquisition transpired through systematic literature review of multiple articles and studies. Data analysis phases were executed until reaching a point at which the considerable amount of data was discovered, ensuring consistency across various sources. We demonstrate that innovation ecosystems in cardiovascular medicine involve interconnected networks of stakeholders across different fields. Moreover, through an investigation of innovation ecosystems of cardiovascular pathologies particularly in Germany, we present the functions undertaken by each stakeholder, which are essential for the participation in the innovation ecosystems. The findings presented in this paper hold the potential to bring better understanding of cardiovascular pathology innovation ecosystems in Germany. This assertion is substantiated through a comprehensive examination of relevant scientific literature. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

10. Adversarial Attack-Based Robustness Evaluation for Trustworthy AI.

Author: Eungyu Lee, Yongsoo Lee, and Taejin Lee
Subjects: ARTIFICIAL intelligence, MALWARE, CYBERTERRORISM, DATA analysis, ROBUST control
Abstract: Artificial Intelligence (AI) technology has been extensively researched in various fields, including the field of malware detection. AI models must be trustworthy to introduce AI systems into critical decisionmaking and resource protection roles. The problem of robustness to adversarial attacks is a significant barrier to trustworthy AI. Although various adversarial attack and defense methods are actively being studied, there is a lack of research on robustness evaluation metrics that serve as standards for determining whether AI models are safe and reliable against adversarial attacks. An AI model's robustness level cannot be evaluated by traditional evaluation indicators such as accuracy and recall. Additional evaluation indicators are necessary to evaluate the robustness of AI models against adversarial attacks. In this paper, a Sophisticated Adversarial Robustness Score (SARS) is proposed for AI model robustness evaluation. SARS uses various factors in addition to the ratio of perturbated features and the size of perturbation to evaluate robustness accurately in the evaluation process. This evaluation indicator reflects aspects that are difficult to evaluate using traditional evaluation indicators. Moreover, the level of robustness can be evaluated by considering the difficulty of generating adversarial samples through adversarial attacks. This paper proposed using SARS, calculated based on adversarial attacks, to identify data groups with robustness vulnerability and improve robustness through adversarial training. Through SARS, it is possible to evaluate the level of robustness, which can help developers identify areas for improvement. To validate the proposed method, experiments were conducted using a malware dataset. Through adversarial training, it was confirmed that SARS increased by 70.59%, and the recall reduction rate improved by 64.96%. Through SARS, it is possible to evaluate whether an AI model is vulnerable to adversarial attacks and to identify vulnerable data types. In addition, it is expected that improved models can be achieved by improving resistance to adversarial attacks via methods such as adversarial training. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

11. The Practice of Predictive Identification: Optimising for Organisational Needs.

Author: Jansen, Fieke
Subjects: PREDICTIVE policing, CRIME, ELECTRONIC data processing, DATA analysis, CRIMINALS
Abstract: The advent of predictive policing systems demonstrates an increased interest in more novel forms of data processing for the purpose of crime control. This paper draws on interviews with police practitioners in the Netherlands and the UK to deconstruct the rationalities that are embedded within the turn to predictive identification. Debates on predictive policing have predominantly centred data in the analysis of the institutional and societal implication of prediction, linking its use to the premise of efficiency and accuracy and foregrounding issues around bias and discrimination. Yet, little is known about its actual practice. In policing, I find that studying data as practice surfaces new insights into the relationship between risk and the ways in which crime priorities are operationalised and the security mandate of the state is negotiated. Drawing on Harcourt's (2008) observation that the desire to predict crime says more about the police than it does about a potential offender, I argue that predictive identification is not about prediction, nor about efficiency, but rather it is about optimisation. Here, datafication serves to overcome self-defined organisational challenges within the police. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

12. Self-adaptive attribute weighted neutrosophic c-means clustering for biomedical applications.

Author: Liu, Zhe, Qiu, Haoye, and Letchmunan, Sukumar
Subjects: SYNTHETIC genes, GENE expression, OUTLIER detection, DATA analysis
Abstract: The applications of clustering in biomedical is pervasive and ubiquitous. A typical example is gene expression data analysis, where clustering is emerging as a powerful solution for uncovering cancer-related insights. Neutrosophic c -means (NCM) clustering has advantages over other conventional clustering methods in characterizing the uncertainty and imprecision caused by cluster overlap and identifying outliers. Nonetheless, NCM and its derivatives equally treat the contribution of each attribute to the cluster. In biomedical applications, genes (i.e. attributes) should take different importance in identifying different clusters. In this paper, we first propose a self-adaptive attribute-weighted neutrosophic c -means (AWNCM) clustering method to overcome the above defects. Moreover, a new objective function is designed to obtain optimal neutrosophic partition, cluster centers and attribute weights. Since AWNCM tends to be more effective against spherical data, we further develop a kernelized version of AWNCM, called KAWNCM, in order to better satisfy the clustering of some complex data (i.e. non-spherical data). We employ the iterative optimization strategy to obtain the optimal solutions for AWNCM and KAWNCM. The advantage of AWNCM and KAWNCM is to improve performance by learning the importance of each attribute to the cluster while maintaining an efficient solution to cluster overlap and outliers. Extensive experimental results using synthetic data and gene expression data demonstrate the feasibility and effectiveness of the proposed methods. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

13. An Approach Towards Reducing Training Time of the Input Doubling Method via Clustering for Middle-Sized Data Analysis.

Author: Izonin, Ivan, Tkachenko, Roman, Yemets, Kyrylo, Gregus, Michal, Tomashy, Yevhen, and Pliss, Iryna
Subjects: DATA augmentation, BIOMEDICAL engineering, DATA analysis, ELECTRONIC data processing, ALGORITHMS
Abstract: Intellectual analysis of small and middle-sized datasets through machine learning tools presents challenges in various application domains. Existing methods fail to provide sufficient accuracy, and their utilization is accompanied by a range of issues during data analysis. This paper proposes the improvement of the input doubling method for middle-sized data analysis. The existing method employs an augmentation procedure where the augmented data sample increases quadratically. This imposes several limitations on the method's usage for middle-sized data analysis. The authors propose enhancing this method by introducing an additional clustering procedure during data augmentation. The training algorithms and application methods are described, and a visualization of the main steps of its operation is provided. Modeling is performed on two medium-sized datasets. Optimal parameters for the improved method are selected, demonstrating its high efficiency. Specifically, significant reductions in the volumes of augmented datasets (8-9 times for both datasets respectively) are achieved, accompanied by substantial reductions in the training procedure duration of the method (more than 100 and 260 times for both datasets respectively), while maintaining high accuracy. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

14. A study on the effect of the die transverse angle and the part rotational feed angle in the cold radial forging process of rods.

Author: Roushan, M. Fattahpoor, Afrasiab, H., and Jafari Talookolaei, R. A.
Subjects: PRODUCT quality, RESIDUAL stresses, FINITE element method, NONLINEAR analysis, DATA analysis
Abstract: Radial forging is an efficient and high-precision process for manufacturing rotary parts such as shafts, axles, and gun barrels. While this process has been extensively investigated in the literature, effect of some parameters like the die transverse angle and the part rotational feed angle has not been adequately studied since simulating several steps of this process with a full three-dimensional model is required for this purpose which is labor-intensive and time-consuming. To bridge this gap, in this paper a three-dimensional nonlinear finite element model has been developed to analyze effects of the die transverse angle and part rotational feed angle in this process. To address the lack of reliable experimental data in the literature, an innovative approach has been proposed and used for validation of the developed finite element model. It has been observed that dies with transverse angle of 155ᵒ and 165ᵒ provide the best performance in producing a part geometry close to the desired shape. However, the most uniform residual stress distribution is obtained in forging by a die with a smaller transverse angle. Furthermore, for improving the shape and quality of the final product, the part rotational feed angle should be reduced as much as possible. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

15. Efficient Explanation and Evaluation Methodology Based on Hybrid Feature Dropout.

Author: Jingang Kim, Suengbum Lim, and Taejin Lee
Subjects: ARTIFICIAL intelligence, DEEP learning, ERROR detection (Information theory), FEATURE extraction, DATA analysis
Abstract: AI-related research is conducted in various ways, but the reliability of AI prediction results is currently insufficient, so expert decisions are indispensable for tasks that require essential decision-making. XAI (eXplainable AI) is studied to improve the reliability of AI. However, each XAI methodology shows different results in the same data set and exact model. This means that XAI results must be given meaning, and a lot of noise value emerges. This paper proposes the HFD (Hybrid Feature Dropout)- based XAI and evaluation methodology. The proposed XAI methodology can mitigate shortcomings, such as incorrect feature weights and impractical feature selection. There are few XAI evaluation methods. This paper proposed four evaluation criteria that can give practical meaning. As a result of verifying with the malware data set (Data Challenge 2019), we confirmed better results than other XAI methodologies in 4 evaluation criteria. Since the efficiency of interpretation is verified with a reasonable XAI evaluation standard, The practicality of the XAI methodology will be improved. In addition, The usefulness of the XAI methodology will be demonstrated to enhance the reliability of AI, and it helps apply AI results to essential tasks that require expert decision-making. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

16. A Modified PointNet-Based DDoS Attack Classification and Segmentation in Blockchain.

Author: Jieren Cheng, Xiulai Li, Xinbing Xu, Xiangyan Tang, and Sheng, Victor S.
Subjects: BLOCKCHAINS, IMAGE segmentation, SOCIAL impact, DATA analysis, ACCURACY
Abstract: With the rapid development of blockchain technology, the number of distributed applications continues to increase, so ensuring the security of the network has become particularly important.However, due to its decentralized, decentralized nature, blockchain networks are vulnerable to distributed denial-of-service (DDoS) attacks, which can lead to service stops, causing serious economic losses and social impacts. The research questions in this paper mainly include two aspects: first, the classification of DDoS, which refers to detecting whether blockchain nodes are suffering DDoS attacks, that is, detecting the data of nodes in parallel; The second is the problem of DDoS segmentation, that is, multiple pieces of data that appear at the same time are determined which type of DDoS attack they belong to. In order to solve these problems, this paper proposes a modified PointNet (MPointNet) for the classification and type segmentation of DDoS attacks. A dataset containing multiple DDoS attack types was constructed using the CIC-DDoS2019 dataset, and trained, validated, and tested accordingly. The results show that the proposed DDoS attack classification method has high performance and can be used for the actual blockchain security maintenance process. The accuracy rate of classification tasks reached 99.65%, and the accuracy of type segmentation tasks reached 85.47%. Therefore, the method proposed in this paper has high application value in detecting the classification and segmentation of DDoS attacks. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

17. An analysis of Open Data Scoring System towards Data Science for Sustainability in Industry 4.0.

Author: Castro, Hélio, Costa, Filipe, Ferreira, Tânia, Ávila, Paulo, Cruz-Cunha, Manuela, Ferreira, Luís, Putnik, Goran D., and Bastos, João
Subjects: INDUSTRY 4.0, DATA science, SUSTAINABILITY, SCIENCE & industry, DATA analysis, BUSINESSPEOPLE
Abstract: In a society based on data-driven, data inclusion and data access play a significant role in societal development. A called democratization of data through open access, Open Data, must be nurtured by countries to empower their citizens, entrepreneurs, companies, industries, academics, and organizations, in general. Open Data Scoring System is an evaluation system that ranks countries in 22 categories of openness in data, divided into the 3 pillars of sustainability. In this paper, we will present the importance of Industry 4.0 and its relation to sustainability and the role of Data Science in Industry 4.0 assuming an Open Design approach. Then, an analysis is made considering the Gross Domestic Product (GDP) of the most relevant countries worldwide, the USA and China, concerning the six (6) higher ranked categories of openness data of these countries, supported by the Open Data Scoring System from 2015 to 2020. Our findings reveal that in the USA and China the main categories are seven (7), five (5), and 2 (two) categories of economic, social, and environmental sustainability, respectively. Through a correlations and co-occurrences analysis of the open data scoring worldwide reveals that the most significant categories are four (4) economic, one (1) social, and two (2) environmental. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

18. Unsupervised K-means Analysis of Tuberculosis Data in Brazil: Identifying High Prevalence States and Temporal Trends.

Author: Rossini, Angelo, Alves, Domingos, Cassão, Vitor, and Brandão Miyoshi, Newton Shydeo
Subjects: K-means clustering, DATA analysis, TRENDS, TIME series analysis, DATABASES, TUBERCULOSIS
Abstract: This paper aims to demonstrate the findings obtained through the analysis and application of an unsupervised K-means algorithm on the SINAN database from 2001 to 2022 in Brazil, with the objective of understanding which states have the highest number of tuberculosis cases and identifying similarities among them that may contribute to a higher case rate relative to the local population. We will begin with a brief historical introduction, followed by an overview of the characteristics related to tuberculosis transmission. Subsequently, we will discuss the results obtained from the year-to-year analysis of the collected data. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

19. Using machine learning to understand driving behavior patterns.

Author: Valente, Jorge, Ramalho, Cláudia, Vinha, Pedro, Mora, Carlos, and Jardim, Sandra
Subjects: PYTHON programming language, MACHINE learning, MOTOR vehicle driving, SUPPORT vector machines, PROGRAMMING languages, SCIENTIFIC computing
Abstract: Driver behavior is one of the principal factors associated with road accidents. Much research to date focusing on Machine learning technology has been successfully applied to identifying driving styles and recognizing unsafe behaviors. In this paper, the development of an android mobile application (Driver Alert) is described, with the aim of collecting data from mobile phone sensors data to identify certain patterns and understand drivers' behaviors. Additional information was recorded regarding weather and traffic information, using public API's to complement the data directly collected from the vehicle. Four machine learning models (K-Means, Algorithm Agglomerative Hierarchical, Random Forest and Support Vector Machines) were tested and compared to identify different driver profiles. A native mobile application named DriverAlert was developed to support collect data and make it available, through an online dashboard, to drivers and researchers. Due to the available tools and libraries, it possesses, Python language was used, as it is a powerful programming language for workloads in data science, machine learning, and scientific computing. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

20. GLC-Frame: A Framework and Library for Exploration of Multidimensional Data with General Line Coordinates.

Author: Luque, Leandro, Antonini, Antonella, Ganuza, Maria Lujan, and Castro, Silvia
Subjects: DATA libraries, DATA visualization, DATA analysis, LIBRARIES, LOSSLESS data compression
Abstract: Copyright of Journal of Computer Science & Technology (JCS&T) is the property of Journal of Computer Science & Technology and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2024
Full Text: View/download PDF

21. Intelligent Image Text Detection via Pixel Standard Deviation Representation.

Author: Guia, Sana Sahar, Laouid, Abdelkader, Hammoudeh, Mohammad, and Kara, Mostafa
Subjects: ARTIFICIAL intelligence, PIXELS, STANDARD deviations, MACHINE learning, DATA analysis
Abstract: Artificial intelligence has been involved in several domains. Despite the advantages of using artificial intelligence techniques, some crucial limitations prevent them from being implemented in specific domains and locations. The accuracy, poor quality of gathered data, and processing time are considered major concerns in implementing machine learning techniques, certainly in low-end smart devices. This paper aims to introduce a novel pre-treatment technique dedicated to image text detection that uses the images' pixel divergence and similarity to reduce the image size. Mitigating the image size while keeping its features improves the model training time with an acceptable accuracy rate. The mitigation is reached by gathering similar image pixels in one pixel based on calculated values of the standard deviation σ, where we consider that two pixels are similar if they have approximately the same σ values. The work proposes a new pipeline approach that reduces the size of the image in the input and intermediate layers of a deep learning model based on merged pixels using standard deviation values instead of the whole image. The experimental results prove that this technique significantly improves the performance of existing text detection methods, particularly in challenging scenarios such as using low-end IoT devices that offer low contrast or noisy backgrounds. Compared with other techniques, the proposed technique can potentially be exploited for text detection in IoT-gathered multimedia data with reasonable accuracy in a short computation time. Evaluation of the MSRA-TD500 dataset demonstrates the remarkable performance of our approach, Standard Deviation Network (σNet), with precision and recall values of 93.8% and 85.6%, respectively, that outperform recent research results. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

22. Evaluating changes in business distribution within urban rail transit hubs in Beijing via Point of Interest (POI) data analysis (2008–2020).

Author: Wan, Bo, Wan, Dongyang, and Sheng, Qiang
Subjects: URBANIZATION, DATA analysis, CITIES & towns, URBAN growth
Abstract: This paper, set against the backdrop of expanding urban rail networks and dynamic urban development, focuses on the distribution and evolution of commercial Points of Interests (POIs) within the central urban rail transit areas of Beijing. The study examines data from four different years—2008, 2013, 2017, and 2020—to observe the temporal evolution of commercial entities. It identifies stable explanatory variables affecting the distribution and evolution of commercial POIs, which include rail transit accessibility, characteristics of the working and residential population distribution around stations, and the construction intensity in the vicinity of station areas. Through statistical analysis and model building, relatively stable linear regression equations were established, with R 2 values generally maintained above 0.5 (except for 2017). The study advances our understanding of the influence of rail transit on urban commercial spaces and how this influence shifts with temporal and urban developmental changes. It elucidates the correlation between changes in the number of businesses and spatial configuration, offering insights and information for urban planners and policy makers. This research also serves as a model for exploring the interplay between urban rail transit and commercial spaces in other major cities. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

23. Secondary use of data for data analysis: A case of modeling medical data for treatment analysis and assessment.

Author: Kaloyanova, Kalinka and Kaloyanov, Kaloyan
Subjects: SECONDARY analysis, DATA modeling, THERAPEUTICS, APPLICATION stores, ACQUISITION of data
Abstract: Collecting and structuring data on which to perform specific data analyses is a long and complicated process. It requires substantial advanced planning of what data and what data characteristics will be needed for the study. It is important to specify in what format they should be collected. On the other side, there are huge amounts of data stored in a variety of applications. This data is clean and consistent. In many cases, it can be used for new analyses, different from the original purposes for which the information was collected. This is particularly true for data in healthcare. Medical information accumulated in multiple systems can be used for further research and analysis. This paper presents a methodology, appropriate data models, and relevant software decisions that support the reuse and explorations of medical data for specific treatment assessments based on data routinely collected from clinical systems. Some initial results are presented and discussed. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

24. Cloud computing for big data analytics How cloud computing can handle procesing large amounts of data and improve real-time data analytics.

Author: Galego, Nuno Miguel Carvalho, Martinho, Domingos Santos, and Duarte, Nelson Martins
Subjects: BIG data, CLOUD computing, CLOUD storage, DATA analytics, COMPUTING platforms, DATA analysis
Abstract: With the increasing volume, velocity, and variety of data generated by various sources, big data has become a critical challenge for many organizations. Cloud computing provides an efficient and cost-effective solution to address the challenges of big data analysis. In this paper, we review the state-of-the-art cloud computing technologies for big data analysis, including cloud storage, cloud computing platforms, and cloud-based big data analytics tools. We also identify the major challenges and solutions for cloud-based big data analysis. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

25. Predictive Data Analysis: Leveraging RNN and LSTM Techniques for Time Series Dataset.

Author: Agarwal, Harsh, Mahajan, Ginika, Shrotriya, Anita, and Shekhawat, Deepika
Subjects: DEEP learning, STOCK price forecasting, TIME series analysis, STOCK price indexes, RECURRENT neural networks, DATA analysis
Abstract: Recurrent neural networks (RNNs) and long short-term memory (LSTM) models have demonstrated tremendous effectiveness in modeling time series data due to the ability they possess to capture temporal dependencies. Stock market data is an example of time series data. Stock price data is very volatile and dynamic therefore one of the most turbulent areas to invest in might be the stock market. The choice to buy or sell stocks is heavily influenced by external circumstances and statistical analysis of previous stock performance. Stock price index forecasting has been a major area of research for many years and many machine learning and deep learning methods have been proposed to simplify this hard process, but little success has been discovered so far. In this paper, the application of RNNs and LSTMs on a stock price dataset to predict future values is explored. We start with Recurrent neural networks (RNNs) to predict the values but one of the major challenges with it is the "vanishing gradient" problem, which makes it difficult for the network to learn long-term dependencies in the data. To overcome this long short-term memory (LSTM) was used which eliminates the vanishing gradient problem. Data were normalized and divided into time steps to determine the relationship between past values and future values to make accurate predictions. The outcomes show that the MLS LSTM method significantly outperforms existing algorithms, achieving a prediction accuracy of 98.1% on the training data set and a 91.97% accuracy on the testing data set. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

26. A Consumer Behavior Analysis Framework toward Improving Market Performance Indicators: Saudi's Retail Sector as a Case Study.

Author: Alawadh, Monerah and Barnawi, Ahmed
Subjects: CONSUMER behavior, BEHAVIORAL assessment, BIG data, CONSUMERS, TRANSACTION records, DATA analysis
Abstract: Studying customer behavior and anticipating future trends is a challenging task, as customer behavior is complex and constantly evolving. To effectively anticipate future trends, businesses need to analyze large amounts of data, use sophisticated analytical techniques, and stay up-to-date with the latest research and industry trends. In this paper, we propose a comprehensive framework to identify trends in consumer behavior using multiple layers of processing, including clustering, classification, and association rule learning. The aim is to help a major retailer in Saudi Arabia better understand customer behavior by utilizing the power of big data analysis. The proposed framework is presented as being generalized to gain insight into the generated big data and enable data-driven decision-making in other relevant domains. We developed this framework in collaboration with a large supermarket chain in Saudi Arabia, which provided us with over 1,000,000 sales transaction records belonging to around 30,000 of their loyal customers. In this study, we apply our proposed framework to those data as a case study and present our initial results of consumer clustering and association rules for each cluster. Moreover, we analyze our findings to figure out how we can further utilize intelligence to predict customer behavior in clustered groups. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

27. A Hybrid Feature Selection Based on Fisher Score and SVM-RFE for Microarray Data.

Author: Hamla, Hind and Ghanem, Khadoudja
Subjects: FEATURE selection, TUMOR classification, TUMOR diagnosis, DATA analysis, DIAGNOSIS
Abstract: Copyright of Informatica (03505596) is the property of Slovene Society Informatika and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2024
Full Text: View/download PDF

28. A Practical Approach of Data Visualization from Geographic Information Systems by Using Mobile Technologies.

Author: Vasilev, Julian, Petrov, Pavel, and Jordanov, Jordan
Abstract: The purpose of this paper is to demonstrate a practical approach to data visualization by using mobile technologies and desktop geographic information systems. To do that, we analyze public data for Black Sea water pollution on Varna’s beaches by using real data in real time. Geographic information systems (GISs) are used for visualization, and statistical software is used for correlation analysis. In our approach, the use of open-source software products is essential, namely QGIS and PSPP software. The methodology includes using one dataset, and after extract-transform-load procedures, additional analysis is conducted. Mobile technologies are used to demonstrate the results of the correlation analysis to a wide audience, and data visualization helps for better understanding. A similar visualization approach may be conducted for other subjects and fields of interest. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

29. Improving Short-Term Traffic Flow Prediction using Grey Relational Analysis for Data Filtering and Stacked LSTM Modeling.

Author: Zhizhu Wu, Mingxia Huang, Zhibo Xing, and Tao Yang
Subjects: GREY relational analysis, TRAFFIC flow, TRAFFIC congestion, DATA analysis, PREDICTION models
Abstract: Traffic flow prediction is one of the critical measures to alleviate traffic congestion. Currently, traffic flow prediction research has made some achievements, but there are still some deficiencies. In order to solve the problems of low prediction accuracy, poor real-time performance, and high data dimensions. This paper proposes a new traffic flow prediction method that employs Grey Relation Analysis (GRA) to detect the correlation between detection points, remove insignificant or uncorrelated traffic flow data points, and hence reduce the data dimensionality of the prediction model. Multiple Long Short-Term Memory (LSTM) models are then stacked to establish the traffic flow prediction model, considering that traffic flow is affected by multi-dimensional spatiotemporal factors, incorporating vehicle speed, occupancy, and traffic volume as inputs. We conducted experiments on real datasets, and the results showed that our GRA-SLSTM model improved prediction accuracy by 3.6% compared to other models, while reducing model prediction time by 27.33%. The proposed model's generalization ability is validated by predicting other detection points, which provides significant references for traffic flow prediction research and practical applications. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

30. A theoretical framework based on activity theory and structuration theory.

Author: Nehemia, Monica and Iyamu, Tiko
Subjects: STRUCTURATION theory, ACTOR-network theory, INFORMATION storage & retrieval systems, DATA analysis
Abstract: The last two decades have witnessed a surge of sociotechnical theories used to examine information systems (IS) research, attributable, to increase rigor. It, therefore, necessitates innovative ways and usefulness in applying the theories in IS research. Inevitably, it leads to complementarity of some of the theories, whereby, activity theory (AT) and actor-network theory (ANT), and structuration theory (ST) and ANT have been combined in IS studies, in recent years. Despite the efforts, there is a need for complementarity of more theories, to cover extended scopes and increase rigor in the phenomenon being studied. This paper proposes a theoretical framework that combines AT and ST, for data analysis and interpretation of findings in IS research. The paper advances the application of sociotechnical theories to underpin IS research through a theoretical framework that combines AT and ST's dimension of change. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

31. Innovación en Uruguay entre 1970 y 2018: una aproximación a través de los datos de patentes.

Author: Bianchi, Carlos, Galaso, Pablo, Palomeque, Sergio, Picasso, Santiago, and Rodríguez Miranda, Adrián
Subjects: PATENT offices, REGIONAL development, FILES (Records), PATENTS, DATA analysis, SPATIO-temporal variation
Abstract: Copyright of Revista Española de Documentación Científica is the property of Consejo Superior de Investigaciones Cientificas and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2023
Full Text: View/download PDF

32. Intelligent Analysis Algorithm for Hidden Danger Identification of Intelligent Network Monitoring System from the Perspective of Big Data.

Author: Xu, Fang, Chen, Qiang, Liu, Qi, and Li, Ning
Subjects: BIG data, ALGORITHMS, HAZARDS, INFORMATION dissemination, INTELLIGENT networks, DATA analysis, IDENTIFICATION
Abstract: Big data has become the mainstream of information dissemination today. Big data plays an important role in the field of intelligent network monitoring. How to improve the hidden danger identification of intelligent network monitoring system and improve its accuracy is a hot issue to be solved. Therefore, in the context of big data, this paper uses intelligent analysis algorithm to identify and analyze the hidden dangers of intelligent network monitoring system. This paper mainly applies the methods of experimental setup, fuzzy analysis and data comparison to carry out relevant research on intelligent analysis of hidden danger identification of intelligent network monitoring system. The experimental data shows that the accuracy of the data obtained by the proposed method in the hidden danger identification is more than 50%, but less than 80%. Therefore, the application of background difference method needs further improvement. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

33. Academic Early Warning Model for Students Based on Big Data Analysis.

Author: Kun Wang
Subjects: BIG data, DATA analysis, PEARSON correlation (Statistics), COLLEGE students, NONLINEAR equations, WARNINGS
Abstract: How to identify in advance and help college students with academic difficulties is an important topic for current education departments and universities. Academic early warning system based on big data analysis comprehensively analyzes the learning, life and psychological data of college students, effectively identifies potential academic problems, and helps teachers and student managers take measures in advance to improve the education quality. The existing academic warning models of college students based on big data analysis often have defects, such as data quality issues, lack of key variables, nonlinear problems, and human factors. Therefore, this paper aimed to study the academic early warning model of college students based on big data analysis. After elaborating on the key points of collecting the academic early warning model data based on big data analysis, this paper explained the reasons of calculating the Pearson correlation coefficient of collected big data. This paper constructed an academic early warning model of college students based on deep self-coding network, provided the construction process, and explained its working principle. After optimizing the model parameters, this paper analyzed the model reconstruction error based on sliding window statistical method, and further improved the prediction ability and generalization performance of evaluating the deep self-coding network model, thus obtaining higher academic early warning accuracy. The experimental results verified that the constructed model was effective. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

34. Fuzzy cluster-scaled principal component analysis for mixed data and its application of educational effect based on device selection.

Author: Sato-Ilic, Mika and Ilic, Peter
Subjects: PRINCIPAL components analysis, FUZZY clustering technique, DATA analysis, TEST scoring
Abstract: This paper proposes a fuzzy cluster-scaled principal component analysis (fuzzy cluster-scaled PCA) for mixed data, which consists of both numerical and categorical data with respect to quantitative and qualitative variables. The fuzzy cluster-scaled PCA has been proposed for high-dimension, low-sample size (HDLSS) data in which the number of variables (or dimensions) is much larger than the number of objects (or samples). In this case, the target data is only for the numerical data. The proposed fuzzy cluster-scaled PCA in this paper can be applied to the mixed type of HDLSS data by utilizing the feature of the fuzzy cluster-scaled correlation, which is decomposed into two parts: the first part is the correlation of classification structures between variables and the second part is the correlation between variables. Then, the first part can be adapted to the categorical data, and the second part can be used for the numerical data through the same objects. Several numerical examples used data from a survey concerned with the relationship between students' choice of devices and scores of examination/class marks to measure the educational effect show a better performance of this proposed method, and insightful, important information on educational effectiveness is clarified. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

35. From Data to Wisdom: A Review of Applications and Data Value in the context of Small Data.

Author: Werner, Jonas, Beisswanger, Philipp, Schürger, Christoph, Klaiber, Marco, and Theissler, Andreas
Subjects: BIG data, MACHINE learning, RESEARCH personnel, WISDOM, DATA analysis, DATA visualization
Abstract: Small data and big data are distinct approaches to data analysis and utilization in various applications. While big data has been the focus of many research and business efforts for more than ten years, small data is increasingly being recognized as having potential value in certain settings. We systematically review literature and conclude that small data can be indeed valuable in certain scenarios. This paper incorporates the data value perspective of small data within various application areas. For this, we apply the data-information-knowledge-wisdom (DIKW) hierarchy to categorize papers and findings, and discuss the papers from the view point of "data value". Our review identifies various contexts where small data can be used to create value, such as data pre-processing, classification tasks, anomaly detection, forecasting and decision support. We also highlight industries that may be particularly promising areas for practitioners and researchers focused on small data. In addition, we provide an overview of methods and tools for small data analysis, including statistical techniques, visualization, and machine learning algorithms. Finally, based on our results, we suggest, that further research should focus on small data analysis. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

36. Construction of College Students' Management Informatization Ecosystem Based on Data Analysis Technology.

Author: Yinghui Xi and Lei Zhu
Subjects: ECOSYSTEM management, COLLEGE students, BIG data, DATA analysis, PSYCHOLOGY of students, DEEP learning, MULTISENSOR data fusion
Abstract: College students' management informatization ecosystem can help college students better manage information and resources, thus improving learning and life efficiency. It's important to introduce data technology into the college student management informatization ecosystem, which can better realize the extraction of student behavior fluctuation information and achieve more targeted student management decisions. The current research lacks consideration of its future development trend, such as the impact of the application of technologies, including artificial intelligence, big data Fenix, and the Internet of Things. This paper studies the construction of college students' management informatization ecosystem based on data analysis technology. Firstly, the design of the data fusion scheme applied to the college students' management informatization ecosystem is carried out, and the idea of the scheme is given. Aiming at the big data of students' learning and living behaviors, this paper digs deep into the value of data information, and extract historical student behavior characteristics, providing a basis for short-term student behavior prediction and management decision-making planning and implementation for the college students' management informatization ecosystem. Based on historical student behavior data and its characteristic data, the XGboost algorithm is used to quantify and extract the importance of each influencing factor on student behavior fluctuations. Experimental results verify the effectiveness of the method proposed in this paper. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

37. The Current Perceptions About Instructional Tools in Educational Towards Adoption of Virtual Reality Among Undergraduate Students.

Author: Farsi, Ghaliya Al, Yusof, Azmi bin Mohd., Bin Rusli, Mohd Ezanee, and AlSinani, Maryam
Subjects: UNDERGRADUATES, THEMATIC analysis, LEARNING, DATA analysis
Abstract: Virtual reality (VR) has emerged as a major tool in this field of research and education development. However, many challenges arise for students during the course of instruction and learning. The learning process, the placement of support assessment variables, and the behavioral intention to continue using it within the learning spectrum are crucial to the success of virtual reality in the educational sector. The goals of this study are to inquire into the degree to which VR is now being utilized in the field of education. In addition, Thematic analysis was used to analyze the data since it provided an applicable approach to use across the interviews, many methods combined into this study. The study polled 32 teachers and analyzed the results with advanced analysist tools. All of the proposed points were determined using the data analysis. The finding of this paper that the initial round of interviews questions were aimed to discover more about the participants' backgrounds, including work titles and gender. The current study was intending to ability to use VR as an instructional tool was discovered as shown in the results of the survey interview in this paper. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

38. MODELING AND ANALYZING USER BEHAVIOR RISKS IN ONLINE SHOPPING PROCESSES BASED ON DATA-DRIVEN AND PETRI-NET METHODS.

Author: Wangyang Yu, Zhuojing Ma, Xiaojun Zhai, Yuke Zhou, Weiwei Zho, and Yuan Liu
Subjects: ONLINE shopping, AT-risk behavior, PETRI nets
Abstract: With the rapid spread of e-commerce and e-payment, the increasing number of people choose online shopping instead of traditional buying way. However, the malicious user behaviors have a significant influence on the security of users’ accounts and property. In order to guarantee the security of shopping environment, a method based on Complex Event Process (CEP) and Colored Petri nets (CPN) is proposed in this paper. CEP is a data-driven technology that can correlate and process a large amount of data according to Event Patterns, and CPN is a formal model that can simulate and verify the specifications of the online shopping processes. In this work, we first define the modeling scheme to depict the user behaviors and Event Patterns of online shopping processes based on CPN. The Event Patterns can be constructed and verified by formal methods, which guarantees the correctness of Event Patterns. After that, the Event Patterns are translated into Event Pattern Language (EPL) according to the corresponding algorithms. Finally, the EPLs can be inserted into the complex event processing engine to analyze the users’ behavior flows in real-time. In this paper, we validate the effectiveness of the proposed method through case studies. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

39. Hybrid Approach for Privacy Enhancement in Data Mining Using Arbitrariness and Perturbation.

Author: Murugeshwari, B., Rajalakshmi, S., and Sudharson, K.
Subjects: DATA mining, DATA security, STATISTICAL sampling, PERTURBATION theory, DATA analysis
Abstract: Imagine numerous clients, each with personal data; individual inputs are severely corrupt, and a server only concerns the collective, statistically essential facets of this data. In several data mining methods, privacy has become highly critical. As a result, various privacy-preserving data analysis technologies have emerged. Hence, we use the randomization process to reconstruct composite data attributes accurately. Also, we use privacy measures to estimate how much deception is required to guarantee privacy. There are several viable privacy protections; however, determining which one is the best is still a work in progress. This paper discusses the difficulty of measuring privacy while also offering numerous random sampling procedures and statistical and categorized data results. Furthermore, this paper investigates the use of arbitrary nature with perturbations in privacy preservation. According to the research, arbitrary objects (most notably random matrices) have "predicted" frequency patterns. It shows how to recover crucial information from a sample damaged by a random number using an arbitrary lattice spectral selection strategy. This filtration system's conceptual framework posits, and extensive practical findings indicate that sparse data distortions preserve relatively modest privacy protection in various situations. As a result, the research framework is efficient and effective in maintaining data privacy and security. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

40. Discussion about "estimation of curie depth, geothermal gradient and near-surface heat flow from spectral analysis of aeromagnetic data in the loum-minta area (centre-east cameroon)".

Author: Mouzong, Marcelin, Donald Njiteu Tchoukeu, Cyrille, Som Mbang, Constantin, Charles, Basseka, and Etame, Jacques
Subjects: DATA analysis, HEAT flux, GEOLOGICAL modeling, SEISMIC anisotropy
Abstract: this paper reviews the results obtained from recent studies mentioning the crustal architecture in Cameroon. It also discusses the study of [1] "Estimation of Curie depth, geothermal gradient and near-surface heat flow from spectral analysis of aeromagnetic data in the Loum-Minta area (Centre-East Cameroon)". The paper published by [1] is an application of the spectral method for the determination of Curie depths, followed by interpretations on the local thermal structure (thermal gradient, heat flux) of the crust. However, some outliers are remarkable and depend at first analysis on a bad mesh of the data following a windows of 27.75 * 27.75 km. Similarly, the failure to take into account the various parameters (seismic, geological, tectonic) which would help to better calibrate the results. with this communication, we would like to mention and clarify some errors in the paper published by [1] based on a global review of thermal behaviour of the crust. It is strongly recommended that information on regional geology and seismicity be used to estimate the Curie depths and that other independent data be integrated to determine the thermal and mechanical behaviour of the crust in Cameroon. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

41. Big Data Analysis and Forecast of Employment Position Requirements for College Students.

Author: Yong Wei, Yinle Zheng, and Ning Li
Subjects: PYTHON programming language, BIG data, COLLEGE students, NATURAL language processing, DATA analysis, EMPLOYMENT forecasting, RANDOM fields, PROGRAMMING languages
Abstract: With the help of natural language processing and machine learning, we can analyze the information of online recruitment text posted by employer companies and dig the requirement features of these employment positions, and this is a meaningful work with practical value. However, existing methods for analyzing online recruitment information are too simple to withstand the mass data on the Internet, so this paper aims to study the analysis and forecast of employment position requirements for college students based on big data analysis. At first, the recruitment information of companies is preprocessed, keywords in the recruitment text are extracted by the Term Frequency-Inverse Document Frequency (TF-IDF) algorithm in the Python Chinese word segmentation toolkit, words in the recruitment text are segmented using the open source tool Word2Vec, and the uncounted words in the recruitment text are identified based on the Conditional Random Field (CRF) model. Then, this paper compares the occurrence probability of skills learnt by college students in a certain company employment position with the occurrence probability of the skills in all employment positions on the website, so as to find out the core skills learnt by college students that can match with the job positions required by companies. At last, this paper builds a XGBoost model to forecast the employment position requirements for college students, and verifies the validity of the model using experimental results. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

42. Computer data analysis to estimate the effects of government effectiveness and voice accountability on Research and Development in Latin American countries (2002-2021).

Author: Ortega, José Torres, Rosa, Jorge Ortega De La, and Albor, Gustavo Rodriguez
Subjects: RESEARCH & development, POLITICAL participation, HUMAN voice, COMPUTERS, ECONOMIC expansion
Abstract: This paper examines the relationship among Research and Development (R&D) expenditure, government effectiveness, and voice accountability in 15 Latin American countries. R&D expenditure is a crucial driver of innovation and economic growth, and effective governance and voice accountability are essential for promoting transparency and citizen participation in decision-making processes. The study utilizes a panel dataset from 2002 to 2021 and employs computer data analysis to estimate the effects of changes in government effectiveness and voice accountability on R&D expenditure. The findings reveal a positive impact of government effectiveness and voice accountability on R&D expenditure. A 1% increase in government effectiveness is associated with a 0.618% increase in research development expenditure, while a 1% rise in voice accountability corresponds to a 0.625% increase. These results highlight the importance of effective governance and citizen engagement in fostering R&D investment. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

43. La proletarización del profesorado como efecto de las políticas neoliberales en educación.

Author: Calvo García, Guadalupe, García Gómez, Teresa, and Vázquez Recio, Rosa
Subjects: EDUCATIONAL standards, PROLETARIANIZATION, NEOLIBERALISM, TEACHERS, CURRICULUM evaluation, DATA analysis
Abstract: Copyright of Revista Electrónica Interuniversitaria De Formación del Profesorado is the property of Asociacion Universitaria de Formacion del Profesorado (AUFOP) and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2024
Full Text: View/download PDF

44. A Principled Distributional Approach to Trajectory Similarity Measurement and its Application to Anomaly Detection.

Author: Yufan Wang, Zijing Wang, Kai Ming Ting, and Yuanyi Shang
Subjects: ANOMALY detection (Computer security), UNIQUENESS (Mathematics), DATA analysis, MATHEMATICAL models, MATHEMATICAL analysis
Abstract: This paper aims to solve two enduring challenges in existing trajectory similarity measures: computational inefficiency and the absence of the ‘uniqueness’ property that should be guaranteed in a distance function: dist(X, Y ) = 0 if and only if X = Y, where X and Y are two trajectories. In this work, we present a novel approach utilizing a distributional kernel for trajectory representation and similarity measurement, based on the kernel mean embedding framework. It is the very first time a distributional kernel is used for trajectory representation and similarity measurement. Our method does not rely on point-to-point distances which are used in most existing distances for trajectories. Unlike prevalent learning and deep learning approaches, our method requires no learning. We show the generality of this new approach in anomalous trajectory and sub-trajectory detection. We identify that the distributional kernel has (i) a data-dependent property and the ‘uniqueness’ property which are the key factors that lead to its superior task-specific performance, and (ii) runtime orders of magnitude faster than existing distance measures. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

45. On Mitigating the Utility-Loss in Differentially Private Learning: A New Perspective by a Geometrically Inspired Kernel Approach.

Author: Kumar, Mohit, Moser, Bernhard A., and Fischer, Lukas
Subjects: MACHINE learning, PRIVACY, MEDICAL sciences, DATA analysis, ACCURACY
Abstract: Privacy-utility tradeoff remains as one of the fundamental issues of differentially private machine learning. This paper introduces a geometrically inspired kernel-based approach to mitigate the accuracy-loss issue in classification. In this approach, a representation of the affine hull of given data points is learned in Reproducing Kernel Hilbert Spaces (RKHS). This leads to a novel distance measure that hides privacy-sensitive information about individual data points and improves the privacy-utility tradeoff via significantly reducing the risk of membership inference attacks. The effectiveness of the approach is demonstrated through experiments on MNIST dataset, Freiburg groceries dataset, and a real biomedical dataset. It is verified that the approach remains computationally practical. The application of the approach to federated learning is considered and it is observed that the accuracy-loss due to data being distributed is either marginal or not significantly high. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

46. Iterative Train Scheduling under Disruption with Maximum Satisfiability.

Author: Lemos, Alexandre, Gouveia, Filipe, Monteiro, Pedro T., and Lynce, Ines
Subjects: COMPUTER algorithms, ENERGY consumption, DATA analysis
Abstract: This paper proposes an iterative Maximum Satisfiability (MaxSAT) approach designed to solve train scheduling optimization problems. The generation of railway timetables is known to be intractable for a single track. We consider hundreds of trains on interconnected multi-track railway networks with complex connections between trains. Furthermore, the proposed algorithm is incremental to reduce the impact of time discretization. The performance of our approach is evaluated with the real-world Swiss Federal Railway (SBB) Crowd Sourcing Challenge benchmark and Periodic Event Scheduling Problems benchmark (PESPLib). The execution time of the proposed approach is shown to be, on average, twice as fast as the best existing solution for the SBB instances. In addition, we achieve a significant improvement over SAT-based solutions for solving the PESPLib instances. We also analyzed real schedule data from Switzerland and the Netherlands to create a disruption generator based on probability distributions. The novel incremental algorithm allows solving the train scheduling problem under disruptions with better performance than traditional algorithms. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

47. conDENSE: Conditional Density Estimation for Time Series Anomaly Detection.

Author: Moore, Alex and Morelli, Davide
Subjects: ANOMALY detection (Computer security), TIME series analysis, DEEP learning, AUTOREGRESSIVE models, DATA analysis
Abstract: In recent years deep learning methods, based on reconstruction errors, have facilitated huge improvements in unsupervised anomaly detection. These methods make the limiting assumption that the greater the distance between an observation and a prediction the lower the likelihood of that observation. In this paper we propose conDENSE, a novel anomaly detection algorithm, which does not use reconstruction errors but rather uses conditional density estimation in masked autoregressive flows. By directly estimating the likelihood of data, our model moves beyond approximating expected behaviour with a single point estimate, as is the case in reconstruction error models. We show how conditioning on a dense representation of the current trajectory, extracted from a variational autoencoder with a gated recurrent unit (GRU VAE), produces a model that is suitable for periodic datasets, while also improving performance on non-periodic datasets. Experiments on 31 time-series, including real-world anomaly detection benchmark datasets and synthetically generated data, show that the model can outperform state-of-the-art deep learning methods. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

48. AUTOMATIC ANALYSIS OF X (TWITTER) DATA FOR SUPPORTING DEPRESSION DIAGNOSIS.

Author: Królak, Aleksandra, Wiktorski, Tomasz, and Żmudzińska, Aleksandra
Subjects: MENTAL depression, SENTIMENT analysis, SOCIAL media, DIAGNOSIS, SCHEDULING, DATA scrubbing, USER-generated content
Abstract: Depression is an increasingly common problem that often goes undiagnosed. The aim of this paper was to determine whether an analysis of tweets can serve as a proxy for assessing depression levels in the society. The work considered keyword-based sentiment analysis, which was enhanced to exclude informational tweets about depression or about recovery. The results demonstrated the words used in the posts most often and the emotional polarity of the tweets. A schedule of user activity was mapped out and trends related to daily activity of users were analyzed. It was observed that the identified X (Twitter) activity related to depression corresponded well with reports on persons with depression and statistics related to suicidal deaths. Therefore, it could be construed that people with undiagnosed depression express their feelings in social media more often, looking, in this way, for help with their emotional problems. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

49. The Factors Influencing Psychological Distress Among Striking Workers in Nigeria in the Post-COVID Pandemic Era.

Author: Ewah, Patrick Ayi, Womboh, Idoo, Awhen, Peter Agba, and Dan, Felicia Agbor-Obun
Subjects: EMPLOYEE psychology, STATISTICS, CROSS-sectional method, MULTIPLE regression analysis, MANN Whitney U Test, STRIKES & lockouts, PHYSICAL activity, CRONBACH'S alpha, SOCIAL classes, DESCRIPTIVE statistics, QUESTIONNAIRES, STATISTICAL sampling, DATA analysis, PSYCHOLOGICAL distress, COVID-19 pandemic
Abstract: Objectives: Withholding workers' salaries for months as a punishment for engaging in a strike may natively affect and influence their psychological distress. This study assessed the correlation between physical activity, psychological distress, and socioeconomic status, and explored the factors influencing psychological distress among the striking workers in Nigeria. Methods: This cross-sectional face-to-face and online study conveniently sampled a total of 234 lecturers aged 27-69 years. The sociodemographic, physical, socio-economic, and psychological distress was assessed by the university strike physical and psychological distress questionnaire (USPAPDQ). Data analysis includes descriptive statistics, multiple linear regression, Spearman's correlation, and Man whiney U-test, using a P<0.05 as the level of significance. Results: The mean age, number of papers, and frequency/duration of exercise were 45.4±10.36 years, 4±5.82, 2.19±1.63 day/week, and 30.49±29.82 minutes/day, respectively. An inverse significant relationship was established between anxiety and age (r=-0.27; P<0.01), contemplating changing my job (r=-0.40; P<0.01), number of children (r=-0.19; P<0.01), academic rank (r=-0.27; P<0.01), frequency (r=-0.18; P<0.01), and duration (r=-0.16; P=0.02) of exercise. The significant predictors of anxiety were marital status (ß=-0.207, P<0.01), contemplation on changing my job if the strike continues (P<0.01, ß=-0.198), see anything positive about the strike (P<0.01, ß=0.178), and numbers of children (P<0.01, ß=-0.193). The significant predictors of depression were, an alternate source of income (P=0.04, ß=0.126), contemplation on changing my job if the strike continues (P=0.03, ß=-0.149), seeing anything positive about the strike (P=0.05, ß=0.118), and time (hours) spent watching television (P=0.03, ß=0.124). Discussion: Overall, the significant negative predictors of psychological distress include marital status, contemplating changing my job, and number of children. The positive predictors were seeing anything positive about the strike, alternate sources of income, and time spent watching television. The government may prevent the reoccurrence of strikes by honouring existing agreements. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

50. Intelligent identification and real-time warning method of diverse complex events in horizontal well fracturing.

Author: YUAN, Bin, ZHAO, Mingze, MENG, Siwei, ZHANG, Wei, and ZHENG, He
Subjects: HORIZONTAL wells, HYDRAULIC fracturing, DEEP learning, ARTIFICIAL neural networks, DATA analysis
Abstract: The existing approaches for identifying events in horizontal well fracturing are difficult, time-consuming, inaccurate, and incapable of real-time warning. Through improvement of data analysis and deep learning algorithm, together with the analysis on data and information of horizontal well fracturing in shale gas reservoirs, this paper presents a method for intelligent identification and real-time warning of diverse complex events in horizontal well fracturing. An identification model for "point" events in fracturing is established based on the Att-BiLSTM neural network, along with the broad learning system (BLS) and the BP neural network, and it realizes the intelligent identification of the start/end of fracturing, formation breakdown, instantaneous shut-in, and other events, with an accuracy of over 97%. An identification model for "phase" events in fracturing is established based on enhanced Unet++ network, and it realizes the intelligent identification of pump ball, pre-acid treatment, temporary plugging fracturing, sand plugging, and other events, with an error of less than 0.002. Moreover, a real-time prediction model for fracturing pressure is built based on the Att-BiLSTM neural network, and it realizes the real-time warning of diverse events in fracturing. The proposed method can provide an intelligent, efficient and accurate identification of events in fracturing to support the decision-making. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

1,837 results

Search Results

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources