Descriptor: "Data processing" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Data processing"' showing total 93,743 results

Start Over Descriptor "Data processing"

93,743 results on '"Data processing"'

1. Understanding Individualized Education Program (IEP) Goals at Scale. EdWorkingPaper No. 24-992

Author: Annenberg Institute for School Reform at Brown University, Indiana Department of Education, Wheelock Educational Policy Center (WEPC), Christopher Cleveland, and Jessica Markham
Abstract: Students with disabilities represent 15% of U.S. public school students. Individualized Education Programs (IEPs) inform how students with disabilities experience education. Very little is known about the aspects of IEPs as they are historically paper-based forms. In this study, we develop a coding taxonomy to categorize IEP goals into 10 subjects and 40 skills. We apply the taxonomy to digital IEP records for an entire state to understand the variety of IEP goal subjects and skills prescribed to students with different disabilities. This study highlights the utility of studying digital IEP records for informing practice and policy.
Published: 2024

2. Teaching Case: How Popular Is Your Name? Finding the Popularity of US Names Using Big Data Visualization

Author: Frank Lee and Alex Algarra
Abstract: Exploratory data analysis (EDA), data visualization, and visual analytics are essential for understanding and analyzing complex datasets. In this project, we explored these techniques and their applications in data analytics. The case discusses Tableau, a powerful data visualization tool, and Google BigQuery, a cloud-based data warehouse that enables users to store, query, and analyze large datasets. It also explored the benefits and applications of both tools and their integration with other platforms and services. The project offers an introductory insight into Tableau's functionalities, employing a data file from the US Census Bureau via Google BigQuery.
Published: 2024

3. Using R for Multivariate Meta-Analysis on Educational Psychology Data: A Method Study

Author: Gamon Savatsomboon, Prasert Ruannakarn, Phamornpun Yurayat, Ong-art Chanprasitchai, and Jibon Kumar Sharma Leihaothabam
Abstract: Using R to conduct univariate meta-analyses is becoming common for publication. However, R can also conduct multivariate meta-analysis (MMA). However, newcomers to both R and MMA may find using R to conduct MMA daunting. Given that, R may not be easy for those unfamiliar with coding. Likewise, MMA is a topic of advanced statistics. Thus, it may be very challenging for most newcomers to conduct MMA using R. If this holds, this can be viewed as a practice gap. In other words, the practice gap is that researchers are not capable of using R to conduct MMA in practice. This is problematic. This paper alleviates this practice gap by illustrating how to use R (the metaSEM package) to conduct MMA on educational psychology data. Here, the metaSEM package is used to obtain the required MMA text outputs. However, the metaSEM package is not capable of producing the other required graphical outputs. As a result, the metafor package is also used as a complimentary to generate the required graphical outputs. Ultimately, we hope that our audience will be able to apply what they learn from this method paper to conduct MMA using R in their teaching, research, and publication.
Published: 2024

4. SBS Feature Selection and AdaBoost Classifier for Specialization/Major Recommendation for Undergraduate Students

Author: Nesrine Mansouri, Mourad Ab, and Makram Soui
Abstract: Selecting undergraduate majors or specializations is a crucial decision for students since it considerably impacts their educational and career paths. Moreover, their decisions should match their academic background, interests, and goals to pursue their passions and discover various career paths with motivation. However, such a decision remains challenging are unfamiliar with the job market, the demand for the required skills, and being in the proper placement in a major is not straightforward. Thus, an automatic recommendation system can be helpful for students to assist and guide them in the right decision. In this context, we developed a machine learning model to predict and recommend suitable specializations for undergraduate students according to the job market and student's academic history. Two hundred twenty-five records of students are considered to establish this work. The proposed approach encompasses four major steps, including data preprocessing to clean, scale, and prepare the data for training to avoid obtaining suboptimal results, accompanied by an oversampling process to equal the samples' distribution to prevent the model from being biased or poorly generalized. Furthermore, we conducted a feature selection step using Sequential Backward Selection (SBS) to extract the relevant features to improve the outcomes and reduce the risk of noise. The selected subset is used to train the model using the AdaBoost classifier. We deployed the Genetic algorithm to optimize the classifier's hyperparameters to maximize results. As a result, the findings of this study exhibit noticeable results compared to existing models, with an accuracy of 98.1%. The proposed model can be reliable in guiding undergraduate students through proper decisions regarding selecting their major.
Published: 2024
Full Text: View/download PDF

5. Profile of State Data Capacity in 2019 and 2020: Statewide Longitudinal Data Systems (SLDS) Survey Descriptive Statistics. Stats in Brief. NCES 2022-051

Author: National Center for Education Statistics (NCES) (ED/IES), Meholick, Sarah, Honey, Rose, and LaTurner, Jason
Abstract: Statewide longitudinal data systems (SLDSs) can enable researchers, policymakers, and practitioners to identify and understand important relationships and trends across the education-to-workforce continuum. A well-developed SLDS can increase state and territory governments' ability to establish more informed and equitable policies, enable agency leaders to act more strategically, and help practitioners make more data-informed decisions. The SLDS Survey was created to capture information about the data capacity of states' and territories' SLDSs across these varying circumstances. In addition to inventorying information about whether a given data type, link, or use is in place, the SLDS Survey explores the development of SLDSs and their varying degrees of implementation. By providing standard measures for various aspects of data capacity, the SLDS Survey helps stakeholders understand and assess the ability of SLDSs to store, manage, link, and use key data types across the preschool through workforce (P-20W+) spectrum. This Statistics in Brief provides aggregate data from the 2019 and 2020 administrations of the SLDS Survey. The primary focus of the report is on the 2020 SLDS Survey with results specific to the 2019 SLDS Survey. This brief is structured to address the following four research questions: (1) What types of K-12 data are included in the statewide longitudinal data system (SLDS)?; (2) What is the capacity for linking K-12 student data in the SLDS to other data? How are the data linked?; (3) Are there data dictionaries published publicly? Are data aligned to the Common Education Data Standards (CEDS)?; and (4) How do states and territories use data for reporting and decision-making?
Published: 2023

6. Co-ML: Collaborative Machine Learning Model Building for Developing Dataset Design Practices

Author: Tiffany Tseng, Matt J. Davidson, Luis Morales-Navarro, Jennifer King Chen, Victoria Delaney, Mark Leibowitz, Jazbo Beason, and R. Benjamin Shapiro
Abstract: Machine learning (ML) models are fundamentally shaped by data, and building inclusive ML systems requires significant considerations around how to design representative datasets. Yet, few novice-oriented ML modeling tools are designed to foster hands-on learning of dataset design practices, including how to design for data diversity and inspect for data quality. To this end, we outline a set of four data design practices (DDPs) for designing inclusive ML models and share how we designed a tablet-based application called Co-ML to foster learning of DDPs through a collaborative ML model building experience. With Co-ML, beginners can build image classifiers through a distributed experience where data is synchronized across multiple devices, enabling multiple users to iteratively refine ML datasets in discussion and coordination with their peers. We deployed Co-ML in a 2-week-long educational AIML Summer Camp, where youth ages 13-18 worked in groups to build custom ML-powered mobile applications. Our analysis reveals how multi-user model building with Co-ML, in the context of student-driven projects created during the summer camp, supported development of DDPs including incorporating data diversity, evaluating model performance, and inspecting for data quality. Additionally, we found that students' attempts to improve model performance often prioritized learnability over class balance. Through this work, we highlight how the combination of collaboration, model testing interfaces, and student-driven projects can empower learners to actively engage in exploring the role of data in ML systems.
Published: 2024
Full Text: View/download PDF

7. Mathematics Intelligent Tutoring Systems with Handwritten Input: A Scoping Review

Author: Luiz Rodrigues, Filipe Dwan Pereira, Marcelo Marinho, Valmir Macario, Ig Ibert Bittencourt, Seiji Isotani, Diego Dermeval, and Rafael Mello
Abstract: Intelligent Tutoring Systems (ITS) have been widely used to enhance math learning, wherein teacher's involvement is prominent to achieve their full potential. Usually, ITSs depend on direct interaction between the students and a computer. Recently, researchers started exploring handwritten input (e.g., from paper sheets) aiming to provide equitable access to ITSs' benefits. However, research on math ITSs ability to handle handwritten input is limited and, to our best knowledge, no study has summarized its state of the art. This article fulfills that gap with a scoping review of handwritten recognition methods, characteristics, and applications of math ITSs compatible with handwritten input. Based on a search of 11 databases, we found eight primary studies that met our criteria. Mainly, we found that all ITSs depend on receiving handwritten input from a touchscreen interface, in contrast to recognizing solutions developed on paper. We also found that most ITSs focus on similar audiences (e.g., English speakers students), subjects (e.g., algebraic questions), and applications (e.g., in-class to understand student perceptions). Thus, towards enabling equitable access to ITSs, we propose ITS Unplugged (i.e., ITSs that i) run on low-cost, resource-restricted devices with little to no internet connection and ii) receive as well as return information in the format target users usually use) and contribute a research agenda concerning challenges of developing such ITSs.
Published: 2024
Full Text: View/download PDF

8. Innovations in Exploring Sequential Process Data

Author: Esther Ulitzsch, Qiwei He, and Steffi Pohl
Abstract: This is an editorial for a special issue "Innovations in Exploring Sequential Process Data" in the journal Zeitschrift für Psychologie. Process data refer to log files generated by human-computer interactive items. They document the entire process, including keystrokes, mouse clicks as well as the associated time stamps, performed by a test-taker to complete a task. Due to their vast potential for providing in-depth insights into how test-takers approached the administered tasks, process data from educational assessments have gained increasing attention in the psychometric, psychological, and educational science literature. However, data that support detailed documentation of test-takers' cognitive and behavioral processes above and beyond the mere time required for solving the task are commonly complex and versatile, and leveraging their potential is therefore not straightforward. Examples for such data are keystroke data, clickstream data, navigation behaviors, video stream, or eye-tracking, just to name a few. This topical issue highlights the potential and plurality of this still developing field of research. It illustrates the large variety of sequential process data that can be obtained from computerized questionnaire and test administrations, the broad array of research questions and measurement challenges that can be tackled by making use of their rich information and sequential nature, and how these data can be used for theory building and evaluation alike. It further showcases how these complex data types can be synthesized through inherently contrasting viewpoints; drawing on exploratory machine learning and data mining techniques on the one hand and perceiving them through the lens of cognitive theory on the other.
Published: 2024
Full Text: View/download PDF

9. How to Open Science: Debugging Reproducibility within the Educational Data Mining Conference

Author: Haim, Aaron, Gyurcsan, Robert, Baxter, Chris, Shaw, Stacy T., and Heffernan, Neil T.
Abstract: Despite increased efforts to assess the adoption rates of open science and robustness of reproducibility in sub-disciplines of education technology, there is a lack of understanding of why some research is not reproducible. Prior work has taken the first step toward assessing reproducibility of research, but has assumed certain constraints which hinder its discovery. Thus, the purpose of this study was to replicate previous work on papers within the proceedings of the "International Conference on Educational Data Mining" to accurately report on which papers are reproducible and why. Specifically, we examined 208 papers, attempted to reproduce them, documented reasons for reproducibility failures, and asked authors to provide additional information needed to reproduce their study. Our results showed that out of 12 papers that were potentially reproducible, only one successfully reproduced all analyses, and another two reproduced most of the analyses. The most common failure for reproducibility was failure to mention libraries needed, followed by non-seeded randomness. [For the complete proceedings, see ED630829. Additional funding for this paper was provided by the U.S. Department of Education's Graduate Assistance in Areas of National Need (GAANN).]
Published: 2023

10. Investigating the Importance of Demographic Features for EDM-Predictions

Author: Cohausz, Lea, Tschalzev, Andrej, Bartelt, Christian, and Stuckenschmidt, Heiner
Abstract: Demographic features are commonly used in Educational Data Mining (EDM) research to predict at-risk students. Yet, the practice of using demographic features has to be considered extremely problematic due to the data's sensitive nature, but also because (historic and representation) biases likely exist in the training data, which leads to strong fairness concerns. At the same time and despite the frequent use, the value of demographic features for prediction accuracy remains unclear. In this paper, we systematically investigate the importance of demographic features for at-risk prediction using several publicly available datasets from different countries. We find strong evidence that including demographic features does not lead to better-performing models as long as some study-related features exist, such as performance or activity data. Additionally, we show that models, nonetheless, place importance on these features when they are included in the data--although this is not necessary for accuracy. These findings, together with our discussion, strongly suggest that at-risk prediction should not include demographic features. Our code is available at: https://anonymous.4open.science/r/edm-F7D1. [For the complete proceedings, see ED630829.]
Published: 2023

11. A New Era for Data Analysis in Qualitative Research: ChatGPT!

Author: Mert Sen, Sevval Nur Sen, and Tugrul Gökmen Sahin
Abstract: Today, the use of software in qualitative research analysis is rapidly becoming widespread among researchers. Researchers manage large data sets using features such as editing data, transcribing, creating codes, and searching within data. However, while the data analysis uses software in a format, the analysis of the essence of the data is done by researchers. An AI language model, ChatGPT, released by the OpenAI company, has features such as text editing, creation, and abbreviation. In this research, considering the characteristics of ChatGPT, an answer was sought to the question of whether it can be used in data analysis for qualitative research. In this research, the case study of the qualitative research method was preferred. The data of the research consists of interview texts of two participants from an unpublished study. The texts were subjected to qualitative research analysis process through ChatGPT-4. Data analysis was done in two separate ways, specifying code, category and theme and not specifying. In conclusion, it has been found that ChatGPT can create code, category and theme, cite directly from within the text, interpret data sets, and analyze the meaning at the core of the data sets. In this context, the availability of ChatGPT in data analysis in qualitative research has been discussed.
Published: 2023

12. Creating a Clear Vision for Rural Healthcare: A Data Analysis Exercise

Author: Christine Ladwig, Taylor Webber, and Dana Schwieger
Abstract: Data is a powerful tool for the healthcare industry to use for managing, analyzing, and reporting on critical events in the field. The analysis of broad, salient data files aids healthcare businesses in uncovering hidden patterns, market trends, and customer preferences; these details may then be used to improve the quality and delivery of care to patients in an organization's community. In this case, students use simple data mining procedures to investigate issues a healthcare organization faces regarding regional and national population patterns, directions for facility and service expansion, and prospective staffing changes. The exercise highlights the use of data analysis as a planning tool for a mid-sized rural hospital with limited resources and may be used in an undergraduate or graduate level management information systems or healthcare information systems course to illustrate data analysis and visualization concepts, reporting, and data driven strategy development.
Published: 2023

13. A Data Pipeline for E-Large-Scale Assessments: Better Automation, Quality Assurance, and Efficiency

Author: Ryan Schwarz, H. Cigdem Bulut, and Charles Anifowose
Abstract: The increasing volume of large-scale assessment data poses a challenge for testing organizations to manage data and conduct psychometric analysis efficiently. Traditional psychometric software presents barriers, such as a lack of functionality for managing data and conducting various standard psychometric analyses efficiently. These challenges have resulted in high costs to achieve the desired research and analysis outcomes. To address these challenges, we have designed and implemented a modernized data pipeline that allows psychometricians and statisticians to efficiently manage the data, conduct psychometric analysis, generate technical reports, and perform quality assurance to validate the required outputs. This modernized pipeline has proven to scale with large databases, decrease human error by reducing manual processes, efficiently make complex workloads repeatable, ensure high quality of the outputs, and reduce overall costs of psychometric analysis of large-scale assessment data. This paper aims to provide information to support the modernization of the current psychometric analysis practices. We shared details on the workflow design and functionalities of our modernized data pipeline, which provide a universal interface to large-scale assessments. The methods for developing non-technical and user-friendly interfaces will also be discussed.
Published: 2023

14. An Examination of Digital Geography Games and Their Effects on Mathematical Data Processing and Social Studies Education Skills

Author: Demirci, Ömer and Ineç, Zekeriya Fatih
Abstract: This study aims to advance the development of the mathematical processing skills of students by suggesting the use of digital geography games. This includes an analysis of its contribution to the standard mathematics curriculum in areas such as data processing as well as its contribution to social studies curricula in areas such as map literacy, location analysis, problem solving, and other skills related to understanding tables, graphs, and diagrams. The study also examines the findings of the experts that relate to these dynamics. A case study of a digital geography game called Gezgin, developed by Ineç (2021) is examined in this research using qualitative research approaches. Under this framework, the data obtained by critical case sampling from six experts through a semi-structured online interview were collected with cloud technologies and examined by using a content analysis. The findings obtained showed that the use of Gezgin can help develop skills related to data processing, problem solving, problem formulation, general mathematical processes, the ability to transfer mathematics to real life, and the ability to create and correctly interpret tables and graphs. The findings also indicate that the real-life context that is provided by this game supports educational development in the areas of map literacy, location analysis, problem-solving skills and graph, table and diagram creation and interpretation for social studies curricula. The expert opinions about Gezgin were determined to be mostly positive, especially with respect to the interdisciplinary structure and its rich content, but some of its functions were found to be limited.
Published: 2023

15. Documentation for the 2017-18 National Teacher and Principal Survey. NCES 2022-718

Author: National Center for Education Statistics (NCES) (ED/IES), Cox, Shawna, Gilary, Aaron, Simon, Dillon, and Thomas, Teresa
Abstract: The National Center for Education Statistics (NCES) sponsors the National Teacher and Principal Survey (NTPS) on behalf of the U.S. Department of Education in order to collect data on public and private elementary and secondary schools in the United States. The NTPS is a large-scale, nationally representative sample survey of K-12 public and private schools and the principals/administrators and teachers who staff them in the United States. The NTPS replaced the Schools and Staffing Survey (SASS) in 2015, which had historically collected the information necessary to form a complete picture of elementary and secondary education in the United States. The NTPS has a different structure and sample from previous administrations of SASS; however, it maintains the same focus on schools and their teachers and administrators that was traditionally held by the SASS. Like SASS, the NTPS provides a wide range of opportunities for analysis and reporting on elementary and secondary educational issues. The 2017-18 NTPS consisted of three questionnaires for three target population: a school questionnaire, a principal questionnaire, and a teacher questionnaire. This Documentation report provides information about all phases of the NTPS, from survey questionnaire revisions to survey data collection and all phases of data processing.
Published: 2022

16. Taking the Next Step in Exploring the 'Literary Digest' 1936 Poll

Author: Beth Chance, Andrew Kerr, and Jett Palmer
Abstract: While many instructors are aware of the "Literary Digest" 1936 poll as an example of biased sampling methods, this article details potential further explorations for the "Digest's" 1924-1936 quadrennial U.S. presidential election polls. Potential activities range from lessons in data acquisition, cleaning, and validation, to basic data literacy and visualization skills, to exploring one or more methods of adjustment to account for bias based on information collected at that time. Students can also compare how those methods would have performed. One option could be to give introductory students a first look at the idea of "sampling adjustment" and how this principle can be used to account for difficulties in modern polling, but the context is rich in other opportunities that can be discussed at various times in the course or in more advanced sampling courses.
Published: 2024
Full Text: View/download PDF

17. Testing Computational Assessment of Idea Novelty in Crowdsourcing

Author: Kai Wang, Boxiang Dong, and Junjie Ma
Abstract: In crowdsourcing ideation websites, companies can easily collect large amount of ideas. Screening through such volume of ideas is very costly and challenging, necessitating automatic approaches. It would be particularly useful to automatically evaluate idea novelty since companies commonly seek novel ideas. Four computational approaches were tested, based on Latent Semantic Analysis (LSA), Latent Dirichlet Allocation (LDA), term frequency -- inverse document frequency (TF-IDF), and Global Vectors for Word Representation (GloVe), respectively. These approaches were used on three set of ideas and the computed idea novelty scores, along with crowd evaluation, were compared with human expert evaluation. The computational methods do not differ significantly with regard to correlation coefficients with expert ratings, even though TF-IDF-based measure achieved a correlation above 0.40 in two out of the three tasks. Crowd evaluation outperforms all the computational methods. Overall, our results show that the tested computational approaches do not match human judgment well enough to replace it.
Published: 2024
Full Text: View/download PDF

18. Optimizing Large-Scale Educational Assessment with a 'Divide-and-Conquer' Strategy: Fast and Efficient Distributed Bayesian Inference in IRT Models

Author: Sainan Xu, Jing Lu, Jiwei Zhang, Chun Wang, and Gongjun Xu
Abstract: With the growing attention on large-scale educational testing and assessment, the ability to process substantial volumes of response data becomes crucial. Current estimation methods within item response theory (IRT), despite their high precision, often pose considerable computational burdens with large-scale data, leading to reduced computational speed. This study introduces a novel "divide and conquer" parallel algorithm built on the Wasserstein posterior approximation concept, aiming to enhance computational speed while maintaining accurate parameter estimation. This algorithm enables drawing parameters from segmented data subsets in parallel, followed by an amalgamation of these parameters via Wasserstein posterior approximation. Theoretical support for the algorithm is established through asymptotic optimality under certain regularity assumptions. Practical validation is demonstrated using real-world data from the Programme for International Student Assessment. Ultimately, this research proposes a transformative approach to managing educational big data, offering a scalable, efficient, and precise alternative that promises to redefine traditional practices in educational assessments. [This paper will be published in "Psychometrika."]
Published: 2024
Full Text: View/download PDF

19. Design and Practice of Japanese Interactive Teaching Systems in Colleges and Universities under the Background of Big Data

Author: Hong Xiao
Abstract: Relying on the background of big data, this paper introduces the blended teaching model into the secondary vocational Japanese oral classroom and explores whether the teaching model is conducive to the improvement of the secondary vocational Japanese oral learning effect and teaching effect. In order to make this research more scientific and effective, this paper refers to a large number of literature materials. First, the purpose, content, and methods of this research are clarified; secondly, the relevant literature is sorted out and summarized. The SPOC blended teaching design was carried out, and teaching experiments were carried out accordingly. Finally, the data collected during the experiment were compared and analyzed, and the research results were verified with objective data. Research has shown that the use of a Japanese language teaching system enhances students' learning experience, promotes effective use of time, and improves overall learning outcomes.
Published: 2024
Full Text: View/download PDF

20. Scalable and Versatile Hardware Acceleration of Graph Neural Networks

Author: Sudipta Mondal
Abstract: Graph neural networks (GNN) are vital for analyzing real-world problems (e.g., network analysis, drug interaction, electronic design automation, e-commerce) that use graph models. However, efficient GNN acceleration faces with multiple challenges related to high and variable sparsity of input feature vectors, power-law degree distribution in the adjacency matrix, and maintaining load-balanced computation with minimal random memory accesses. This thesis addresses the problems of building fast, energy-efficient inference and training accelerators for GNNs, addressing both static and dynamic graphs. For inference, this thesis proposes GNNIE, a versatile GNN inference accelerator capable of handling a diverse set of GNNs, including graph attention networks (GATs), graph convolutional networks (GCNs), GraphSAGE, GINConv, and DiffPool. It mitigates workload imbalance by (i) splitting vertex feature operands into blocks, (ii) reordering and redistributing computations, (iii) using a novel "flexible MAC" architecture. To maximize on-chip data reuse and reduce random DRAM fetches, GNNIE adopts a novel graph-specific, degree-aware caching policy. GNNIE attains substantial speedup over CPU (7197x), GPU (17.81x), and prior works, e.g., HyGCN (5x), AWB-GCN (1.3x) over multiple datasets on GCN, GAT, GraphSAGE, and GINConv. For training GNNs for large graphs, this research develops a GNNIE-based multicore accelerator. A novel feature vector segmentation approach is proposed to scale on large graphs using small on-chip buffers. A multicore-specific graph-specific caching is also implemented to reduce off-chip and on-chip communication and to alleviate random DRAM accesses. Experiments over multiple large datasets and multiple GNNs demonstrate an average training speedup and energy efficiency improvement of 17x and 322x, respectively, over DGL on a GPU, and a speedup of 14x with 268x lower energy over the GPU-based GNNAdvisor approach. Overall, this research tackles scalability and versatility issues of building GNN accelerators while delivering significant speedup and energy efficiency. Finally, this thesis addresses the acceleration of dynamic graph neural networks (DGNNs), which play a crucial role in applications such as social network analytics and urban traffic prediction that require inferencing on graph-structured data, where the connectivity and features of the underlying graph evolve over time. The proposed platform integrates GNN and Recurrent Neural Network (RNN) components of DGNNs, providing a unified platform for spatial and temporal information capture, respectively. The contributions encompass optimized cache reuse strategies, a novel caching policy, and an efficient pipelining mechanism. Evaluation across multiple graph datasets and multiple DGNNs demonstrates average energy efficiency gains of 8393x, 183x, and 87x - 10x, and inference speedups of 1796x, 77x, and 21x - 2.4x, over Intel Xeon Gold CPU, NVIDIA V100 GPU, and prior state-of-the-art DGNN accelerators, respectively, are demonstrated across multiple graph datasets and multiple DGNNs. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
Published: 2024

21. Computational Shoeprint Analysis for Forensic Science

Author: Samia Shafique
Abstract: Shoeprints are a common type of evidence found at crime scenes and are regularly used in forensic investigations. However, their utility is limited by the lack of reference footwear databases that cover the large and growing number of distinct shoe models. Additionally, existing methods for matching crime-scene shoeprints to reference databases cannot effectively employ deep learning techniques due to a lack of training data. Moreover, these methods typically rely on comparing crime-scene shoeprints with clean reference prints instead of more detailed tread depth maps. To address these challenges, we break down the problem into two parts. First, we leverage shoe tread images sourced from online retailers to predict their corresponding depth maps, which are then thresholded to generate prints, thus constructing a comprehensive reference database. Next, we use a section of this database to train a retrieval network that matches query crime-scene shoeprints to tread depth maps. Extensive experimentation across multiple datasets demonstrates the state-of-the-art performance achieved by both the database creation and retrieval steps, validating the effectiveness of our proposed methodology. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
Published: 2024

22. Evaluating a Web-Based System for Tracking Public Health Practice Experiences: User Perceptions, Challenges, and Recommendations for Technology Improvement

Author: Matthew McGrievy
Abstract: The purpose of this descriptive research study was to evaluate the web-based practice experience tracking system at a school of public health (SPH) at a large southeastern university. Applied practice experiences (APEs) are a key component of public health education, and schools and programs of public health in the United States must provide documentation of APEs and practice-related activities to meet accreditation standards. A descriptive research evaluation examined how users perceive and use the existing web-based system designed to document APEs in the SPH. Research questions investigated what factors influence the use of the existing system, identified challenges users face when using the system, and focused on recommendations for improvement in tracking practice-based experiences using the web-based system. The research design is a case study that uses a convergent mixed methods approach where qualitative and quantitative data were collected simultaneously from students, faculty, and practitioner partners of the SPH. Semi-structured interviews were used to gather qualitative data from 8 participants and a survey was used to gather quantitative data from 82 respondents. The Unified Theory of Acceptance and Use of Technology (UTAUT) and the User Burden Scale (UBS) served as the theoretical basis for the semi-structured interviews and the survey. Quantitative and qualitative results indicated overall positive perceptions toward constructs related to perceived usefulness, ease-of-use, social influence, and user burden. Overall attitude toward the system was rated most negatively by participants. Research findings can serve as a guide for other schools and programs who are required to document and report on practice-based learning to meet accreditation requirements. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
Published: 2024

23. The Impact of Specialised Translator Training and Professional Experience on Legal Translation Quality Assurance: An Empirical Study of Revision Performance

Author: Fernando Prieto Ramos and Diego Guzmán
Abstract: The relevance of translation and law degrees as pathways to professional legal translation is the subject of persistent debate, but there is limited research on the relationship between legal translators' backgrounds and competence levels in practice. This study compares the revision performance of several groups of institutional translators (44 in total) according to their academic backgrounds (legal translation specialisation, translation degrees with no legal specialisation, law degrees or other degrees) and legal translation experience (more or less than three years). The scores of justified, missing and over-corrections, among other indicators, corroborate the crucial impact of legal translation specialisation and subject-matter knowledge in ensuring legal translation quality, while experience can serve to partially fill certain training deficits. Qualified translators with a legal specialisation stood out as the most efficient revisers, followed by law graduates, translation graduates without a legal specialisation and other translators. A subsequent holistic assessment of the revised target text yielded results globally in line with the revision scores, as well as mixed perceptions of the target text as potential machine output. The findings support the added value of legal translator training, and are of relevance for translator recruitment and workflow management. They also challenge the rationale behind ISO 20771:2020 qualification requirements.
Published: 2024
Full Text: View/download PDF

24. A Study on the Relationship between Deep Learning and Statistical Models

Author: Il Do Ha
Abstract: Recently, deep learning has become a pervasive tool in prediction problems for structured and/or unstructured big data in various areas including science and engineering. In particular, deep neural network models (i.e. a basic core model of deep learning) can be viewed as an extension of statistical models by going through the incorporation of hidden layers. In this paper, we study the relationship between both models in terms of model structures and model learning. For this purpose, we also compare the predictive performances of both models, with two practical examples.
Published: 2024
Full Text: View/download PDF

25. Comparing Optimization Practices across Engineering Learning Contexts Using Process Data

Author: Jennifer L. Chiu, James P. Bywater, Tugba Karabiyik, Alejandra Magana, Corey Schimpf, and Ying Ying Seah
Abstract: Despite an increasing focus on integrating engineering design in K-12 settings, relatively few studies have investigated how to support students to engage in systematic processes to optimize the designs of their solutions. Emerging learning technologies such as computational models and simulations enable rapid feedback to learners about their design performance, as well as the ability to research how students may or may not be using systematic approaches to the optimization of their designs. This study explored how middle school, high school, and pre-service students optimized the design of a home for energy efficiency, size, and cost using facets of fluency, flexibility, closeness, and quality. Results demonstrated that students with successful designs tended to explore the solution space with designs that met the criteria, with relatively lower numbers of ideas and fewer tightly controlled tests. Optimization facets did not vary across different student levels, suggesting the need for more emphasis on supporting quantitative analysis and optimization facets for learners in engineering settings.
Published: 2024
Full Text: View/download PDF

26. FPGA Hardware Kit for Remote Training Platforms

Author: Muhammad Alhammami
Abstract: This paper outlines the development of a Hardware Development Kit (HDK) for a remote training platform on FPGA Devices designed to provide university students pursuing degrees in electronic and informatics engineering (at the bachelor's, master's, and PhD levels) with the tools they need to learn and develop systems related to artificial intelligence, digital signal processing, image processing, digital systems, integrated and embedded systems using an FPGA chip. Through the internet or a local network, students can engage with the platform to conduct engineering experiments. The HDK is equipped with a Raspberry Pi, a screen, a camera, and LEDs to facilitate the transfer of experiment results and to aid in assessing the validity of lab experiment execution. The Raspberry Pi plays a crucial role in the HDK, providing control and monitoring, remote access, data processing, user interface, and integration with other possible systems.
Published: 2024
Full Text: View/download PDF

27. 'The Punched Cards Were Sent Yesterday, We Hope They Arrive Undamaged.' Computers and International Large-Scale Assessments during the 1960s and 1970s

Author: Joakim Landahl
Abstract: This article explores the history of digital testing technology. Using an organisation that pioneered the use of international large-scale assessments -- the International Association for the Evaluation of Educational Achievement (IEA) -- I discuss the role of computers, punched cards, answer cards and scanning machines as an example of transnational collaboration in the social sciences. In doing so, I particularly look into temporal and spatial dimensions of collaboration. The temporal dimension had to do with ideals of speed and durability, whereas the spatial dimension had to do with questions of where data processing should occur and how data could travel across boundaries. The analysis goes beyond simplistic assumptions where digital technology is seen merely as a tool that facilitates data processing. Instead, I stress the complexity of using computers and emphasize the materiality of digital technology, essential for understanding how data moved around in an era before the World Wide Web.
Published: 2024
Full Text: View/download PDF

28. The Discovery of Knowledge in Educational Databases: A Literature Review with Emphasis on Preprocessing and Postprocessing

Author: Garcia, Léo Manoel Lopes da Silva, Lara, Daiany Francisca, Gomes, Raquel Salcedo, and Cazella, Silvio Cézar
Abstract: In educational data mining (EDM), preprocessing is an arduous and complex task and must promote an appropriate treatment of data to solve each specific educational problem. In the same way, the parameters used in the evaluation of postprocessing results are decisive in the interpretation of the results and decision-making in the future. These two steps have as much influence on obtaining good results in EDM as the algorithms used. However, in the dissemination of the results of studies on this topic, emphasis is placed only on the evaluation of the algorithms used. Thus, the present study sought to carry out a systematic review of the literature on this topic, focusing on the exploration of the preprocessing performed and on the metrics for evaluating the results. It is observed in many studies that the description and evaluation of the preprocessing and the use of several metrics to evaluate the algorithms used are negligible. However, without a proper explanation of the meaning of each metric to reach the proposed objective.
Published: 2022

29. The Construction and Application of Regional Education Quality Monitoring Databases: A Case Study of Suzhou's Education Quality Monitoring

Author: Shen, Jian and Luo, Qiang
Abstract: The development of school education depends on the quality of the education provided, and it is a key metric for assessing the effectiveness of schools in developing talent. Building specialized, intelligent education quality monitoring (EQM) databases is crucial for speeding EQM progress in the big data era. This article examines the development of regional EQM databases in the areas of operational procedure and logical structure based on the idea of data lakes by using the development of databases for the EQM data analysis system (DAS) in Suzhou City as a case study. The goal of this study is to assist in addressing the current issues with regional EQM data processing and ensuring EQM's successful implementation.
Published: 2022

30. Gaining an Insight into Learner Satisfaction in MOOCs: An Investigation through Blog Mining

Author: Ustaog?lu, Mehmi and Kukul, Volkan
Abstract: MOOCs can be considered as a powerful alternative in extraordinary situations where people cannot reach formal education. In recent years, the widespread use of the internet worldwide and especially the COVID-19 has increased the need of people for MOOCs. However, in order to increase the effectiveness of MOOCs, and to provide a better learning environment, the need to evaluate MOOCs has arisen. One of the indicators of quality in online learning is student satisfaction. Accordingly, this research aims to reveal learner satisfaction in MOOCs. The most important indicator for measuring this satisfaction in MOOCs is user comments. In this study, 39101 comments of the participants in 960 MOOCs were examined by using text mining techniques within the framework of satisfaction.
Published: 2022

31. New Materialist Network Approaches in Science Education: A Method to Construct Network Data from Video

Author: Turkkila, Miikka, Lavonen, Jari, Salmela-Aro, Katariina, and Juuti, Kalle
Abstract: Lately, new materialism has been proposed as a theoretical framework to better understand material-dialogic relationships in learning, and concurrently network analysis has emerged as a method in science education research. This paper explores how to include materiality in network analysis and reports the development of a method to construct network data from video. The approaches, (1) information flow, (2) material semantic and (3) material engagement, were identified based on the literature on network analysis and new materialism in science education. The method was applied and further improved with a video segment from an upper secondary school physics lesson. The example networks from the video segment show that network analysis is a potential research method within the materialist framework and that the method allows studies into the material and dialogic relationships that emerge when students are engaged in investigations in school.
Published: 2022

32. Recent Advances in Predictive Learning Analytics: A Decade Systematic Review (2012-2022)

Author: Sghir, Nabila, Adadi, Amina, and Lahmer, Mohamm
Abstract: The last few years have witnessed an upsurge in the number of studies using Machine and Deep learning models to predict vital academic outcomes based on different kinds and sources of student-related data, with the goal of improving the learning process from all perspectives. This has led to the emergence of predictive modelling as a core practice in Learning Analytics and Educational Data Mining. The aim of this study is to review the most recent research body related to Predictive Analytics in Higher Education. Articles published during the last decade between 2012 and 2022 were systematically reviewed following PRISMA guidelines. We identified the outcomes frequently predicted in the literature as well as the learning features employed in the prediction and investigated their relationship. We also deeply analyzed the process of predictive modelling, including data collection sources and types, data preprocessing methods, Machine Learning models and their categorization, and key performance metrics. Lastly, we discussed the relevant gaps in the current literature and the future research directions in this area. This study is expected to serve as a comprehensive and up-to-date reference for interested researchers intended to quickly grasp the current progress in the Predictive Learning Analytics field. The review results can also inform educational stakeholders and decision-makers about future prospects and potential opportunities.
Published: 2023
Full Text: View/download PDF

33. 2008/18 Baccalaureate and Beyond Longitudinal Study (B&B:08/18). Data File Documentation. NCES 2021-141

Author: National Center for Education Statistics (NCES) (ED/IES), RTI International, Cominole, Melissa, Ritchie, Nichole Smith, and Cooney, Jennifer
Abstract: This publication describes the methods and procedures used for the 2008/18 Baccalaureate and Beyond Longitudinal Study (B&B:08/18). The B&B graduates, who completed the requirements for a bachelor's degree during the 2007-08 academic year, were first surveyed as part of the 2008 National Postsecondary Student Aid Study (NPSAS:08), and then followed up with in 2009 and 2012. The 2018 follow-up is the third and final survey for the B&B:08 cohort, conducted 10 years after completion of the bachelor's degree. This report details the methodology and outcomes of the B&B:08/18 student survey data collection and administrative records matching. [For "Baccalaureate and Beyond (B&B:08/18): First Look at the 2018 Employment and Educational Experiences of 2007-08 College Graduates. First Look. NCES 2021-241," see ED609932.]
Published: 2021

34. Liability Framework for Cognitive Computing in Healthcare: Standing at the Crossroad

Author: Saripan, Hartini, Mohd Shith Putera, Nurus Sakinatul Fikriah, Abdullah, Sarah Munirah, Abu Hassan, Rafizah, and Abd Ghadas, Zuhairah Ariff
Abstract: Digitization across the healthcare industry has witnessed the advent of emerging Cognitive Computing (CC) healthcare technologies that improve diagnostic accuracy and efficiency, predict illnesses, automate routine healthcare tasks, and refine processes and care beyond human capabilities. Increased adoption of this technology can be attributed to its ability of processing enormous amounts of data promptly in addressing specific queries and producing customized intelligent recommendations. While CC's transformative technologies offer profound benefits to the healthcare industry, it also carries an unpredictable burden of risk and mistakes with damaging consequences to patients. At this juncture, CC's legal place in healthcare is largely undefined as the applicable liability framework is ambiguous. CC fits into the traditional liability rules in a piecemeal manner; however a single theory of recovery sufficiently addressing the potential liability questions arising from a computer system capable of practicing medicine and possessing the ability of parsing through enormous data for better patient outcomes is absent. The present research therefore sets out to chart the analysis of cases involving emerging medical technologies comparable to CC, in hope of examining ways in which the traditional theories of liability is projected to develop in adapting to this novel contrivance. A doctrinal and case study methods formed an integrated qualitative approach adopted by this research in opting the deployment of emerging medical technologies akin to CC and the bearing it has on the imposition of liability in the United States. CC's potential contributions to healthcare are revolutionary, however its legal repercussions are just as alarming and therefore demands for more discussion in addressing the concerns.
Published: 2021

35. ADL DAU Sandbox. Final Report

Author: Office of the Secretary of Defense (SECDEC) (DOD), Office of the Under Secretary of Defense for Personnel & Readiness (USDP&R), Office of the Secretary of Defense (SECDEC) (DOD), Advanced Distributed Learning (ADL) Initiative, Schatz, Sae, Feemster, Val, and Tompkins, Juli
Abstract: In May of 2020, the Advanced Distributed Learning Initiative (ADL) and the Defense Acquisition University (DAU) partnered with USALearning/PowerTrain to create a practical application of ADL's 2019 TLA Reference Implementation. The purpose of this project was to recreate key components of the Reference Implementation with commercially-available, open source, and customized solutions to demonstrate the value of organizations choosing to adopt the architecture to improve their ability to track competency-based learning. [This report was created with USALearning and PowerTrain.]
Published: 2021

36. AI-Based Diagnostic Assessment System: Integrated With Knowledge Map in MOOCs

Author: Lee, Chia-An, Huang, Nen-Fu, Tzeng, Jian-Wei, and Tsai, Pin-Han
Abstract: Massive open online courses offer a valuable platform for efficient and flexible learning. They can improve teaching and learning effectiveness by enabling the evaluation of learning behaviors and the collection of feedback from students. The knowledge map approach constitutes a suitable tool for evaluating and presenting students' learning performance levels. This study proposes an artificial-intelligence-based knowledge assessment system that integrates knowledge maps to determine students' familiarity with and mastery of course contents. This study employs a structural approach encompassing data collection, data preprocessing, model training, testing, and evaluation. In detail, the system can then customize the knowledge maps and recommend videos according to the knowledge nodes. Students consequently dedicate additional time to studying concepts with which they are unfamiliar and adjust their learning efforts accordingly. After teachers and teaching assistants have captured students' performance metrics and idiosyncratic weaknesses through knowledge maps, teachers can modify the teaching materials. Through the use of education data mining and learning analytics, our system can benefit both teachers and online learners. We hope that the proposed system provides a more personalized and intelligent online learning environment within which students can learn in a more efficient and flexible manner.
Published: 2023
Full Text: View/download PDF

37. (Re)Politicising Data-Driven Education: From Ethical Principles to Radical Participation

Author: Knox, Jeremy
Abstract: This paper examines ways in which the ethics of data-driven technologies might be (re)politicised, particularly where educational institutions are involved. The recent proliferation of principles, guidelines, and frameworks for ethical 'AI' (artificial intelligence) have emerged from a plethora of organisations in recent years, and seem poised to impact educational governance. This trend will be firstly shown to align with a narrow form of ethics--deontology--and overlook other potential ways ethical reasoning might contribute to thinking about 'AI'. Secondly, the attention to ethical principles will be suggested to focus excessively on the technology itself, with the effect of masking political concerns for equity and justice. Thirdly and finally, the paper will propose a more radical form of participation in ethical decision-making that not only challenges the assumption of universal consensus, but also draws more authentically on the capacities for debate, contestation, and exchange inherent in the educational institution.
Published: 2023
Full Text: View/download PDF

38. A Hybridized Deep Learning Strategy for Course Recommendation

Author: Deepak, Gerard and Trivedi, Ishdutt
Abstract: Recommender systems have been actively used in many areas like e-commerce, movie and video suggestions, and have proven to be highly useful for its users. But the use of recommender systems in online learning platforms is often underrated and less likely used. But many of the times it lacks personalisation especially in collaborative approach while content-based doesn't work well for new users. Therefore, the authors propose a hybrid course recommender system for this problem which takes content as well as collaborative approaches and tackles their individual limitations. The authors see recommendation as a sequential problem and thus have used RNNs for solving the problem. Each recommendation can be seen as the new course in the sequence. The results suggest it outperforms other recommender systems when using LSTMs instead of RNNs. The authors develop and test the recommendation system using the Kaggle dataset of users for grouping similar users as historical data and search history of different users' data.
Published: 2023
Full Text: View/download PDF

39. Overreliance on Inefficient Computer-Mediated Information Retrieval Is Countermanded by Strategy Advice That Promotes Memory-Mediated Retrieval

Author: Patrick P. Weis and Wilfried Kunde
Abstract: With ubiquitous computing, problems can be solved using more strategies than ever, though many strategies feature subpar performance. Here, we explored whether and how simple advice regarding when to use which strategy can improve performance. Specifically, we presented unfamiliar alphanumeric equations (e.g., A + 5 = F) and asked whether counting up the alphabet from the left letter by the indicated number resulted in the right letter. In an initial choice block, participants could engage in one of three cognitive strategies: (a) internal counting, (b) internal retrieval of previously generated solutions, or (c) computer-mediated external retrieval of solutions. Participants belonged to one of two groups: they were either instructed to first try internal retrieval before using external retrieval, or received no specific use instructions. In a subsequent internal block with identical instructions for both groups, external retrieval was made unavailable. The 'try internal retrieval first' instruction in the choice block led to pronounced benefits (d = 0.76) in the internal block. Benefits were due to facilitated creation and retrieval of internal memory traces and possibly also due to improved strategy choice. These results showcase how simple strategy advice can greatly help users navigate cognitive environments. More generally, our results also imply that uninformed use of external tools (i.e., technology) can bear the risk of not developing and using even more superior internal processing strategies.
Published: 2023
Full Text: View/download PDF

40. Exploring Learner and Task Characteristics during Information Visualization Comprehension: Toward Adaptive Infographics

Author: Kristine Zlatkovic
Abstract: New forms of visualizations are transforming how people interact with data. This dissertation explored how undergraduates learn with infographics. The following questions guided this research: (i) What do we know about the factors influencing the processing of data visualizations? (ii) How do task-level and learner-level characteristics impact the visual processing and comprehension of infographics? (iii) Can machine learning be used to reliably predict the visual processing and comprehension of infographics using task-level and learner-level characteristics? Systematic review of the literature has shown that data visualization comprehension includes perceptive and conceptual processes, which are influenced by learning task and its complexity, strategies used to convey data, individual learner differences in previous experiences, cognitive and attentional characteristics. A study was conducted with 51 undergraduates in an eye-tracking laboratory at a major southeastern university. The learning task included using infographics with verbal and visual data representations to find answers to questions of three levels of complexity. Learners' working memory, visual search and inhibitory control abilities were evaluated as measures of individual differences in cognition. The results suggest that learners become more engaged and produce slightly more accurate results when they learn using verbal infographics. Complex tasks that require learners to make inferences by connecting newly acquired knowledge with prior knowledge always produce the least accurate results. Regardless of the data representation format, learners' visuospatial working memory and goal-oriented visual search ability significantly influence comprehension. On the contrary, low verbal working memory and inhibitory control hinder the processing of verbal infographics. Further, a random forest algorithm predicted infographics comprehension with 88.04% accuracy with learner-level and task-level characteristics contributing to the predictive performance of the model. Learners' visual processing was explored using gaze-based saliency maps. Machine learning generated less than 50% accuracy using saliency map predictions. Yet, statistical tests revealed that both task-level and learner-level characteristics are significantly associated with saliency maps. This study implies that machine learning and the proposed saliency maps may contribute to the development of adaptive infographics based on learner-level and task-level characteristics. This study contributes insights to existing knowledge on data visualization comprehension and proposes new approaches to enhance learning with information visualizations. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
Published: 2023

41. 20 Years of Interactive Tasks in Large-Scale Assessments: Process Data as a Way towards Sustainable Change?

Author: Stadler, Matthias, Brandl, Laura, and Greiff, Samuel
Abstract: Background: Over the last 20 years, educational large-scale assessments have undergone dramatic changes moving away from simple paper-pencil assessments to innovative, technology-based assessments. This comprehensive switch has led to some rather technical improvements such as identifying early guessing or improving standardization. Objectives: At the same time, process data on student interaction with items has been shown to carry value for obtaining, reporting, and interpreting additional results on student skills in international comparisons. In fact, on the basis of innovative simulated assessment environments, news about student rankings, under- and overperforming countries, and novel ideas on how to improve educational systems are prominently featured in the media. At the same time, few of these efforts have been used in a sustainable way to create new knowledge (i.e., on a scientific level), to improve learning and instruction (i.e., on a practical level), and to provide actionable advice to political stakeholders (i.e., on a policy level). Methods: This paper will adopt a meta-perspective and discuss recent and current developments with a focus on these three perspectives. There will be a particular emphasis on new assessment environments that have been recently employed in large-scale assessments. Results and Conclusions: Most findings remain very task specific. We propose necessary steps that need to be taken in order to yield sustainable change from analysing process data on all three levels. Implications: New technologies might be capable of contributing to the research-policy-practitioner gap when it comes to utilizing the results from large-scale assessments to increase the quality of education around the globe but this will require a more systematic approach towards researching them.
Published: 2023
Full Text: View/download PDF

42. PIILO: An Open-Source System for Personally Identifiable Information Labeling and Obfuscation

Author: Holmes, Langdon, Crossley, Scott, Sikka, Harshvardhan, and Morris, Wesley
Abstract: Purpose: This study aims to report on an automatic deidentification system for labeling and obfuscating personally identifiable information (PII) in student-generated text. Design/methodology/approach: The authors evaluate the performance of their deidentification system on two data sets of student-generated text. Each data set was human-annotated for PII. The authors evaluate using two approaches: per-token PII classification accuracy and a simulated reidentification attack design. In the reidentification attack, two reviewers attempted to recover student identities from the data after PII was obfuscated by the authors' system. In both cases, results are reported in terms of recall and precision. Findings: The authors' deidentification system recalled 84% of student name tokens in their first data set (96% of full names). On the second data set, it achieved a recall of 74% for student name tokens (91% of full names) and 75% for all direct identifiers. After the second data set was obfuscated by the authors' system, two reviewers attempted to recover the identities of students from the obfuscated data. They performed below chance, indicating that the obfuscated data presents a low identity disclosure risk. Research limitations/implications: The two data sets used in this study are not representative of all forms of student-generated text, so further work is needed to evaluate performance on more data. Practical implications: This paper presents an open-source and automatic deidentification system appropriate for student-generated text with technical explanations and evaluations of performance. Originality/value: Previous study on text deidentification has shown success in the medical domain. This paper develops on these approaches and applies them to text in the educational domain.
Published: 2023
Full Text: View/download PDF

43. The Use of Data Mining to Achieve the Objectives of Open-Ended Inquiry in the Context of IB Biology Classrooms

Author: Mindorff, David
Abstract: Practical work involving laboratory experiments is agreed upon to be an essential component of secondary science education. Practical work encompasses a broad range of activity types. The different forms of practical work are not equal in terms of cognitive demand and learning benefit. Inquiry-based investigations provide experience of cognitive apprenticeship, an opportunity for developing important and unique academic dispositions and exposure to communities of practice. There are several barriers to the successful implementation of inquiry investigations, and achievement often falls short of the intended outcome in multiple educational jurisdictions. Database mining as the foundational method of an inquiry investigation can address these challenges. The International Baccalaureate Diploma requires submission of a written record of an inquiry-based investigation as part of its assessment scheme. Two hundred samples of International Baccalaureate inquiry investigations were analyzed to ascertain the frequency of use of data base mining as the method, whether there was a difference in the outcome of this approach in terms of the evidence of complying with the model and their success against the assessment criteria. Results show that there is no difference in outcome of assessment, with slight improvement in one domain. There is no difference in the frequency of successful compliance with the model. The implications of these results are distance learning utility and scaling up of inquiry investigations for universities without requiring a significant drain on resources. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
Published: 2023

44. High School Longitudinal Study of 2009 (HSLS:09) Postsecondary Education Transcript Study and Student Financial Aid Records Collection. Data File Documentation. NCES 2020-004

Author: National Center for Education Statistics (ED), RTI International, Duprey, Michael A., Pratt, Daniel J., Wilson, David H., Jewell, Donna M., Brown, Derick S., Caves, Lesa R., Kinney, Satkartar K., Mattox, Tiffany L., Ritchie, Nichole Smith, Rogers, James E., Spagnardi, Colleen M., and Wescott, Jamie D.
Abstract: This data file documentation accompanies new data files for the High School Longitudinal Study of 2009 (HSLS:09) Postsecondary Education Transcript Study and Student Financial Aid Records Collection (PETS-SR). HSLS:09 follows a nationally representative sample of students who were ninth-graders in fall 2009 from high school into postsecondary education and the workforce. The PETS-SR data collection was conducted between spring 2017 and fall 2018, approximately 4 years after high school graduation for most of the cohort. These data allow researchers to examine postsecondary coursetaking experiences and financial aid awards for the subset of fall 2009 ninth-graders who enrolled in postsecondary education after high school. [For "High School Longitudinal Study of 2009 (HSLS:09) Postsecondary Education Transcript Study and Student Financial Aid Records Collection. Data File Documentation. Appendices. NCES 2020-004," see ED607373.]
Published: 2020

45. Proceedings of the International Conference on Educational Data Mining (EDM) (13th, Online, July 10-13, 2020)

Author: International Educational Data Mining Society, Rafferty, Anna N., Whitehill, Jacob, Romero, Cristobal, and Cavalli-Sforza, Violetta
Abstract: The 13th iteration of the International Conference on Educational Data Mining (EDM 2020) was originally arranged to take place in Ifrane, Morocco. Due to the SARS-CoV-2 (coronavirus) epidemic, EDM 2020, as well as most other academic conferences in 2020, had to be changed to a purely online format. To facilitate efficient transmission of presentations all paper presenters pre-recorded their presentation as a video and then hosted it on YouTube with closed-captioning (CC). The official theme of this year's conference is Improving Learning Outcomes for All Learners. The theme comprises two parts: (1) Identifying actionable learning or teaching strategies that can be used to "improve" learning outcomes, not just predict them; (2) Using EDM to promote more "equitable" learning across diverse groups of learners, and to benefit underserved communities in particular. This year's conference features three invited talks: Alina von Davier, Chief Officer at ACTNext; Abelardo Pardo, Professor and Dean of Programs (Engineering), at UniSA STEM, University of South Australia; and Kobi Gal, Associate Professor at the Department of Software and Information Systems Engineering at Ben-Gurion University of the Negev, and Reader at the School of Informatics at the University of Edinburgh.
Published: 2020

46. Assessment Big Data in Nigeria: Identification, Generation and Processing in the Opinion of the Experts

Author: Esomonu, Nkechi Patricia-Mary, Esomonu, Martins Ndibem, and Eleje, Lydia Ijeoma
Abstract: As a result of increasing complexity of assessing all aspects of human behaviours, a lot of data are generated on individual learner and from teachers and the system. What qualifies as big data in assessment in Nigeria? This research identifies the sources of assessment big data in Nigeria, investigates how the big data are generated and processed, and identifies the problems of generating and processing assessment big data in Nigeria. Through purposive sampling technique forty-five experts in education assessment and research were selected. The instruments for data collection were interview and documents. The data collected were analysed using descriptive statistics to answer the five research questions that guided the research. The results of the investigation showed that the internal and external examinations and assessments from secondary schools, and course work results in universities were identified by more than 95.5% of the experts interviewed as the major sources of assessment data in Nigeria. The major problem in generating and processing assessment big data from the experts' opinions is low awareness on the need/advantages of assessment big data with the highest mean rating (4.29[plus or minus]0.76). Many data are not analysed and a lot of information are lost. Recommendation was made amongst others on the need for the stakeholders to create awareness on the importance of big data in the modern education system to improve learner's performance.
Published: 2020

47. The Application of Multimedia Information Fusion Technology in the Construction of University Intelligent Libraries

Author: Nan Pang
Abstract: Multimedia information fusion technology is currently a popular technology for data processing. The advanced productivity it brings is dozens of times that of traditional data information processing. The collected and processed information is not only efficient and accurate, but also has strong scalability. With the continuous development of college education in China, the problem of the construction of intelligent library in colleges and universities has gradually surfaced. The establishment of a university intelligent library is a process of integration and common progress between disciplines. This article starts from the current situation of university libraries, studies and analyzes intelligent libraries at home and abroad, and finally combines the existing multimedia information fusion technology to analyze the theoretical level of information flow as an important idea of the current university intelligent library construction.
Published: 2024
Full Text: View/download PDF

48. Diving Deep into Dissertations: Analyzing Graduate Students' Methodological and Data Practices to Inform Research Data Services and Subject Liaison Librarian Support

Author: Swygart-Hobaugh, Mandy, Anderson, Raeda, George, Denise, and Glogowski, Joel
Abstract: We present findings from an exploratory quantitative content analysis case study of 156 doctoral dissertations from Georgia State University that investigates doctoral student researchers' methodology practices (used quantitative, qualitative, or mixed methods) and data practices (used primary data, secondary data, or both). We discuss the implications of our findings for provision of data support services provided by the Georgia State University Library's Research Data Services (RDS) Team and subject liaison librarians in the areas of instructional services, data software support and licensing advocacy, collection development, marketing/outreach, and professional development/expansion.
Published: 2022
Full Text: View/download PDF

49. An Analysis of Learners' Programming Skills through Data Mining

Author: Zhang, Wei, Zeng, Xinyao, Wang, Jihan, Ming, Daoyang, and Li, Panpan
Abstract: Programming skills (PS) are indispensable abilities in the information age, but the current research on PS cultivation mainly focuses on the teaching methods and lacks the analysis of program features to explore the differences in learners' PS and guide programming learning. Therefore, the purpose of this study aims to explore horizontal differences and vertical changes in PS of learners aged 18 to 25 and facilitate the discovery of programming features and behaviors to guide the acquisition of PS through an experiment of statistical analysis and cluster analysis of 2,400 Python programs in four programming tasks. The research found the characteristics and main differences of PS reflected in the function call, interactive loop and several structures nesting. Simple programming task to medium-difficulty programming task is the most important link in programming learning. Furthermore, the research also showed that the difference in program structure is the core and foundation. The difference in type and quantity in simple structure, nested structure and mixed-use of structures is regular, which is an important factor to determine whether the program runs efficiently and whether the programming task can be solved. Finally, some heuristic ideas were put forward to help learners optimize programs and solve programming difficulties, which was of great guiding significance to PS learning.
Published: 2022
Full Text: View/download PDF

50. Data Cleansing: An Omission from Data Analytics Coursework

Author: Snyder, Johnny
Abstract: Quantitative decision making (management science, business statistics) textbooks rarely address data cleansing issues, rather, these textbooks come with neat, clean, well-formatted data sets for the student to perform analysis on. However, with a majority of the data analyst's time spent on gathering, cleaning, and pre-conditioning data, students need to be trained on what to look for when generating or receiving data. A critical scan of the data needs to be performed (at a minimum) to look for errors in the data set before data analysis can be performed.
Published: 2019

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Category

Publication Type

Journal

Region

Database

Publisher

93,743 results on '"Data processing"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources