688 results on '"Heffernan, Neil"'
Search Results
2. Using Auxiliary Data to Boost Precision in the Analysis of A/B Tests on an Online Educational Platform: New Data and New Results
- Author
-
Sales, Adam C., Prihar, Ethan B., Gagnon-Bartsch, Johann A., and Heffernan, Neil T.
- Abstract
Randomized A/B tests within online learning platforms represent an exciting direction in learning sciences. With minimal assumptions, they allow causal effect estimation without confounding bias and exact statistical inference even in small samples. However, often experimental samples and/or treatment effects are small, A/B tests are underpowered, and effect estimates are overly imprecise. Recent methodological advances have shown that power and statistical precision can be substantially boosted by coupling design-based causal estimation to machine-learning models of rich log data from historical users who were not in the experiment. Estimates using these techniques remain unbiased and inference remains exact without any additional assumptions. This paper reviews those methods and applies them to a new dataset including over 250 randomized A/B comparisons conducted within ASSISTments, an online learning platform. We compare results across experiments using four novel deep-learning models of auxiliary data and show that incorporating auxiliary data into causal estimates is roughly equivalent to increasing the sample size by 20% on average, or as much as 50-80% in some cases, relative to t-tests, and by about 10% on average, or as much as 30-50%, compared to cutting-edge machine learning unbiased estimates that use only data from the experiments. We show that the gains can be even larger for estimating subgroup effects, hold even when the remnant is unrepresentative of the A/B test sample, and extend to post-stratification population effects estimators.
- Published
- 2023
3. How to Open Science: Debugging Reproducibility within the Educational Data Mining Conference
- Author
-
Haim, Aaron, Gyurcsan, Robert, Baxter, Chris, Shaw, Stacy T., and Heffernan, Neil T.
- Abstract
Despite increased efforts to assess the adoption rates of open science and robustness of reproducibility in sub-disciplines of education technology, there is a lack of understanding of why some research is not reproducible. Prior work has taken the first step toward assessing reproducibility of research, but has assumed certain constraints which hinder its discovery. Thus, the purpose of this study was to replicate previous work on papers within the proceedings of the "International Conference on Educational Data Mining" to accurately report on which papers are reproducible and why. Specifically, we examined 208 papers, attempted to reproduce them, documented reasons for reproducibility failures, and asked authors to provide additional information needed to reproduce their study. Our results showed that out of 12 papers that were potentially reproducible, only one successfully reproduced all analyses, and another two reproduced most of the analyses. The most common failure for reproducibility was failure to mention libraries needed, followed by non-seeded randomness. [For the complete proceedings, see ED630829. Additional funding for this paper was provided by the U.S. Department of Education's Graduate Assistance in Areas of National Need (GAANN).]
- Published
- 2023
4. Modeling and Analyzing Scorer Preferences in Short-Answer Math Questions
- Author
-
Zhang, Mengxue, Heffernan, Neil, and Lan, Andrew
- Abstract
Automated scoring of student responses to open-ended questions, including short-answer questions, has great potential to scale to a large number of responses. Recent approaches for automated scoring rely on supervised learning, i.e., training classifiers or fine-tuning language models on a small number of responses with human-provided score labels. However, since scoring is a subjective process, these human scores are noisy and can be highly variable, depending on the scorer. In this paper, we investigate a collection of models that account for the individual preferences and tendencies of each human scorer in the automated scoring task. We apply these models to a short-answer math response dataset where each response is scored (often differently) by multiple different human scorers. We conduct quantitative experiments to show that our scorer models lead to improved automated scoring accuracy. We also conduct quantitative experiments and case studies to analyze the individual preferences and tendencies of scorers. We found that scorers can be grouped into several obvious clusters, with each cluster having distinct features, and analyzed them in detail. [For the complete proceedings, see ED630829.]
- Published
- 2023
5. Knowledge Tracing over Time: A Longitudinal Analysis
- Author
-
Lee, Morgan P., Croteau, Ethan, Gurung, Ashish, Botelho, Anthony F., and Heffernan, Neil T.
- Abstract
The use of Bayesian Knowledge Tracing (BKT) models in predicting student learning and mastery, especially in mathematics, is a well-established and proven approach in learning analytics. In this work, we report on our analysis examining the generalizability of BKT models across academic years attributed to "detector rot." We compare the generalizability of Knowledge Training (KT) models by comparing model performance in predicting student knowledge within the academic year and across academic years. Models were trained on data from two popular open-source curricula available through Open Educational Resources. We observed that the models generally were highly performant in predicting student learning within an academic year, whereas certain academic years were more generalizable than other academic years. We posit that the Knowledge Tracing models are relatively stable in terms of performance across academic years yet can still be susceptible to systemic changes and underlying learner behavior. As indicated by the evidence in this paper, we posit that learning platforms leveraging KT models need to be mindful of systemic changes or drastic changes in certain user demographics. [For the complete proceedings, see ED630829. Additional funding was provided by the U.S. Department of Education's Graduate Assistance in Areas of National Need (GAANN) program.]
- Published
- 2023
6. Auto-Scoring Student Responses with Images in Mathematics
- Author
-
Baral, Sami, Botelho, Anthony, Santhanam, Abhishek, Gurung, Ashish, Cheng, Li, and Heffernan, Neil
- Abstract
Teachers often rely on the use of a range of open-ended problems to assess students' understanding of mathematical concepts. Beyond traditional conceptions of student open-ended work, commonly in the form of textual short-answer or essay responses, the use of figures, tables, number lines, graphs, and pictographs are other examples of open-ended work common in mathematics. While recent developments in areas of natural language processing and machine learning have led to automated methods to score student open-ended work, these methods have largely been limited to textual answers. Several computer-based learning systems allow students to take pictures of hand-written work and include such images within their answers to open-ended questions. With that, however, there are few-to-no existing solutions that support the auto-scoring of student hand-written or drawn answers to questions. In this work, we build upon an existing method for auto-scoring textual student answers and explore the use of OpenAI/CLIP, a deep learning embedding method designed to represent both images and text, as well as Optical Character Recognition (OCR) to improve model performance. We evaluate the performance of our method on a dataset of student open-responses that contains both text- and image-based responses, and find a reduction of model error in the presence of images when controlling for other answer-level features. [For the complete proceedings, see ED630829.]
- Published
- 2023
7. Effective Evaluation of Online Learning Interventions with Surrogate Measures
- Author
-
Prihar, Ethan, Vanacore, Kirk, Sales, Adam, and Heffernan, Neil
- Abstract
There is a growing need to empirically evaluate the quality of online instructional interventions at scale. In response, some online learning platforms have begun to implement rapid A/B testing of instructional interventions. In these scenarios, students participate in series of randomized experiments that evaluate problem-level interventions in quick succession, which makes it difficult to discern the effect of any particular intervention on their learning. Therefore, distal measures of learning such as posttests may not provide a clear understanding of which interventions are effective, which can lead to slow adoption of new instructional methods. To help discern the effectiveness of instructional interventions, this work uses data from 26,060 clickstream sequences of students across 31 different online educational experiments exploring 51 different research questions and the students' posttest scores to create and analyze different proximal surrogate measures of learning that can be used at the problem level. Through feature engineering and deep learning approaches, next-problem correctness was determined to be the best surrogate measure. As more data from online educational experiments are collected, model based surrogate measures can be improved, but for now, next-problem correctness is an empirically effective proximal surrogate measure of learning for analyzing rapid problem-level experiments. [For the complete proceedings, see ED630829. Additional funding for this paper was provided by the U.S. Department of Education's Graduate Assistance in Areas of National Need (GAANN).]
- Published
- 2023
8. Math-LLMs: AI Cyberinfrastructure with Pre-trained Transformers for Math Education
- Author
-
Zhang, Fan, Li, Chenglu, Henkel, Owen, Xing, Wanli, Baral, Sami, Heffernan, Neil, and Li, Hai
- Published
- 2024
- Full Text
- View/download PDF
9. Exploring Common Trends in Online Educational Experiments
- Author
-
Prihar, Ethan, Syed, Manaal, Ostrow, Korinn, Shaw, Stacy, Sales, Adam, and Heffernan, Neil
- Abstract
As online learning platforms become more ubiquitous throughout various curricula, there is a growing need to evaluate the effectiveness of these platforms and the different methods used to structure online education and tutoring. Towards this endeavor, some platforms have performed randomized controlled experiments to compare different user experiences, curriculum structures, and tutoring strategies in order to ensure the effectiveness of their platform and personalize the education of the students using it. These experiments are typically analyzed on an individual basis in order to reveal insights on a specific aspect of students' online educational experience. In this work, the data from 50,752 instances of 30,408 students participating in 50 different experiments conducted at scale within the online learning platform ASSISTments were aggregated and analyzed for consistent trends across experiments. By combining common experimental conditions and normalizing the dependent measures between experiments, this work has identified multiple statistically significant insights on the impact of various skill mastery requirements, strategies for personalization, and methods for tutoring in an online setting. This work can help direct further experimentation and inform the design and improvement of new and existing online learning platforms. The anonymized data compiled for this work are hosted by the Open Science Foundation and can be found at https://osf.io/59shv/. [This paper was published in: "Proceedings of the 15th International Conference on Educational Data Mining," edited by A. Mitrovic and N. Bosch, International Educational Data Mining Society, 2022, pp. 27-38.]
- Published
- 2022
- Full Text
- View/download PDF
10. Student Perception on the Effectiveness of On-Demand Assistance in Online Learning Platforms
- Author
-
Haim, Aaron and Heffernan, Neil T.
- Abstract
Studies have shown that on-demand assistance, additional instruction given on a problem per student request, improves student learning in online learning environments. Students may have opinions on whether an assistance was effective at improving student learning. As students are the driving force behind the effectiveness of assistance, there could exist a correlation between students' perceptions of effectiveness and the computed effectiveness of the assistance. This work conducts a survey asking secondary education students on whether a given assistance is effective in solving a problem in an online learning platform. It then provides a cursory glance at the data to view whether a correlation exists between student perception and the measured effectiveness of an assistance. Over a three year period, approximately twenty-two thousand responses were collected across nearly four thousand, four hundred students. Initial analyses of the survey suggest no significance in the relationship between student perception and computed effectiveness of an assistance, regardless of if the student participated in the survey. All data and analysis conducted can be found on the Open Science Foundation website. [This paper was published in: "Proceedings of the 15th International Conference on Educational Data Mining," edited by A. Mitrovic and N. Bosch, International Educational Data Mining Society, 2022, pp. 734-37.]
- Published
- 2022
- Full Text
- View/download PDF
11. Automatic Interpretable Personalized Learning
- Author
-
Prihar, Ethan, Haim, Aaron, Sales, Adam, and Heffernan, Neil
- Abstract
Personalized learning stems from the idea that students benefit from instructional material tailored to their needs. Many online learning platforms purport to implement some form of personalized learning, often through on-demand tutoring or self-paced instruction, but to our knowledge none have a way to automatically explore for specific opportunities to personalize students' education nor a transparent way to identify the effects of personalization on specific groups of students. In this work we present the Automatic Personalized Learning Service (APLS). The APLS uses multi-armed bandit algorithms to recommend the most effective support to each student that requests assistance when completing their online work, and is currently used by ASSISTments, an online learning platform. The first empirical study of the APLS found that Beta-Bernoulli Thompson Sampling, a popular and effective multi-armed bandit algorithm, was only slightly more capable of selecting helpful support than randomly selecting from the relevant support options. Therefore, we also present Decision Tree Thompson Sampling (DTTS), a novel contextual multi-armed bandit algorithm that integrates the transparency and interpretability of decision trees into Thomson sampling. In simulation, DTTS overcame the challenges of recommending support within an online learning platform and was able to increase students' learning by as much as 10% more than the current algorithm used by the APLS. We demonstrate that DTTS is able to identify qualitative interactions that not only help determine the most effective support for students, but that also generalize well to new students, problems, and support content. The APLS using DTTS is now being deployed at scale within ASSISTments and is a promising tool for all educational learning platforms. [This paper was published in: "Proceedings of the Ninth ACM Conference on Learning @ Scale (L@S '22), June 1-3, 2022, New York City, NY, USA," ACM, 2022.]
- Published
- 2022
- Full Text
- View/download PDF
12. Automatic Short Math Answer Grading via In-Context Meta-Learning
- Author
-
Zhang, Mengxue, Baral, Sami, Heffernan, Neil, and Lan, Andrew
- Abstract
Automatic short answer grading is an important research direction in the exploration of how to use artificial intelligence (AI)-based tools to improve education. Current state-of-the-art approaches use neural language models to create vectorized representations of students responses, followed by classifiers to predict the score. However, these approaches have several key limitations, including i) they use pre-trained language models that are not well-adapted to educational subject domains and/or student-generated text and ii) they almost always train one model per question, ignoring the linkage across question and result in a significant model storage problem due to the size of advanced language models. In this paper, we study the problem of automatic short answer grading for students' responses to math questions and propose a novel framework for this task. First, we use MathBERT, a variant of the popular language model BERT adapted to mathematical content, as our base model and fine-tune it on the downstream task of student response grading. Second, we use an in-context learning approach that provides scoring examples as input to the language model to provide additional context information and promote generalization to previously unseen questions. We evaluate our framework on a real-world dataset of student responses to open-ended math questions and show that our framework (often significantly) outperform existing approaches, especially for new questions that are not seen during training. [For the full proceedings, see ED623995.]
- Published
- 2022
13. Automatic Short Answer Grading in College Mathematics Using In-Context Meta-learning: An Evaluation of the Transferability of Findings
- Author
-
Michael, Smalenberger, Sohrabi, Elham, Zhang, Mengxue, Baral, Sami, Smalenberger, Kelly, Lan, Andrew, Heffernan, Neil, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Olney, Andrew M., editor, Chounta, Irene-Angelica, editor, Liu, Zitao, editor, Santos, Olga C., editor, and Bittencourt, Ig Ibert, editor
- Published
- 2024
- Full Text
- View/download PDF
14. Promoting Open Science in Artificial Intelligence: An Interactive Tutorial on Licensing, Data, and Containers
- Author
-
Haim, Aaron, Hutt, Stephen, Shaw, Stacy T., Heffernan, Neil T., Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Olney, Andrew M., editor, Chounta, Irene-Angelica, editor, Liu, Zitao, editor, Santos, Olga C., editor, and Bittencourt, Ig Ibert, editor
- Published
- 2024
- Full Text
- View/download PDF
15. The Effectiveness of AI Generated, On-Demand Assistance Within Online Learning Platforms
- Author
-
Haim, Aaron, Worden, Eamon, Heffernan, Neil T., Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Olney, Andrew M., editor, Chounta, Irene-Angelica, editor, Liu, Zitao, editor, Santos, Olga C., editor, and Bittencourt, Ig Ibert, editor
- Published
- 2024
- Full Text
- View/download PDF
16. Can Large Language Models Generate Middle School Mathematics Explanations Better Than Human Teachers?
- Author
-
Wang, Allison, Prihar, Ethan, Haim, Aaron, Heffernan, Neil, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Olney, Andrew M., editor, Chounta, Irene-Angelica, editor, Liu, Zitao, editor, Santos, Olga C., editor, and Bittencourt, Ig Ibert, editor
- Published
- 2024
- Full Text
- View/download PDF
17. Comparing Different Approaches to Generating Mathematics Explanations Using Large Language Models
- Author
-
Prihar, Ethan, Lee, Morgan, Hopman, Mia, Tauman Kalai, Adam, Vempala, Sofia, Wang, Allison, Wickline, Gabriel, Murray, Aly, and Heffernan, Neil
- Abstract
Large language models have recently been able to perform well in a wide variety of circumstances. In this work, we explore the possibility of large language models, specifically GPT-3, to write explanations for middle-school mathematics problems, with the goal of eventually using this process to rapidly generate explanations for the mathematics problems of new curricula as they emerge, shortening the time to integrate new curricula into online learning platforms. To generate explanations, two approaches were taken. The first approach attempted to summarize the salient advice in tutoring chat logs between students and live tutors. The second approach attempted to generate explanations using few-shot learning from explanations written by teachers for similar mathematics problems. After explanations were generated, a survey was used to compare their quality to that of explanations written by teachers. We test our methodology using the GPT-3 language model. Ultimately, the synthetic explanations were unable to outperform teacher written explanations. In the future more powerful large language models may be employed, and GPT-3 may still be effective as a tool to augment teachers' process for writing explanations, rather than as a tool to replace them. The explanations, survey results, analysis code, and a dataset of tutoring chat logs are all available at https://osf.io/wh5n9/.
- Published
- 2023
18. Leveraging Natural Language Processing to Support Automated Assessment and Feedback for Student Open Responses in Mathematics
- Author
-
Botelho, Anthony, Baral, Sami, Erickson, John A., Benachamardi, Priyanka, and Heffernan, Neil T.
- Abstract
Background: Teachers often rely on the use of open-ended questions to assess students' conceptual understanding of assigned content. Particularly in the context of mathematics; teachers use these types of questions to gain insight into the processes and strategies adopted by students in solving mathematical problems beyond what is possible through more close-ended problem types. While these types of problems are valuable to teachers, the variation in student responses to these questions makes it difficult, and time-consuming, to evaluate and provide directed feedback. It is a well-studied concept that feedback, both in terms of a numeric score but more importantly in the form of teacher-authored comments, can help guide students as to how to improve, leading to increased learning. It is for this reason that teachers need better support not only for assessing students' work but also in providing meaningful and directed feedback to students. Objectives: In this paper, we seek to develop, evaluate, and examine machine learning models that support automated open response assessment and feedback. Methods: We build upon the prior research in the automatic assessment of student responses to open-ended problems and introduce a novel approach that leverages student log data combined with machine learning and natural language processing methods. Utilizing sentence-level semantic representations of student responses to open-ended questions, we propose a collaborative filtering-based approach to both predict student scores as well as recommend appropriate feedback messages for teachers to send to their students. Results and Conclusion: We find that our method outperforms previously published benchmarks across three different metrics for the task of predicting student performance. Through an error analysis, we identify several areas where future works may be able to improve upon our approach.
- Published
- 2023
- Full Text
- View/download PDF
19. Toward Personalizing Students' Education with Crowdsourced Tutoring
- Author
-
Prihar, Ethan, Patikorn, Thanaporn, Botelho, Anthony, Sales, Adam, and Heffernan, Neil T.
- Abstract
As more educators integrate their curricula with online learning, it is easier to crowdsource content from them. Crowdsourced tutoring has been proven to reliably increase students' next problem correctness. In this work, we confirmed the findings of a previous study in this area, with stronger confidence margins than previously, and revealed that only a portion of crowdsourced content creators had a reliable benefit to students. Furthermore, this work provides a method to rank content creators relative to each other, which was used to determine which content creators were most effective overall, and which content creators were most effective for specific groups of students. When exploring data from TeacherASSIST, a feature within the ASSISTments learning platform that crowdsources tutoring from teachers, we found that while overall this program provides a benefit to students, some teachers created more effective content than others. Despite this finding, we did not find evidence that the effectiveness of content reliably varied by student knowledge-level, suggesting that the content is unlikely suitable for personalizing instruction based on student knowledge alone. These findings are promising for the future of crowdsourced tutoring as they help provide a foundation for assessing the quality of crowdsourced content and investigating content for opportunities to personalize students' education. [This paper was published in: " L@S '21, June 22-25, 2021, Virtual Event, Germany," ACM, 2021.]
- Published
- 2021
- Full Text
- View/download PDF
20. The Effect of an Intelligent Tutor on Performance on Specific Posttest Problems
- Author
-
Sales, Adam, Prihar, Ethan, Heffernan, Neil, and Pane, John F.
- Abstract
This paper drills deeper into the documented effects of the Cognitive Tutor Algebra I and ASSISTments intelligent tutoring systems by estimating their effects on specific problems. We start by describing a multilevel Rasch-type model that facilitates testing for differences in the effects between problems and precise problem-specific effect estimation without the need for multiple comparisons corrections. We find that the effects of both intelligent tutors vary between problems-- the effects are positive for some, negative for others, and undeterminable for the rest. Next we explore hypotheses explaining why effects might be larger for some problems than for others. In the case of ASSISTments, there is no evidence that problems that are more closely related to students' work in the tutor displayed larger treatment effects. [For the full proceedings, see ED615472.]
- Published
- 2021
21. Improving Automated Scoring of Student Open Responses in Mathematics
- Author
-
Baral, Sami, Botelho, Anthony F., Erickson, John A., Benachamardi, Priyanka, and Heffernan, Neil T.
- Abstract
Open-ended questions in mathematics are commonly used by teachers to monitor and assess students' deeper conceptual understanding of content. Student answers to these types of questions often exhibit a combination of language, drawn diagrams and tables, and mathematical formulas and expressions that supply teachers with insight into the processes and strategies adopted by students in formulating their responses. While these student responses help to inform teachers on their students' progress and understanding, the amount of variation in these responses can make it difficult and time-consuming for teachers to manually read, assess, and provide feedback to student work. For this reason, there has been a growing body of research in developing AI-powered tools to support teachers in this task. This work seeks to build upon this prior research by introducing a model that is designed to help automate the assessment of student responses to open-ended questions in mathematics through sentence-level semantic representations. We find that this model outperforms previously-published benchmarks across three different metrics. With this model, we conduct an error analysis to examine characteristics of student responses that may be considered to further improve the method. [For the full proceedings, see ED615472.]
- Published
- 2021
22. A Novel Algorithm for Aggregating Crowdsourced Opinions
- Author
-
Prihar, Ethan and Heffernan, Neil
- Abstract
Similar content has tremendous utility in classroom and online learning environments. For example, similar content can be used to combat cheating, track students' learning over time, and model students' latent knowledge. These different use cases for similar content all rely on different notions of similarity, which make it difficult to determine contents' similarities. Crowdsourcing is an effective way to identify similar content in a variety of situations by providing workers with guidelines on how to identify similar content for a particular use case. However, crowdsourced opinions are rarely homogeneous and therefore must be aggregated into what is most likely the truth. This work presents the Dynamically Weighted Majority Vote method. A novel algorithm that combines aggregating workers' crowdsourced opinions with estimating the reliability of each worker. This method was compared to the traditional majority vote method in both a simulation study and an empirical study, in which opinions on seventh grade mathematics problems' similarity were crowdsourced from middle school math teachers and college students. In both the simulation and the empirical study the Dynamically Weighted Majority Vote method outperformed the traditional majority vote method, suggesting that this method should be used instead of majority vote in future crowdsourcing endeavors. [For the full proceedings, see ED615472.]
- Published
- 2021
23. Implementing and Evaluating ASSISTments Online Math Homework Support at Large Scale over Two Years: Findings and Lessons Learned
- Author
-
Feng, Mingyu, Heffernan, Neil, Collins, Kelly, Heffernan, Cristina, and Murphy, Robert F.
- Abstract
Math performance continues to be an important focus for improvement. The most recent National Report Card in the U.S. suggested student math scores declined in the past two years possibly due to COVID-19 pandemic and related school closures. We report on the implementation of a math homework program that leverages AI-based one-to-one technology, in 32 schools for two years as a part of a randomized controlled trial in diverse settings of the state of North Carolina in the US. The program, called "ASSISTments," provides feedback to students as they solve homework problems and automatically prepares reports for teachers about student performance on daily assignments. The paper describes the sample, the study design, the implementation of the intervention, including the recruitment effort, the training and support provided to teachers, and the approaches taken to assess teacher's progress and improve implementation fidelity. Analysis of data collected during the study suggest that (a) treatment teachers changed their homework review practices as they used ASSISTments, and (b) the usage of ASSISTments was positively correlated with student learning outcome.
- Published
- 2023
24. Effectiveness of Crowd-Sourcing On-Demand Assistance from Teachers in Online Learning Platforms
- Author
-
Patikorn, Thanaporn and Heffernan, Neil T.
- Abstract
It has been shown in multiple studies that expert-created on-demand assistance, such as hint messages, improves student learning in online learning environments. However, there are also evident that certain types of assistance may be detrimental to student learning. In addition, creating and maintaining on-demand assistance are hard and time-consuming. In 2017-2018 academic year, 132,738 distinct problems were assigned inside ASSISTments, but only 38,194 of those problems had on-demand assistance. In order to take on-demand assistance to scale, we needed a system that is able to gather new on-demand assistance and allows us to test and measure its effectiveness. Thus, we designed and deployed TeacherASSIST inside ASSISTments. TeacherASSIST allowed teachers to create on-demand assistance for any problems as they assigned those problems to their students. TeacherASSIST then redistributed on-demand assistance by one teacher to students outside of their classrooms. We found that teachers inside ASSISTments had created 40,292 new instances of assistance for 25,957 different problems in three years. There were 14 teachers who created more than 1,000 instances of on-demand assistance. We also conducted two large-scale randomized controlled experiments to investigate how on-demand assistance created by one teacher affected students outside of their classes. Students who received on-demand assistance for one problem resulted in significant statistical improvement on the next problem performance. The students' improvement in this experiment confirmed our hypothesis that crowd-sourced on-demand assistance was sufficient in quality to improve student learning, allowing us to take on-demand assistance to scale. [This paper was published in: "L@S '20, August 12-14, 2020, Virtual Event, USA," ACM, 2020, pp. 115-124.]
- Published
- 2020
- Full Text
- View/download PDF
25. Reinforcement Learning for Education: Opportunities and Challenges
- Author
-
Singla, Adish, Rafferty, Anna N., Radanovic, Goran, and Heffernan, Neil T.
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
This survey article has grown out of the RL4ED workshop organized by the authors at the Educational Data Mining (EDM) 2021 conference. We organized this workshop as part of a community-building effort to bring together researchers and practitioners interested in the broad areas of reinforcement learning (RL) and education (ED). This article aims to provide an overview of the workshop activities and summarize the main research directions in the area of RL for ED.
- Published
- 2021
26. MathBERT: A Pre-trained Language Model for General NLP Tasks in Mathematics Education
- Author
-
Shen, Jia Tracy, Yamashita, Michiharu, Prihar, Ethan, Heffernan, Neil, Wu, Xintao, Graff, Ben, and Lee, Dongwon
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Since the introduction of the original BERT (i.e., BASE BERT), researchers have developed various customized BERT models with improved performance for specific domains and tasks by exploiting the benefits of transfer learning. Due to the nature of mathematical texts, which often use domain specific vocabulary along with equations and math symbols, we posit that the development of a new BERT model for mathematics would be useful for many mathematical downstream tasks. In this resource paper, we introduce our multi-institutional effort (i.e., two learning platforms and three academic institutions in the US) toward this need: MathBERT, a model created by pre-training the BASE BERT model on a large mathematical corpus ranging from pre-kindergarten (pre-k), to high-school, to college graduate level mathematical content. In addition, we select three general NLP tasks that are often used in mathematics education: prediction of knowledge component, auto-grading open-ended Q&A, and knowledge tracing, to demonstrate the superiority of MathBERT over BASE BERT. Our experiments show that MathBERT outperforms prior best methods by 1.2-22% and BASE BERT by 2-8% on these tasks. In addition, we build a mathematics specific vocabulary 'mathVocab' to train with MathBERT. We discover that MathBERT pre-trained with 'mathVocab' outperforms MathBERT trained with the BASE BERT vocabulary (i.e., 'origVocab'). MathBERT is currently being adopted at the participated leaning platforms: Stride, Inc, a commercial educational resource provider, and ASSISTments.org, a free online educational platform. We release MathBERT for public usage at: https://github.com/tbs17/MathBERT., Comment: Accepted by NeurIPS 2021 MATHAI4ED Workshop (Best Paper)
- Published
- 2021
27. Classifying Math KCs via Task-Adaptive Pre-Trained BERT
- Author
-
Shen, Jia Tracy, Yamashita, Michiharu, Prihar, Ethan, Heffernan, Neil, Wu, Xintao, McGrew, Sean, and Lee, Dongwon
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Educational content labeled with proper knowledge components (KCs) are particularly useful to teachers or content organizers. However, manually labeling educational content is labor intensive and error-prone. To address this challenge, prior research proposed machine learning based solutions to auto-label educational content with limited success. In this work, we significantly improve prior research by (1) expanding the input types to include KC descriptions, instructional video titles, and problem descriptions (i.e., three types of prediction task), (2) doubling the granularity of the prediction from 198 to 385 KC labels (i.e., more practical setting but much harder multinomial classification problem), (3) improving the prediction accuracies by 0.5-2.3% using Task-adaptive Pre-trained BERT, outperforming six baselines, and (4) proposing a simple evaluation measure by which we can recover 56-73% of mispredicted KC labels. All codes and data sets in the experiments are available at:https://github.com/tbs17/TAPT-BERT
- Published
- 2021
28. Precise Unbiased Estimation in Randomized Experiments using Auxiliary Observational Data
- Author
-
Gagnon-Bartsch, Johann A., Sales, Adam C., Wu, Edward, Botelho, Anthony F., Erickson, John A., Miratrix, Luke W., and Heffernan, Neil T.
- Subjects
Statistics - Applications - Abstract
Randomized controlled trials (RCTs) are increasingly prevalent in education research, and are often regarded as a gold standard of causal inference. Two main virtues of randomized experiments are that they (1) do not suffer from confounding, thereby allowing for an unbiased estimate of an intervention's causal impact, and (2) allow for design-based inference, meaning that the physical act of randomization largely justifies the statistical assumptions made. However, RCT sample sizes are often small, leading to low precision; in many cases RCT estimates may be too imprecise to guide policy or inform science. Observational studies, by contrast, have strengths and weaknesses complementary to those of RCTs. Observational studies typically offer much larger sample sizes, but may suffer confounding. In many contexts, experimental and observational data exist side by side, allowing the possibility of integrating "big observational data" with "small but high-quality experimental data" to get the best of both. Such approaches hold particular promise in the field of education, where RCT sample sizes are often small due to cost constraints, but automatic collection of observational data, such as in computerized educational technology applications, or in state longitudinal data systems (SLDS) with administrative data on hundreds of thousand of students, has made rich, high-dimensional observational data widely available. We outline an approach that allows one to employ machine learning algorithms to learn from the observational data, and use the resulting models to improve precision in randomized experiments. Importantly, there is no requirement that the machine learning models are "correct" in any sense, and the final experimental results are guaranteed to be exactly unbiased. Thus, there is no danger of confounding biases in the observational data leaking into the experiment., Comment: Forthcoming in Journal of Causal Inference. Replication materials at https://osf.io/d9ujq/ . Results differ very slightly from previous versions due to changes made in the process of making the analysis replicable. For details, compare https://github.com/adamSales/rebarLoop/tree/ReplicateArxiv2-2023 (previous version) to https://github.com/adamSales/rebarLoop/tree/docker (current version)
- Published
- 2021
29. Investigating Patterns of Tone and Sentiment in Teacher Written Feedback Messages
- Author
-
Baral, Sami, Botelho, Anthony F., Santhanam, Abhishek, Gurung, Ashish, Erickson, John, Heffernan, Neil T., Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Wang, Ning, editor, Rebolledo-Mendez, Genaro, editor, Dimitrova, Vania, editor, Matsuda, Noboru, editor, and Santos, Olga C., editor
- Published
- 2023
- Full Text
- View/download PDF
30. How to Open Science: Promoting Principles and Reproducibility Practices Within the Artificial Intelligence in Education Community
- Author
-
Haim, Aaron, Shaw, Stacy T., Heffernan, Neil T., Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Wang, Ning, editor, Rebolledo-Mendez, Genaro, editor, Dimitrova, Vania, editor, Matsuda, Noboru, editor, and Santos, Olga C., editor
- Published
- 2023
- Full Text
- View/download PDF
31. More Powerful A/B Testing Using Auxiliary Data and Deep Learning
- Author
-
Sales, Adam C., Prihar, Ethan, Gagnon-Bartsch, Johann, Gurung, Ashish, and Heffernan, Neil T.
- Abstract
Randomized A/B tests allow causal estimation without confounding but are often under-powered. This paper uses a new dataset, including over 250 randomized comparisons conducted in an online learning platform, to illustrate a method combining data from A/B tests with log data from users who were not in the experiment. Inference remains exact and unbiased without additional assumptions, regardless of the deep-learning model's quality. In this dataset, incorporating auxiliary data improves precision consistently and, in some cases, substantially.
- Published
- 2022
- Full Text
- View/download PDF
32. Active Learning for Student Affect Detection
- Author
-
Yang, Tsung-Yen, Baker, Ryan S., Studer, Christoph, Heffernan, Neil, and Lan, Andrew S.
- Abstract
"Sensor-free" detectors of student affect that use only student activity data and no physical or physiological sensors are cost-effective and have potential to be applied at large scale in real classrooms. These detectors are trained using student affect labels collected from human observers as they observe students learn within intelligent tutoring systems (ITSs) in real classrooms. Due to the inherent diversity of student activity and affect dynamics, observing the affective states of some students at certain times is likely to be more informative to the affect detectors than observing others. Therefore, a carefully-crafted observation schedule may lead to more meaningful observations and improved affect detectors. In this paper, we investigate whether active (machine) learning methods, a family of methods that adaptively select the next most informative observation, can improve the efficiency of the affect label collection process. We study several existing active learning methods and also propose a new method that is ideally suited for the problem setting in affect detection. We conduct a series of experiments using a real-world student affect dataset collected in real classrooms deploying the ASSISTments ITS. Results show that some active learning methods can lead to high-quality affect detectors using only a small number of highly informative observations. We also discuss how to deploy active learning methods in real classrooms to improve the affect label collection process and thus sensor-free affect detectors. [For the full proceedings, see ED599096.]
- Published
- 2019
33. Save Your Strokes: Chinese Handwriting Practice Makes for Ineffective Use of Instructional Time in Second Language Classrooms
- Author
-
Lu, Xiwen, Ostrow, Korinn S., and Heffernan, Neil T.
- Abstract
Handwriting practice is the most time-consuming activity for learners of Chinese as a foreign language (CFL). CFL instructors report allocating at least one third of their course time to handwriting practice although it prevents students from engaging in meaningful communication, especially in the earliest stages of learning. Given the amount of time students spend in a college course is relatively fixed, the preregistered study presented herein examines the best use of students' time when primary goals are word acquisition and communication. This work replicates a pilot study examining CFL word recognition in an online learning environment (ASSISTments) and the effects of supplemental handwriting practice. We examined word acquisition and recognition while manipulating condition (no-handwriting practice and with-handwriting practice), and posttest test point (1 [immediate], 2 [1 day delay], and 3 [1 week delay]). Two-way repeated measures analyses of variance revealed significant main effects for both condition and posttest test point in online and on-paper measures of word recognition and handwriting. Potential implications for CFL instruction and directions for future work are discussed.
- Published
- 2019
34. Precise unbiased estimation in randomized experiments using auxiliary observational data
- Author
-
Gagnon-Bartsch Johann A., Sales Adam C., Wu Edward, Botelho Anthony F., Erickson John A., Miratrix Luke W., and Heffernan Neil T.
- Subjects
education research ,a/b testing ,data integration ,62-08 ,62d20 ,62p25 ,Mathematics ,QA1-939 ,Probabilities. Mathematical statistics ,QA273-280 - Abstract
Randomized controlled trials (RCTs) admit unconfounded design-based inference – randomization largely justifies the assumptions underlying statistical effect estimates – but often have limited sample sizes. However, researchers may have access to big observational data on covariates and outcomes from RCT nonparticipants. For example, data from A/B tests conducted within an educational technology platform exist alongside historical observational data drawn from student logs. We outline a design-based approach to using such observational data for variance reduction in RCTs. First, we use the observational data to train a machine learning algorithm predicting potential outcomes using covariates and then use that algorithm to generate predictions for RCT participants. Then, we use those predictions, perhaps alongside other covariates, to adjust causal effect estimates with a flexible, design-based covariate-adjustment routine. In this way, there is no danger of biases from the observational data leaking into the experimental estimates, which are guaranteed to be exactly unbiased regardless of whether the machine learning models are “correct” in any sense or whether the observational samples closely resemble RCT samples. We demonstrate the method in analyzing 33 randomized A/B tests and show that it decreases standard errors relative to other estimators, sometimes substantially.
- Published
- 2023
- Full Text
- View/download PDF
35. Achieving User-Side Fairness in Contextual Bandits
- Author
-
Huang, Wen, Labille, Kevin, Wu, Xintao, Lee, Dongwon, and Heffernan, Neil
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Personalized recommendation based on multi-arm bandit (MAB) algorithms has shown to lead to high utility and efficiency as it can dynamically adapt the recommendation strategy based on feedback. However, unfairness could incur in personalized recommendation. In this paper, we study how to achieve user-side fairness in personalized recommendation. We formulate our fair personalized recommendation as a modified contextual bandit and focus on achieving fairness on the individual whom is being recommended an item as opposed to achieving fairness on the items that are being recommended. We introduce and define a metric that captures the fairness in terms of rewards received for both the privileged and protected groups. We develop a fair contextual bandit algorithm, Fair-LinUCB, that improves upon the traditional LinUCB algorithm to achieve group-level fairness of users. Our algorithm detects and monitors unfairness while it learns to recommend personalized videos to students to achieve high efficiency. We provide a theoretical regret analysis and show that our algorithm has a slightly higher regret bound than LinUCB. We conduct numerous experimental evaluations to compare the performances of our fair contextual bandit to that of LinUCB and show that our approach achieves group-level fairness while maintaining a high utility., Comment: 12 pages
- Published
- 2020
36. Context-Aware Attentive Knowledge Tracing
- Author
-
Ghosh, Aritra, Heffernan, Neil, and Lan, Andrew S.
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Knowledge tracing (KT) refers to the problem of predicting future learner performance given their past performance in educational applications. Recent developments in KT using flexible deep neural network-based models excel at this task. However, these models often offer limited interpretability, thus making them insufficient for personalized learning, which requires using interpretable feedback and actionable recommendations to help learners achieve better learning outcomes. In this paper, we propose attentive knowledge tracing (AKT), which couples flexible attention-based neural network models with a series of novel, interpretable model components inspired by cognitive and psychometric models. AKT uses a novel monotonic attention mechanism that relates a learner's future responses to assessment questions to their past responses; attention weights are computed using exponential decay and a context-aware relative distance measure, in addition to the similarity between questions. Moreover, we use the Rasch model to regularize the concept and question embeddings; these embeddings are able to capture individual differences among questions on the same concept without using an excessive number of parameters. We conduct experiments on several real-world benchmark datasets and show that AKT outperforms existing KT methods (by up to $6\%$ in AUC in some cases) on predicting future learner responses. We also conduct several case studies and show that AKT exhibits excellent interpretability and thus has potential for automated feedback and personalization in real-world educational settings., Comment: Published in KDD 2020
- Published
- 2020
37. Immediate text-based feedback timing on foreign language online assignments: How immediate should immediate feedback be?
- Author
-
Lu, Xiwen, Wang, Wei, Motz, Benjamin A., Ye, Weibing, and Heffernan, Neil T.
- Published
- 2023
- Full Text
- View/download PDF
38. The great challenges and opportunities of the next 20 years
- Author
-
T. Rodrigo, Maria Mercedes, primary, Vassileva, Julita, additional, Lane, H. Chad, additional, Brusilovsky, Peter, additional, Sosnovsky, Sergey, additional, Biswas, Gautam, additional, Lester, James C., additional, Mizoguchi, Riichiro, additional, Prihar, Ethan, additional, Heffernan, Neil, additional, Mostow, Jack, additional, Frasson, Claude, additional, and Dmitrova, Vania, additional
- Published
- 2023
- Full Text
- View/download PDF
39. Achieving User-Side Fairness in Contextual Bandits
- Author
-
Huang, Wen, Labille, Kevin, Wu, Xintao, Lee, Dongwon, and Heffernan, Neil
- Published
- 2022
- Full Text
- View/download PDF
40. Using Big Data to Sharpen Design-Based Inference in A/B Tests
- Author
-
Sales, Adam C., Botelho, Anthony, Patikorn, Thanaporn, and Heffernan, Neil T.
- Abstract
Randomized A/B tests in educational software are not run in a vacuum: often, reams of historical data are available alongside the data from a randomized trial. This paper proposes a method to use this historical data--often highdimensional and longitudinal--to improve causal estimates from A/B tests. The method proceeds in two steps: first, fit a machine learning model to the historical data predicting students' outcomes as a function of their covariates. Then, use that model to predict the outcomes of the randomized students in the A/B test. Finally, use design-based methods to estimate the treatment effect in the A/B test, using prediction errors in place of outcomes. This method retains all of the advantages of design-based inference, while, under certain conditions, yielding more precise estimators. This paper will give a theoretical condition under which the method improves statistical precision, and demonstrates it using a deep learning algorithm to help estimate effects in a set of experiments run inside ASSISTments. [For the full proceedings, see ED593090.]
- Published
- 2018
41. Studying Affect Dynamics and Chronometry Using Sensor-Free Detectors
- Author
-
Botelho, Anthony F., Baker, Ryan S., Ocumpaugh, Jaclyn, and Heffernan, Neil T.
- Abstract
Student affect has been found to correlate with short- and long-term learning outcomes, including college attendance as well as interest and involvement in Science, Technology, Engineering, and Mathematics (STEM) careers. However, there still remain significant questions about the processes by which affect shifts and develops during the learning process. Much of this research can be split into affect dynamics, the study of the temporal transitions between affective states, and affective chronometry, the study of how an affect state emerges and dissipates over time. Thus far, these affective processes have been primarily studied using field observations, sensors, or student self-report measures; however, these approaches can be coarse, and obtaining finer-grained data produces challenges to data fidelity. Recent developments in sensor-free detectors of student affect, utilizing only the data from student interactions with a computer-based learning platform, open an opportunity to study affect dynamics and chronometry at moment-to-moment levels of granularity. This work presents a novel approach, applying sensor-free detectors to study these two prominent problems in affective research. [For the full proceedings, see ED593090.]
- Published
- 2018
42. Decision Tree Modeling of Wheel-Spinning and Productive Persistence in Skill Builders
- Author
-
Kai, Shimin, Almeda, Ma. Victoria, Baker, Ryan S., Heffernan, Cristina, and Heffernan, Neil
- Abstract
Research on non-cognitive factors has shown that persistence in the face of challenges plays an important role in learning. However, recent work on wheel-spinning, a type of unproductive persistence where students spend too much time struggling without achieving mastery of skills, show that not all persistence is uniformly beneficial for learning. For this reason, it becomes increasingly pertinent to identify the key differences between unproductive and productive persistence toward informing interventions in computer-based learning environments. In this study, we use a classification model to distinguish between productive persistence and wheel-spinning in ASSISTments, an online math learning platform. Our results indicate that there are two types of students who wheel-spin: first, students who do not request any hints in at least one problem but request more than one bottom-out hint across any 8 problems in the problem set; second, students who never request two or more bottom out hints across any 8 problems, do not request any hints in at least one problem, but who engage in relatively short delays between solving problems of the same skill. These findings suggest that encouraging students to both engage in spaced practice and use bottom-out hints sparingly is likely helpful for reducing their wheel-spinning and improving learning. These findings also provide insight on when students are struggling and how to make students' persistence more productive.
- Published
- 2018
43. Human-Centered Learning Engineering for the Emerging Intelligence Augmentation Economy
- Author
-
Goodell, Jim, Heffernan, Neil, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Stephanidis, Constantine, editor, Antona, Margherita, editor, Ntoa, Stavroula, editor, and Salvendy, Gavriel, editor
- Published
- 2022
- Full Text
- View/download PDF
44. Exploring Selective College Attendance and Middle School Cognitive and Non-cognitive Factors Within Computer-Based Math Learning
- Author
-
San Pedro, Maria Ofelia Z., Baker, Ryan S., Bowers, Alex J., Heffernan, Neil T., Ifenthaler, Dirk, Series Editor, Gibson, David, Series Editor, Wang, Yuan 'Elle', editor, Joksimović, Srećko, editor, San Pedro, Maria Ofelia Z., editor, Way, Jason D., editor, and Whitmer, John, editor
- Published
- 2022
- Full Text
- View/download PDF
45. Enhancing Auto-scoring of Student Open Responses in the Presence of Mathematical Terms and Expressions
- Author
-
Baral, Sami, Seetharaman, Karthik, Botelho, Anthony F., Wang, Anzhuo, Heineman, George, Heffernan, Neil T., Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Rodrigo, Maria Mercedes, editor, Matsuda, Noburu, editor, Cristea, Alexandra I., editor, and Dimitrova, Vania, editor
- Published
- 2022
- Full Text
- View/download PDF
46. Deep Learning or Deep Ignorance? Comparing Untrained Recurrent Models in Educational Contexts
- Author
-
Botelho, Anthony F., Prihar, Ethan, Heffernan, Neil T., Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Rodrigo, Maria Mercedes, editor, Matsuda, Noburu, editor, Cristea, Alexandra I., editor, and Dimitrova, Vania, editor
- Published
- 2022
- Full Text
- View/download PDF
47. Toward Improving Effectiveness of Crowdsourced, On-Demand Assistance from Educators in Online Learning Platforms
- Author
-
Haim, Aaron, Prihar, Ethan, Heffernan, Neil T., Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Rodrigo, Maria Mercedes, editor, Matsuda, Noburu, editor, Cristea, Alexandra I., editor, and Dimitrova, Vania, editor
- Published
- 2022
- Full Text
- View/download PDF
48. Exploring Fairness in Automated Grading and Feedback Generation of Open-Response Math Problems
- Author
-
Gurung, Ashish, Heffernan, Neil T., Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Rodrigo, Maria Mercedes, editor, Matsuda, Noburu, editor, Cristea, Alexandra I., editor, and Dimitrova, Vania, editor
- Published
- 2022
- Full Text
- View/download PDF
49. Feedback Design Patterns for Math Online Learning Systems
- Author
-
Inventado, Paul Salvador, Scupelli, Peter, Heffernan, Cristina, and Heffernan, Neil
- Abstract
Increasingly, computer-based learning systems are used by educators to facilitate learning. Evaluations of several math learning systems show that they result in significant student learning improvements. Feedback provision is one of the key features in math learning systems that contribute to its success. We have recently been uncovering feedback design patterns as part of a larger pattern language for math problems and learning support in online learning systems. In this paper, we present three feedback design patterns developed from the application of the data-driven design pattern methodology on a large educational dataset collected from actual student data in a math online learning system. These design patterns can help teachers, learning designers, and other stakeholders construct effective feedback for interactive learning activities that facilitate student learning.
- Published
- 2017
50. Using a Single Model Trained across Multiple Experiments to Improve the Detection of Treatment Effects
- Author
-
Patikorn, Thanaporn, Selent, Douglas, Heffernan, Neil T., Beck, Joseph E., and Zou, Jian
- Abstract
In this work, we describe a new statistical method to improve the detection of treatment effects in interventions. We call our method TAME (Trained Across Multiple Experiments). TAME takes advantage of multiple experiments with similar designs to create a single model. We use this model to predict the outcome of the dependent variable in unseen experiments. We use the predictive accuracy of the model on the conditions of the experiment to determine if the treatment had a statistically significant effect. We validated the effectiveness of our model using a large-scale simulation study, where we showed that our model can detect treatment effects with 10% more statistical power than an ANOVA in certain settings. We also applied our model to real data collected from the ASSISTments online learning platform and showed that the treatment effects detected by our model were comparable to the effects detected by the ANOVA. [For the full proceedings, see ED596512.]
- Published
- 2017
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.