The 4th International Conference on Educational Data Mining (EDM 2011) brings together researchers from computer science, education, psychology, psychometrics, and statistics to analyze large datasets to answer educational research questions. The conference, held in Eindhoven, The Netherlands, July 6-9, 2011, follows the three previous editions (Pittsburgh 2010, Cordoba 2009 and Montreal 2008), and a series of workshops within the AAAI, AIED, EC-TEL, ICALT, ITS, and UM conferences. The increase of e-learning resources such as interactive learning environments, learning management systems, intelligent tutoring systems, and hypermedia systems, as well as the establishment of state databases of student test scores, has created large repositories of data that can be explored to understand how students learn. The EDM conference focuses on data mining techniques for using these data to address important educational questions. The broad collection of research disciplines ensures cross fertilization of ideas, with the central questions of educational research serving as a unifying focus. This publication presents the following papers: (1) Social Information Discovery (Barry Smyth); (2) On exploration and mining of data in educational practice (Erik-Jan van der Linden, Martijn Wijffelaars, Thomas Lammers); (3) EDM and the 4th Paradigm of Scientific Discovery--Reflections on the 2010 KDD Cup Competition (John Stamper); (4) Factorization Models for Forecasting Student Performance (Nguyen Thai-Nghe, Tomas Horvath and Lars Schmidt-Thieme); (5) Analyzing Participation of Students in Online Courses Using Social Network Analysis Techniques (Reihaneh Rabbany Khorasgani, Mansoureh Takaffoli and Osmar Zaiane); (6) A Machine Learning Approach for Automatic Student Model Discovery (Nan Li, Noboru Matsuda, William W. Cohen and Kenneth R. Koedinger); (7) Conditions for effectively deriving a Q-Matrix from data with Non-negative Matrix Factorization (Michel C. Desmarais); (8) Student Translations of Natural Language into Logic: The Grade Grinder Translation Corpus Release 1.0 (Dave Barker-Plummer, Richard Cox and Robert Dale); (9) Instructional Factors Analysis: A Cognitive Model For Multiple Instructional Interventions (Min Chi, Kenneth Koedinger, Geoff Gordon, Pamela Jordan and Kurt Vanlehn); (10) The Simple Location Heuristic is Better at Predicting Students Changes in Error Rate Over Time Compared to the Simple Temporal Heuristic (A.F. Nwaigwe and K.R. Koedinger); (11) Items, skills, and transfer models: which really matters for student modeling? (Y. Gong and J.E. Beck); (12) Avoiding Problem Selection Thrashing with Conjunctive Knowledge Tracing (K.R. Koedinger, P.I. Pavlik Jr., J. Stamper, T. Nixon and S. Ritter); (13) Less is More: Improving the Speed and Prediction Power of Knowledge Tracing by Using Less Data (Bahador Nooraei, Zachary Pardos, Neil T. Heffernan and Ryan S.J.D. Baker); (14) Analysing frequent sequential patterns of collaborative learning activity around an interactive tabletop (R. Martinez Maldonado, K. Yacef, Judy Kay, A. Kharrufa and A. Al-Qaraghuli); (15) Acquiring Item Difficulty Estimates: a Collaborative Effort of Data and Judgment (K. Wauters, P. Desmet and W. Van Den Noortgate); (16) Spectral Clustering in Educational Data Mining (Shubhendu Trivedi, Zachary A. Pardos, Gabor Sarkozy and Neil T. Heffernan); (17) Does Time Matter? Modeling the Effect of Time with Bayesian Knowledge Tracing (Yumeng Qiu, Yingmei Qi, Hanyuan Lu, Zachary Pardos and Neil Heffernan); (18) Learning classifiers from a relational database of tutor logs (Jack Mostow, Jose Gonzalez-Brenes and Bao Hong Tan); (19) A Framework for Capturing Distinguishing User Interaction Behaviors in Novel Interfaces (S. Kardan and C. Conati); (20) How to Classify Tutorial Dialogue? Comparing Feature Vectors vs. Sequences (Jose Gonzalez-Brenes, Jack Mostow and Weisi Duan); (21) Automatically Detecting a Students Preparation for Future Learning: Help Use is Key (Ryan S.J.D. Baker, Sujith M. Gowda and Albert T. Corbett); (22) Ensembling Predictions of Student Post-Test Scores for an Intelligent Tutoring System (Zachary A. Pardos, Sujith M. Gowda, Ryan S.J.D. Baker and Neil T. Heffernan); (23) Improving Models of Slipping, Guessing, and Moment-By-Moment Learning with Estimates of Skill Difficulty (Sujith M. Gowda, Jonathan P. Rowe, Ryan S.J.D. Baker, Min Chi and Kenneth R. Koedinger); (24) A Method for Finding Prerequisites Within a Curriculum (Annalies Vuong, Tristan Nixon and Brendon Towle); (25) Estimating Prerequisite Structure From Noisy Data (Emma Brunskill); (26) What can closed sets of students and their marks say? (Dmitry Ignatov, Serafima Mamedova, Nikita Romashkin, and Ivan Shamshurin); (27) How university entrants are choosing their department? Mining of university admission process with FCA taxonomies (Nikita Romashkin, Dmitry Ignatov and Elena Kolotova); (28) What's an Expert? Using learning analytics to identify emergent markers of expertise through automated speech, sentiment and sketch analysis (Marcelo Worsley and Paulo Blikstein); (29) Using Logistic Regression to Trace Multiple Subskills in a Dynamic Bayes Net (Yanbo Xu and Jack Mostow); (30) Monitoring Learners Proficiency: Weight Adaptation in the Elo Rating System (K. Wauters, P. Desmet and W. Van Den Noortgate); (31) Modeling students activity in online discussion forums: a strategy based on time series and agglomerative hierarchical clustering (G. Cobo, D. Garcia, E. Santamaria, J.A. Moran, J. Melenchon and C. Monzo); (32) Prediction of Perceived Disorientation in Online Learning Environment with Random Forest Regression (Gokhan Akcapinar, Erdal Cosgun and Arif Altun); (33) Analysing Student Spatial Deployment in a Computer Laboratory (Vladimir Ivancevic, Milan Celikovic and Ivan Lukovic); (34) Predicting School Failure Using Data Mining (C. Marquez-Vera, C. Romero and S. Ventura); (35) A Dynamical System Model of Microgenetic Changes in Performance, Efficacy, Strategy Use and Value during Vocabulary Learning (P. Pavlik Jr. and S. Wu); (36) Desperately Seeking Subscripts: Towards Automated Model Parameterization (J. Mostow, Y. Xu and M. Munna); (37) Automatic Generation of Proof Problems in Deductive Logic (B. Mostafavi, T. Barnes and M. Croy); (38) Comparison of Traditional Assessment with Dynamic Testing in a Tutoring System (Mingyu Feng, Neil T. Heffernan, Zachary A. Pardos and Cristina Heffernan); (39) Evaluating a Bayesian Student Model of Decimal Misconceptions (G. Goguadze, S. Sosnovsky, S. Isotani and B. Mclaren); (40) Exploring user data from a game-like math tutor: a case study in causal modeling (D. Rai and J. E. Beck); (41) Goal Orientation and Changes of Carelessness over Consecutive Trials in Science Inquiry (A. Hershkovitz, R.S.J.D. Baker, J. Gobert and M. Wixon); (42) Towards improvements on domain-independent measurements for collaborative assessment (Antonio R. Anaya and Jesus G. Boticario); (43) A Java desktop tool for mining Moodle data (R. Pedraza-Perez, C. Romero and S. Ventura); (44) Using data mining in a recommender system to search for learning objects in repositories (A. Zapata-Gonzalez, V.H. Menendez, M.E. Prieto-Mendez and C. Romero); (45) E-learning Web Miner: A data mining application to help instructors involved in virtual courses (Diego Garcia-Saiz and M.E. Zorrilla Pantaleon); (46) Computerized Coding System for Life Narratives to Assess Students' Personality Adaption (Q. He, B.P. Veldkamp and G.J. Westerhof); (47) Partially Observable Sequential Decision Making for Problem Selection in an Intelligent Tutoring System (Emma Brunskill and Stuart Russell); (48) Mining Teaching Behaviors from Pedagogical Surveys (J. Barracosa and C. Antunes); (49) Variable Construction and Causal Modeling of Online Education Messaging Data: Initial Results (S. Fancsali); (50) The Hospital Classrooms Environments Challenge (Carina Gonzalez and Pedro A. Toledo); (51) Combining study of complex network and text mining analysis to understand growth mechanism of communities on SNS (Osamu Yamakawa, Takahiro Tagawa, Hitoshi Inoue, Koichi Yastake and Takahiro Sumiya); (52) Logistic Regression in a Dynamic Bayes Net Models Multiple Subskills Better! (Yanbo Xu and Jack Mostow); (53) Studying problem-solving strategies in the early stages of learning programming (E. Cambranes-Martinez and J. Good); (54) Brick: Mining Pedagogically Interesting Sequential Patterns (Anjo Anjewierden, Hannie Gijlers, Nadira Saab and Robert De Hoog); (55) Intelligent evaluation of social knowledge building using conceptual maps with MLN (L. Moreno, C.S. Gonzalez, R. Estevez and B. Popescu); (56) Identifying Influence Factors on Students Success by Subgroup Discovery (F. Lemmerich, M. Ifland and F. Puppe); (57) Analyzing University Data for Determining Student Profiles and Predicting Performance (D. Kabakchieva, K. Stefanova and V. Kisimov); (58) The EDM Vis Tool (Matthew Johnson, Michael Eagle, Leena Joseph and Tiffany Barnes); (59) Towards Modeling Forgetting and Relearning in ITS: Preliminary Analysis of ARRS Data (Y. Wang and N.T. Heffernan); (60) Quality Control and Data Mining Techniques Applied to Monitoring Scaled Scores (A.A. Von Davier); (61) eLAT: An Exploratory Learning Analytics Tool for Reflection and Iterative Improvement of Technology Enhanced Learning (A.L. Dyckhoff, D. Zielke, M.A. Chatti and U. Schroeder); (62) Predicting graduate-level performance from undergraduate achievements (J. Zimmermann, K.H. Brodersen, J.-P. Pellet, E. August and J.M. Buhmann); (63) Mining Assessment and Teaching Evaluation Data of Regular and Advanced Stream Students (Irena Koprinska); (64) Investigating Usage of Resources in LMS with Specific Association Rules (A. Merceron); (65) Towards Parameter-Free Data Mining: Mining Educational Data with "yacaree" (Jose L. Balcazar, Diego Garcia-Saiz and Marta E. Zorrilla); (66) Factors Impacting Novice Code Comprehension in a Tutor for Introductory Computer Science (Leigh Ann Sudol-DeLyser and Jonathan Steinhart); (67) Investigating the Transitions between Learning and Non-learning Activities as Students Learn Online (P.S. Inventado, R. Legaspi, M. Suarez and M. Numao); (68) Learning parameters for a knowledge diagnostic tools in orthopedic surgery (S. Lalle and V. Luengo); (69) Problem Response Theory and its Application for Tutoring (P. Jarusek and R. Pelanek); and (70) Towards Better Understanding of Transfer in Cognitive Models of Practice (Michael V. Yudelson, Philip I. Pavlik, Jr. and Kenneth R. Koedinger). Individual papers contain tables, figures, footnotes and references.