Peer reviewed journals
Data driven automatic feedback generation in the iList intelligent tutoring system. Technology, Instruction, Cognition, and Learning (TICL), Special Issue on Role of Data in Instructional Processes, vol. 10(1), pp. 5-26, 2015.
Abstract. Based on our empirical studies of effective human tutoring, we developed an Intelligent Tutoring System, iList, that helps students learn linked lists, a challenging topic in Computer Science education. The iList system can provide several forms of feedback to students. Feedback is automatically generated thanks to a Procedural Knowledge Model extracted from the history of interaction of students with the system. This model allows iList to provide effective reactive and proactive procedural feedback while a student is solving a problem. We tested five different versions of iList, differing in the level of feedback they can provide, in multiple classrooms, with a total of more than 200 students. The evaluation study showed that iList is effective in helping students learn; students liked working with the system; and the feedback generated by the most sophisticated versions of the system is helpful in keeping students on the right path.
Affect detection from non-stationary physiological data using ensemble classifiers. Evolving Systems, November 2014.
Abstract. Affect detection from physiological signals has received considerable attention. One challenge is that physiological measures exhibit considerable variations over time, making classification of future data difficult. The present study addresses this issue by providing insights on how diagnostic physiological features of affect change over time. Affective physiological data (Electrocardiogram, Electromyogram, Skin Conductivity, and Respiration) was collected from four participants over five sessions each. Classification performance of a number of training strategies, under different conditions of features selection and engineering, were compared using an adaptive classifier ensemble algorithm. Analysis of the performance of individual physiological channels for affect detection is also provided. The key result is that using pooled features set for affect detection is more accurate than using day-specific features. A decision fusion strategy which combines decisions from classifiers trained on individual channels data outperformed a features fusion strategy. Results also show that the performance of the ensemble is affected by the choice of the base classifier and the alpha factor used to update the member classifiers of the ensemble. Finally, the corrugator and zygomatic facial EMGs were found to be more reliable measures for detecting the valence component of affect compared to other channels.
Supporting Computer Science curriculum: Exploring and learning linked lists with iList. IEEE Transactions on Learning Technologies, Special Issue on Real-World Applications of Intelligent Tutoring Systems, vol. 2(2), pp. 107-120, April-June 2009.
Abstract. We developed two versions of a system, called iList, that helps students learn linked lists, an important topic in Computer Science curricula. The two versions of iList differ on the level of feedback they can provide to the students, specifically in the explanation of syntax and execution errors. The system has been fielded in multiple classrooms in two institutions. Our results indicate that iList is effective, is considered interesting and useful by the students, and its performance is getting closer to the performance of human tutors. Moreover, the system is being developed in the context of a study of human tutoring, which is guiding the evolution of iList with empirical evidence of effective tutoring.
Be brief, and they shall learn: Generating concise language feedback for a computer tutor. International Journal of AI in Education, vol. 18(4), pp. 317-345, 2008.
Abstract. To investigate whether more concise Natural Language feedback improves learning, we developed two Natural Language generators (DIAG-NLP1 and DIAG-NLP2), to provide feedback in an Intelligent Tutoring System that teaches troubleshooting. We systematically evaluated them in a three way comparison that included the original system, which generates overly repetitive feedback. We found that DIAG-NLP2, the generator which intuitively produces the best, corpus-based language, does engender the most learning. Distinguishing features of the more effective feedback are: it obeys Grice's maxim of brevity, it is more directive and uses a specific type of referring expressions. Interestingly, simpler ways of restructuring the original repetitive feedback as done in DIAG-NLP1, such as exploiting the hierarchical structure of the domain, were not effective. Since the design of interfaces to Intelligent Tutoring Systems often includes verbal feedback, we suggest that: if the number of different contexts in which verbal feedback is provided is high, such feedback should be based on corpus studies, and generated by techniques more sophisticated than template filling.
Peer reviewed conferences
Behavior and Learning of Students using Worked-out Examples in a Tutoring System. In ITS 2016, 13th International Conference on Intelligent Tutoring Systems, Zagreb, Croatia, June 2016.
Abstract. Worked-out examples have been shown to increase learning gains over problem solving alone. These increases are even greater in novices and those who are learning algorithmic topics, such as those in Computer Science. We have integrated this strategy into our Intelligent Tutoring System and evaluated it on undergraduate students learning the linked list data structure. Although promising, we have identified behavioral differences between high and low gainers - spending less time on an example, and prematurely quitting them led to greater learning.
Integrating Support for Collaboration in a Computer Science Intelligent Tutoring System. In ITS 2016, 13th International Conference on Intelligent Tutoring Systems, Zagreb, Croatia, June 2016.
Abstract. Calls for widespread Computer Science (CS) education have been issued from the White House down and have been met with increased enrollment in CS undergraduate programs. Yet, these programs often suffer from high attrition rates. One successful approach to addressing the problem of low retention has been a focus on group work and collaboration. This paper details the design of a collaborative ITS (CIT) for foundational CS concepts including basic data structures and algorithms. We investigate the benefit of collaboration to student learning while using the CIT. We compare learning gains of our prior work in a non-collaborative system versus two methods of supporting collaboration in the collaborative-ITS. In our study of 60 students, we found significant learning gains for students using both versions. We also discovered notable differences related to student perception of tutor helpfulness which we will investigate in subsequent work.
Incorporating Analogies and Worked Out Examples as Pedagogical Strategies in a Computer Science Tutoring System. In SIGCSE 2016, the 47th ACM Technical Symposium on Computer Science Education, Memphis, TN, March 2016.
Abstract. Analogies and worked out examples are effective means of instruction in a wide variety of learning environments. However, the extent of their effectiveness in Computer Science (CS) education has not been fully explored. We extended our intelligent tutoring system (ITS) for CS data structures, ChiQat-Tutor, to incorporate worked out examples and analogy as teaching strategies. We compare three versions of the system: one that uses standard worked out examples, one that uses analogical worked out examples, and one that uses a pure analogical explanation with separate worked out examples. A study with 66 students showed that students using the standard worked out examples had greater learning gains than students in both analogy conditions. We also found that analogy can be less effective for students with higher prior knowledge. Additionally, we show that some interaction patterns highly correlate with student gains. Overall, the system implementation and results represent a step towards exploring the use of well-established instructional strategies in a computer science ITS.
Collab-ChiQat: A Collaborative Remaking of a Computer Science Intelligent Tutoring System. In CSCW 2016, Computer-Supported Cooperative Work and Social Computing, San Francisco, CA, March 2016.
Abstract. This paper focuses on the motivation, design, and initial prototype implementation of Collab-ChiQat. Collab-ChiQat is a collaborative reconceptualization of an existing intelligent tutoring system for Computer Science Education originally intended for one-to-one student-system tutoring. Collab-ChiQat allows students to work as pair programmers as they solve coding problems for linked lists, a foundational and difficult to grasp CS concept. The work is unique in its comparison of how system structuring of collaboration affects both learning and actual collaboration. In one condition, students are left to themselves with no system feedback regarding their collaborative behavior. While in a second condition, the collaboration is semi-structured, meaning students received a visualization of their participation and other metrics.
Student Behavior with Worked-out Examples in a Computer Science Intelligent Tutoring System. In ICEduTech 2015, International Conference on Educational Technologies, Florianopolis, Santa Catarina, Brazil, November 2015. Best paper award.
Abstract. The computing industry is currently facing a huge deficit in talent entering the industry. Even though enrollment in Computer Science (CS) degrees is climbing, many students drop out due to hurdles in grasping fundamental concepts. ChiQat-Tutor, our new CS Intelligent Tutoring System (ITS) helps students overcome difficulties by teaching CS concepts such as data structures. Content aside, an important aspect is how this material is taught. Here we show our work on using Worked-out Examples (WOE) in the linked list tutorial of our ITS. Contrary to previous literature, we found that WOEs are not a silver bullet, and may not be an effective teaching strategy for some students.
A Hybrid Model for Teaching Recursion. In SIGITE 2015, 16th Annual Conference on Information Technology Education, Chicago, IL, October 2015.
Abstract. Novice programmers struggle to understand the concept of recursion, partly because of unfamiliarity with recursive activities, difficulty with visualizing program execution, and difficulty understanding its back flow of control. In this paper we discuss the conceptual and program visualization approaches to teaching recursion. We also introduce our approach to teaching recursion in the ChiQat-Tutor system that relies on ideas from both approaches. ChiQat-Tutor will help Computer Science students learn recursion, develop accurate mental models of recursion, and serve as an effective visualization tool with which hidden contexts of recursion can become evident.
Worked-out Example in a Computer Science Intelligent Tutoring System. In SIGITE 2015, 16th Annual Conference on Information Technology Education, Chicago, IL, October 2015. Lightning talk.
Abstract. Our CS Intelligent Tutoring System (ITS), ChiQat-Tutor, aims at aiding students in overcoming the initial difficulties in CS education, such as learning data structures. Here, we show our work on utilizing Worked-out Examples (WOE) in our linked list lesson. Despite being a promising strategy, we find that it can be detrimental to student growth.
A Study of Analogy in Computer Science Tutorial Dialogues. In CSEDU 2015, 7th International Conference on Computer Supported Education, Lisbon, Portugal, May 2015.
Abstract. Analogy plays an important role in learning, but its role in teaching Computer Science has hardly been explored. We annotated and analyzed analogy in a corpus of tutoring dialogues on Computer Science data structures. Via linear regression analysis, we established that the presence of analogy and of specific dialogue acts within analogy episodes correlate with learning. We have integrated our findings in our ChiQat-Tutor system, and are currently evaluating the effect of analogy within the system.
A Scalable Intelligent Tutoring System Framework for Computer Science Education. In CSEDU 2015, 7th International Conference on Computer Supported Education, Lisbon, Portugal, May 2015.
Abstract. Computer Science is a difficult subject with many fundamentals to be taught, usually involving a steep learning curve for many students. It is some of these initial challenges that can turn students away from computer science. We have been developing a new Intelligent Tutoring System, ChiQat-Tutor, that focuses on tutoring of Computer Science fundamentals. Here, we outline the system under development, while bringing particular attention to its architecture and how it attains the primary goals of being easily extensible and providing a low barrier of entry to the end user. The system is broadly broken down into lessons, teaching strategies, and utilities, which work together to promote seamless integration of components. We also cover currently developed components in the form of a case study, as well as detailing our experience of deploying it to an undergraduate Computer Science classroom, leading to learning gains on par with prior work.
ChiQat-Tutor: An Integrated Environment for Learning Recursion. In ITS-AIEDCS 2014, 12th International Conference on Intelligent Tutoring Systems (ITS), 2nd Workshop on AI-supported Education for Computer Science (AIEDCS), Honolulu, HI, June 2014. Short paper.
Abstract. Novice Computer Science (CS) students struggle learning recursion for reasons such as unfamiliarity with recursive thinking and difficulty in visualizing program execution. Many tasks in CS require a thorough understanding of recursion. We introduce the recursion module of ChiQat-Tutor, an environment for learning CS algorithms and data structures. ChiQat-Tutor uses the pedagogical tool of Recursion Graphs to help students visualize, manipulate, and learn recursive processes.
Affect Detection and Classification from Non-Stationary Physiological Data. In ICMLA 2013, IEEE 12th International Conference on Machine Learning and Applications, Miami, FL, December 2013.
Abstract. Affect detection from physiological signals has received a great deal of attention recently. One arising challenge is that physiological measures are expected to exhibit considerable variations or non-stationarities over multiple days/sessions recordings. These variations pose challenges to effectively classify affective sates from future physiological data. The present study collects affective physiological data (electrocardiogram (ECG), electromyogram (EMG), skin conductivity (SC), and respiration (RSP)) from four participants over five sessions each. The study provides insights on how diagnostic physiological features of affect change over time. We compare the classification performance of two feature sets; pooled features (obtained from pooled day data) and day-specific features using an updatable classifier ensemble algorithm. The study also provides an analysis on the performance of individual physiological channels for affect detection. Our results show that using pooled feature set for affect detection is more accurate than using day-specific features. The corrugator and zygomatic facial EMGs were more reliable measures for detecting valence than arousal compared to ECG, RSP and SC over the span of multi-session recordings. It is also found that corrugator EMG features and a fusion of features from all physiological channels have the highest affect detection accuracy for both valence and arousal.
Predicting Students' Performance and Problem Solving Behavior from iList Log Data. In ICCE 2013, 21st International Conference on Computers in Education, Bali, Indonesia, November 2013.
Abstract. In this paper, we analyze data gathered from students' interactions with iList, an intelligent tutoring system that teaches linked lists to computer science (CS) undergraduates. A number of features have been extracted from the log files which were used to; a) build predictive models of students' performance, b) analyze temporal aspects of students' problem solving behavior. Our results suggest that it is possible to build predictive models of performance with an accuracy of 87\% by using logistic regression. The results also show that it is more likely a student will perform a step correctly if s/he spends more time on it.
Worked Out Examples in Computer Science Tutoring. In AIED 2013, 16th International Conference on Artificial Intelligence in Education, Memphis, TN, July 2013. Short paper.
Abstract. We annotated and analyzed Worked Out Examples (WOEs) in a corpus of tutoring dialogues on Computer Science data structures. We found that some dialogue moves that occur within WOEs, or sequences thereof, correlate with learning. Features of WOEs such as length also correlate with learning for some data structures. These results will be used to augment the tutorial tactics available to iList, an ITS that helps student learn linked lists.
Towards Improving Programming Habits to Create Better Computer Science Course Outcomes. In ITiCSE 2013, 18th ACM International Conference on Innovation and Technology in Computer Science Education, Canterbury, UK, July 2013.
Abstract. We examine a large dataset collected by the Marmoset system in a CS2 course. The dataset gives us a richly detailed portrait of student behavior because it combines automatically collected program snapshots with unit tests that can evaluate the correctness of all snapshots. We find that students who start earlier tend to earn better scores, which is consistent with the findings of other researchers. We also detail the overall work habits exhibited by students. Finally, we evaluate how students use release tokens, a novel mechanism that provides feedback to students without giving away the code for the test cases used for grading, and gives students an incentive to start coding earlier. We find that students seem to use their tokens quite effectively to acquire feedback and improve their project score, though we do not find much evidence suggesting that students start coding particularly early.
Exploring effective dialogue act sequences in one-on-one Computer Science tutoring dialogues. In ACL-HLT 2011, 49th Annual Meeting of the Association for Computational Linguistics, Workshop on Innovative Use of NLP for Building Educational Applications, Portland, OR, June 2011.
Abstract. We present an empirical study of one-on-one human tutoring dialogues in the domain of Computer Science data structures. We are interested in discovering effective tutoring strategies, that we frame as discovering which Dialogue Act (DA) sequences correlate with learning. We employ multiple linear regression, to discover the strongest models that explain why students learn during one-on-one tutoring. Importantly, we define ``flexible'' DA sequence, in which extraneous DAs can easily be discounted. Our experiments reveal several cognitively plausible DA sequences which significantly correlate with learning outcomes.
The use of evidence in the change making process of Computer Science educators. In SIGCSE 2011, the 42nd ACM Technical Symposium on Computer Science Education, Dallas, TX, March 2011.
Abstract. This paper explores the issue of what kind of evidence triggers changes in the teaching practice of Computer Science educators, and how educators evaluate the effectiveness of those changes. We interviewed 14 Computer Science instructors from three different institutions. Our study indicates that changes are mostly initiated from instructors' intuition, informal discussion with students, and anecdotal evidence.
Generating proactive feedback to help students stay on track. In ITS 2010, The 10th International Conference on Intelligent Tutoring Systems, Pittsburgh, PA, June 2010. Short paper.
Abstract. In a tutoring system based on an exploratory environment, it is also important to provide direct guidance to students. We endowed iList, our linked list tutor, with the ability to generate proactive feedback using a procedural knowledge model automatically constructed from the interaction of previous students with the system. We compared the new version of iList with its predecessors and human tutors. Our evaluation shows that iList is effective in helping students learn.
I learn from you, you learn from me: How to make iList learn from students. In AIED 2009, The 14th International Conference on Artificial Intelligence in Education, Brighton, UK, July 2009.
Abstract. We developed a new model for iList, our system that helps students learn linked list. The model is automatically extracted from past student data, and allows iList to track students' problem-solving behavior in order to provide targeted feedback. We evaluated the new model both intrinsically and extrinsically. We show that the model can match most student actions after a relatively small sequence of observations, and that iList can effectively use the new student tracker to provide feedback and help students learn.
Towards explaining effective tutorial dialogues. In CogSci 2009, The Annual Meeting of the Cognitive Science Society, Amsterdam, The Netherlands, 2009.
Abstract. We present a study of human tutorial dialogues in a core Computer Science domain that: focuses on individual tutoring sessions, rather than on contrasting different types of tutors; uses multiple regression analysis to correlate features of those sessions with learning outcomes; and highlights the effects of two types of tutor moves that have not been studied in depth so far, direct instruction and positive feedback.
The role of positive feedback in Intelligent Tutoring Systems. In ACL 2008, The 46th Annual Meeting of the Association for Computational Linguistics, Student Research Workshop, Columbus, OH, June 2008.
Abstract. The focus of this study is positive feedback in one-on-one tutoring, its computational modeling, and its application to the design of more effective Intelligent Tutoring Systems. A data collection of tutoring sessions in the domain of basic Computer Science data structures has been carried out. A methodology based on multiple regression is proposed, and some preliminary results are presented. A prototype Intelligent Tutoring System on linked lists has been developed and deployed in a college-level Computer Science class.
Learning linked lists: Experiments with the iList system. In ITS 2008, The 9th International Conference on Intelligent Tutoring Systems, Montreal, Canada, June 2008.
Abstract. This paper presents the first experiments with an Intelligent Tutoring System in the domain of linked lists, a fundamental topic in Computer Science. The system has been deployed in an introductory college-level Computer Science class, and engendered significant learning gains. A constraint-based approach has been adopted in the design and implementation of the system. We describe the system architecture, its current functionalities, and the future directions of its development.
Simple but effective feedback generation to tutor abstract problem solving. In INLG 2008, 5th International Natural Language Generation Conference, Salt Fork, OH, June 2008.
Abstract. To generate natural language feedback for an intelligent tutoring system, we developed a simple planning model with a distinguishing feature: its plan operators are derived automatically, on the basis of the association rules mined from our tutorial dialog corpus. Automatically mined rules are also used for realization. We evaluated 5 different versions of a system that tutors on an abstract sequence learning task. The version that uses our planning framework is significantly more effective than the other four versions. We compared this version to the human tutors we employed in our tutorial dialogs, with intriguing results.
I saw TREE trees in the park: How to correct real-word spelling mistakes. In LREC 2008, Sixth International Conference on Language Resources and Evaluation, Marrakech, Morocco, May 2008.
Abstract. This paper presents a context sensitive spell checking system that uses mixed trigram models, and introduces a new empirically grounded method for building confusion sets. The proposed method has been implemented, tested, and evaluated in terms of coverage, precision, and recall. The results show that the method is effective.
Beyond the code-and-count analysis of tutoring dialogues. In AIED07, 13th International Conference on Artificial Intelligence in Education, Marina Del Rey, CA, July 2007.
Abstract. In this paper, we raise a methodological issue concerning the empirical analysis of tutoring dialogues: The frequencies of tutoring moves do not necessarily reveal their causal efficacy. We propose to develop coding schemes that are better informed by theories of learning; stop equating higher frequencies of tutoring moves with effectiveness; and replace ANOVAs and chi-squares with multiple regression. As motivation for our proposal, we will present an initial analysis of tutoring dialogues, in the domain of introductory Computer Science.
A mixed trigrams approach for context sensitive spell checking. In CICLing-2007, Eighth International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, Mexico, February 2007.
Abstract. This paper addresses the problem of real-word spell checking, i.e., the detection and correction of typos that result in real words of the target language. This paper proposes a methodology based on a mixed trigrams language model. The model has been implemented, trained, and tested with data from the Penn Treebank. The approach has been evaluated in terms of hit rate, false positive rate, and coverage. The experiments show promising results with respect to the hit rates of both detection and correction, even though the false positive rate is still high.
The problem of ontology alignment on the web: A first report. In EACL 2006, 11th conference of the European Association of Computational Linguistics, Workshop on Web as Corpus, Trento, Italy, April 2006.
Abstract. This paper presents a general architecture and four algorithms that use Natural Language Processing for automatic ontology matching. The proposed approach is purely instance based, i.e., only the instance documents associated with the nodes of ontologies are taken into account. The four algorithms have been evaluated using real world test data, taken from the Google and LookSmart online directories. The results show that NLP techniques applied to instance documents help the system achieve higher performance.
Aggregation improves learning: Experiments in Natural Language Generation for Intelligent Tutoring Systems. In ACL05, Proceedings of the 42nd Meeting of the Association for Computational Linguistics, Ann Arbor, MI, 2005.
Abstract. To improve the interaction between students and an intelligent tutoring system, we developed two Natural Language generators, that we systematically evaluated in a three way comparison that included the original system as well. We found that the generator which intuitively produces the best language does engender the most learning. Specifically, it appears that functional aggregation is responsible for the improvement.
Natural Language Generation for Intelligent Tutoring Systems: A case study. In AIED 2005, 12th International Conference on Artificial Intelligence in Education, Amsterdam, The Netherlands, 2005.
Abstract. To investigate whether Natural Language feedback improves learning, we developed two different feedback generation engines, that we systematically evaluated in a three way comparison that included the original system as well. We found that the system which intuitively produces the best language does engender the most learning. Specifically, it appears that presenting feedback at a more abstract level is responsible for the improvement.
Automatic Modeling of Procedural Knowledge and Feedback Generation in a Computer Science Tutoring System. PhD thesis, University of Illinois at Chicago, 2009.
Abstract. This research takes place in the larger context of the study of one-on-one tutoring, a form of instruction that has been shown to be very effective. I conducted a study of human tutoring in the domain of Computer Science data structures, to understand which features and strategies of human tutoring are important for learning. I developed an Intelligent Tutoring System, iList, that helps students learn linked lists. One of the main advances in iList is the presence of a Procedural Knowledge Model automatically extracted from student data. This model allows iList to provide effective reactive and proactive procedural feedback while a student is solving a problem. I tested five different versions of iList, differing in the level of feedback they can provide, in multiple classrooms, with a total of more than 200 students. The evaluation study showed that iList is effective in helping students learn; students liked working with the system; and the feedback generated by the most sophisticated versions of the system is helpful in keeping the students on the right path.
DIAG-NLP: Improving Intelligent Tutoring Systems with Natural Language Generation Technology. Master's thesis, University of Illinois at Chicago, 2003.
Abstract. The latest generation of Intelligent Tutoring Systems proved to be effective in helping students acquire knowledge. Natural Language Processing technology could be beneficial in improving these highly interactive applications. This thesis describes the latest development of the DIAG-NLP project, a research study conducted to verify the effectiveness of simple Natural Language Generation techniques in Intelligent Tutoring Systems. A rigorous corpus-based methodology has been applied, and multiple prototypes have been implemented and compared. The preliminary results obtained are encouraging and provide some basis for future research.