Here you can find a short description of the most recent research projects I have been working on.
Davide Fossati
Research
A Computational Model for Formative Assessment in Computer Science Education
The concept of formative assessment, originally expressed by educational psychologists in the late sixties, formalizes the idea of testing the knowledge of students in order to personalize instructional intervention and enhance their learning opportunities. This is in contrast to the traditional practice of summative assessment, where the main purpose of testing is to assign grades and rank student performance. Although summative assessment is certainly important for practical reasons, the inclusion of more formative assessment in instructional practice could significantly improve the effectiveness of teaching.
Formative assessment, although very elegant in theory, can be difficult to apply in practice. One of the reasons is that a teacher may need to collect large quantities of data, thoroughly analyze it to discover important trends and patterns, and use the findings to design appropriate teaching responses at the level of individual students and at the scale of the entire classroom. This process may be overwhelming and unrealistically expensive in terms of teacher and class time.
Impressive advances in the fields of Data Mining and Artificial Intelligence, combined with the insights provided by Cognitive and Educational Psychology, make it possible to envision the creation of Intelligent Assessment Systems (IASs), tools that can help teachers collect and analyze data to facilitate formative assessment. Such tools would noninvasively monitor students' learning processes by creating appropriate tests and automatically evaluating them; store, filter, and classify relevant data; and automatically or semi-automatically discover important patterns that are promptly reported to teachers so they can make informed decisions about what to do next. This could be seen as an educational equivalent of the Management Information Systems that are successfully used in business.
In this new project, I am currently developing methods and tools to help instructors, in particular those teaching large undergraduate courses in Computer Science, analyze and discover influential patterns in student grades. For example, if we are able to discover that "not understanding topics X and Y will most likely lead to trouble with topic Z," and at some point during the semester a student is showing these misunderstandings, then we can take appropriate instructional actions before it's too late.
iList: Intelligent Tutoring System for Computer Science
For a hands-on experience with the iList project, you can visit www.digitaltutor.net.
The main goals of this project are to discover the characteristics that make one-on-one tutoring an effective form of instruction, and to use these features to design and implement effective Intelligent Tutoring Systems (ITSs). In particular, we are interested in the feedback that ITSs can provide to students. To achieve this goal, this project required several tasks: collection and analysis of human tutorial data; definition of computational models of tutorial strategies and feedback; design and implementation of an Intelligent Tutoring System; and deployment and evaluation of the system.
We conducted a study of human tutoring in the domain of Computer Science data structures, to understand which features and strategies of human tutoring are important for learning. We developed an Intelligent Tutoring System, iList, that helps students learn linked lists. One of the main features of iList is a Procedural Knowledge Model that is automatically extracted from previous student data. This model allows iList to provide effective reactive and proactive procedural feedback while a student is solving a problem.
We tested five different versions of iList, differing in the level of feedback they provide, in multiple classrooms, with a total of more than 200 students. The evaluation study showed that iList is as effective as human tutors in helping students learn; students liked working with the system; and the feedback generated by the most sophisticated versions of the system helps keep the students on the right path.

Research methodology

CSCoding, a tool we developed to annotate video-recorded tutoring sessions

Screenshot of iList

Example of Procedural Knowledge Model automatically generated by iList. You can also download a vector representation in SVG
Links
www.digitaltutor.net - home of iList.
Relevant papers
- Davide Fossati. Automatic Modeling of Procedural Knowledge and Feedback Generation in a Computer Science Tutoring System. PhD Dissertation, University of Illinois at Chicago. July 2009. Download PDF
- Davide Fossati, Barbara Di Eugenio, Stellan Ohlsson, Christopher Brown, Lin Chen, and David Cosejo. I learn from you, you learn from me: How to make iList learn from students. AIED 2009, The 14th International Conference on Artificial Intelligence in Education. Brighton, UK. July 2009. Download PDF
- Davide Fossati, Barbara Di Eugenio, Christopher Brown, Stellan Ohlsson, David Cosejo, and Lin Chen. Supporting Computer Science curriculum: Exploring and learning linked lists with iList. IEEE Transactions on Learning Technologies, Special Issue on Real-World Applications of Intelligent Tutoring Systems, vol. 2 no. 2, pp. 107-120. April-June 2009. Download PDF
- Barbara Di Eugenio, Davide Fossati, Stellan Ohlsson, and David Cosejo. Towards explaining effective tutorial dialogues. CogSci 2009, The Annual Meeting of the Cognitive Science Society. Amsterdam, The Netherlands. July 2009. Download PDF
- Davide Fossati, Barbara Di Eugenio, Christopher Brown, and Stellan Ohlsson. Learning Linked Lists: Experiments with the iList System. ITS 2008, The 9th International Conference on Intelligent Tutoring Systems. Montreal, Canada. June 2008. Nominated for best paper award. Download PDF
- Davide Fossati. The role of positive feedback in Intelligent Tutoring Systems. ACL 2008, The 46th Annual Meeting of the Association for Computational Linguistics, Student Research Workshop. Columbus, OH. June 2008. Download PDF
- Stellan Ohlsson, Barbara Di Eugenio, Bettina Chow, Davide Fossati, Xin Lu, and Trina C. Kershaw. Beyond the code-and-count analysis of tutoring dialogues. AIED 2007, The 13th International Conference on Artificial Intelligence in Education. Marina Del Rey, CA. July 2007. Download PDF
DIAG-NLP: Natural Language Generation for Intelligent Tutoring Systems
Intelligent Tutoring Systems (ITSs) are effective tools that help students learn. We believe that natural language interfaces to ITSs can play an important role in improving the effectiveness of such systems. To investigate that hypothesis, we developed natural language generators that manipulate the feedback provided by Vivids-DIAG, an ITS that helps students learn how to troubleshoot complex mechanical systems. We found that the version of the system that generates language in which the core concepts are aggregated in a principled way engenders more learning. This more effective language is based on a corpus study, in which human tutors interacted with students through the DIAG interface.

The furnace system in the DIAG home heating troubleshooting simulation
Relevant papers
- Barbara Di Eugenio, Davide Fossati, Susan Haller, Dan Yu, and Michael Glass. Be brief, and they shall learn: Generating Concise Language Feedback for a Computer Tutor. International Journal of AI in Education, 18(4). 2008. Download PDF
- Barbara Di Eugenio, Davide Fossati, Dan Yu, Susan Haller, and Michael Glass. Natural language generation for intelligent tutoring systems: A case study. AIED 2005, Artificial Intelligence in Education. Amsterdam, The Netherlands. July 2005. Nominated for best paper award. Download PDF
- Barbara Di Eugenio, Davide Fossati, Dan Yu, Susan Haller, and Michael Glass. Aggregation improves learning: Experiments in natural language generation for intelligent tutoring systems. ACL 2005, The 43rd annual meeting of the Association of Computational Linguistics. Ann Arbor, MI. June 2005. Download PDF
- Davide Fossati. DIAG-NLP: Improving Intelligent Tutoring System with Natural Language Generation Technology. Master's Thesis, University of Illinois at Chicago. 2003. Download PDF
Context Sensitive Spell Checking
The spell checkers included in modern word processors do a very good job in finding misspelled words that do not exist in a dictionary, but have a hard time catching typos that result in a word that by chance is present in the vocabulary of the selected language. For example, a traditional English spell checker would not catch the mistake in a sentence like "I saw TREE trees in the park," where "tree" was written when "three" was intended, because "tree" is also a valid word present in an English dictionary. The only way to catch a mistake like that is to take context into account. Our approach uses a statistical model based on mixed trigrams to capture the context of a given word, and uses this model to try to detect and possibly correct a real-word spelling mistake.

Example of misspelling detection process
Relevant papers
- Davide Fossati and Barbara Di Eugenio. I saw TREE trees in the park: How to correct real-word spelling mistakes. LREC 2008, Sixth International Conference on Language Resources and Evaluation. Marrakech, Morocco. May 2008. Download PDF
- Davide Fossati and Barbara Di Eugenio. A mixed trigrams approach for context sensitive spell checking. CICLing-2007, Eighth International Conference on Intelligent Text Processing and Computational Linguistics. Mexico City, Mexico. February 2007. Download PDF
Ontology Alignment on the Web
Many information systems use taxonomies and ontologies to allow them to make inferences and organize data for better retrieval performance. Since different systems usually have different ontologies, the integration of heterogeneous systems requires that such ontologies be aligned. One example of this problem is the matching of categories used by different web portals to classify web documents. We worked on an approach, based on simple Natural Language Processing techniques, that can be used to automatically align those categories by analyzing the documents associated with them. We tested the approach on a subset of the Google and LookSmart web directories and obtained promising results.

One of the algorithms implemented in our matcher

Ontologies used in our experiments
Relevant paper
- Davide Fossati, Gabriele Ghidoni, Barbara Di Eugenio, Isabel Cruz, Huiyong Xiao, and Rajen Subba. The problem of ontology alignment on the web: A first report. EACL 2006, 11th conference of the European Association of Computational Linguistics, Workshop on Web as Corpus. Trento, Italy. April 2006. Download PDF