ISILA dataset
The data collected in the scope of the ISILA project has been curated and released as an open dataset on Zenodo so it can be reused by the learning analytics community.
This dataset brings together learning activity data from multiple Learning Management Systems (LMSs)—including Canvas, Moodle, and LAMS—with complementary data collected through surveys and other external tools. All data have been transformed into the Experience API (xAPI) format in an effort to enable interoperability and unified analysis across heterogeneous sources.
The dataset illustrates both the potential and the practical challenges of applying a common standard, such as xAPI, to integrate learning data. Although xAPI provides a shared structure for representing learning experiences, differences in how LMS platforms generate, structure, and semantically encode events lead to inconsistencies that require extensive preprocessing, mapping, and interpretation. Variations in logging granularity, event naming conventions, user identifiers, and contextual metadata highlight the complexity of achieving true interoperability, even when a standard is in place.
In addition to LMS-derived traces, the dataset incorporates weekly survey responses and data from auxiliary tools, offering a richer, multimodal view of learner activity and experience. The integration of these sources required alignment across differing temporal resolutions, data schemas, and levels of abstraction.
This dataset is intended to support research on learning analytics, data interoperability, and educational data standardization. It provides a realistic example of the challenges involved in merging multi-platform educational data and may serve as a benchmark for developing methods in data harmonization, semantic alignment, and cross-system analytics.
Data description
The file isila_all.csv contains the activities in xAPI format with the additional fields University and Course for ease of use.
The file isila_weekly.csv contains the data aggregated per week, including the following fields
- Total_time: total time spent across all sessions in the week
- University: institution offering the course
- Course: course identifier/name
- Week: normalized week index (starting from 1)
- actor.id: anonymized learner identifier
- Concise SRL survey measures: the following variables represent weekly average scores derived from the Concise SRL survey.
- Efficient: average score for perceived efficiency
- TaskValue: average score for perceived task value
- Monitoring: average score for monitoring strategies
- Goals: average score for goal setting
- Effort: average score for effort regulation
- Environment: average score for environmental structuring
- Help: average score for help-seeking behavior
- Social: average score for social learning aspects
- TimeMgmt: average score for time management
- Motivation: average score for motivation
- Anxiety: average score for anxiety
- Enjoyment: average score for enjoyment
- Feedback: average score for feedback perception
- Metacognition: average score for metacognitive strategies
- Total_events: total number of recorded events in the week
- Days_per_week: number of distinct active days within the week
- Total_sessions: number of identified sessions (sessions are defined using a 15-minute inactivity threshold, i.e., 900 seconds)
- Mean_Sess_Len: average session duration
- Total_time: total time spent across all sessions in the week
The dataset shared can be used to explore a wide range of research questions in learning analytics, educational data mining, and technology-enhanced learning. A unique feature of this dataset is that it combines behavioral traces from multiple learning platforms with self-reported measures. As such, it enables analyses that link observable activity patterns with learners’ perceptions, strategies, and affective states.
The multi-platform nature of the dataset makes it suitable for studying interoperability and data integration challenges. Researchers can use it to investigate how differences in data structures, event semantics, and logging practices across LMSs impact downstream analyses, and to develop or evaluate methods for data harmonization and standardization using xAPI.
Full link
López-Pernas, S., Jovic, J., Conde, M. A., Yordanova, T., Jovanovic, J., Elmoazen, R., Georgiev, A., Konstantinov, O., Pavlović, O., Riego del Castillo, V., Rodríguez-Sedano, F. J., Grujic, A., Ivanova, M., & Saqr, M. (2025). ISILA Piloting curated xAPI dataset [Data set]. Zenodo. https://doi.org/10.5281/ZENODO.18793674