How Psychometric Analysis Turns EdTech Data into Real Evidence of Learning
Think Your EdTech Has Impact? Without Psychometric Validation, You Might Be Guessing
As highlighted in this recent UNESCO blog, many impact claims by EdTech founders are overstated or simply unsupported by evidence. One major reason for this gap is the absence of psychometric analysis: the rigorous process that ensures an assessment or evaluation instrument truly measures what it claims to measure.
Without psychometric validation, it becomes nearly impossible to distinguish between EdTech solutions that seem effective and those that are effective.
Many EdTech products today include built-in quizzes, progress trackers, or dashboards that provide instant feedback on user performance. These tools can be engaging and informative, but they are rarely psychometrically sound.
Take an early reading app, for example. After a student finishes a story, the app might offer a short quiz to test vocabulary or comprehension. The provider might then compare pre- and post-quiz results across users and conclude that the app “improves reading by 40%.”
From a marketing standpoint, that sounds impressive. But from a research standpoint, it’s deeply problematic. Why? Because the assessment itself may not be reliable or valid. It might be measuring familiarity with the app’s interface rather than actual reading comprehension. It might include items that don’t cover the full skill domain or that are too easy to answer after simple repetition.
Psychometric evaluation is the scientific backbone of meaningful educational measurement. It examines whether an instrument - be it a quiz, rubric, or in-app test -actually captures the construct it claims to measure (such as critical thinking, literacy, or problem-solving), and whether it does so consistently and reliably.
At its core, psychometric analysis focuses on two key properties: reliability and validity.
1. Reliability
Reliability ensures that the tool produces consistent results under consistent conditions. If you measure the same student twice, you should get roughly the same result.
Internal consistency, often measured through Cronbach’s alpha, checks whether items within a test are correlated and collectively measure one coherent construct.
Inter-rater reliability assesses whether human scorers apply criteria consistently (which is a crucial factor in rubric-based evaluations or teacher-rated assessments).
Test–retest reliability evaluates whether results remain stable over time, providing confidence that performance changes are not just random variation.
2. Validity
While reliability is about consistency, validity is about accuracy: whether the tool measures what it’s supposed to measure.
There are several dimensions of validity, all essential in EdTech evaluations:
Content validity ensures that assessment items adequately cover the learning domain. For instance, a tool measuring socio-emotional skills should assess empathy, collaboration, and self-awareness etc.
Construct validity uses statistical methods like factor analysis to verify that the test structure aligns with theoretical expectations. Do the results actually reflect the skill or competency being measured?
Criterion validity compares the EdTech-generated scores with external benchmarks, such as standardized assessments, teacher evaluations, or long-term academic performance.
Psychometric analysis for your tool is likely going to be requested by a funder who understands the edtech field. Because for these funders, psychometric rigor is a risk management tool: When evaluation instruments lack psychometric credibility, funding decisions become guesswork. Funders may end up investing in tools that show impressive internal “learning gains” but fail to hold up under independent evaluation. This can lead to wasted capital and lost opportunities for real system change.
At our Centre, we work with EdTech organizations to conduct reliability and validity analyses, ensuring their evaluation tools meet the standards of educational research. . Read about three the case studies in our latest report. If you’d like your solution to be part of our next round of psychometric case studies, reach out to our research team.



