Beyond the Hype: Measuring the Real Effectiveness of AI Learning Tools

Written by Jessica Smagler | Jun 25, 2026

Proving that students are learning - especially in new and innovative programs - is harder than it sounds. And the rapid proliferation of AI tools has made this more urgent, not less. Most AI tools promise transformative outcomes but often provide little evidence to back them up. For institutions trying to make responsible decisions about what to adopt and who to trust, the question isn't just does this work - it's how would we even know?

As an AI learning company working with institutions across higher education, we've had to think hard about what meaningful evidence looks like and how to build toward it when rigorous outcome data takes time to accumulate.

What we've found is that measuring the impact of a genuinely new kind of educational technology isn't a single leap to a finish line. It's a progression from early signals to deeper evidence, and each stage has real value if you know what it can and can't tell you. We think of it as four stages: Engagement & Confidence, Formative Signals, Persistence & Achievement, and Sustained & Verified Outcomes.

This is a framework built from practice, developed alongside institutions doing this work in real conditions. We offer it as an approach that can help any institution navigate the evidence question more clearly, whatever tools they're evaluating.

Engagement & Confidence: Early Signs That Something Is Working

Engagement and confidence are not learning outcomes, but they are valuable prerequisites. This is especially true when introducing new modalities. Before you can measure what students have learned, you need to know whether they are showing up, staying engaged, and experiencing the instruction as credible and useful. Research in educational psychology is consistent on this: time-on-task and perceived relevance are preconditions for learning. Students who are disengaged aren't learning, regardless of how good the content is. And students who feel confused or unsupported tend to disengage.

Early confidence data at Kyron was encouraging: more than 80% of learners reported feeling more confident after a Kyron lesson and wanted to see more of them in their courses. When a learning tool builds confidence, students are more likely to keep engaging with it. Meanwhile, at one fully online partner university, students were spending over 22 minutes on each Kyron module compared to roughly 3 minutes for traditional video content. Students who spend seven times longer with content are, at a minimum, giving learning a chance.

Formative Signals: Seeing Inside the Learning Experience

Formative signals start to tell you whether learning is actually happening. And this is where AI tools, if designed well, have a meaningful advantage over many other modalities.

A textbook can't tell you where a student got confused. A video can't surface a misconception. But an AI tutor, by its very nature, is witnessing student thinking in real time - the questions students ask, the reasoning they attempt, the points where they struggle, and the moments where something clicks. The question is whether a given tool is designed to make that visible and actionable.

Institutions should be asking this directly of any AI learning tool they evaluate: what formative insight does your platform generate, and how does it get into the hands of instructors?

At Kyron, formative insight is central to how the platform works. Learner misconceptions are surfaced to instructors at both the individual and section level. Instructors can access full transcripts of student interactions, seeing exactly how each learner reasoned through a problem, where they needed scaffolding, and how their understanding evolved. And we use those same interaction patterns internally to continuously improve the learner experience.

This kind of data is the bridge between early engagement signals and the outcome measures that ultimately matter. It won't tell you whether students passed - but it will tell you a great deal about whether they're on track.

Persistence & Achievement: Proof That Learning is Happening

Persistence and achievement are where the framework starts to deliver on its promise. Persistence - whether students stay enrolled, continue engaging, and complete what they started - is one of the most consequential measures in higher education, particularly for the populations most at risk of stopping out. Achievement measures whether they actually learned: grades, pass rates, competency demonstrations.

These are the outcomes institutions care most about. They take time to accumulate, but when they do come, they are the most direct answer to the question this whole framework is designed to answer: is this working?

The evidence across our institutional partners is compelling. At one partner institution, students who engaged with Kyron showed statistically significantly higher grades on case study assignments and stronger persistence rates across multiple health information technology courses. At a community college partner, the pass rate for a gateway English course rose from 68% to 72% — breaking the 70% threshold for the first time in the institution’s history. And at a non-profit workforce development organization, Kyron's integration into an HR track led to a 15% increase in course completion and a 20% increase in learner retention.

These are not anecdotes. They are the validation that the earlier signals were pointing in the right direction.

Sustained & Verified Outcomes: Building Evidence That Holds

Programs that can offer solid persistence and achievement data are in a great position to start thinking about even more sophisticated evidence. That might mean longitudinal tracking - following the same cohorts over time to see whether gains persist and compound. It might mean quasi-experimental designs that allow for more rigorous comparisons across sections or populations. Or it might mean pursuing independent, third-party validation that makes findings credible beyond a single institutional context.

At Kyron, this is exactly where we are headed. In the coming months, we will be incorporating an in-app assessment tool that will allow institutions to measure learning gains over time directly within the platform. We're also deepening our understanding of dosage: how much Kyron does a learner need to see meaningful gains? And we're pursuing third-party validation to ensure our findings are as rigorous as students deserve.

AI in higher education will only be as good as our willingness to hold it accountable. That means building measurement frameworks that are honest about what early signals can and can't tell us, and patient enough to follow the evidence all the way to outcomes that actually change student trajectories. The institutions that do this well won't just make better decisions about technology. They'll be better positioned to serve the students who need them most.

View full post