It’s time to get over the accuracy of predictive models in higher education

How should we approach the evaluation of predictive models in higher education?

It is easy to fall into the trap of thinking that the goal of a predictive algorithm is to be as accurate as possible. But, as I have explained previously, the desire to increase the accuracy of a model for its own sake is one that fundamentally misunderstands the purpose of predictive analytics. The goal of predictive analytics in identifying at-risk students is not to ‘get it right,’ but rather to inform action. Accuracy is definitely important here, but it is not the most important, and getting hung up on academic conversations about a model can actually obscure its purpose and impede the progress we are able to make in support of student success.

Let’s take a hypothetical example. Consider a model with an accuracy of 70% in predicting a student’s chances of completing a course with a grade of C or higher. A confusion matrix representing this might look something like this:

Too much emphasis on model accuracy can lead to a kind of paralysis, or hesitation to reach out to students for fear that that model has misclassified them. Institutions might worry about students falling through the cracks because the model predicted they would pass when they actually failed. But what is worse? Acting wrongly? Or not acting at all?

Let’s consider this from the perspective of the academic advisor. In the absence of a predictive model, advisors may approach their task from one of the following two perspectives.

  • No proactive outreach – this is the traditional walk-in model of academic advising. We know that the students who are most likely to seek out an academic advisor are also among the most likely to succeed anyway. What this means is that an academic advisor will probably only see some portion of 40 students in the above scenario, and make very little impact since those students would probably do just fine without them.
  • Proactively reach out to everyone – we know that proactive advising works, so why not try and reach everyone? This would obviously be amazing! But institutions simply do not have the capacity to do this very well. With average advising loads of 450 students or more, it is impossible for advisors to reach out to all their students in time to ensure that they are on track and remain on track each semester. If an advisor only had the ability to see 50 students before week six of the semester, selected at random, only 25 of students (50%) seen would actually have been in need of academic support.

Compare the results of each of these scenarios with the results of an advisor who only reaches out to students that the algorithm has identified as being at-risk of failure. I this case, an advisor would only need to see 45 students, which means that they have greater time available to meet with each of them. True, only 30 of these students would truly be at risk of failing, but this is significantly greater than the number of at-risk students they would otherwise be able to meet with. There is, of course, no harm in meeting with students who are not actually at risk. Complemented by additional information about student performance and engagement, a trained academic advisor could also further triage students flagged as being at risk, and communicate with instructors to increase the accuracy and impact of outreach attempts.

What about the students who fail through the cracks? The students that the model predicts would be successful but who actually fail the course? This is obviously something we’d like to avoid, but 15% is far lower than the 60% that fall through in a traditional advising context, and the 25% that fall through using the scatter shot approach. Of course, this is an artificial example, describing an advisor who only makes outreach decisions on the basis of the recommendation produced by the predictive model. In actual fact, however, through a system like Blackboard Predict, advisors and faculty have access to additional information and communications tools to help them to fine tune their judgments and increase the accuracy and impact of outreach efforts even further.

What I hope this example underscores is that predictive analytics should be viewed as simply a tool. Prediction is not prophesy. It is an opportunity to have a conversation. Accuracy is important, but not to the point that modeling efforts get in the way of the actual interventions that drive student success. It is understandable that institutions might worry that a perceived lack of sufficient model accuracy by faculty and advisors might error confidence in the model that prevents them from taking action. It is therefore incredibly important that misunderstandings about the nature of prediction, predictive modeling, and action be addressed from the outset so that time and resources can be committed where they will make the greatest impact: in the development and implementation of high impact advising practices that use predictive analytics as a valuable source of information alongside others, including the kind of wisdom that comes through experience.


This is the third in a series of blog posts on misunderstandings about predictive analytics that get in the way of high impact campus adoption. For the first two posts in this series, check out What are predictive analytics? And why does the idea create so much confusion? and Predictive analytics are not social science: A common misunderstanding with major consequences for higher education

What are predictive analytics? And why does the idea create so much confusion?

The greatest barrier to the widespread impact of predictive analytics in higher education is adoption. No matter how great the technology is, if people don’t use it effectively, any potential value is lost.

In the early stages of predictive analytics implementations at colleges and universities, a common obstacle comes in the form of questions that arise from some essential misunderstandings about data science and predictive analytics.  Without a clear understanding of what predictive analytics are, how they work, and what they do, it is easy to establish false expectations.  When predictive analytics fail to live up to these expectations, the result is disappointment, frustration, poor adoption, and a failure to fully actualize their potential value for student success.

This post is the first in a series of posts addressing common misunderstandings about data science that can have serious consequences for the success of an educational data or learning analytics analytics initiative in higher education.  The most basic misunderstanding that people have is about the language of prediction. What do we mean by ‘predictive’ analytics, anyway?

Why is the concept of ‘Predictive Analytics’ so confusing?

The term ‘predictive analytics’ is used widely, not just in education, but across all knowledge domains. We use the term because everyone else uses it, but it is actually pretty misleading.

I have written about this at length elsewhere, but in nutshell the term ‘prediction’ has a long history of being associated with a kind of mystical access to true knowledge about future events in a deterministic universe.  The history of the term is important, because it explains why many people get hung up on issues of accuracy, as if the goal of predictive analytics was to become something akin to the gold standard of a crystal ball.  It also explains why others are immediately creeped out by conversations about predictive analytics in higher education, because the term ‘prediction’ carries with it a set of pretty heavy metaphysical and epistemological connotations.  It is not uncommon in discussions of ethics and AI in higher education to hear comparisons between predictive analytics and the world of the film Minority Report (which is awesome), in which government agents are able to intervene and arrest people for crimes before they were committed.  In these conversations, however, it is rarely remembered that Minority Report predictions were quasi-magical in origin, where predictive analytics involve computational power applied to incomplete information.

Predictive analytics are not magic, even if the language of prediction sets us up to think of it in this way.  In The Signal an the Noise, Nate Silver suggests that we can begin to overcome this confusion by using the language of forecasting instead.  Where the goal of prediction is to be correct, the goal of a forecast is to be prepared.  I watch the weather channel, not because I want to know what the weather is going to be like, but because I want to know whether I need to pack an umbrella.

In higher education, it is unlikely that we will stop talking about predictive analytics any time soon.  But it is important to shift our thinking and set our expectations along the lines of forecasting.  When it comes to the early identification of at-risk students, our aim is not to be 100% accurate, and we are not making deterministic claims about a particular student’s future behavior.  What we are doing is providing a forecast based on incomplete information about groups of students in the past so that instructors and professional advisors can take action. The goal of predictive analytics in higher education is to offer  students an umbrella when the sky turns grey and there is a strong chance of rain.

Why the National Student Clearinghouse matters, and why it should matter more

In analytics circles, it is common to quote Peter Drucker: “What gets measured get managed.” By quantifying our activities, it becomes possible to measure the impact of decisions on important outcomes, and optimize processes with a view to continual improvement.  With analytics, there comes a tremendous opportunity to make evidence-based decisions where before there was only anecdote.

But there is a flip side to all this.  Where measurement and management go hand in hand, the measurable can easily limit the kinds of things we think of as important.  Indeed, this is what we have seen in recent years around the term ‘student success.’  As institutions have gained more access to their own institutional data, they have gained tremendous insight into the factors contributing to things like graduation and retention rates.  Graduation and retention rates are easy to measure, because they don’t require access to data outside of institutions, and so retention and graduation have become the de facto metrics for student success.  Because colleges and universities can easily report on these things, they are also easy to incorporate into rankings of educational quality, accreditation standards, and government statistics.

But are institutional retention and graduation rates actually the best measures of student success? Or are they simply the most expedient given limitations on data collection standards?  What if we had greater visibility into how students flowed into and out of institutions?    What if we could reward institutions for effectively preparing their students for success at other institutions despite a failure to retain high numbers through to graduation?  In many ways, limited data access between institutions has led to conceptions of student success and a system of incentives that foster competition rather than cooperation, and may in fact create obstacles to the success of non-traditional students.  These are the kind of questions that have recently motivated a bipartisan group of senators to introduce a bill that would lift a ban on the federal collection of employment and graduation outcomes data.

More than 98% of US institutions provide data and have access to the National Student Clearinghouse.  For years, the National Student Clearinghouse (NSC) has provided a rich source of information about the flow of students between institutions in the U.S., but colleges and universities often struggle with making this information available for easy analysis.  Institutions see the greatest benefit from access to NSC data when they combine it with other institutional data sources, and especially demographic and performance information stored in their student information systems.  This kind of integration is helpful, not only for understanding and mitigating barriers to enrollment and progression, but also as institutions work together to understand the kinds of data that are important to them.  As argued in a recent article in Politico, external rating systems have a significant impact on setting institutional priorities and, in so doing, may have the effect of promoting systematic inequity on the basis of class and other factors. As we see at places like Georgia State University, the more data that an institution has at their disposal, and the more power it has to combine multiple data sources the more it can align its measurement practices with its own values, and do what’s best for its students.

 

Should edtech vendors stop selling ‘predictive analytics’? A response to Tim McKay

Pedantic rants about the use and misuse of language are a lot of fun. We all have our soap boxes, and I strongly encourage everyone to hop on theirs from time to time. But when we enter into conversations around the use and misuse of jargon, we must always keep two things in mind: (1) conceptual boundaries are fuzzy, particularly when common terms are used across different disciplines, and (2) our conceptual commitments have serious consequences for how we perceive the world.

Tim McKay recently wrote a blog post called Hey vendors! Stop calling what you’re selling colleges and universities “Predictive Analytics”. In this piece, Mckay does two things. First, he tries to strongly distinguish the kind of ‘predictive analytics’ work done by vendors from the kind of ‘real’ prediction that is done within his own native discipline, which is astronomy. Second, on the basis of this distinction, he asserts that what analytics companies are calling ‘predictive analytics’ are actually not predictive at all. All of this is to imply what he later says explicitly in a tweet to Mike Sharkey: the language of prediction in higher ed analytics is less about helpfully describing the function of a particular tool, and more about marketing.

What I’d like to do here is to unpack Tim’s claims, and in so doing, soften the kind of strong antagonism that he erects between vendors and the rest of the academy, which is not particularly productive as vendors, higher educational institutions, government, and others seek to work together to promote student success, both in the US and abroad.

What is predictive analytics?

A hermeneutic approach

Let’s begin with defining analytics. Analytics is simply the visual display of quantitative information in support of human decision-making. That’s it. In practice, we see the broad category of analytics sub-divided in a wide variety of ways: by domain (i.e. website analytics), by content area (i.e., learning analytics, supply chain analytics), by intent (i.e., in the case of the common distinction between descriptive, predictive, and prescriptive analytics).

Looking specifically at predictive analytics, it is important not to take the term out of context. In the world of analytics, the term ‘predictive’ always refers to intent. Since analytics is always in the service of human decision-making, it always involves factors that are subject to change on the basis on human activity. Hence, ‘predictive analytics’ involves the desire to anticipate and represent some likely future outcome that is subject to change on the basis on human intervention. When considering the term ‘predictive analytics,’ then, it is important not to consider ‘predictive’ in a vacuum, separate from related terms (descriptive and prescriptive) and the concept of analytics, of which predictive analytics is a type. Pulling a specialized term out of one domain and evaluating it on the terms of another is unfair and is only possible under the presumption that language is static and ontologically bound to specific things.

So, when Tim McKay talks about scientific prediction and complains that predictive analytics do not live up to the rigorous standards of the former, he is absolutely right. But he is right because the language of prediction is deployed in two very different ways. In McKay’s view, scientific prediction involves applying one’s knowledge of governing rules to determine some future state of affairs with a high degree of confidence. In contrast, predictive analytics involves creating a mathematical model that anticipates a likely state of affairs based on observable quantitative patterns in a way that makes no claim to understanding how the world works. Scientific prediction, in McKay’s view, involves an effort to anticipate events that cannot be changed. Predictive analytics involves events that can be changed, and in many cases should be changed.

The distinction that McKay notes is indeed incredibly important. But, unlike McKay, I’m not particularly bothered by the existence of this kind of ambiguity in language. I’m also not particularly prone to lay blame for this kind of ambiguity at the feet of marketers, but I’ll address this later.

An Epistemological Approach

One approach to dealing with the disconnect between scientific prediction and predictive analytics is to admit that there is a degree of ambiguity in the term ‘prediction,’ to adopt a hermeneutic approach, and be clear that the term is simply being deployed relative to a different set of assumption. In other words, science and analytics are both right.

Another approach, however, might involve looking more carefully at the term ‘prediction’ itself and reconciling science and analytics by acknowledging that the difference is a matter of degree, and that they are both equally legitimate (and illegitimate) in their respective claims to the term.

McKay is actually really careful in the way that he describes scientific prediction. To paraphrase, scientific prediction involves (1) accurate information about a state of affairs (ex., the solar system), and (2) an understanding of the rules that govern changes in that state of affairs (ex., laws of gravity, etc). As McKay acknowledges, both our measurements and understanding of the rules of the universe are imperfect and subject to error, but when it comes to something like predicting an eclipse, the information we have is good enough that he is willing to “bet you literally anything in my control that this will happen – my car, my house, my life savings, even my cat. Really. And I’m prepared to settle up on August 22nd.”

Scientific prediction is inductive. It involves the creation of models that adequately describe past states of affairs, an assumption that the future will behave in very much the same way as the past, and some claim about a future event. It’s a systematic way of learning from experience.  McKay implies that explanatory scientific models are the same as the ‘rules that govern,’ but I feel like his admission that ‘Newton’s law of gravity is imperfect but quite adequate’ admits that they are not in fact the same. Our models might adequate rules, but the rules themselves are eternally out of our reach (a philosophical point that has been born out time and time again in the history of science).

Scientific prediction involves the creation of a good enough model that, in spite of errors in measurement and assuming that the patterns of the past will persist into the future, we are able to predict something like a solar eclipse with an incredibly high degree of probability. What if I hated eclipses. What if they really ground my gears. If I had enough time, money, and expertise, might it not be possible for me to…

…wait for it…

…build a GIANT LASER and DESTROY THE MOON?!

Based on my experience as an arm-chair science fiction movie buff, I think the answer is yes.

How is this fundamentally different from how predictive analytics works? Predictive analytics involves the creation of mathematical models based on past states of affairs, an admission that models are inherently incomplete and subject to error in measurement, an assumption that the future will behave in ways very similar to the past, and an acknowledgement that predicted future states of affairs might change with human (or extraterrestrial) intervention. Are the models used to power predictive analytics in higher education as accurate as those we have to predict a lunar eclipse? Certainly not. Is the data collected to produce predictive models of student success free from error? Hardly. But these are differences in degree rather than differences in the thing itself. By this logic, both predictive analytics and scientific prediction function in the exact same way. The only difference is that the social world is way more complex than the astrological world.

So, if scientific predictions are predictive, then student risk predictions are predictive as well. The latter might not be as accurate as the former, but the process and assumptions are identical for both.

An admission

It is unfortunate that, even as he grumbles about how the term ‘predictive’ is used in higher education analytics, McKay doesn’t offer a better alternative.

I’ll admit at this point that, with McKay, I don’t love the term ‘predictive.’ I feel like it is either too strong (in that it assumes some kind of god-like vision into the future) or too weak (in that it is used so widely in common speech and across disciplines that it ceases to have a specific meaning. With Nate Silver, I much prefer the term ‘forecast,’ especially in higher education.

In the Signal and the Noise, Silver notes that the terms ‘prediction’ and ‘forecast’ are used differently in different fields of study, and often interchangeably. In seismology, however, the two terms have very specific meanings: “A prediction is a definitive and specific statement about when and where an earthquake will strike: a major earthquake will hit Kyoto, Japan on June 28…whereas a forecast is a probabilistic statement usually over a longer time scale: there is a 60 percent chance of an earthquake in Southern California over the next thirty years.

There are two things to highlight in Silver’s discussion. First, the term ‘prediction’ is used differently and with varying degrees of rigor depending on the discipline. Second, if we really want to make a distinction, then what we call prediction in higher ed analytics should really be called forecasting. In principle, I like this a lot. When we produce a predictive model of student success, we are forecasting, because we are anticipating an outcome with a known degree of probability. When we take these forecasts and visualize them for the purpose of informing decisions, are we doing ‘forecasting analytics’? ‘forecastive analytics’? ‘forecast analytics’? I can’t actually think of a related term that I’d like to use on a regular basis. Acknowledging that no discipline owns the definition of ‘prediction,’ I’d far rather preserve the term ‘predictive analytics’ in higher education since it both rolls off the tongue, and already has significant momentum within the domain.

Is ‘predictive analytics’ a marketing gimMick?

Those who have read my book will know that I like conceptual history. When we look at the history of the concept of prediction, we find that it has Latin roots and significantly predates the scientific revolution. Quoting Silver again:

The words predict and forecast are largely used interchangeably today, but in Shakespeare’s time, they meant different things.  A prediction was what a soothsayer told you […]

The term forecast came from English’s Germanic roots, unlike predict which is from Latin. Forecasting reflected the new Protestant worldliness rather than the otherwordliness of the Holy Roman Empire. Making a forecast typically implied planning under conditions of uncertainty. It suggested having prudence,
wisdom, and industriousness, more like the way we currently use the word foresight.

The term ‘prediction’ has a long and varied history. It’s meaning is slippery. But what I like about Silver’s summary of the term’s origins is that it essentially takes it off the table for everyone except those who who presume a kind of privileged access to the divine. In other words, using the language of prediction might actually be pretty arrogant, regardless of your field of study, since it presumes to have both complete information and an accurate understanding of the rules that govern the universe. Prediction is an activity reserved for gods, not men.

Digressions aside, the greatest issue that I have with McKay’s piece is that it uses the term ‘prediction’ as a site of antagonism between vendors and the academy. If we bracket all that has been said, and for a second accept McKay’s strong definition of ‘prediction,’ it is easy to demonstrate that vendors are not the only ones misusing the term ‘predictive analytics’ in higher education. Siemens and Baker deploy the term in their preface to the Cambridge Handbook of the Learning Sciences. Manuela Ekowo and Iris Palmer from New America comfortably makes use of the term in their recent policy paper on The Promise and Peril of Predictive Analytics in Higher Education. EDUCAUSE actively encourages the adoption of the term ‘predictive analytics’ through large numbers of publications including the Sept/Oct 2016 edition of the EDUCAUSE Review, which was dedicated entirely to the topic. The term appears in the ‘Journal of Learning Analytics,’ and is used in the first edition of the Handbook of Learning Analytics published by the Society of Learning Analytics Research (SoLAR). University administrators use the term. Government officials use the term. The examples are too numerous to cite (a search for “predictive analytics in higher education” in google scholar yields about 58,700 results). If we want to establish the true definition of ‘prediction’ and judge every use by this gold standard, then it is not simply educational technology vendors who should be charged with misuse. If there is a problem with how people are using the term, it is not a vendor problem: it is a problem of language, and of culture.

I began this essay by stating that we need to keep two things in mind when we enter into conversations about conceptual distinctions:  (1) conceptual boundaries are fuzzy, particularly when common terms are used across different disciplines, and (2) our conceptual commitments have serious consequences for how we perceive the world.  By now, I hope that I have demonstrated that the term ‘prediction’ is used in a wide variety of ways depending on context and intention.  That’s not a bad thing.  That’s just language.  A serious consequence of McKay’s discussion of how ed tech vendors use the term ‘predictive analytics is that it tacitly pits vendors against the interests of higher education — and of students — more generally.  Not only is such a sweeping implication unfair, but it is also unproductive.  It is the shared task of colleges, universities, vendors, government, not-for-profits, and others to work together in support of the success of students in the 21st century.  The language of student success is coalescing in such a way as to make possible a common vision and concerted action around a set of shared goals.  The term ‘predictive analytics’ is just one of many evolving terms that make up our contemporary student success vocabulary, and is evidence of an important paradigm shift in how we view higher education in the US.  Instead of quibbling about the ‘right’ use of language, we should instead recognize that language is shaped by values, and so work together to ensure that the words we use reflect the kinds of outcome we collectively wish to bring about.

Five strategies for succeeding with data in higher education

What important steps can you take to increase the success of your analytics project on campus?

At the 2017 Blackboard Analytics Symposium, A. Michael Berman, ‎VP for Technology & Innovation at CSU Channel Islands and Chief Innovation Officer for California State University, took a different approach to answering this question. Instead of asking about success, he asked about failure. What would project management look like if we set out to fail from the very beginning? As it turns out, the result looks pretty similar to a lot of well-meaning data projects we see today.

 

What important lessons can we learn from failure?

#1. Set clear goals

Setting clear goals is not easy, but it is an important first step to successfully completing any project. If you don’t know what you are setting out to do, you won’t know when you are done, and you won’t know if you succeeded. Setting clear goals is hard work, not only because it requires careful thinking, but also because it involves communication and consensus. Clear communication of well-defined goals creates alignment, but it also invites disagreement as different stakeholders want to achieve different things. Goal setting is a group exercise that involves bringing key stakeholders together to agree on a set of shared outcomes so you can all succeed together.

#2. Gain executive support

Garnering the support of executive champions is a crucial and often overlooked step. All too often, academic technology units are prevented from scaling otherwise innovative practices simply because no one in leadership knows about them. Support from leadership means access to resources. It means advocacy. It also means accountability.

#3. Think beyond the tech

IT projects are never about technology. They are always about solving specific problems for particular groups of people. For the most part, the people that are served by an analytics project have no interest in what “ETL” means, or what a “star schema” is. All they know is that they lack access to important information. What many IT professionals fail to appreciate is the fact that their language is foreign to a lot of people, and that using overly technical language often serves to compound the very problems they are trying to solve. Access to information without understanding is worse than no access at all.

#4. Maximize communication

Communication is important in two respects. It is important to the health of your analytics project because it ensures alignment and fosters momentum around the clearly defined goals that justified the project in the first place. But it is also important once the project is complete. The completion of an analytics project marks the beginning, not the end. If you wait until the project is complete before engaging your end users, you have an uphill battle ahead of you that is fraught with squandered opportunity. With a goal of ensuring widespread adoption once the analytics project is completed, it’s important to share information, raise awareness, and start training well in advance so that your users are ready and excited to dig in and start seeing results as soon as possible.

#5. Celebrate success

It’s easy to think of celebration as a waste of company resources. People come to work to do a job. They get paid for doing their job. What other reward do people need? But IT projects, and analytics projects in particular, are never ‘done.’ And they are never about IT or analytics. They are about people. Celebration needs to be built into a project in order to punctuate a change in state, and propel the project from implementation into adoption. In the absence of this kind of punctuation, projects never really feel complete, and a lack of closure inhibits the exact kind of excitement that is crucial to achieve widespread adoption.


Originally posted to blog.blackboard.com

Ethics and Predictive Analytics in Higher Education

In March 2017, Manuela Ekowo and Iris Palmer co-authored a report for New America that offered five guiding practices for the ethical use of predictive analytics in higher education.  This kind of work is really important.  It acknowledges that, to the extent that analytics in higher education is meant to have an impact on human behavior, it it is a fundamentally ethical enterprise.

Work like the recent New America report is not merely about educational data science.  It is an important facet of educational data science itself.

Are we doing ethics well?

But ethics is hard.  Ethics is not about generating a list of commandments.  It is not about cataloging common opinion.  It is about carefully establishing a set of principles on the basis of which it becomes possible to create a coherent system of knowledge and make consistent judgements in specific situations.

Unfortunately, most work on the ethics of analytics in higher education lacks this kind of rigor.  Instead, ethical frameworks are the result of a process of pooling opinions in such a way as to strike a balance between the needs of a large number of stakeholders including students, institutions, the economy, the law, and public opinion.  To call this approach ethics is to confuse the good with the expedient.

Where should we begin?

An ethical system worthy of the name needs to begin with a strong conception of the Good.  Whether stated or implied, the most common paradigm is essentially utilitarian, concerned with maximizing benefit for the greatest number of people.  The problem with this approach, however, is that it can only ever concern perceived benefit.  People are famously bad at knowing what is good for them.

A benefit of this utilitarian approach, of course, is that it allows us to avoid huge epistemological and metaphysical minefields.  In the absence of true knowledge of the good, we can lean on the wisdom of crowds.  By pooling information about perceived utility, so the theory goes, we can approximate something like the good, or at least achieve enough consensus to mitigate conflict as much as possible.

But what if we were more audacious?  What if our starting point was not the pragmatic desire to reduce conflict, but rather an interest in fostering the fullest expression of our potential as humans?  As it specifically pertains to the domain of educational data analytics, what if we abandoned that instrumental view of student success as degree completion?  What if we began with the question of what it means to be human, and wrestled with the ways in which the role of ‘student’ is compatible and incompatible with that humanity?

Humane data ethics in action

Let’s consider one example of how taking human nature seriously affects how we think about analytics technologies.  As the Italian humanist Pier Paolo Vergerio observed, all education is auto-didactic.  When we think about teaching and learning, the teacher has zero ability to confer knowledge.  It is always the learner’s task to acquire it.  True, it is possible to train humans just as we can train all manner of other creatures (operant and classical forms of conditioning are incredibly powerful). but this is not education.  Education is a uniquely human capability whereby we acquire knowledge (with the aim of living life in accord with the Good). Teachers do not educate.  Teachers do not ‘teach.’ Rather, it is the goal of the teacher to establish the context in which the student might become actively engaged as learners.

What does this mean for Education?  Viewed from this perspective, it is incumbent on us as educators to create contexts that bring students to an awareness of themselves as learners in the fullest sense of the word.  It is crucial that we develop technologies that highlight the student’s role as autodidact.  Our technologies need to help bring students to self-knowledge at the same time as they create robust contexts for knowledge acquisition (in addition to providing opportunities for exploration, discovery, experimentation, imagination and other humane attributes).

It is in large part this humanistic perspective that has informed my excitement about student-facing dashboards.  As folks like John Fritz have talked about, one of the great things about putting data in the hands of students is that it furthers institutional goals like graduation and retention as a function of promoting personal responsibility and self-regulated learning.  In other words, by using analytics first and foremost with an interest in helping students to understand and embrace themselves as learners in the fullest sense of the term, we cultivate virtues that translate into degree completion, but also career success and life satisfaction.

In my opinion, analytics (predictive or otherwise) are most powerful when employed with a view to maximizing self-knowledge and the fullest expression of human capability rather than as way to constrain human behavior to achieve institutional goals.  I am confident that such a virtuous and humanistic approach to educational data analytics will also lead to institutional gains (as indeed we have seen at places like Georgia State University), but worry that where values and technologies are not aligned, both human nature and institutional outcomes are bound to suffer.

Using Analytics to Meet the Needs of Students in the 21st Century

Below is excerpted from a keynote address that I delivered on November 8, 2016 at Texas A&M at Texarkana for its National Distance Education Week Mini-Conference


Right now in the US, nearly a quarter of all undergraduate students — 4.5 million — are both first generation and low income.

Of these students, only 11% earn a bachelors degree in under 6 years. That’s compared to the rest of the population, which sees students graduate at a national rate of 55%. What this means is that 89% of first generation, low income students stop out, perpetuating a widespread pattern of socio-economic inequality.

Since 1970, bachelors degree attainment among those in the top income quartile in the US has steadily increased from 40.2% to 82.4 in 2009. By contrast, those in the bottom two income quartiles have seen only slight improvements: under an 8 point increase for the bottom two quartiles combined. In the US, a bachelors degree means a difference in lifetime earnings of more than 66% compared to those with only high school. Read more