We live in an increasingly quantified world.
Advances in electronic database architectures, a rapid increase in computing power, and the development of sophisticated strategies for analyzing massive amounts of data have converged to produce analytics as a distinct approach to solving complex problems in a variety of fields. Sitting at the intersection of data warehousing and data mining, analytics has been used in sciences such as physics, biology, and climate science since the 1970’s. It is used extensively in business as a tool for optimizing processes, and by marketers as a way of targeting advertisements to particular audiences. We are living at the very beginning of an era of data-driven decision-making, in which sensors are capable of capturing data about nearly every part of our lives, in which massive data warehouses are capable of storing and making this data accessible, and in which computing power and sophisticated data mining techniques are capable of providing feedback to stakeholders in near real-time.
Analytics arrived late to the learning sciences. This first journal dedicated to the use of analytics in the learning sciences, the Journal of Educational Data Mining, only began publication in 2009 (Baker & Siemens, 2013). Since then, however, the field of education has been transformed, as institutions increasingly seek to leverage the power of their existing databases in order to improve efficiency and increase student success by optimizing their learning environments. EDUCAUSE has called Learning Analytics a ‘Gamer Changer’. The 2013 Higher Education Edition of the NMC Horizon Report lists data-driven learning and assessment as a mid-range trend likely to take three to five years to create substantial change.
What is Learning Analytics?
In its most common and general formulation, learning analytics is defined as “the measurement, collection, analysis and reporting of data about learners and their contexts, for the purpose of understanding and optimizing learning and the environments in which it occurs” (Long & Siemens, 2011). To those actively involved in the use of learning analytics, it represents a powerful set of tools and techniques that will increasingly allows administrators and instructors to identify at-risk students in order to develop strategies that increase their chances of success. In higher education, the field of learning analytics consists primarily of individuals representing two different perspectives. On the one hand, administrators and university information technology (IT) professionals are interested in leveraging their existing IT infrastructure in order to track student activity and increase student success, usually defined either in terms of retention or grade performance. The primary publication venue for this group is EDUCAUSE and its various publications. On the other hand, there are data scientists who are interested in optimizing learning within particular learning environments. Although there is a growing number of organizations and journals dedicated to learning analytics from this perspective, the most established and prominent venues for scholarly communication in this field are the Journal of Educational Data Mining and the International Conference on Learning Analytics Knowledge. The orientation of these two communities is symptomatic of the history of analytics more generally, which comes out of IT and business intelligence on the one hand (the orientation of EDUCAUSE), and out of data science on the other (the educational data mining perspective). What is currently largely lacking from the field of learning analytics, however, is a humanistic viewpoint, a perspective that would first ask deeper questions about the task of education, and only afterward inquire after how best to accomplish that task.
What is Humane Education?
Humanists of the early Italian Renaissance — specifically Vergerio, Bruni, Piccolomini, Guarino, and Vegio — are remarkable because they were first and foremost teachers, and pursued philosophy only as a secondary activity. For these thinkers, philosophy was not an important activity in and of itself, but rather as served an important supporting function in the service of the larger task of education. Looking to this Humanist tradition, I will focus on three main themes that address the ‘what,’ the ‘how,’ and the ‘why’ of education. In answer to the ‘what’ question, humane education is concerned with cultivating those habits and sensibilities required in order to be responsive to particular situations. For the Humanists of the Italian Renaissance, education involved cultivating ingenium, or the capacity to rally disparate and apparently unrelated elements and put them together in such a way to address the demands of a particular here and now. The ‘what’ of humane education is not a content or a method, but rather an openness to the world, a practice of embracing information from multiple and diverse sources in order to be prudent at a moment that calls for action. Second, the ‘how’ of education involves eloquence. Through eloquence, the student is not taught, but rather seduced into a personal relationship with a world of knowledge. It is a mistake to think that the teacher is in any way responsible for the student’s learning, for learning only ever takes place in the learner. The teacher-student relationship is not one in which some datum is effectively transferred from one mind to another. Instead, the teacher functions as a liaison between the student and knowledge with the task of showing, modeling, and inspiring. Lastly, for the Humanists of the Italian Renaissance who were tremendously influenced by Cicero’s De Oratore, the general aim of education is truth, by which they meant knowledge of the whole. On the other hand, and more specifically, they gave a special place to self-knowledge as necessary in order to arrive at an understanding of the whole. Giambattista Vico, for example, insisted that “knowledge of oneself is for everyone the greatest incentive to acquire the universe of learning in the shortest period of time.” From the humane perspective, the end of education as a relationship between teacher and student is merely to bring students into an understanding of themselves and in such a way as to make learning possible. In order to maximize students’ potential for learning, students must first come to an understanding of themselves as ingenious agents responsible for their own learning activity.
Why Humane Education?
The question of the compatibility of learning analytics and humane education is important for three reasons. First, as mentioned above, in spite of the interdisciplinary aspirations of the emerging field of learning analytics, the humanities are currently almost entirely unrepresented. In fact, within some circles (particularly within the educational data mining community), the kind of non-experimental research that is characteristic of studies in the humanities is treated with relative disdain. If the field of learning analytics seeks to make prescriptive judgments about learning in general, and learning is something that takes place within the sciences and humanities alike, then the field is doing itself a disservice by excluding the humanities from the conversation. Furthermore, since institutional decisions with consequences for the future of higher education are increasingly data-driven, a failure on the part of humanists to take a critical interest is to give up their place at the table.
Second, the role of the humanities is to ask, not just how best to accomplish a particular end, but rather to interrogate the end itself. The language of optimization pervades the field of learning analytics, but there is not a clear consensus within the field about what an optimal state might look like (except, perhaps, the two most common definitions of success: (1) retention through to degree, and (2) a grade of C- or higher). In other words, there is a strong sense that there is a standard of success, but little reflection on what that standard is, and why it should be adopted as the end of education.
Lastly, and most importantly, in the absence of a humanistic perspective, the assumptions underlying the field of learning analytics put it at odds with the demands of education in the twenty-first century. Bauman, Thomas and Brown, Davidson, and a growing contingent of others observe that the social landscape has seen a radical shift since the 1950s, that technological advancement and globalization have thrust us into a world of constant change. In this new social and technological milieu, what is called for on the part of individuals is exactly the kind of ingenious activity demanded by the Humanists of the Italian Renaissance. With respect both to the conception of human nature put forth by these thinkers, and to the skills necessary to survive and thrive in the 21st century, education must be humane in the sense described above. The problem with learning analytics, however, and with the data-driven approach to problem solving in general, is that it tends to undermine the very creativity that education needs to cultivate. In a recent article in Wired Magazine, Felix Salmon notes that in business and politics, with respect to systems of people, ‘quants’ can arrive at highly efficacious insights that have a tremendous amount of predictive power, but that algorithm-powered systems have a way of encouraging people to change their behaviors in perverse ways, in order to provide a system with more of what it is designed to measure. In other words, proxy variables quickly become mistaken for the concepts they are meant to represent. As a consequence, predictive systems end up rewarding conformity and discouraging innovative behaviors that actually produce enduring value.
The question, ‘is learning analytics compatible with humane education,’ is not a question about the use of learning analytics in the humanities. It is rather a question about the use of learning analytics in general. If learning analytics is incompatible with humane education, then it ought to be severely restricted in scope, if not jettisoned entirely, as a technique that can only undermine self-knowledge and human flourishing. What I would like to suggest, however, is that learning analytics and humane education are not incompatible at all. The problem is not with analytics itself, but rather emerges only with a lack of reflection upon what it might mean to incorporate analytics as a meaningful part of a complete philosophy of teaching and learning.
UPDATE 31 January 2017: This blog post was written during 2014. Since that time, Blackboard has made several very important and strategic hires including Mike Sharkey, John Whitmer, and others who are not only well-regarded data scientists, but also passionate educators. Since 2014, Blackboard has become a leader in educational data science, conducting generalizable research to arrive at insights with the potential to make a significant impact on how we understand teaching and learning in the 21 century. Blackboard has changed. Blackboard is now committed to high quality research in support of rigorously defensible claims to efficacy. Blackboard is not in the business of selling magic beans. Blackboard is also not the only company doing excellent work in this way. As this article continues to be read and shared, I still believe it has value. But it should be noted that the concerns that I express here are a reflection of the state of a field and industry still in its infancy. The irony it describes is still present to be sure, and we should all work to increase our data literacy so that we can spot the magic beans where they exist, but it should also be noted that educational technology companies are not enemies. Teachers, researchers, and edtech companies alike are struggling together to understand the impact of their work on student success. Appreciating that fact, and working together in a spirit of honesty and openness is crucial to the success of students and institutions of higher education in the 21st century.
— Timothy D. Harfield (@tdharfield) August 13, 2014
The learning analytics space is currently dominated, not by scholars, but rather by tool developers and educational technology vendors with a vested interest in getting their products to market as quickly as they possibly can. The tremendous irony of these products is that, on the one hand, they claim to enable stakeholders (students, faculty, administration) to overcome the limitations of anecdotal decision-making and achieve a more evidence-based approach to teaching and learning. On the other hand, however, the effectiveness of the vast majority of learning analytic tools are untested. In other words, vendors insist upon the importance of evidence-based (i.e. data-driven) decision-making, but rely upon anecdotal evidence in support of claims with regard to the value of their analytics products.
In the above presentation, Kent Chen (former Director of Market Development for Blackboard Analytics) offers a startlingly honest account of the key factors motivating the decision to invest in Learning Analytics:
Analytics, I believe, revolves around two key fundamental concepts. The first of these fundamental concepts is a simple question: is student activity a valid indicator of student success? And this question is really just asking, is the amount of work that a student puts in a good indicator of whether or not that student is learning? Now this is really going to be the leap of faith, the jumping off point for a lot of our clients
Is learning analytics based on a leap of faith? If this is actually the case, then the whole field of learning analytics is premised on a fallacy. Specifically, it begs the question by assuming its conclusion in its premises: “we can use student activity data to predict student success, because student activity data is predictive of student success.” Indeed, we can see this belief in ‘faith as first principle’ in the Blackboard Analytics product itself, which famously fails to report on its own use.
Fortunately for Chen (and for Blackboard Analytics), he’s wrong. During the course of Emory’s year-long pilot of Blackboard Analytics for Learn, we were indeed able to find small but statistically significant correlations between several measures of student activity and success (defined as a grade of C or higher). Our own findings provisionally support the (cautious) use of student course accesses and interactions as heuristics on the basis of which an instructor can identify at-risk students. When it comes to delivering workshops to faculty at Emory, our findings are crucial, not only to making a case in defense of the value of learning analytics for teaching and course design, but also as we discuss how those analytics might most effectively be employed. In fact, analytics is valuable as a way of identifying contexts in which embedded analytic strategies (i.e. student-facing dashboards) might have no, or even negative, effects, and it is incumbent upon institutional researchers and course designers to use the data they have in order to evaluate how to use that data most responsibly. Paradoxically, one of the greatest potential strengths of learning analytics is that it provides us with insight into the contexts and situations where analytics should not be employed.
I should be clear that I use Blackboard Analytics as an example here solely for the sake of convenience. In Blackboard’s case, the problem is not so much a function of the product itself (which is a data model that is often mistaken for a reporting platform), but rather of the fact that it doesn’t understand the product’s full potential, which leads to investment in the wrong areas of product development, cliched marketing, and unsophisticated consulting practices. The same use of anecdotal evidence to justify data-driven approaches to decision-making is endemic to the learning analytics space dominated by educational technology vendors clamoring to make hay from learning analytics while the sun is shining.
I should also say that these criticisms do not necessarily apply to learning analytics researchers (like those involved with the Society of Learning Analytics Research and scholars involved in educational data mining). This is certainly not to say that researchers do not have their own sets of faith commitments (we all do, as a necessary condition of knowledge in general). Rather, freed from the pressure to sell a product, this group is far more reflective about how they understand concepts. As a community, the fields of learning analytics and educational data mining are constantly grappling with questions about the nature of learning, the definition(s) of student success, how concepts are best operationalized, and how specific interventions might be developed and evaluated. To the extent that vendors are not engaged in these kinds of reflective activity — that immediate sales trump understanding — it might be argued that vendors are giving ‘learning analytics’ a bad name, since they and the learning analytics research community are engaged in fundamentally different activities. Or perhaps the educational data science community has made the unfortunate decision to adopt a name for its activity that is already terribly tainted by the tradition of ‘decision-support’ in business, which is itself nothing if not dominated by a similar glut of vendors using a faith in data to sell its magic beans.
This week, Ryan Baker posted a link to a piece, co-written with George Siemens, that is meant to function as an introduction to the fields of Educational Data Mining (EDM) and Learning Analytics (LA). “Educational Data Mining and Learning Analytics” is book chapter primarily concerned with methods and tools, and does an excellent job of summarizing some of the key similarities and differences between the two fields in this regard. However, in spite of the fact that the authors make a point of explicitly stating that EDM and LA are distinctly marked by an emphasis on making connections to educational theory and philosophy, the theoretical content of the piece is unfortunately quite sparse.
The tone of this work actually brings up some concerns that I have about EDM/LA as a whole. The authors observe that EDM and LA have been made possible, and have in fact been fueled, by (1) increases in technological capacity and (2) advances in business analytics that are readily adaptable to educational environments.
“The use of analytics in education has grown in recent years for four primary reasons: a substantial increase in data quantity, improved data formats, advances in computing, and increased sophistication of tools available for analytics”
The authors also make a point of highlighting the centrality of theory and philosophy in informing methods and interpretation.
“Both EDM and LA have a strong emphasis on connection to theory in the learning sciences and education philosophy…The theory-oriented perspective marks a departure of EDM and LA from technical approaches that use data as their sole guiding point”
My fear, however, which seems justified in light of the imbalance between theory and method in this chapter (a work meant to introduce, summarize, and so represent the two fields), is that the tools and methods that the fields have adopted, along with the technological- and business-oriented assumptions (and language) that those methods imply, have actually had a tendency to drive their educational philosophy. From their past work, I get the sense that Baker and Siemens would both agree that the educational / learning space differs markedly from the kind of spaces we encounter in IT and business more generally. If this is the case, I would like to see more reflection on the nature of those differences, and then to see various statistical and machine learning methods evaluated in terms of their relevance to educational environments as educational environments.
As a set of tools for “understanding and optimizing learning and the environments in which it occurs” (solaresearch.org), learning analytics should be driven, first and foremost, by an interest in learning. This means that each EDM/LA project should begin with a strong conception of what learning is, and of the types of learning that it wants to ‘optimize’ (a term that is, itself, imported from technical and business environments into the education/learning space, and which is not at all neutral). To my mind, however, basic ideas like ‘learning’ and ‘education’ have not been sufficiently theorized or conceptualized by the field. In the absence of such critical reflection on the nature of education, and on the extent to which learning can in fact be measured, it is impossible to say exactly what it is that EDM/LA are taking as their object. How can we measure something if we do not know what it is? How can we optimize something unless we know what it is for? In the absence of critical reflection, and of maintaining a constant eye on our object, it becomes all too easy to consider our object as if its contours are the same as the limits of our methods, when in actual fact we need to be vigilant in our appreciation of just how much of the learning space our methods leave untouched.
If it is true that the field of learning analytics has emerged as a result of, and is driven by, advancements in machine learning methods, computing power, and business intelligence, then I worry about the risk of mistaking the cart for the horse and, in so doing, becoming blind to the possibility that our horse might actually be a mule—an infertile combination of business and education, which is also neither.