Big Data in K-12: Attack of the Recommendation Engines – Part I

Big Data Meets Education

A wave of K-12 entrepreneurial initiatives sees the application of “big data” as the key to instructional technology’s Holy Grail—intelligent real-time differentiated instruction akin to working one-on-one with a brilliant personal instructor. Investors, aware of the powerful strides made in recommendation engines by Internet giants Google, Amazon, LinkedIn, Netflix, and Zynga, as well as for a host of military and commercial applications, see in big data education’s “next big thing.” In this and my next article, I’m going to explore what’s happening in this arena and in voice recognition technology, which, if you look under the hood, can be thought of as being driven by the same advances in data science and recommendation engines. These articles are based in part on my recent View From the Catbird Seat presentation at EdNET 2012. Read on to see what threats and opportunities this new frontier represents for your own organization.

Big Ideas and Market Drivers

Outside of education, big data has been growing up fast because of significant developments in data analytics and meaning and reasoning engines. These have overlapped with advancements in voice recognition technology. Separately and together these are seen as making possible much more powerful computer-based adaptive learning.

Most of us realize the critical role of recommendation engines for firms like Google, Amazon, Netflix, and Pandora. You may be less aware that online games giant Zynga is a massive user of big data technology for real-time optimization of game parameters to maximize income. Back in 2008 Chris Anderson, at Wired enthused, “The power of computers and volume of data will soon mean there’ll no longer be much need for theory or even scientific method.” For him and others, strides made in applying big data in commerce, political analysis, and military apps presage a new world of smarter digital systems. The idea penetrated the public imagination even further when, in February 2011, IBM’s Watson computer beat expert human competitors on Jeopardy and again with Apple’s introduction of Siri for the iPhone.

In education, schools are experiencing an explosion of educational data. Online and hybrid learning models are expanding the data pool. Common Core alone will be creating massive volumes of student data needing analysis. Cory Linton, Executive VP, Operations, of School Improvement Network told me, “NCLB waivers, indirectly requiring the use of value-added growth models for evaluating student gains and teacher effectiveness, complex and poorly understood multivariate phenomena, will also be a K-12 driver for big data analyses.”

Civitas Learning’s website echoes the philosophy of many of the firms looking to catch the big data wave, “More than ever, institutions need to be better served by technology. The latest advancements—big data, predictive analytics, machine learning, and recommendation engines—have transformed other industries. If we can use these technologies to change the way people buy TVs, play games online, and rent movies, surely we can use them to transform education.”

Where We’re Going

Exactly where big data fits in K-12 remains to be seen as numerous entrepreneurial initiatives seek to convert its power elsewhere into improved educational outcomes, but some consequences are already becoming clear:

  • Data interoperability – To play in tomorrow’s data-rich digital environment, products will need to “play nicely” with others, putting a premium on data interchange protocols, such as IMS, SIFA, SCORM, and content tagging schemes, such as the Learning Registry (LR), Shared Learning Infrastructure (SLI), and Learning Resource Metadata Initiative (LRMI).
  • Not just “textbooks” – For digital instructional resources, schools will want adaptive real-time assessment and remediation. An example—more on this later—would be the digital “textbook” for developmental math coming from the Pearson partnership with Knewton, which can be thought of as an adaptive platform controlling content plus curriculum and assessment. Continuously evolving “learning analytics platforms” will aggregate and understand student performance in ways that improve instruction.
  • Value-added growth models – Will ingest a wide range of performance and demographic data to assess student achievement gains and how they are related to the work of individual teachers and instructional programs.

Companies to Watch

Companies making big bets on the power of big data models for education include Amplify (Wireless Generation’s parent), DreamBox Learning, McGraw-Hill (with LearnSmart), Grockit, Knewton, Junyo, Civitas Learning, Revolution K12, and Pierson Labs. Ramona Pierson, co-founder of Pierson Labs, previously founded Synaptic Mash (subsequently acquired by Promethean), another educational system empowered by “data mining.” The Pierson Labs’ website describes their forthcoming products as “big data tools, building on cutting-edge technologies like Hadoop, Cassandra, Clojure, Scala, Cascalog, and Hive that leverage algorithms based on the most advanced versions of machine learning technology.”

Investors Aplenty

Whether big data tools can significantly advance the educational achievement needle remains to be seen (we’ll look at some preliminary results in a minute), but what is already clear is that investors love it. Prominent examples are Knewton, which has raised $54 million and signed a significant co-development and distribution deal with Pearson, and Grockit, which has raised $24 million to date, in part related to its new big data initiative. LearnSmart is a part of McGraw-Hill. CarnegieSpeech, a prominent voice technology firm using a big data engine, has raised $3.4 million. Several other educational big data initiatives are self-funded to some extent by key stakeholders of commercial big data firms. Junyo’s founder and CEO, Steve Schoettler, was a co-founder of Zynga. Manish and Ketan Kothari, who founded AlphaSmart and sold it to Renaissance Learning, have launched Root-1, an educational games firm, and brought in Vibhu Mittal, who spent ten years at Google working on recommendation engines.

Data Science Grows Up

Owen Lawlor, Director, Strategic Technology, Victory Productions, a data technology expert, told me that “data science has advanced from a ‘rearview mirror’ orientation, exemplified by products from Oracle and SAP, to real-time interpretation and predictive modeling. Significant advances have been made in analysis of structured (e.g., numerical) and unstructured data (e.g., natural language text from social media).” According to Lawlor, “80% of world’s data is unstructured log files, mostly natural language, and there aren’t enough skilled people to make adequate use of it [without data science support].” Sophisticated tools, such as multivariate statistics, Bayesian analysis, predictive analytics, machine learning, and artificial intelligence, as well as subject-specific concept maps (the context-sensitive “ontology” that lets machines or, for that matter, humans, interpret language) empower interpretive analysis by computer-based meaning and reasoning engines. Simple examples are the way search engines deal with misspellings and automatically supply semantic equivalents of words and phrases input by users. More sophisticated systems are used in discourse analysis tools, essay graders, and social media trackers. Even more advanced applications, such as computer-supported collaborative learning, are under study at research institutions, such as Carnegie Mellon University and SRI International (SRI). Developments in cloud computing, parallel processing, and mobility are increasingly bringing this big data technology to the masses.

Inside the Magic Boxes: Knewton and Grockit

Let’s have a quick look at these two K-12 companies to see how big data technology plays into their business plans.

Knewton, founded in 2008 by a former Kaplan executive, envisions online courseware in which curricula are atomized into fundamental concepts mapped onto a multidimensional grid through which each student follows a trajectory, perhaps unique, based on mastery of competencies necessary to advance to successful completion. In other words, according to Jessica Shabin, Marketing Associate, Knewton’s courseware utilizes extensive concept mapping of the hierarchy of concepts for which mastery is needed to accomplish a subsequent or higher-order concept. The two basic ideas are that (1) a user’s failure to accomplish a concept can be attributed to failure to correctly utilize one or more of its downstream concepts, and (2) as more and more students take the course, big data analyses of their paths to mastery probabilistically identify how to support student advancement, providing an ever more potent personalized adaptive learning experience for each user. Knewton and Pearson have agreed to partner on creating a suite of PearsonMyLab/Mastering” offerings in developmental mathematics, English, and writing, powered by Knewton’s Adaptive Learning Platform™. By carefully monitoring student response, the Platform compiles “engagement metrics” used to further personalize the learning experience by assigning students to study groups and presenting material in a format best suited for the individual (e.g., video, textbook, Socratic steps, etc.). It also supports the instructor by turning data into graphics depicting what types of problems the class needs the most additional practice with and how each student is performing in each assignment. Shabin added that authoring these courses is extremely labor-intensive since the concept maps are very intricate and complex. Knewton hopes to automate more of this in the future. A more comprehensive presentation of the logic behind Knewton's adaptive learning platform and its underlying recommendation engine is here.

Does it work? During the 2011-2012 school year, Knewton and Arizona State University (ASU) piloted the new developmental math course with 6,523 students. The results, while not dramatic, represent important gains in ASU’s eyes. According to Julia Rosen, Associate Vice Provost, ASU Online and Extended Campus:

  • 75% of the students completed the course (an approximately 17% gain).
  • 45% completed course work early.
  • The school experienced a 9% decrease in withdrawal rate compared with its historical average.

Grockit, founded in 2007 by Farb Nivi, a former Teacher of the Year for The Princeton Review and academic director at Kaplan, has made its mark as an SAT prep social learning platform. It derives appeal by organizing students into virtual study groups based on research showing that when students study with others, they spend significantly more time solving problems and are more likely to get answers correct than when they study alone. Last May, building on that success, Grockit expanded to adaptive social learning by launching Learnist, “powered by Netflix-like recommendation engine algorithms that match group participants and serve as an expert tutor based on users’ strengths and weaknesses.” Learnist has been described by Mike Isaac of All Things D as a “wiki-like mash up between Pinterest and Wikipedia. Users find content from across the web—videos, news stories, music, Soundcloud links, and what have you—and post it to a personal board that other users can follow. Teachers curate multimedia lessons for students to follow.” In other words, Grockit wants to re-configure online content, such as YouTube videos, Wikipedia entries, and ebooks into ordered lesson plans. A key resource is Grockit Answers, a new tool that allows a user to start a Q&A from any video on YouTube. According to Audrey Watters, Hack Education blogger, “Rather than having to wade through the comments on YouTube itself, the new tool lets the user watch the video via a separate interface. There, the user can ask questions that are time-stamped, so that the question is tied to a specific spot in the video. When others watch that video, those questions pop up for them at that very spot, and if they can help, they can provide an answer. It’s a nifty way to combine both synchronous and asynchronous feedback for learners. (and it’s a good way, too, for teachers to get data about what point in a video students are getting ‘stuck’).” As for the business model, Grockit Academy will be a place where students can learn together and teach each other. It will start with math and English curricula for 8th to 12th grades. The group learning is free, but if students (or their parents) want reports or adaptive learning algorithms to help them get smarter, there’s a $29.99 monthly fee.

Voice Recognition, Hurdles, and Implications

Next month I’ll delve into why today’s advances in voice recognition technology, most popularly evidenced by the iPhone’s Siri platform, are another consequence of big data’s advances and how they’re finding early use in K-12 instructional products. Finally, we’ll look at the business and technology hurdles for applying all this in K-12 and what it will mean for publishers, platforms firms, and service providers.

Dr. Nelson Heller is President of The HellerResults Group, a global strategic consultancy serving business and non-profits seeking growth opportunities in the education market. He is the founder of The Heller Reports newsletters and EdNET: The Educational Networking Conference, both started in 1989. The EdNET News Alert, successor to The Heller Reports publications and now published by MDR, reaches over 31,000 education executives worldwide every week and features a regular column from The HellerResults Group each month. You can learn more about Nelson and his industry leadership at The HellerResults Group. If you need strategic insight, business partners, international connections, stronger boards, keynoters, or entrepreneurial savvy and want the insight of 30 years at the business and technology crossroads of the education market, you can reach Nelson at 858-720-1914, by email at nelson@hellerresults.com, and on Twitter @NelsonHeller.