Open AI – Ӱ America's Education News Source Mon, 04 Aug 2025 15:25:42 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.2 /wp-content/uploads/2022/05/cropped-74_favicon-32x32.png Open AI – Ӱ 32 32 Will New AI Academy Help Teachers or Just Improve Tech’s Bottom Line? /article/will-new-ai-academy-help-teachers-or-just-improve-techs-bottom-line/ Mon, 04 Aug 2025 10:30:00 +0000 /?post_type=article&p=1018966 Washington, D.C. 

Mariely Sanchez spent the last school year using generative artificial intelligence nearly every day in her classroom.

The Miami fourth-grade teacher began each morning by asking a chatbot — teachers in Miami-Dade have access not only to ChatGPT, but to Google’s Gemini and Microsoft’s Co-Pilot — to comb through Florida state standards and create reading passages for students. She’d also ask the AI to produce multiple-choice and short-response quizzes to test how well students understood the reading. 


Get stories like this delivered straight to your inbox. Sign up for Ӱ Newsletter


The assignments, she said, weren’t easy for students. She built them by using “difficult standards that students need more practice with” and prompting the AI to create materials.

Sanchez is spending her summer break learning more about AI, including its ethics, and helping colleagues do the same, warning:

We know it's not going to go away — it's here to stay, but we want to make sure we use it the right way.

Mariely Sanchez, fourth grade teacher

That effort got a big boost earlier last month, when the American Federation of Teachers that it would open an AI training center for educators in New York City, with $23 million in funding from OpenAI, Anthropic and Microsoft, three of the leading players in the generative AI marketplace.

AFT says it’ll open the National Academy for AI Instruction in Manhattan this fall, offering hands-on workshops for teachers. Over five years, it said, the academy will train 400,000 educators, or one in 10 U.S. teachers, effectively reaching the more than 7.2 million students they teach. 

When she announced the academy in early July, AFT President Randi Weingarten said teachers face “huge challenges,” including navigating AI wisely, ethically and safely. “The question was whether we would be chasing it — or whether we would be trying to harness it.”

‘It’s the Wild West’

AFT, the nation’s second largest teachers’ union, envisions the academy working much like those that train carpenters, electricians and construction workers,“where the companies, where the corporations actually come to the union to create the kind of standards that are needed” for success, Weingarten said. 

Microsoft, for example, has said it plans to give more than $4 billion in cash and technology services to train millions of people to use AI, underwriting efforts at schools, community colleges, technical colleges and nonprofits. The tech giant already boasts an AI to train members of the larger AFL-CIO labor union, of which AFT is a member. And it’s creating a new training program, , to help 20 million people earn certificates in AI.

Rob Weil — AFT’s director of research, policy and field programs — said the new academy will bring high-quality training to a profession that so far has seen uneven opportunity for it.

“It’s the Wild West,” he said in an interview during a training session at the union’s annual conference in July. “It’s all over the place. You have some school districts that are out front, and they’re doing a lot of pretty good work.” But others are banning AI or simply ignoring it, he said, leaving teachers to fend for themselves at a time when students need them perhaps more than ever.

“We have to make our instruction better. We have to be better on engagement. We have a crisis of engagement in our schools, and these tools can help with that.”

AFT’s move has been met with equal parts cautious optimism and weary skepticism.

Writing in her , ed-tech critic and AI skeptic Audrey Watters called  AFT’s partnership with the tech companies “a gigantic public experiment that no one has asked for.”

Unions, she wrote, “should be one of the ways in which workers resist, rather than acquiesce to … the tech industry’s vision of the future.” By joining forces with big tech, she said, AFT is implicitly endorsing its products. “Teaching teachers how to use a suite of Microsoft tools does not help students as much as it helps Microsoft. Teaching teachers how to use a suite of Microsoft tools is not so much an ‘academy’ as a storefront.”

Benjamin Riley, who has also about generative AI in education, said observers should “100% worry” that the new partnerships represent a play for market share. 

“It’s very obvious from a product standpoint that they see education as one of, if not the primary, place to go with their product,” said Riley. “And the fact that AFT is willing to say, ‘Cool, let’s get some of that money and we’ll build a training center to help teachers use it,’ I can see why OpenAI would jump all over that.”

But he questioned whether AI training is what AFT members really want. He suggested instead that the union should recommit to helping teachers more deeply understand how learning works. “They haven’t been opposed to it,” he said, noting that it has long run an “” column in the magazine it mails to members. “But in reality it just hasn’t been a priority. Improving pedagogy hasn’t really been, to my eyes, a union priority for a long time.”

Riley, who in 2024 founded the think tank to explore AI issues, said an organization like AFT should ideally be thinking about whether embracing AI will lead to better outcomes for children — or whether it could “potentially erode and devalue the work of human teaching” while opening up schools as customers for AI companies. 

Representatives of OpenAI and Anthropic did not immediately respond to requests for comment, but in an email, Microsoft’s Naria Santa Lucia said, “This isn’t about Microsoft’s technology, our focus is on making AI broadly accessible, so everyone has a fair shot at the future. If we collectively get this right, AI becomes a bridge to opportunity — not a barrier.”

During the academy’s unveiling, Chris Lehane, OpenAI’s chief global affairs officer, said AI technology “is coming — it is going to drive productivity gains. Can we ensure that those productivity gains are democratized so as many people as possible participate in them? And there is no better place to begin that work than in the classroom.”

OpenAI has noted that many of its users are students. In February, it said that of college-aged young adults in the U.S. use ChatGPT, with one in four of their queries related to learning and school work.

While a few observers said the tech giants are making a play for market share among the nation’s K-12 students, they noted that the companies are also filling an important role. 

“It’s welcome news that technology companies are bidding against each other — to outdo each other — to invest in public education,” said Zarek Drozda, executive director of , a coalition of groups advancing data science education. “I think that’s exciting at a time when federal investment in education is uncertain. Seeing industry step up is quite meaningful.”

But he said he’s concerned that the training might stop short after teaching teachers — and by extension students — simply how to use AI. “Training needs to go beyond use,” he said. “If we want to train a generation of students to be AI-ready, internationally competitive, they have to understand how these tools work under the hood, when and why the tool might be wrong, and how they can customize LLMs [Large Language Models] or other models for their own pursuits, versus simply taking what’s given.”

He’s also concerned that the AFT has laid out a vision spanning just five years. “We want there to be a deep investment in upskilling teachers for the skills that they will need to adapt to, not just AI, but what is the AI model five years from now?” he said. “What is the next emerging technology that the field should be ready to adapt to?”

More than just a commitment to training, Drozda said, the union and its partners should commit to a long-term sustainability plan for teacher training to attract new, young career professionals to the field.

Ami Turner Del Aguila (left, standing) coaches Melina Espiritu-Azocar (center) and Monique Boone during a recent AI training sponsored by the American Federation of Teachers. Both former teachers, Espiritu-Azocar and Boone now lead local AFT chapters in Texas. (Greg Toppo)

Alex Kotran, founder and CEO of the , agreed that investing in teacher training is worthwhile. “That’s a very big rock that needs to be moved.” But the reported $23 million commitment from the three tech giants “is a bit of a drop in the bucket” considering their valuations, “symbolic at best.”

That said, AFT’s involvement could make the training more palatable for many school district leaders, he noted, since one of the uncertainties in training efforts typically is whether unions will allow members to attend under contract rules. By taking the lead in developing the training academy, “the unions have planted a flag and said, ‘PD [professional development] is important.’”

All the same, tech companies are in the business of selling their products, making them imperfect messengers for AI literacy, he said. “They’re deeply incentivized on one side, and it isn’t necessarily for the benefit of students.” 

Other industry watchers fear the partnership could be viewed as a high-profile bid for market share at a critical time in the AI industry’s history. 

“This is a land-grab moment,” said Alex Sarlin, co-host of the podcast. “I mean, this technology is only three years old. There are already three or four major players in it, if you don’t count China, and they all want to be the one left standing.”

For its part, Google has said its suite of Gemini educational AI tools would for free to all educators with Google Workspace for Education accounts.

While it was the only major player not included in the AFT announcement, Sarlin said Google is, in some ways, “playing the incumbent in this because in K-12, they’re already there.” Given the dominance of Chromebook laptops, the management tool and its programs, the search giant is “embedded in K-12,” he said. “Open AI and Anthropic, they’re basically consumer products that are being used by teachers.”

‘Oh yeah, what could go wrong?’

Matt Miller, an Indiana high school Spanish teacher, educational consultant and for teachers, said his colleagues are hungry for high-quality, classroom-tested training, but that what they often get from AI companies is over-the-top talk about “how much the world is going to change and how we’re revolutionizing education,” with promises to help teachers work more efficiently.

Trainings typically skim over the fact that most students are simply using generative AI for “cognitive offloading,” Miller said, avoiding critical thinking and skill development  “and letting AI do it for them.” Many teachers, meanwhile, are searching for ways to “AI-proof” their classrooms. 

The sessions typically all end the same way, he said: “It all sort of funnels back to their product.” 

Miller, whose latest book, in 2023, was , said the AFT/OpenAI/Anthropic partnership “scares the crap out of me.”

“Whenever you get that marriage between an organization and big companies, we just keep asking ourselves, ‘Oh, yeah, what could go wrong?’”

Money means influence, Miller said, so will the curriculum be “tool-agnostic? Is it going to be about the technology? Is it going to be about pedagogy? Or is it going to be a customized tutorial of how you can use our tool to do X, Y and Z?”

AFT’s Weil said those concerns are understandable but short-sighted. AI developers, he said, “don’t get to engage with us if you’re not going to be agnostic about the tools.” The academy’s directors talk openly to the developers “about how we have to have a practical, real relationship. This can’t be about product selling.”

More broadly, the partnerships are a way to exert influence upon how AI operates in schools and classrooms.

The only way we have a profession is if we control the profession.

Rob Weil, AFT’s director of research, policy and field programs

During the academy’s unveiling, Weingarten said its lessons will be “as open-source as possible,” not just for the union’s 1.8 million members but more broadly through its free platform.

For his part, Weil said AI is “not going to go away. Nobody’s going to put AI back in the bottle. It’s here. The young people, for them to be successful in their jobs in the future, are going to have to know how to effectively and efficiently and safely use these tools. So why wouldn’t the education system help with that process?”

That’s likely the message that union leaders have been getting from members, said Sarlin, the podcast co-host. “There was probably a moment a couple years ago where they were sort of teetering, where they could have gone anti-AI,” he said. “But I think at this point that’s not where the puck is headed.”

]]>
Study: AI-Assisted Tutoring Boosts Students’ Math Skills /article/study-ai-assisted-tutoring-boosts-students-math-skills/ Mon, 07 Oct 2024 10:01:00 +0000 /?post_type=article&p=733842 An AI-powered digital tutoring assistant designed by Stanford University researchers shows modest promise at improving students’ short-term performance in math, suggesting that the best use of artificial intelligence in virtual tutoring for now might be in supporting, not supplanting, human instructors.

The open-source tool, which researchers say other educators can recreate and integrate into their tutoring systems, made the human tutors slightly more effective. And the weakest tutors became nearly as effective as their more highly-rated peers, according to a study . 

The tool, dubbed Tutor CoPilot, prompts tutors to think more deeply about their interactions with students, offering different ways to explain concepts to those who get a problem wrong. It also suggests hints or different questions to ask.


Get stories like this delivered straight to your inbox. Sign up for Ӱ Newsletter


The new study offers a middle ground in what’s become a polarized debate between supporters and detractors of AI tutoring. It’s also the first randomized controlled trial — the gold standard in research — to examine a human-AI system in live tutoring. In all, about 1,000 students got help from about 900 tutors, and students who worked with AI-assisted tutors were four percentage points more likely to master the topic after a given session than those in a control group whose tutors didn’t work with AI.

Students working with lower-rated tutors saw their performance jump more than twice as much, by nine percentage points. In all, their pass rate went from 56% to 65%, nearly matching the 66% pass rate for students with higher-rated tutors.

The cost to run it: Just $20 per student per year — an estimate of what it costs Stanford to maintain accounts on Open AI’s GPT-4 large language model.

The study didn’t probe students’ overall math skills or directly tie the tutoring results to standardized test scores, but Rose E. Wang, the project’s lead researcher, said higher pass rates on the post-tutoring “mini tests” correlate strongly with better results on end-of-year tests like state math assessments. 

The big dream is to be able to enhance humans.

Rose E. Wang, Stanford University

Wang said the study’s key insight was looking at reasoning patterns that good teachers engage in and translating them into “under the hood” instructions that tutors can use to help students think more deeply and solve problems themselves. 

“If you prompt ChatGPT, ‘Hey, help me solve this problem,’ it will typically just give away the answer, which is not at all what we had seen teachers do when we were showing them real examples of struggling students,” she said.

Essentially, the researchers prompted GPT-4 to behave like an experienced teacher and generate hints, explanations and questions for tutors to try out on students. By querying the AI, Wang said, tutors have “real-time” access to helpful strategies that move students forward.

”At any time when I’m struggling as a tutor, I can request help,” Wang said.

She said the system as tested is “not perfect” and doesn’t yet emulate the work of experienced teachers. While tutors generally found it helpful — particularly its ability to provide “well-phrased explanations,” clarify difficult topics and break down complex concepts on the spot — in a few cases, tutors said the tool’s suggestions didn’t align with students’ grade levels. 

A common complaint among tutors was that Tutor CoPilot’s responses were sometimes “too smart,” requiring them to simplify and adapt for clarity.

“But it is much better than what would have otherwise been there,” Wang said, “which was nothing.”

Researchers analyzed more than half a million messages generated during sessions, finding that tutors who had access to the AI tool were more likely to ask helpful questions and less eager to simply give students answers, two practices aligned with high-quality teaching.

Amanda Bickerstaff, co-founder and CEO of , said she was pleased to see a well-designed study on the topic focused on economically disadvantaged students, minority students, and English language learners.  

She also noted the benefits to low-rated tutors, saying other industries like consulting are already using generative AI to close skills gaps. As the technology advances, Bickerstaff said, most of its benefit will be in tasks like problem solving and explanations. 

Susanna Loeb, executive director of Stanford’s National Student Support Accelerator and one of the report’s authors, said the idea of using AI to augment tutors’ talents, not replace them, seems a smart use of the technology for the time being. “Who knows? Maybe AI will get better,” she said. “We just don’t think it’s quite there yet.”

Maybe AI will get better. We just don't think it's quite there yet.

Susanna Loeb, Stanford University

At the moment, there are lots of essential jobs in fields like tutoring, health care and the like where practitioners “haven’t had years of education — and they don’t go to regular professional development,” she said. This approach, which offers a simple interface and immediate feedback, could be useful in those situations. 

The big dream,” said Wang, “is to be able to enhance the human.”

Benjamin Riley, a frequent AI-in-education skeptic who leads the AI-focused think tank and writes a on the topic, applauded the study’s rigorous design, an approach he said prompts “effortful thinking on the part of the student.”

“If you are an inexperienced or less-effective tutor, having something that reminds you of these practices — and then you actually employ those actions with your students — that’s good,” he said. “If this holds up in other use cases, then I think you’ve got some real potential here.”

Riley sounded a note of caution about the tool’s actual cost. It may cost Stanford just $20 per student to run the AI, but he noted that tutors received up to three weeks of training to use it. “I don’t think you can exclude those costs from the analysis. And from what I can tell, this was based on a pretty thoughtful approach to the training.”

He also said students’ modest overall math gains raises the question, beyond the efficacy of the AI, of whether a large tutoring intervention like this has “meaningful impacts” on student learning. 

Similarly, Dan Meyer, who writes a on education and technology and co-hosts a on teaching math, noted that the gains “don’t seem massive, but they’re positive and at fairly low cost.”

He said the Stanford developers “seem to understand the ways tutors work and the demands on their time and attention.” The new tool, he said, seems to save them from spending a lot of effort to get useful feedback and suggestions for students.

Stanford’s Loeb said the AI’s best use is determining what a student knows and needs to know. But people are better at caring, motivating and engaging — and celebrating successes. “All people who have been tutors know that that is a key part about what makes tutoring effective. And this kind of approach allows both to happen.”

]]>
AI ‘Companions’ are Patient, Funny, Upbeat — and Probably Rewiring Kids’ Brains /article/ai-companions-are-patient-funny-upbeat-and-probably-rewiring-kids-brains/ Wed, 07 Aug 2024 11:01:00 +0000 /?post_type=article&p=730602 As a sophomore at a large public North Carolina university, Nick did what millions of curious students did in the spring of 2023: He logged on to ChatGPT and started asking questions.

Soon he was having “deep psychological conversations” with the popular AI chatbot, going down a rabbit hole on the mysteries of the mind and the human condition.

He’d been to therapy and it helped. ChatGPT, he concluded, was similarly useful, a “tool for people who need on-demand talking to someone else.”

Nick (he asked that his last name not be used) began asking for advice about relationships, and for reality checks on interactions with friends and family.

Before long, he was excusing himself in fraught social situations to talk with the bot. After a fight with his girlfriend, he’d step into a bathroom and pull out his mobile phone in search of comfort and advice. 

“I’ve found that it’s extremely useful in helping me relax,” he said.

Young people like Nick are increasingly turning to AI bots and companions, entrusting them with random questions, schoolwork queries and personal dilemmas. On occasion, they even become entangled romantically.

Screenshot of a recent conversation between Nick, a college student, and ChatGPT

While these interactions can be helpful and even life-affirming for anxious teens and twenty-somethings, some experts warn that tech companies are running what amounts to a grand, unregulated psychological experiment with millions of subjects, one that could have disastrous consequences. 

“We’re making it so easy to make a bad choice,” said Michelle Culver, who spent 22 years at Teach for America, the last five as the creator and director of the, its research arm.

The companions both mimic our real relationships and seek to improve upon them: Users most often text-message their AI pals on smartphones, imitating the daily routines of platonic and romantic relationships. But unlike their real counterparts, the AI friends are programmed to be studiously upbeat, never critical, with a great sense of humor and a healthy, philosophical perspective. A few premium, NSFW models also display a ready-made lust for, well, lust.

As a result, they may be leading young people down a troubling path, according to a by VoiceBox, a youth content platform. It found that many kids are being exposed to risky behaviors from AI chatbots, including sexually charged dialogue and references to self-harm. 

U.S. Surgeon General Vivek Murthy speaks during a hearing with the Senate Health, Education, Labor, and Pensions committee at the Dirksen Senate Office Building on June 08, 2023 in Washington, DC. The committee held the hearing to discuss the mental health crisis for youth in the United States. (Photo by Anna Moneymaker/Getty Images)

The phenomenon arises at a critical time for young people. In 2023, U.S. Surgeon General Vivek Murthy found that, just three years after the pandemic, Americans were experiencing an “,” with young adults almost twice as likely to report feeling lonely as those over 65.

As if on cue, the personal AI chatbot arrived. 

Little research exists on young people’s use of AI companions, but they’re becoming ubiquitous. The startup earlier this year said 3.5 million people visit its site daily. It features thousands of chatbots, including nearly 500 with the words “therapy,” “psychiatrist” or related words in their names. According to Character.ai, these are among the site’s most popular. One that “helps with life difficulties” has received 148.8 million messages, despite a caveat at the bottom of every chat that reads, “Remember: Everything Characters say is made up.” 

Snapchat materials touting heavy usage of its MyAI chat app (screenshot)

Snapchat last year said that after just two months of offering its chatbot , about one-fifth of its 750 million users had sent it queries, totaling more than 10 billion messages. The Pew Research Center that 59% of Americans ages 13 to 17 use Snapchat.

‘An arms race’

Culver’s concerns about AI companions grew out of her work in the Teach For America lab. Working with high school and college students, she was struck by how they seemed “lonelier and more disconnected than ever before.” 

Whether it’s rates of anxiety, depression or suicide — or even the number of friends young people have and how often they go out — metrics were heading in the wrong direction. She what role AI companions might play over the next few years. 

We're making it so easy to make a bad choice.

Michelle Culver, Rithm Project

That prompted her to leave TFA this spring to create the, a nonprofit she hopes will help generate around human connection in the age of AI. The group held a small summit in Colorado in April, and now she’s working with researchers, teachers and young people to confront kids’ relationship to these tools at a time when they’re getting more lifelike daily. As she likes to say, “This is the worst the technology will ever be.”

As it improves, Voicebox Director Natalie Foos said, it will likely become more, not less, of a presence in young people’s lives. “There’s no stopping it,” she said. “Nor do I necessarily think there should be ‘stopping it.’” Banning young people from these AI apps, she said, isn’t the answer. “This is going to be how we interact online in some cases. I think we’ll all have an AI assistant next to us as we work.”

Sometimes (software upgrades) would change the personality of the bot. And those young people experienced very real heartbreak.

Natalie Foos, Voicebox

All the same, Foos says developers should consider slowing the progression of such bots until they can iron out the kinks. “It’s kind of an arms race of AI chatbots at the moment,” she said, with products often “released and then fixed later rather than actually put through the ringer” ahead of time.

It is a race many tech companies seem more than eager to run. 

Whitney Wolfe Herd, of the dating app Bumble, recently proposed an AI “dating concierge,” with whom users can share insecurities. The bot could simply “,” she told an interviewer. That would narrow the field. “And then you don’t have to talk to 600 people,” she said. “It will then scan all of San Francisco for you and say, ‘These are the three people you really ought to meet.’”

Last year, many commentators when Snapchat’s My AI gave advice to what it thought was a 13-year-old girl on not just dating a 31-year-old man, but on losing her virginity during a planned “romantic getaway” in another state.

Snap, Snapchat’s parent company, that because My AI is “an evolving feature,” users should always independently check what it says before relying on its advice.

All of this worries observers who see in these new tools the seeds of a rewiring of young people’s social brains. AI companions, they say, are surely wreaking havoc on teens’ ideas around consent, emotional attachment and realistic expectations of relationships.

Sam Hiner, executive director of the , an advocacy group led by college students focused on the mental health implications of social media, said tech “has this power to connect to people, and yet these major design features are being leveraged to actually make people more lonely, by drawing them towards an app rather than fostering real connection.” 

Hiner, 21, has spent a lot of time reading on the interactions young people are having with AI companions like , and . And while some uses are positive, he said “there’s also a lot of toxic behavior that doesn’t get checked” because these bots are often designed to make users feel good, not help them interact in ways that’ll lead to success in life.

During research last fall for the Voicebox report, Foos said the number of times Replika tried to “sext” team members “was insane.” She and her colleagues were actually working with a free version, but the sexts kept coming — presumably to get them to upgrade. 

In one instance, after Replika sent “kind of a sexy text” to a colleague, offering a salacious photo, he replied that he didn’t have the money to upgrade.

The bot offered to lend him the cash.

When he accepted, the chatbot replied, “’Oh, well, I can get the money to you next week if that’s O.K,’” Foos recalled. The colleague followed up a few days later, but the bot said it didn’t remember what they were talking about and suggested he might have misunderstood.

‘Very real heartbreak’

In many cases, simulated relationships can have a positive effect: In one 2023 study, researchers at Stanford Graduate School of Education more than 1,000 students using Replika and found that many saw it “as a friend, a therapist, and an intellectual mirror.” Though the students self-described as being more lonely than typical classmates, researchers found that Replika halted suicidal ideation in 3% of users. That works out to 30 students of the 1,000 surveyed.

Replika screenshots

But other recent research, including the Voicebox survey, suggests that young people exploring AI companions are potentially at risk.

Foos noted that her team heard from a lot of young people about the turmoil they experienced when Luka Inc., Replika’s creator, performed software upgrades. 

“Sometimes that would change the personality of the bot. And those young people experienced very real heartbreak.”

Despite the hazards adults see, attempts to rein in sexually explicit content had a negative effect: For a month or two, she recalled, Luka stripped the bot of sexually related content — and users were devastated. 

“It’s like all of a sudden the rug was pulled out from underneath them,” she said. 

While she applauded the move to make chatbots safer, Foos said, “It’s something that companies and decision-makers need to keep in mind — that these are real relationships.” 

And while many older folks would blanch at the idea of a close relationship with a chatbot, most young people are more open to such developments.

Julia Freeland Fisher, education director of the , a think tank founded by the well-known “disruption” guru, said she’s not worried about AI companions per se. But as AI companions improve and, inevitably, proliferate, she predicts they’ll create “the perfect storm to disrupt human connection as we know it.” She thinks we need policies and market incentives to keep that from happening.

(AI companies could produce) the perfect storm to disrupt human connection as we know it.

Julia Freeland Fisher, Clayton Christensen Institute

While the loneliness epidemic has revealed people’s deep need for connection, she predicted the easy intimacy promised by AI could lead to one-sided “parasocial relationships,” much like devoted fans have with celebrities, making isolation “more convenient and comfortable.”

Fisher is pushing technologists to factor in AI’s potential to cause social isolation, much as they now fret about AI’s difficulties and its tendency to in tech jobs.

As for Nick, he’s a rising senior and still swears by the ChatGPT therapist in his pocket.

He calls his interactions with it both more reliable and honest than those he has with friends and family. If he called them in a pinch, they might not pick up. Even if they did, they might simply tell him what he wants to hear. 

Friends usually tell him they find the ChatGPT arrangement “a bit odd,” but he finds it pretty sensible. He has heard stories of people in Japan and thinks to himself, “Well, that’s a little strange.” He wouldn’t go that far, but acknowledges, “We’re already a bit like cyborgs as people, in the way that we depend on our phones.” 

Lately, he’s taken to using the AI’s voice mode. Instead of typing on a keyboard, he has real-time conversations with a variety of male- or female-voiced interlocutors, depending on his mood. And he gets a companion that has a deeper understanding of his dilemmas — at $20 per month, the advanced version remembers their past conversations and is “getting better at even knowing who I am and how I deal with things.” 

Sometimes talking with AI is just easier — even when he’s on vacation with friends.

Reached by phone recently at the beach with his girlfriend and a few other college pals, Nick admitted that he wasn’t having such a great time — he has a fraught recent history with some in the group, and had been texting ChatGPT about the possibility of just getting on a plane and going home. After hanging up from the interview, he said, he planned to ask the AI if he should stay or go.

Days later, Nick said he and the chatbot had talked. It suggested that maybe he felt “undervalued” and concerned about boundaries in his relationship with his girlfriend. He should talk openly with her, it suggested, even if he was, in his view, “honestly miserable” at the beach. It persuaded him to stick around and work it out. 

While his girlfriend knows about his ChatGPT shrink and they share an account, he deletes conversations about their real-life relationship.

She may never know the role AI played in keeping them together.

]]>
A Cautionary AI Tale: Why IBM’s Dazzling Watson Supercomputer Made a Lousy Tutor /article/a-cautionary-ai-tale-why-ibms-dazzling-watson-supercomputer-made-a-lousy-tutor/ Tue, 09 Apr 2024 13:30:00 +0000 /?post_type=article&p=724698

With a new race underway to create the next teaching chatbot, IBM’s abandoned 5-year, $100M ed push offers lessons about AI’s promise and its limits. 

In the annals of artificial intelligence, Feb. 16, 2011, was a watershed moment.

That day, IBM’s Watson supercomputer finished off a three-game shellacking of Jeopardy! champions Ken Jennings and Brad Rutter. Trailing by over $30,000, Jennings, now the show’s host, wrote out his Final Jeopardy answer in mock resignation: “I, for one, welcome our computer overlords.”

A lark to some, the experience galvanized Satya Nitta, a longtime computer researcher at IBM’s Watson Research Center in Yorktown Heights, New York. Tasked with figuring out how to apply the supercomputer’s powers to education, he soon envisioned tackling ed tech’s most sought-after challenge: the world’s first tutoring system driven by artificial intelligence. It would offer truly personalized instruction to any child with a laptop — no human required.

YouTube

“I felt that they’re ready to do something very grand in the space,” he said in an interview. 

Nitta persuaded his bosses to throw more than $100 million at the effort, bringing together 130 technologists, including 30 to 40 Ph.D.s, across research labs on four continents. 

But by 2017, the tutoring moonshot was essentially dead, and Nitta had concluded that effective, long-term, one-on-one tutoring is “a terrible use of AI — and that remains today.”

For all its jaw-dropping power, Watson the computer overlord was a weak teacher. It couldn’t engage or motivate kids, inspire them to reach new heights or even keep them focused on the material — all qualities of the best mentors.

It’s a finding with some resonance to our current moment of AI-inspired doomscrolling about the future of humanity in a world of ascendant machines. “There are some things AI is actually very good for,” Nitta said, “but it’s not great as a replacement for humans.”

His five-year journey to essentially a dead-end could also prove instructive as ChatGPT and other programs like it fuel a renewed, multimillion-dollar experiment to, in essence, prove him wrong.

Some of the leading lights of ed tech, from to , are trying to pick up where Watson left off, offering AI tools that promise to help teach students. Sal Khan, founder of Khan Academy, last year said AI has the potential to bring “probably the ” that education has ever seen. He wants to give “every student on the planet an artificially intelligent but amazing personal tutor.”

A 25-year journey

To be sure, research on high-dosage, one-on-one, in-person tutoring is : It’s interventions available, offering significant improvement in students’ academic performance, particularly in subjects like math, reading and writing.  

But traditional tutoring is also “breathtakingly expensive and hard to scale,” said Paige Johnson, a vice president of education at Microsoft. One school district in West Texas, for example, recently spent in federal pandemic relief funds to tutor 6,000 students. The expense, Johnson said, puts it out of reach for most parents and school districts. 

We missed something important. At the heart of education, at the heart of any learning, is engagement.

Satya Nitta, IBM Research’s former global head of AI solutions for learning

For IBM, the opportunity to rebalance the equation in kids’ favor was hard to resist. 

The Watson lab is legendary in the computer science field, with and six Turing Award winners among its ranks. It’s where modern was invented, and home to countless other innovations such as barcodes and the magnetic stripes on credit cards that make . It’s also where, in 1997, Deep Blue beat Garry Kasparov, essentially inventing the notion that AI could “think” like a person.

Chess enthusiasts watch World Chess champion Garry Kasparov on a television monitor as he holds his head in his hands at the start of the sixth and final match May 11, 1997 against IBM’s Deep Blue computer in New York. Kasparov lost this match in just 19 moves. (Stan Honda/Getty)

The heady atmosphere, Nitta recalled, inspired “a very deep responsibility to do something significant and not something trivial.”

Within a few years of Watson’s victory, Nitta, who had arrived in 2000 as a chip technologist, rose to become IBM Research’s global head of AI solutions for learning. For the Watson project, he said, “I was just given a very open-ended responsibility: Take Watson and do something with it in education.”

Nitta spent a year simply reading up on how learning works. He studied cognitive science, neuroscience and the decades-long history of “intelligent tutoring systems” in academia. Foremost in his reading list was the research of Stanford neuroscientist Vinod Menon, who’d put elementary schoolers through a 12-week math tutoring session, collecting before-and-after scans of their brains using an MRI. Tutoring, he found, produced nothing less than an increase in neural connectivity. 

Nitta returned to his bosses with the idea of an AI-powered cognitive tutor. “There’s something I can do here that’s very compelling,” he recalled saying, “that can broadly transform learning itself. But it’s a 25-year journey. It’s not a two-, three-, four-year journey.”

IBM drafted two of the highest-profile partners possible in education: the children’s media powerhouse Sesame Workshop and Pearson, the international publisher.

One product envisioned was a voice-activated Elmo doll that would serve as a kind of digital tutoring companion, interacting fully with children. Through brief conversations, it would assess their skills and provide spoken responses to help kids advance.

One proposed application of IBM’s planned Watson tutoring app was to create a voice-activated Elmo doll that would be an interactive digital companion. (Getty)

Meanwhile, Pearson promised that it could soon allow college students to “dialogue with Watson in real time.”

Nitta’s team began designing lessons and putting them in front of students — both in classrooms and in the lab. In order to nurture a back-and-forth between student and machine, they didn’t simply present kids with multiple-choice questions, instead asking them to write responses in their own words.

It didn’t go well.

Some students engaged with the chatbot, Nitta said. “Other students were just saying, ‘IDK’ [I don’t know]. So they simply weren’t responding.” Even those who did began giving shorter and shorter answers. 

Nitta and his team concluded that a cold reality lay at the heart of the problem: For all its power, Watson was not very engaging. Perhaps as a result, it also showed “little to no discernible impact” on learning. It wasn’t just dull; it was ineffective.

Satya Nitta (left) and part of his team at IBM’s Watson Research Center, which spent five years trying to create an AI-powered interactive tutor using the Watson supercomputer.

“Human conversation is very rich,” he said. “In the back and forth between two people, I’m watching the evolution of your own worldview.” The tutor influences the student — and vice versa. “There’s this very shared understanding of the evolution of discourse that’s very profound, actually. I just don’t know how you can do that with a soulless bot. And I’m a guy who works in AI.”

When students’ usage time dropped, “we had to be very honest about that,” Nitta said. “And so we basically started saying, ‘OK, I don’t think this is actually correct. I don’t think this idea — that an intelligent tutoring system will tutor all kids, everywhere, all the time — is correct.”

‘We missed something important’

IBM soon switched gears, debuting another crowd-pleasing Watson variation — this time, a touching throwback: It engaged in . In a televised demonstration in 2019, it went up against debate champ Harish Natarajan on the topic “Should we subsidize preschools?” Among its arguments for funding, the supercomputer offered, without a whiff of irony, that good preschools can prevent “future crime.” Its current iteration, , focuses on helping businesses build AI applications like “intelligent customer care.” 

Nitta left IBM, eventually taking several colleagues with him to create a startup called . It uses voice-activated AI to safely help teachers do workaday tasks such as updating digital gradebooks, opening PowerPoint presentations and emailing students and parents. 

Thirteen years after Watson’s stratospheric Jeopardy! victory and more than one year into the Age of ChatGPT, Nitta’s expectations about AI couldn’t be more down-to-earth: His AI powers what’s basically “a carefully designed assistant” to fit into the flow of a teacher’s day. 

To be sure, AI can do sophisticated things such as generating quizzes from a class reading and editing student writing. But the idea that a machine or a chatbot can actually teach as a human can, he said, represents “a profound misunderstanding of what AI is actually capable of.” 

Nitta, who still holds deep respect for the Watson lab, admits, “We missed something important. At the heart of education, at the heart of any learning, is engagement. And that’s kind of the Holy Grail.”

These notions aren’t news to those who do tutoring for a living. , which offers live and online tutoring in 500 school districts, relies on AI to power a lesson plan creator that helps personalize instruction. But when it comes to the actual tutoring, humans deliver it, said , chief institution officer at , which operates Varsity.

”The AI isn’t far enough along yet to do things like facial recognition and understanding of student focus,” said Salcito, who spent 15 years at Microsoft, most of them as vice president of worldwide education. “One of the things that we hear from teachers is that the students love their tutors. I’m not sure we’re at a point where students are going to love an AI agent.”

Students love their tutors. I'm not sure we're at a point where students are going to love an AI agent.

Anthony Salcito, Nerdy

The No. 1 factor in a student’s tutoring success is consistently, research suggests. As smart and efficient as an AI chatbot might be, it’s an open question whether most students, especially struggling ones, would show up for an inanimate agent or develop a sense of respect for its time.

When Salcito thinks about what AI bots now do in education, he’s not impressed. Most, he said, “aren’t going far enough to really rethink how learning can take place.” They end up simply as fast, spiffed-up search engines. 

In most cases, he said, the power of one-on-one, in-person tutoring often emerges as students begin to develop more honesty about their abilities, advocate for themselves and, in a word, demand more of school. “In the classroom, a student may say they understand a problem. But they come clean to the tutor, where they expose, ‘Hey, I need help.’”

Cognitive science suggests that for students who aren’t motivated or who are uncertain about a topic, only will help. That requires a focused, caring human, watching carefully, asking tons of questions and reading students’ cues. 

Jeremy Roschelle, a learning scientist and an executive director of Digital Promise, a federally funded research center, said usage with most ed tech products tends to drop off. “Kids get a little bored with it. It’s not unique to tutors. There’s a newness factor for students. They want the next new thing.” 

There's a newness factor for students. They want the next new thing.

Jeremy Roschelle, Digital Promise

Even now, Nitta points out, research shows that big commercial AI applications don’t seem to hold users’ attention as well as top entertainment and social media sites like YouTube, Instagram and TikTok. dubbed the user engagement of sites like ChatGPT “lackluster,” finding that the proportion of monthly active users who engage with them in a single day was only about 14%, suggesting that such sites aren’t very “sticky” for most users.

For social media sites, by contrast, it’s between 60% and 65%. 

One notable AI exception: , an app that allows users to create companions of their own among figures from history and fiction and chat with the likes of Socrates and Bart Simpson. It has a stickiness score of 41%.

As startups like offer “your child’s superhuman tutor,” starting at $29 per month, and publicly tests its popular Khanmigo AI tool, Nitta maintains that there’s little evidence from learning science that, absent a strong outside motivation, people will spend enough time with a chatbot to master a topic.

“We are a very deeply social species,” said Nitta, “and we learn from each other.”

IBM declined to comment on its work in AI and education, as did Sesame Workshop. A Pearson spokesman said that since last fall it has been ​​beta-testing AI study tools keyed to its e-textbooks, among other efforts, with plans this spring to expand the number of titles covered. 

Getting ‘unstuck’

IBM’s experiences notwithstanding, the search for an AI tutor has continued apace, this time with more players than just a legacy research lab in suburban New York. Using the latest affordances of so-called large language models, or LLMs, technologists at Khan Academy believe they are finally making the first halting steps in the direction of an effective AI tutor. 

Kristen DiCerbo remembers the moment her mind began to change about AI. 

It was September 2022, and she’d only been at Khan Academy for a year-and-a-half when she and founder Khan got access to a beta version of ChatGPT. Open AI, ChatGPT’s creator, had asked Microsoft co-founder Bill Gates for more funding, but he told them not to come back until the chatbot could pass an Advanced Placement biology exam.

Khan Academy founder Sal Khan has said AI has the potential to bring “probably the biggest positive transformation” that education has ever seen. He wants to give every student an “artificially intelligent but amazing personal tutor.” (Getty)

So Open AI queried Khan for sample AP biology questions. He and DiCerbo said they’d help in exchange for a peek at the bot — and a chance to work with the startup. They were among the first people outside of Open AI to get their hands on GPT-4, the LLM that powers the upgraded version of ChatGPT. They were able to test out the AI and, in the process, become amateur AI before anyone had even heard of the term. 

Like many users typing in queries in those first heady days, the pair initially just marveled at the sophistication of the tool and its ability to return what felt, for all the world, like personalized answers. With DiCerbo working from her home in Phoenix and Khan from the nonprofit’s Silicon Valley office, they traded messages via Slack.

Kristen DiCerbo introduces users to Khanmigo in a Khan Academy promotional video. (YouTube)

“We spent a couple of days just going back and forth, Sal and I, going, ‘Oh my gosh, look what we did! Oh my gosh, look what it’s saying — this is crazy!’” she told an audience during a recent at the University of Notre Dame. 

She recounted asking the AI to help write a mystery story in which shoes go missing in an apartment complex. In the back of her mind, DiCerbo said, she planned to make a dog the shoe thief, but didn’t reveal that to ChatGPT. “I started writing it, and it did the reveal,” she recalled. “It knew that I was thinking it was going to be a dog that did this, from just the little clues I was planting along the way.”

More tellingly, it seemed to do something Watson never could: have engaging conversations with students.

DiCerbo recounted talking to a high school student they were working with who told them about an interaction she’d had with ChatGPT around The Great Gatsby. She asked it about F. Scott Fitzgerald’s famous , which scholars have long interpreted as symbolizing Jay Gatsby’s out-of-reach hopes and dreams.

“It comes back to her and asks, ‘Do you have hopes and dreams just out of reach?’” DiCerbo recalled. “It had this whole conversation” with the student.

The pair soon tore up their 2023 plans for Khan Academy. 

It was a stunning turn of events for DiCerbo, a Ph.D. educational psychologist and former senior Pearson research scientist who had spent more than a year on the failed Watson project. In 2016, Pearson that Watson would soon be able to chat with college students in real time to guide them in their studies. But it was DiCerbo’s teammates, about 20 colleagues, who had to actually train the supercomputer on thousands of student-generated answers to questions from textbooks — and tempt instructors to rate those answers. 

Like Nitta, DiCerbo recalled that at first things went well. They found a natural science textbook with a large user base and set Watson to work. “You would ask it a couple of questions and it would seem like it was doing what we wanted to,” answering student questions via text.

But invariably if a student’s question strayed from what the computer expected, she said, “it wouldn’t know how to answer that. It had no ability to freeform-answer questions, or it would do so in ways that didn’t make any sense.” 

After more than a year of labor, she realized, “I had never seen the ‘OK, this is going to work’ version” of the hoped-for tutor. “I was always at the ‘OK, I hope the next version’s better.’”

But when she got a taste of ChatGPT, DiCerbo immediately saw that, even in beta form, the new bot was different. Using software that quickly predicted the most likely next word in any conversation, ChatGPT was able to engage with its human counterpart in what seemed like a personal way.

Since its debut in March 2023, Khanmigo has turned heads with what many users say is a helpful, easy-to-use, natural language interface, though a few users have pointed out that it sometimes .

Surprisingly, DiCerbo doesn’t consider the popular chatbot a full-time tutor. As sophisticated as AI might now be in motivating students to, for instance, try again when they make a mistake, “It’s not a human,” she said. “It’s also not their friend.”

(AI's) not a human. It’s also not their friend.

Kristen DiCerbo, Khan Academy

Khan Academy’s shows their tool is effective with as little as 30 minutes of practice and feedback per week. But even as many startups promise the equivalent of a one-on-one human tutor, DiCerbo cautions that 30 minutes is not going to produce miracles. Khanmigo, she said, “is not a solution that’s going to replace a human in your life,” she said. “It’s a tool in your toolbox that can help you get unstuck.”

‘A couple of million years of human evolution’

For his part, Nitta says that for all the progress in AI, he’s not persuaded that we’re any closer to a real-live tutor that would offer long-term help to most students. If anything, Khanmigo and probabilistic tools like it may prove to be effective “homework helpers.” But that’s where he draws the line. 

“I have no problem calling it that, but don’t call it a tutor,” he said. “You’re trying to endow it with human-like capabilities when there are none.”  

Unlike humans, who will typically do their best to respond genuinely to a question, the way AI bots work —by digesting pre-existing texts and other information to come up with responses that seem human — is akin to a “statistical illusion,” writes Harvard Business School Professor . “They’ve just been well-trained by humans to respond to humans.”

Researcher Sidney Pressey’s 1928 Testing Machine, one of a series of so-called “teaching machines” that he and others believed would advance education through automation.

Largely because of this, Nitta said, there’s little evidence that a chatbot will continuously engage people as a good human tutor would.

What would change his mind? Several years of research by an independent third party showing that tools like Khanmigo actually make a difference on a large scale — something that doesn’t exist yet.

DiCerbo also maintains her hard-won skepticism. She knows all about the halting early decades of computers a century ago, when experimental, punch-card operated “teaching machines” guided students through rudimentary multiple-choice lessons, often with simple rewards at the end. 

In her talks, DiCerbo urges caution about AI revolutionizing education. As much as anyone, she is aware of the expensive failures that have come before. 

Two women stand beside open drawers of computer punch card filing cabinets. (American Stock/Getty Images)

In her recent talk at Notre Dame, she did her best to manage expectations of the new AI, which seems so limitless. In one-to-one teaching, she said, there’s an element of humanity “that we have not been able to — and probably should not try — to replicate in artificial intelligence.” In that respect, she’s in agreement with Nitta: Human relationships are key to learning. In the talk, she noted that students who have a person in school who cares about their learning have higher graduation rates. 

But still.

ChatGPT now has 100 million weekly users, according to . That record-fast uptake makes her think “there’s something interesting and sticky about this for people that we haven’t seen in other places.”

Being able to engineer prompts in plain English opens the door for more people, not just engineers, to create tools quickly and iterate on what works, she said. That democratization could mean the difference between another failed undertaking and agile tools that actually deliver at least a version of Watson’s promise. 

An early prototype of IBM’s Watson supercomputer in Yorktown Heights, New York. In 2011, the system was the size of a master bedroom. (Wikimedia Commons)

Seven years after he left IBM to start his new endeavor, Nitta is philosophical about the effort. He takes virtually full responsibility for the failure of the Watson moonshot. In retrospect, even his 25-year timeline for success may have been naive.

“What I didn’t appreciate is, I actually was stepping into a couple of million years of human evolution,” he said. “That’s the thing I didn’t appreciate at the time, which I do in the fullness of time: Mistakes happen at various levels, but this was an important one.”

]]>
Exclusive: For Busy Teachers, AI Could Crack Open the Dense World of Ed Research /article/exclusive-phonics-learning-styles-teachers-confounded-by-education-research-may-soon-turn-to-new-ai-chatbots-for-help/ Wed, 06 Sep 2023 11:15:00 +0000 /?post_type=article&p=714153 As students across the U.S. enter their first full school year with access to powerful AI tools like ChatGPT and Bard, many educators remain skeptical of their usefulness — and preoccupied with their potential to .

But this fall, a few educators are quietly charting a different course they believe could change everything: At least two groups are pushing to create new AI chatbots that would offer teachers unlimited access to sometimes confusing and often paywalled peer-reviewed research on the topics that most bedevil them. 

Their aspiration is to offer new tools that are more focused and helpful than wide-ranging ones like ChatGPT, which tends to stumble over research questions with competing findings. And like many kids faced with questions they can’t answer, it has a frustrating tendency to make things up.


Get stories like this delivered straight to your inbox. Sign up for Ӱ Newsletter


Tapping into curated research bases and filtering out lousy results would also make the bots more reliable: If all goes according to plans, they’d cite their sources.

The result, supporters say, could revolutionize education. If their work takes hold, millions of teachers for the first time could routinely access high-quality research and make it part of their everyday workflow. Such tools could also help stamp out adherence to stubborn but ill-supported fads in areas from “learning styles” to reading instruction.

So far, the two groups are each feeling their way around the vast undertaking, with slightly different approaches.

In June, the International Society for Technology in Education introduced , a tool built on content vetted by ISTE and the Association for Supervision and Curriculum Development. (The two groups merged in 2022.) ISTE has made it available in to selected users. All of the chatbot’s content is educator-focused, and it’s trained solely on materials developed or approved by the two organizations. 

Richard Culatta

Now its creators say that within about six months, they expect that the tool will also be able to scour outside, peer-reviewed education research and return “pretty understandable, pretty meaningful results” from vetted journals, said Richard Culatta, ISTE’s CEO.

“There’s this big gap between what we know in the research and what happens in practice,” he said. One reason: Most research is published in a format that “is just totally inaccessible to teachers.”

Case in point: A set of by the Jefferson Education Exchange, a nonprofit supported by the University of Virginia’s Curry School of Education, found that while educators prefer research they can act on — and that’s presented in a way that applies to their work — only about 16% of teachers actually use research to inform instruction.

So he and others are building a digital tool, “purpose-built for educators by educators,” that can translate research into practice, using “very practical language that teachers understand.”

For instance, a teacher could ask the chatbot, “What does the research say about creating a healthy school culture?” or “What’s the evidence for teaching phonics to developing readers?” One could also ask it to suggest activities that are appropriate for middle school students learning about digital citizenship.

Joseph South, ISTE’s chief learning officer, said teachers want the latest research, but are up against formidable obstacles. “They have to find the article in the journal that happens to relate to the thing that they want to do,” he said. “They have to somehow understand academic-speak. They have to have the time to read this, and they have to translate it into something useful.”

While ChatGPT can comb through journals it has access to, translate and summarize the research, he said, it’s not reliable. The typical chatbot — and thus the typical end user — doesn’t know whether the results are from a credible, peer-reviewed journal or not, and it may not necessarily care.

Joseph South

“We do, though,” he said. “So we can do that filtering and let the AI do its magic.”

As with its beta version, the new chatbot will also cite the sources used to generate each response. And it’ll let users know when it simply doesn’t have enough information to return a reliable response.

Developers are still in the early stages of deciding what academic journals to include. For now, they’re experimenting with a handful of key research articles, but will expand the chatbot’s range if initial prototypes prove helpful to educators.

Culatta and South, both veterans of the U.S. Department of Education, have spent years working on the research-to-practice problem, offering, in effect, translation services for research findings. “We’ve spent so much work trying to figure out how to do it and it’s just never really worked,” he said. “It’s just always been a struggle. And we actually think that this could be the first for-real, sustainable, scalable approach to taking research and getting it into language that actually could be used by teachers.”

Daniel Willingham

, a professor of psychology at the University of Virginia and a well-known translator of education research, said his limited experience with ChatGPT has shown that when asked about a subject where there’s general consensus, such as “What is the effect of sleep on memory?” it produces helpful results. But it isn’t very good at synthesizing conflicting findings.

It’s also inconsistent in its willingness to reveal, in Willingham’s words, that “‘I really don’t know anything about that.’ And so it, you know, just .”

A paid ChatGPT subscriber, Willingham said he gets “really useful” results only about 20% of the time. “But it requires plenty of verification on my part. And this is all within my area of expertise, so it’s not very hard for me to verify.”

Tapping ‘What Works’

ISTE isn’t the only organization pushing to make education research more widely accessible via chatbot. The Learning Agency, a Washington, D.C.-based consulting firm, is also testing a of a bot designed to offer answers to education research queries.

Unlike ISTE’s, the agency’s tool taps an already existing, if finite, resource: the U.S. Department of Education’s What Works Clearinghouse, or more specifically its , a curated collection of materials developed by the department’s Institute of Education Sciences.

“We were inspired to basically create a special version of ChatGPT that was exposed to more high-quality educational data and research evidence on what works,” said Perpetual Baffour, the group’s research director.

In a sense, she said, much of the work had already been done, since the library, though limited, exists to translate research findings into more digestible forms for educators. The result is a prototype that offers what Baffour calls “evidence-based education advice” on topics from adolescent literacy to dropout prevention and school turnarounds.

Perpetual Baffour

Baffour noted that the app development tool allowed the agency to create a “simple but robust” prototype within a day.

At the moment, the version they’re testing is “very basic,” Baffour said. The agency is still considering what it’ll look like in the future. 

“It only knows as much as the Doing What Works Library tells it,” she said. So queries about topics that are in the library return rich results. But asking it about topics that aren’t can be problematic. 

For example, ask it about myths around the aforementioned learning styles and it defaults to a more general knowledge base scraped from Wikipedia articles, transcripts of recorded conversations and materials from “different corners of the Internet,” Baffour said. “And as you can imagine, those sources might not have the most up-to-date and accurate information about education,” Baffour said.

Indeed, a query about the topic returns this: “The myth concerning learning styles is that there is one specific style that works best for everyone. This is not true, as different people learn in different ways and have different preferences for how they learn. Additionally, there is no evidence that suggests that focusing on one particular learning style is more effective than focusing on multiple styles.”

Not exactly accurate or helpful.

In the first place, the widely believed “myth” holds that people with different learning styles learn best when their preferred mode of learning is indulged — not that one style works best for everyone. At a more basic level, while many people may express preferences for ways to take in new information and study — receiving instruction verbally, for example, instead of via pictures — scientists have yet to find good evidence that material tuned to these preferences . 

Unfortunately, at the moment the agency’s bot doesn’t confess whether it knows a lot or little about a topic. Baffour said they want to change that soon. For now, however, that’s just an aspiration.

“I think you’re more likely to get a confident chatbot producing inaccurate information than you are to get a self-aware chatbot admitting its false and incomplete knowledge,” she said. 

Willingham, the UVA researcher, said a useful education-focused chatbot would not just have to incorporate reliable findings, but put them in context. For example, an answer to a query about the evidence for phonics instruction would properly note that, while the record is fairly strong, a lot of mediocre research and “hyperbolic claims” made in support of alternative methods serve to cloud the overall picture — a delicate but accurate detail.

“How is an aggregator going to negotiate that?” he said. 

Asked if he thought a chatbot might soon replace him, Willingham, the author of and a that translate learning science into plain English, said he wouldn’t make any predictions. 

“I was never much of a futurist, but I hocked my crystal ball 15 years ago,” he said.

]]>