This is Platformsa newsletter about the intersection of Silicon Valley and democracy written by Casey Newton and Zoë Schiffer. sign up here.
Today, let’s talk about a preview of Bard, Google’s answer to ChatGPT, and how it addresses one of the most pressing problems with chatbots today: their tendency to make things up.
From the day chatbots arrived last year, their creators warned us not to trust them. The text generated by tools like ChatGPT is not based on a database of established facts. Instead, chatbots are predictive: they make probabilistic guesses about which words seem correct based on the huge corpus of text on which their large underlying linguistic models were trained.
As a result, chatbots are often “surely wrong,” to use the industry term. And this can fool even highly educated people, as we saw this year in the case of the lawyer who sent subpoenas generated by ChatGPT – without realizing that each case had been fabricated from scratch.
This situation explains why I consider chatbots to be mostly useless as research assistants. They will tell you anything you want, often in a matter of seconds, but in most cases without citing their work. As a result, you end up spending a lot of time researching their answers to see if they are true, which often defeats the purpose of using them.
When it launched earlier this year, Google’s Bard came with a “Google It” button that sent your query to the company’s search engine. This made it a little faster to get a second opinion on the chatbot’s output, but it still put the burden on you to determine what is true and what is false.
However, starting this week, Bard will be working a little harder on your behalf. After the chatbot answers one of your queries, pressing the Google button will “double check” its answer. Here it is how the company explained it in a blog post:
When you click the “G” icon, Bard will read the response and evaluate whether there is content on the web to support it. When a statement can be evaluated, you can click on the highlighted phrases and learn more about information that supports or contradicts your search.
When rechecking a query, many of the sentences within the answer will turn green or brown. Answers highlighted in green are linked to cited web pages; Hover over one and Bard will show you the source of the information. Answers highlighted in brown indicate that Bard does not know where the information came from, highlighting a possible error.
When I double-checked Bard’s answer to my question about the history of the band Radiohead, for example, he gave me many sentences highlighted in green that squared with my own knowledge. But he also made this sentence brown: “They have won numerous awards, including six Grammy Awards and nine Brit Awards.” Hovering over the words showed that the Google search had turned up contradictory information; in fact, Radiohead have never (criminally) won a single Brit Award, let alone nine of them.
“I’m going to tell you about a tragedy that happened in my life,” Jack Krawczyk, Google’s senior product manager, told me in an interview last week.
Krawczyk had cooked swordfish at home and the resulting smell seemed to permeate the entire house. He used Bard to look for ways to get rid of it and then double-checked the results to separate fact from fiction. It turns out that deep cleaning the kitchen wouldn’t solve the problem, as the chatbot had originally said. But placing bowls of baking soda around the house might help.
If you’re wondering why Google doesn’t recheck answers like this before Show them, me too. Krawczyk told me that given the wide variety of ways people use Bard, it’s often unnecessary to double-check. (You wouldn’t normally ask him to double-check a poem you wrote, or an email you composed, etc.).
And while double-checking represents a clear step forward, it often still requires you to get all those citations and make sure Bard is interpreting the search results correctly. At least when it comes to research, we humans still hold AI’s hand as much as it holds ours.
Still, it’s a welcome development.
“We may have created the first language model that admits it’s made a mistake,” Krawczyk told me. And given what’s at stake as these models improve, ensuring AI models accurately confess their errors should be a high priority for the industry.
Bard received another big update on Tuesday: It can now connect to Gmail, Docs, Drive, and several other Google products, including YouTube and Maps. Extensions, as they are called, allow you to search, summarize, and ask questions about documents you have stored in your Google account in real time.
For now, it’s limited to personal accounts, which drastically limits its usefulness, at least for me. Sometimes it is interesting as an alternative way of browsing the web; He did a good job, for example, when I asked him to show me some good videos on how to get started in interior design. (The fact that you can play those online videos in Bard’s response window is a nice touch.)
But extensions also do a lot of things wrong and there is no button to press here to improve the results. When I asked Bard to find my oldest email from a friend I’d been exchanging messages with on Gmail for 20 years, Bard showed me a message from 2021. When I asked which messages in my inbox might need a response Quickly, Bard suggested a spam message with the subject “Is it possible to print without problems with HP Instant Ink.”
It works best in scenarios where Google can make money. Ask it to plan an itinerary for a trip to Japan that includes flight and hotel information, and you’ll get a nice selection of options from which Google can take a cut.
Over time, I imagine third-party extensions will come to Bard, just as they have before with ChatGPT. (There they are called complements). The promise of being able to do things on the web through a conversational interface is enormous, even if the current experience is so-so.
The long-term question is to what extent AI will eventually be able to check its own work. Today, the task of directing chatbots to the correct answer still weighs heavily on the person writing the message. Right now, tools are needed that drive AIs to cite their work. However, over time we expect that more of that work will fall to the tools themselves, and without us always having to ask for it.