The hype machine is real with Generative AI and ChatGPT, which are seemingly ubiquitous in technology today. So it’s not surprising that we’re starting to hear rumors of a new, improved Siri. In fact, 9to5Mac has already discovered a new natural language system.
Do you speak my language?
The claim is that Siri on tvOS 16.4 beta has a new “Siri Natural Language Generation” framework. As described it doesn’t sound impressive as it seems to be mostly focused on storytelling (dad?) jokes, but can also let you use natural language to set timers. It is codenamed “Bobcat.”
This whisper follows a recent one New York Times report on Apple’s AI summit in February. That report claimed that the event saw a certain focus on the kind of generative content and large language models (LLM) used by ChatGPT. It also said Apple engineers are “actively testing” language-generating concepts by launching new language concepts every week as Apple tries to push AI forward.
So, does it build a ChatGPT competitor? Not really according Bloomberg.
“Hey Siri, how do you spell ‘catch up’?”
While Siri seemed incredibly advanced when it first appeared, development hasn’t kept pace, leaving Apple’s brash voice assistant echoes of MobileMe and Ping. Like both Apple fails, Siri had promised never quite delivered and is now lagging behind assistants from Google and Amazon, despite being a bit more private.
Siri’s lack of contextual sense means it’s really only good at what it’s trained to do, which limits its capabilities; GPT seems to leave it in the dust. With the recent GPT-4 update, OpenAI innovates quickly. We can already see that this has started a fire among the big tech companies. Microsoft has adopted ChatGPT in Bing, Google is rapidly moving forward with Palm development and so is Amazon busy with AWS Chat (the latter is now integrated into Microsoft Teams).
Apple – and Siri – seem to be on the run.
Not the only one
Of course Siri is not alone machine intelligence (MI) Apple continues to work. In some domains, such as accessibility and image enlargement, it has achieved insanely good examples of MI done right. But somehow Siri still makes mistakes.
I’m not entirely sure how Apple’s Steve Jobs would have handled that – I can’t see him being happy when his HomePod tells him he can’t find his Dylan tracks. The difference between the two voice-activated AIs is that I could ask GPT to take a picture of him throwing that smart speaker against the wall.
Partly this is because of the way Siri is built.
How they created Siri
Siri is like a huge database of answers for different knowledge domains, supplemented with Spotlight search results and natural language interpretation so you can talk to it. When a request is made, Siri checks to see if it understands the question and then uses deep/machine learning algorithms to find the right answer. To get that answer, it makes a numerical assessment (confidence score) of the likelihood of it having the correct answer.
What this means is that when you ask Siri a question, it first takes a quick look to see if it’s a simple request (“turn on the lights”) that it can quickly fulfill what it already knows, or if it needs to confer the larger database. Then it does what you ask (sometimes), you get the data you need (often) or tells you it doesn’t understand you or asks you to change a setting hidden somewhere on your system (too often).
In theory, Siri is only as good as its database, meaning the more answers packed into it, the better and more effective it gets.
However, there is a problem. If explained by former Apple engineer John Burkey, the way Siri is built means engineers have to rebuild the entire database to upgrade it. That is a process that can take up to six weeks.
This lack of real learning makes Siri and other voice assistants “dumb as a rock,” according to Microsoft CEO Satya Nadella. You’d expect him to say something like that, of course, since Microsoft has invested billions in ChatGPT, which it incorporates into its products.
Generative AI, on the other hand
Generative AI (the kind of intelligence used in ChatGPT, Midjourney, Dall-E and Stable Diffusion) also uses natural language, proprietary databases and search results, but can also use algorithms to create original-looking content such as audio, images, or text.
You can ask it a question and it will search all available data and make a few decisions to spin a result.
Now, as has been pointed out quite often since people started exploring technology, those results aren’t always great or original, but they usually seem convincing. The ability to ask it to generate deepfake videos and photos goes even further in this.
In use, one way to tell the difference between the two AI models is to think about what they can achieve.
So while you might be able to ask Siri for a map of Lisbon, Portugal, or even search directions to anywhere on that map, Generative AI lets you ask more nuanced questions, like which parts of the city it recommends, to write a story featuring the action in that city, or even take an eerily accurate fake photo of you sitting in that really nice bar in Largo dos Trigueiros.
It’s pretty obvious which AI is the most impressive.
What happens now?
It doesn’t have to be that way. Developers have managed to create apps to add ChatGPT to Apple’s products. watchGPT, which was recently renamed Petey – AI assistant for trademark reasons, is a good example.
Apple is unlikely to want to give such a competitively important technology to third parties, so it will likely continue to work on its own solution, but this could take years – during which Siri still can’t open that cabin door.
Considering that though GPT-4 costs up to 12 cents per thousand prompts, it is highly unlikely that Apple will directly weave it into its operating systems. With an installed base of over a billion users, this would be hugely expensive, and Microsoft is already here.
It’s in that context that Apple may just be taking the plunge to make it easy for its developers to add support for OpenAI’s technology to the apps they create, effectively passing the cost on to them and their customers.
That might help in the short term, but I’m confident this is a bull’s-eye for Apple’s machine intelligence teams. They will now be twice as determined to develop further innovation in the natural language processing that lies at the heart of both technologies.
But at this stage they seem to be falling behind in terms of implementation. Although appearances, as GPT generated images show, can be deceptive.
Please follow me on Mastodonor join the AppleHolic’s bar & grill And Apple Discussions groups on MeWe.
Copyright © 2023 IDG Communications, Inc.