The buzz and excitement around generative AI is continuing to grow as their capabilities rapidly expand. However, their ability to generate content is just the starting point. From my perspective, the emergent reasoning capability, coupled with their intelligent use of tools, is what will make this technology truly transformational, and will mark the start of a new technology epoch.
In recent years we’ve experienced a small number of technological advances that have been truly transformative. My simple test for whether a technology has had this effect is how hard it is to imagine life before it existed. The internet has certainly been transformative, try explaining to the younger generation what life was like before ‘the net’ - it’s quite hard to articulate. The same is true for mobile phones, while their use as a portable telephone has had limited impact, our ability to access the internet almost anywhere is something we’ve become largely dependent on.
I am becoming increasingly confident that we are on the cusp of AI having exactly this transformative effect on our lives, both professional and personal. And this transformation isn’t due to the incredibly impressive content generation capabilities we’re seeing today.
This is something I’d like to explore in this post.
As an aside, people who know me would almost certainly describe me as being quite conservative in my views on technology. As an industry we have a long history of making over-inflated claims that fail to deliver, something I’ve looked to counter with our Beyond the Hype podcast. With this in mind, declaring a (possible) new technology epoch is not something I take lightly, or for that matter have ever done before!
Generative AI
Artificial Intelligence (AI) is a discipline that has gradually grown in both capability and utility over the past few decades. AI-powered tools are becoming mainstream, examples include improved speech recognition, translations, and surprisingly powerful image editing tools that allow you to simply highlight a section of an image that you’d like it to replace. However, the advances in the past couple of years, spearheaded by OpenAI, have put us on what feels like a very different path.
At the forefront of this change is the concept of generative AI, put simply, this is AI that can generate large quantities of creative content that is comparable in quality to the content generated by humans. We’ve seen generative AI create imagery (DALL-E), code (Copilot), text (GPT-3) and hold conversations (ChatGPT) with human-levels of capability. OpenAI is not the only player in this field, they just hold the lead at the moment. There are a number of notable competitors (both commercial and open source) playing catch-up.
The quality of generated output is quite stunning, but there are some equally (if not more) notable qualities of this technology:
- General purpose - historically AI models have been single purpose, with often costly and time-consuming training required on a specific task, for example language translation or sentiment analysis. The recent generative AI models are multi-purpose, GPT-3 can turn its hand to a wide range of text-based analysis tasks (summarisation, translation, classification, creation of prose) without any additional training required.
- Ease of human interaction - these multi-purpose models are not driven by developer APIs, instead their primary input is text. You issue commands in text (in what is termed a ‘prompt’) and the model does its best to understand your instructions. I noted earlier that the output of generative AI models is quite human-like, the natural language interface makes their input quite human-like also.
I believe it is these qualities that made ChatGPT such a runaway success. It is a tool that anyone can use. And use they did, with it becoming the most rapidly adopted ‘product’ ever.
It’s not hard to see how much of an impact generative AI is going to have. There are a great-many content creation tasks where this technology will be able to produce a suitable result more quickly or cheaply. A recent report by Goldman Sachs looks at the impact this will have across various industries, highlighting the tasks that are more exposed to being replaced by automation, and which jobs entail more of these tasks, and could be considered at risk.
The title of this blog post talks about AI moving from tool to platform. It is this transition I’d like to explore.
Most of the commentary and analysis I’ve read looks at generative AI from the perspective of it being a tool. This implies it will be used in a targeted fashion, replacing certain tasks where it can achieve similar (or better) results more cheaply or quickly. This is an acceleration of the successes AI was already enjoying, but doesn’t feel like a new technology epoch.
To see what that may be ahead of us, we need to view this technology from a different perspective.
AI becomes the platform
While the ability of generative AI to create novel content is both impressive and of tangible value, these models are starting to exhibit capabilities that could be far more powerful in the long-term.
A research team from Microsoft recently published a paper “Sparks of Artificial General Intelligence: Early experiments with GPT-4” based on their findings gleaned from an early-access version of the model developer and trained by OpenAI. As well as a significant increase in the models generative capabilities, there are other points they raise which I feel point to a new direction:
- Reasoning capabilities - users of ChatGPT have already noted that it shows nascent reasoning and problem solving capabilities. Early criticisms that these models are simply stochastic parrots, appear to miss the point.
- Use of tools - the capabilities of generative AI models are constrained in a number of ways, examples include its training dataset having a cut-off data - leaving gaps in its knowledge, it also performed quite poorly at basic mathematics tasks. However, these AI models are now able to use external tools (APIs, calculators, databases), plugging these gaps.
For a great example of these two features working together, take a look at the recent Wolfram blog post where the AI model ‘intelligently’ uses Wolfram APIs to answer questions and solve problems.
The ability of these models to both reason and use tools is what I feel will push us into the next technology epoch, not their ability to generate text or images.
Interestingly their ability to ‘use tools’ is something that these models have had for a long time, it’s just that we (or perhaps I), hadn’t noticed.
The pieces started to fall into place for me when I listened to this excellent interview with Bill Gates hosted by Kevin Scott (Microsoft CTO). Kevin described one of the known weaknesses of ChatGPT, where it struggles to count the instances of a set of words in a passage of text (due to the token-generation approach used by large language models). He then asked ChatGPT to write a Python program to solve this problem, and the result was perfect, the first time.
Tools have been an important part of human evolution, with computers arguably being one of the most important tools we have created to date. We use computers to undertake tasks that we struggle to tackle ourselves, like counting the occurrences of words in a passage of text! ChatGPT is already able to write computer programs, it can already use one of the most powerful tools we have at our disposal … the computer. At the moment we are constraining its ability, in that it passes the computer program to us to execute, but we haven’t given it direct access … yet.
With AI no longer being the tool itself, it can become the orchestrator, the reasoning engine. It is this shift which I feel turns AI into an entirely new platform.
Imagining where this will take us is the hard part. It is a new paradigm which we haven’t seen before. But let’s explore a few brief examples …
My mobile phone has a whole collection of somewhat useful applications. Other than the social media ones, they tend to be quite task focussed, travel apps, food delivery apps, hotel / holiday apps. Stepping back a little, most of these are just a simple user interface on top of an API. In the future I think these apps will entirely disappear. Instead we will converse with our AI platform, describing what we want to achieve, “I’m a bit hungry, I fancy a takeaway”, having a rich conversation about our preferences (which the AI will no doubt learn), with the platform itself directly accessing the Just Eat API (for example), and submitting the order for us.
However, we shouldn’t be limited by the applications we already see on our phones. The ability of these models to write computer programs will mean the creation of single-use bespoke applications. We will be able to describe problems which the platform is able to solve by creating applications on our behalf. We’ll probably not even see the code it writes.
There are also so many apps on my computer desktop that I doubt I will ever want to open up again. As an example, I have always struggled with Photoshop, it is way too complicated (for me). In the future, I’ll describe what I want to create to my personal AI assistant and it will use Photoshop (or similar APIs) entirely in the background to achieve this goal.
There are a great many things we’ll need to figure out if AI really does become central to our lives. These include ethical concerns, the role of regulation, existential risks (to humanity) and a great many more besides. Despite all these unanswered questions, I am very excited about what is just around the corner.