To begin with a quote regularly and erroneously attributed to Henry Ford, “If I had asked my customers what they wanted, they would have said a faster horse.”

A lot of the hype surrounding the latest developments in Generative AI (GenAI) focuses on its potential to carry out particular tasks much faster than humans. It’s an understandable human impulse to look for ways to speed up time-consuming (boring?) manual work and be more productive. However, AI has so much more potential. Processes need to be rethought, and AI should also be applied to new tasks that are unfeasible for humans.

In this blog post, I’ll reflect on recent discussions with civil servants and our own research to consider how AI might increase productivity and offer new capabilities. At the same time, I’ll explore the necessary checks and balances on how far AI should be applied in delivering services to UK citizens.

Harnessing a faster horse

As you would imagine, it didn’t take long for the developers of the latest GenAI models to see their applicability to software development. With the advent of tools like GitHub Copilot and ChatGPT, we undertook research at Scott Logic to explore their qualitative and quantitative impacts. In his blog ‘If software development were a race, AI wins every time’, my colleague Colin Eberhardt summarises the results; in our experiment, tasks were completed 37% faster.

Similar examples of productivity increases were discussed in two private roundtable sessions hosted by the Institute for Government (IfG) in partnership with Scott Logic exploring the opportunities and risks of harnessing AI in providing public services. Given the scale at which central government has to operate, several compelling GenAI use cases were mentioned. These included processing large volumes of government correspondence, and analysing responses to public consultations. In the latter case, there’s a pilot underway called Consult which explains, “A consultation attracting 30,000 responses requires a team of around 25 analysts for 3 months to analyse the data and write the report […] If we can apply automation in a way that is fair, effective and accountable, we could save most of [the £80m spent on consultations per year].”

So, there are very good reasons to want a faster horse. However, it’s inevitably not that simple. One critical factor is how to define what good looks like, across a range of vectors, so that the quality and suitability of the GenAI’s outputs can be measured against a human’s. We recently supported a client to do exactly this in developing a GenAI paralegal. Unlike development with ‘traditional’ software, it can’t be “one and done” with non-deterministic AI. Having trained a model, ongoing monitoring and retraining are required to maintain the model’s reliability and accuracy; we’ve recently supported a team at DWP to build its capability in this area.

As I said earlier, it’s not all about speed. Colin says in his blog, “These tools don’t represent an incremental improvement to our existing toolkit, instead they are genuinely disruptive.” The government must look beyond immediate efficiency gains to rethink its processes and harness AI to tasks that would be unfeasible for humans to do.

Tapping into AI’s superhuman abilities

Take the policy-making process. As a roundtable participant explained, this process has remained largely unchanged over the decades, but AI could play a transformative role. With its ability to digest and analyse vast amounts of data, AI could radically improve the process of gathering and identifying gaps in the evidence on which policy is based. In addition, AI’s ability to extrapolate future trends from historical data could transform impact analysis, helping predict the potential outcomes of new policies – and this could be a dynamic process. As one of the roundtable’s participants suggested, it would be possible to run rapid experiments in the early stages of policy-making to demonstrate the causality between interventions and their impacts. In this way, you could provide confidence in the metrics with which the policy’s effectiveness would be measured.

The extraordinary power of AI opens up new use cases that would be inconceivable for humans in terms of the time and resources required. For example, medical prescription errors result in a significant number of deaths a year. The government’s Incubator for AI sought expressions of interest for a pilot project to trial using pharmacy data to flag suboptimal prescription profiles and concerning cases. In the wider health sector, there are already well-known use cases that exploit AI’s superhuman capabilities – for example, the use of image analysis and pattern recognition in early diagnosis of conditions such as cancer and heart disease.

The importance of explainability

So, AI has the potential to accelerate some processes, transform others, and create whole new processes that were previously inconceivable. In all cases, transparency will be of critical importance in securing buy-in from citizens. For that reason, the ‘explainability’ of AI will be of greater importance in the public sector than in other sectors.

Explainability refers to the ability to understand and interpret how an AI system arrives at its outputs or decisions. In the government context, it is a key component in supporting accountability for AI-assisted decisions and in helping to identify and mitigate biases.

However, it’s a complex area and research is still underway into explainability techniques. In the recent hype around GenAI, the focus has been much more on demonstrating what it can do, rather than how it does it. For now, public sector projects will need to weigh up the pros and cons of using AI products from vendors (where the models are less likely to be explainable) versus developing their own models. As the Ada Lovelace Institute suggests, it might also be wise for the government to be ‘fast followers’, “adopting established technologies and practices once they have been tried and tested,” rather than trying to be at the cutting edge.

As important as explainability will be in the adoption of AI by the public sector, it isn’t a silver bullet. Particularly while techniques are still maturing, explainability will depend on people who are trained in interpreting the explanations of a model’s outputs, based on sufficient knowledge of the context and how the model works.

Beyond the hype, human involvement remains vital

The discussions at the IfG roundtables returned time and again to the conclusion that human involvement in most AI-assisted processes will remain vital for the foreseeable future. Transparency and accountability are intrinsic to the running of government services and, as things stand, AI can only go so far in supporting these. And even where a use case may be technically feasible and apparently straightforward – e.g., automating replies to correspondence – there’s a larger context that the government must consider; as a roundtable participant stated, automating anything changes its meaning. A recipient of a fully AI-generated piece of correspondence may feel a greater sense of disconnection from government as a result.

In the meantime, as my colleague Colin says in his blog, making the most of these tools will be a learning curve. Training to support AI literacy will be important. One of the roundtable participants pointed with concern to a survey undertaken by the Alan Turing Institute for Data Science and AI and the General Medical Council of public sector professionals which revealed that one-fifth were using GenAI in their work, with a further 40% wanting to use it. A very high proportion of those using AI thought that they understood how it worked and trusted its outputs. Less than a third of them felt they had received clear guidance on its use.

While civil servants progress along that learning curve, I predict that it’s entirely possible that far from stealing jobs, AI will create a range of new roles and responsibilities. Government will need people with AI-specific expertise in data quality management, security and privacy protection, system auditing, bias detection and mitigation, and oversight and governance. As AI technologies mature and AI literacy grows, civil servants will be quick judges of which use cases are suited to the application of AI and which are not. As my colleagues explain on this episode of our podcast, it may never be possible to de-risk GenAI sufficiently to use it for certain tasks – in its current forms, at least.

However, we at Scott Logic share the roundtable participants’ broad optimism about the positive impact AI will have on public service delivery, and it was great to hear the combination of ambition, pragmatism and judicious caution in the discussions.

Our work in the public sector

If you’d like to know more about our work with government, visit our Transforming the public sector page.