There are several possible approaches to forecasting benefit spending. In this box we outlined the key issues the modelling of disability benefits needed to address, the three approaches we used to forecast spending, and the strengths and limitations of each, concluding that a combination of approaches was better than reliance on any single one.

Since working-age disability benefits reform was first announced in 2010, we have used three different approaches to forecasting its cost: a microsimulation model; an aggregate ‘bottom-up’ approach focusing on claims, inflows, outflows and benefit amounts; and a ‘top-down’ approach focusing on the prevalence of benefit receipt in the population, disaggregated by age and sex. In many of our forecasts, we have used more than one of these approaches.

We assess our fiscal forecasting models regularly against five criteria:a

  • accuracy – how well does the model match outturns?
  • plausibility – how well do the model outputs align with theory and experience?
  • transparency – how easily can the model outputs be understood and scrutinised?
  • effectiveness – how well does the model capture the tax or spending system?
  • efficiency – is the model capable of providing outputs to short deadlines?

As well as meeting these criteria, the challenges posed by the PIP reform, and experience here and with other areas of welfare spending, suggest the modelling infrastructure also needs to:

  • Integrate the past and the forecast. Although PIP is a ‘new’ benefit, information from the DLA system is nevertheless valuable. Long average durations spent in receipt of disability benefits mean that fully representative PIP caseload data will not be available for many years, so currently the DLA experience is the best information available. This can be adjusted for known differences where that is supported by evidence. Because of the migration of claimants from DLA to PIP, which will continue indefinitely in the case of children reaching age 16, integration of DLA and PIP forecasting models will remain desirable to ensure consistency of assumptions across them.
  • Account for continuing claims among pension-age adults. As with DLA in the 1990s, the number of PIP claims among pension-age adults will rise rapidly as the benefit matures and claims made by working-age adults continue past the cut-off age for new claims. This will be an important contributor to caseload and spending growth in the coming years. This is illustrated by the rise in PIP claims among those aged 65 and over between May 2017 and May 2018 – up 52 per cent – outstripping overall caseload growth of 29 per cent as the PIP rollout continues. The number of 69-year olds in receipt of PIP increased from 3,700 in May 2017 to 57,000 in May 2018. A similar rise in 70-year olds in receipt of PIP can be expected between 2018 and 2019, and so on.
  • Provide the necessary outputs for dependent forecasts. Box 3.1 outlines the interdependencies between different benefits that together provide support for disabled people. In forecasting these benefits it is necessary to ensure that the methodology enables us to assess and model these interactions. In particular, this has implications for the way benefit awards are modelled.

Microsimulation modelling

Initially PIP was forecast using DWP’s ‘integrated forecasting model’ (INFORM), a dynamic microsimulation model that projects forward data recorded in the ‘work and pensions longitudinal study’ (WPLS), which encompasses most benefits. The likelihood of a claimant changing benefit status in each month was modelled based on past experience. New PIP claims were ‘cloned’ in the model based on recent claims in proportions set by assumption. The model covered DLA in full – including child claims – despite INFORM being a working-age forecasting model. From our December 2012 forecast, PIP was modelled based on the likelihood of a DLA claimant receiving PIP, using the evidence then available (which, as Chapter 4 set out, has not proved reliable). The model met most of the required criteria, being fully integrated and delivering a forecast by award rate, and worked well prior to PIP introduction. But the size and complexity of INFORM – and the fact that it was not tailored to the needs of PIP forecasting – meant there were some significant disadvantages to its use:

  • ‘Black box’ processing meant it was not obvious how changes to input assumptions resulted in forecast outputs. Links with changes in other benefits were insufficiently clear. Exploring and clarifying these linkages was not feasible during the compressed timetables and resource pressures that characterise a Budget forecast process.
  • Relative inflexibility in changing the structure, such as splitting DLA inflow assumptions between children and adults when the two started to show substantially different trends.
  • Inability to apply time-varying assumptions, which became important in the new system, particularly when significant processing backlogs arose.
  • Difficulty in overriding unrepresentative data generated by the processing backlogs.

Aggregate ‘bottom-up’ modelling

As PIP-specific data became available, in a different format and with more functionality than the WPLS, use of INFORM was superseded by aggregate ‘bottom-up’ modelling based on the new data. This was independent of other benefits, and modelled the ‘customer journey’ through the system from claim registration to initial decision, reconsideration and appeal, and ultimately exit from PIP. Additional modelling using the same data estimated the amount of arrears paid. This model enabled the forecast to be compared against the emerging data almost in real time, and allowed the key assumptions on provider capacity and claim success rates to be included directly in the model. This modelling evolved as more data became available, particularly on the managed migration from DLA to PIP and on award reviews. The main disadvantages were:

  • The modelling was heavily dependent on duration assumptions, but there is very little information to inform how PIP durations will evolve in the longer term.
  • The large number of detailed assumptions required meant that it was very difficult to take a holistic view of the caseload, and the interdependencies between assumptions.
  • The different sources of data meant that the DLA and PIP models were entirely separate.

‘Top-down’ prevalence approach

The third approach, used more recently, is a ‘top-down’ prevalence approach – projecting forward receipt of PIP and DLA by age for adults aged 16 to 64. This approach proved better than the other two for taking an overall view of future caseload growth, with the main judgement being about how prevalence would evolve over time, implicitly reflecting underlying trends in disability and the impact of unspecified future legal cases in expanding coverage of disability benefits. This type of model is more appropriate for forecasting a broad path of caseloads over the medium term, but it proved difficult to calibrate to outturn data or to determine whether differences between forecast and outturn reflected temporary deviations from the assumed medium-term path or shifts to a lower or higher trend.

In the aggregate ‘bottom-up’ and ‘top-down’ models, average amounts were forecast separately, although with only four rates payable it would be possible to integrate them into the ‘top-down’ approach fully. Our October 2018 forecast was the first time we had included specific assumptions about the trends in individual rates of PIP paid. Further work is needed to consider how these evolve as the caseload matures and average durations rise.

The main lesson from this experience is that no single model ticks all the necessary boxes. In an environment still subject to significant change, running a detailed ‘bottom-up’ model alongside a simpler ‘top-down’ model is likely to be appropriate, with useful insights to be gained from operating each methodology separately. We will work with DWP to improve each of these models, in particular to achieve greater integration between different aspects of the forecast.

a More detail is given in Chapter 4 of our October 2016 Forecast evaluation report.