Does census data still have value almost ten years later?

2021 is on the horizon and that date means only one thing for small area data fans…the 2021 Census is nearly upon us! We have been eagerly awaiting an updated census, which will help give us an insight into the profound social and demographic changes that have occurred over the last decade. However, it will not be until 2023 that we can start getting our hands on some of the major data outputs at small-area level.

In the meantime, the 2011 Census continues to provide an (increasingly out of date) picture of the characteristics and challenges in local communities.

So why do those of us engaged in quantitative socio-economic research keep coming back to using census data?

We get a lot of questions about this and our Head of Research, Stefan Noble, has collated some of his thoughts below.

Can we use more recent survey data instead of the census?

Despite Census 2011 data becoming increasingly out of date, it has a major benefit of dwarfing other large scale surveys in terms of the size and scope of the survey. 

The census aims to capture 100% of the population rather than a small subset. In addition, statistical techniques were used to capture the hard to measure population in hidden households (including homeless people, people living in beds in sheds etc). As a result,  it was possible to produce extremely granular data down to the smallest neighbourhood geographies relatively robustly, allowing measures of variation across small communities and local neighbourhoods to identify hidden pockets of need and specific characteristics.

Other surveys do not have the resources or timeframes to produce outputs at nearly the same scale. To put it in context, the second-largest survey –the Annual Population Survey has a sample size of 320,000 – which would lead to sampling errors that are too large to produce reliable statistics below Local Authority level. 

In fact, where survey data is presented at neighbourhood level it is typically modelled down to that level using data from Census 2011. Small area estimation techniques use socio-demographic data from Census 2011 to estimate the prevalence at that small area level – so in effect, a lot of the local variation observed in these measures is derived from Census 2011 data.

So, even though there may be more recent surveys looking at socio-demographic issues across the country, this doesn’t automatically make them a preferable option to use as robustness and reliability may be compromised. 

What questions can only the census answer?

Unlike surveys, administrative data (data collected by governments or other organizations for non-statistical reasons) can provide suitably robust and more timely data to provide a more up to date picture of neighbourhoods than the census. Therefore, where possible we would recommend using administrative data in analysis of the socio-economic patterns and performance in small areas. However, there are some themes where there is comparatively little administrative data and so we are reliant on census data to fill the gap.

In 2014, the National Statistician recommended that the ONS should seek to make increased use of administrative data to both enhance the statistics from the 2021 Census and improve statistics between censuses. This project is called the Administrative Data Census. However, while excellent progress has been made on some themes, with high-quality data from administrative data sources deemed suitable for capturing some measures of population, migration and housing size, various themes either have poor coverage or poor composite quality or both outside of the census, including:

  • Adult skills and qualifications
  • Commuting flows
  • Informal care
  • Economic inactivity
  • Self-reported health
  • Country of birth and ethnicity
  •  Language proficiency
  • Religion
  • Year since last worked
  • Industry/occupation by place of residence
  • Car ownership

The census remains the only robust source of information on these themes at local level and for this reason, we would recommend continuing to use this data when analysing these themes.

My area has completely changed since 2011, is using census data still appropriate?

Some areas have seen considerable change over the last 10 years with new housing developments, large influxes of people from elsewhere or changing socio-economic circumstances. 

Administrative data can be used to measure the impacts of some of these changes (see other sources below). Where there is limited administrative data it may be possible to draw on locally collected data from smaller surveys conducted by the Local Authority in order to get an understanding of some of the more recent local changes. However, in the absence of suitable local or administrative data – the census remains the best estimate of a neighbourhood’s characteristics. In these cases, using census data is still justifiable, however, you may want to ensure that you provide a ‘health warning’ with your analysis, so that users are aware that census data has been used. 

Do other data publishers provide age and gender breakdowns?

The census remains the best source of information to identify where individuals experience multiple disadvantages such as worklessness and low skills, or poor health and lack of access to transport. As census data is sourced from households’ responses to questions across a range of themes, it is possible to link these responses. 

It is also possible to examine socio-economic data for a detailed set of demographic breakdowns (eg. age, gender, ethnicity) to provide a definitive source of information for key equalities groups. 

While some administrative datasets have age and gender breakdowns, it is rare for other sources of data to contain breakdowns by ethnicity or nationality, which is a key gap when looking at identifying inequalities across communities.

What other data sources are available?

However, there are a growing number of useful administrative data sources that provide insight into how our local communities are changing between census periods.

  • The Department for Work and Pensions benefits data is a key source for measuring participation in the labour market and can also be used to identify poor health and disability
  • HMRC data on PAYE taxes and tax credits can be used to identify income levels and children in poverty.
  • The Valuation Office Agency provides useful data on the size and type of properties
  • ONS publish population estimates using a range of administrative sources 
  • The Inter Departmental Business Register (IDBR) and Business Register and Employment Survey (BRES) provide counts of jobs and enterprises by industry

There are also a number of themes not covered by the census at all, for which there are alternative sources, including:

  • criminal offences (collected by Police UK)
  • property transactions (collected by the Land registry)
  • data on broadband provision (from OfCom)
  • births and deaths (collected by ONS)
  • people with specific health conditions (collected by NHS England) 

What’s next?

The quality and breadth of administrative data continues to improve, with work to provide matched administrative data from multiple sources to provide a richer picture at a more granular level. Some examples include work to explore developing an All Education Dataset for England (AEDE). There is also the scope to make use of alternative data sources such as the use of mobile phone data to track commuting flows, or ethnicity estimations based on name records in consumer databases, which could have the potential to supplement and update census data in the future. 

But for now, the census remains a vital tool for understanding the characteristics of neighbourhoods, as long as its limitations are understood.

Stefan Noble, Head of Research

Case studies
Data analysis
left-behind areas
OCSI news
Project Spotlights
Resources and data
Uncategorized