4  ECHILD Overview

4.1 Key features

  • Population-based cohort of children & young people in England born between 01/09/1984 to date (with annual updates).
  • Longitudinal data from birth to mid-adulthood or the most recent year available, e.g., children born in 1984 will be aged 38 in HES records in 2022.
  • Pseudonymised datasets that do NOT include any identifiable information (name, address, postcode, date of birth, Unique Pupil Numbers or NHS numbers).
    • De-identified NHSE Hospital Episode Statistics administrative datasets (listed in Chapter 6).
    • DfE National Pupil Database, including the Children Looked After Return and Children in Need Census.
    • ONS Mortality data.
  • Linkage rates
    • of births in 1996/7, 94% of children in the National Pupil Database linked to a hospital record.
    • of births in 2004/05, 98% of children in the National Pupil Database linked to a hospital record.
  • Linked data for an estimated 20 million individuals.

ECHILD includes information from Hospital Episode Statistics (HES), including mortality data provided by the Office for National Statistics (ONS). It also contains education and social care information from the National Pupil Database (NPD). Two pseudonymised IDs are included, one specific to HES data, the other specific to NPD, and individuals are linked across multiple records over time using these IDs. Both the HES IDs and NPD IDs are mapped together and stored within a ‘Pseudonymised Bridging file’ located within the ONS SRS (Mc Grath-Lone, Libuy, Harron, et al., 2021).

ECHILD can only be accessed by approved researchers in the ONS SRS, and researchers are not permitted to try to re-identify individuals. Furthermore, any results of analyses (tables or figures) are checked by ONS staff for potential disclosure risk before they can be exported from the ONS SRS.

An overview of how the ECHILD database is structured can be found in Chapter 5, with further details about the linkage algorithm provided in Appendix B and information about the ‘Attribute Data’ for HES and NPD provided in Chapter 6.

4.2 Education dataset pseudo-identifier

When a pupil first attends a state-funded school in England e.g., nursery or primary school, or has an education, health and care (EHC) plan put in place, they are allocated a ‘Unique Pupil Number’ (UPN), which remains with the pupil throughout their school career regardless of any change in school or local authority (Department for Education, 2019). Social care data is included in the NPD for children who have a UPN. Children receiving social care preschool entry who never have social care during their school years are therefore not included in ECHILD. UPNs facilitate the transfer of school-based education and attainment data between schools, local authorities and central government and are stored within the NPD. Within ECHILD, a nationally unique and anonymised child-level identifier called the Anonymised Pupil Matching Reference (aPMR) can be used to link data across different years of data collection (Jay, Mc Grath-Lone and Gilbert, 2019).

4.3 Healthcare dataset pseudo-identifier

The pseudonymised linkage spine is generated by NHS England. NHS England receives real-world identifiers provided by the DfE (name, date of birth, sex, postcode) and the aPMR. The real-world identifiers from education and NHS healthcare data are linked separately from any health or education information (NHS England, 2023h). For each matched pair of identifiers from education and health, NHS England attaches a pseudonymised ID called a ‘Token Person ID’ (TPI) (NHS England, 2024d). The TPI is created specifically for ECHILD and cannot be used to identify anyone.