Linking Historical Records and Contemporary Administrative Data

18 October 2024

Richard Zijdeman, Head of the Data & Augmentation department (IISG), senior researcher, SSHOC-NL Task Lead.

The Historical Sample of the Netherlands Database (HSNDB) has supported research into historical demographic developments for several decades. Many of these developments have echoed through Dutch society over the years and still shape (knowledge of) the societal behaviour of individuals living today. While social scientists have studied this ‘long arm’ of history, it has mostly been in isolation from contemporary data. Substantively, this is a big loss. Connecting historical and contemporary data provides a longer span of time to study variability in for example inequality. Also, persons in historical datasets are the forebears of those in contemporary sets and linking them provides rich information on within and between family patterns on outcomes such health and longevity.

As part of ODISSEI, data from the HSNDB dataset Historical Sample of the Netherlands (HSN) has been linked with the current Statistics Netherlands microdata and catalogued in the ODISSEI Portal. This enables links between the historical research conducted using the HSN and contemporary society and the outcomes studied by social scientists, whilst simultaneously allowing social scientists to understand the historical antecedents of the behaviours that they study. The linking of this data creates a meaningful infrastructural bridge between ODISSEI and CLARIAH, the National Infrastructure for Humanities research and supports a wide range of interdisciplinary research and exciting new research lines in demography, economics, sociology and socio-genomics. The work has been executed by the International Institute for Social History (IISG) and Statistics Netherlands (CBS). 

The HSN sample is based on the birth certificates from the period 1812-1922. To achieve the link with Statistics Netherlands microdata, all HSN research persons from the birth cohorts 1900-1922 (n=20,000) have been extended with the life courses of their children. Most of these children are born between 1925 and 1960. ODISSEI then linked four generations of families that lived in the Netherlands between 1850 and 2023. The linking was done on the basis of the combination of the birth date of the person and his/her parents. A test proved that 85 to 90% of the persons from the second generation could be immediately linked to the Statistics Netherlands microdata and then more extensive linkage procedures were necessary to resolve ambiguities (twins, double links) and errors including more variables in the procedure (e.g. names and partners). 

The result of this work is a linkage file that researchers working in the Statistics Netherlands Secure Remote Access Environment can access and incorporate links to this historical population sample. ODISSEI provides opportunities to help bear the cost of using the Access Environment. The resulting linkage files are already being used in a range of research projects, including investigations into the intergenerational transmission of mortality over very long periods.

Photo by Zetong Li on Unsplash