The ODISSEI Observatory provides continuity for long-standing, excellent data collections in the social sciences and links them to the ODISSEI infrastructure.

It is a work stream in the ODISSEI Roadmap project. The other work streams are the Data Facility, Laboratory and Hub.

The Observatory consists of four tasks:

  1. Data collection
  2. Linking to HSN data
  3. Linking to the OSSC
  4. Media Content Analysis Lab

2.1 Data Collection

The coordination of all ODISSEI data collections by a single data infrastructure opens the possibilities for greater substantive and methodological synergies, the coordinated application of new technologies, the implementation of shared standards, and the efficient distribution of resources. This can reduce costs, improve data quality and support multi-disciplinarily. This is particularly true with regards to international data infrastructures such as SHARE, GGP, ESS, LIS, and EVS. ODISSEI will coordinate the Dutch representation with these infrastructures and seek to align the work within ODISSEI.

ODISSEI will also conduct a full assessment of international data commitments and evaluate participation on a biannual basis (2021 & 2023). This process is currently disjointed and inefficient. Statistics Netherlands is mandated to provide certain data to Eurostat under European regulation and there are also statistical requirements under the United Nations Agencies such as the International Labour Organization and the United Nations Population Division. There are also requirements as part of membership of the OECD and participation in studies such as PISA (Programme for International Student Assessment) and PIAAC (Programme for the International Assessment of Adult Competencies). Then there are commitments of the research community to international research infrastructures such as SHARE-ERIC, ESS-ERIC, GGP, EuroCohort, LIS, and EVS. ODISSEI will develop an evaluation framework for the sustainability of the data, examine the data landscape, and report to the ODISSEI Supervisory Board on its recommendations for data coordination.

ODISSEI will address high fieldwork costs in the Netherlands by developing a cost reduction strategy for fieldwork to identify potential efficiencies throughout the data collection process in survey infrastructures. ODISSEI will commission a strategic assessment of the face-to-face fieldwork process and systematically identify the cost components within the current tender requirements. Specifically, workload management and caseload constraints that data collections apply within tender requirements are preventing fieldwork agencies from deploying more cost-efficient face to-face-operations. Sometimes these requirements have high scientific merit, but some have limited scientific merit and significant consequences for cost. The results of the assessment will then be fed into a working group which consists of leading data collections. This working group will be tasked with developing guidelines for future data collections and facilitating the cost reduction strategies outlined in the assessment. Based on the report and its findings, ODISSEI will provide a single tendering framework with clear guidelines on how call for tenders are to be structured. This will be used for data collections from January 2022 onward.

Whilst face-to-face fieldwork costs are increasing rapidly, the feasibility of high-quality online surveys are improving. Online surveys have been constrained by the degree to which they reach the whole population and the degree to which they can be applied in international contexts. ODISSEI will address these issues by developing protocols for adopting web only designs. These will specify mitigation strategies for the negative effects of online only data collection, specifically for hard to reach sub-populations and the internationally comparable nature of the data. The next stage in development will be to set out the conditions and timetable for this transition. These will then be used to inform and structure future data investments within ODISSEI.

Project team Data collection: Marike Knoef (Leiden University – Task leader).

Questions regarding Data collection? Contact Tom Emery (ODISSEI Coordination Team).

2.2 Historical Sample of the Netherlands

Data from the Historical Sample of the Netherlands (HSN) will be linked with the current Statistics Netherlands microdata and catalogued in the ODISSEI Portal (subtask 2.2b). This will enable links between the historical research conducted using the HSN and contemporary society and the outcomes studied by social scientists, enabling stronger links with CLARIAH. The HSN sample is based on the birth certificates from the period 1812-1922. To achieve the link with Statistics Netherlands microdata, all HSN research persons from the birth cohorts 1900-1922 (n=20,000) will be extended with the life courses of their children. Most of these children are born between 1925 and 1960. In total, ODISSEI will link at least four generations of families that lived in the Netherlands between 1850 and 2023. The linking will be done on the basis of the combination of the birth date of the person and his/her parents. A test proved that 85 to 90% of the persons from the second generation could be immediately linked to the Statistics Netherlands microdata. However, more extensive linkage procedures are necessary to resolve ambiguities (twins, double links) and errors including more variables in the procedure (e.g. names and partners). The linking of this data will create a meaningful infrastructural bridge between ODISSEI and CLARIAH and the resulting links will be findable via the ODISSEI Portal.

Project team HSN: Kees Mandemakers (IISG – Task leader).

Questions regarding Fieldwork coordination? Contact Tom Emery (ODISSEI Coordination Team).

2.3 Linking to the OSSC

Data from ODISSEI data collections will be linked in the ODISSEI Secure Supercomputer. Many data collections in the social sciences have begun to include genotype information or other measures such as MRI and recordings from wearables. These can be pooled to achieve larger sample sizes to study rare conditions or small and hard to reach sub-populations. To accommodate and link multiple sensitive data sources together, the ODISSEI Secure Supercomputer is required. The Netherlands Twin Registry will take the lead in linking the ODISSEI Secure Supercomputer project to a diverse range of ODISSEI data collections. Running new and innovative genetic analyses on multiple traits, longitudinal outcomes, and gene-environment interactions will allow insights into the role of genetics in social phenomenon within Dutch society. The aim will be to set up standard operating procedures (SOP) for running parallel analyses within multiple ODISSEI data collections with their data linked to other sensitive data sources such as Administrative Data from Statistics Netherlands. Because of privacy, informed consent, and linkage issues, combining data from different data collections must be done in the secure access facility that ODISSEI provides. Combining data across data collections will substantially enhance statistical power, which is essential in many fields of research.

Questions regarding Fieldwork coordination? Contact Tom Emery (ODISSEI Coordination Team).

2.4 Media Content Analysis Lab

The ODISSEI Media Content Analysis Lab will offer systematic analysis of large corpora of digital media content. The Media Content Analysis Lab seeks to tackle the major challenge of digital text analysis whereby copyright and GDPR restrictions make it very hard to share media content data by providing a workbench for researchers where they can share data and analyses with strict rights management, without the need to read or export the data themselves. Currently, several Dutch initiatives exist that facilitate retrieval, storage, and analysis of media content (i.e. the infrastructure for content analysis [INCA] at the University of Amsterdam (Trilling et al., 2018), and the Amsterdam Content Analysis Toolkit [AMCaT] at VU Amsterdam (Van Atteveldt et al., 2014). Their use, however, requires considerable programming skills and is currently tailored to a limited set of applications. Through the Media Content Analysis Lab, the large amount of longitudinal media content data (i.e. online news data; social media data) currently available within ASCoR at University of Amsterdam will be made available to researchers at ODISSEI member organisations. It will also be possible to deposit new data, or to use specific tools on data stored elsewhere. First, an initial set of pre-existing tools will be identified – ranging from scrapers to collect content from specific sources, to the extraction of meta information to scripts to conduct topic modelling, sentiment analysis, various forms of machine learning, deep learning (for a concise overview see Boumans & Trilling, 2018). Next, suitable existing textual corpora will be added to the database. An interface will be developed that offers researchers the opportunity to register as users, and to apply existing tools on existing datasets. Moreover, the opportunity to apply the content analytical tools on one’s own datasets will be added. Researchers from different social science disciplines will be actively invited and encouraged to do so and will receive technical support when needed.

Project team Media Content Analysis Lab: Rens Vliegenthart (University of Amsterdam – Task leader).

Questions regarding Media Content Analysis Lab? Contact Tom Emery (ODISSEI Coordination Team).