The ODISSEI Portal combines metadata from a wide variety of research data repositories into a single interface, allows advanced semantic queries to support findability, and facilitates data access.
Or start your search by typing a search term in the bar: (please note: after pressing Find, the ODISSEI Portal will open in a new tab)
The ODISSEI Portal’s catalogue will include metadata from most data collections that are relevant to the social science community in the Netherlands. Currently, the prototype of the Portal includes (a) metadata of the microdata catalogue of Statistics Netherlands (CBS), (b) metadata from most datasets developed in the ODISSEI Laboratory (LISS panel), (c) metadata from the DANS-Easy SSH data catalogue; and (d) metadata from the social science records in DataverseNL. In the future, also metadata from all datasets developed in the ODISSEI Observatory (EVS, GGP, SHARE, ESS, NTR, HSN) will be added.
The Portal will also extend the coverage of available data by including key research datasets that are currently not findable via the available repositories. This dataset extension task is a joint effort of data stewards at DANS, the ODISSEI team of Data Scouts at the Observatory, and the ODISSEI Data Manager.
To enable semantic search and linking across datasets from different providers the metadata needs to be harmonised. To achieve this the Portal uses a pipeline to ingest the machine-readable metadata. Detailed information on the ingestion process for each record is attached to each metadata record in the Portal. The pipeline consists of a number of microservices, published on the ODISSEI GitHub. Furthermore, metadata can be exported from the Portal in different formats (including DDI and JSON).
Currently, existing tools for findability in the social sciences are limited in that they only identify specific terms or synonyms (e.g. United Kingdom question bank). The ODISSEI Portal will extend and improve search functionality by using semantic queries which will enable broader probabilistic matching and link functions over an enriched knowledge graph representation of the FAIR metadata catalogue. This incorporates the context of specific terms which are crucial in social research. By using rich and extensible data structures developed within the linked data community, ODISSEI will evolve the relatively flat metadata catalogues in use today into the highly interlinked and graph-based structures needed to conduct advanced semantic searches. The richness in the metadata catalogue does not solely rely on the high manual documentation and curation standards that already exist across ODISSEI associated data such as DDI. The social sciences are fortunate to have an advanced automated metadata capture system that documents the data collection process, principally through survey software. Besides the manual documentation and curation standards, it takes advantage of automatic and semi-automatic metadata enrichment and entity linking to enrich the available curated information. Where relevant, there will be alignment with standards used in CESSDA (CESSDA Metadata Model) at the European level and the Open Science Framework.
The Portal also facilitates automatic and semi-automatic data access policy management between the producers and users of research datasets. Unclear data licensing or access policies are currently an obstacle in open science and the application of the FAIR principles, even for research datasets that are available as open data. ODISSEI will enrich its research data catalogue with explicit, and as detailed as possible information on licensing and access policies, preferably in a machine-readable format. The owners of each dataset will be able to provide the Portal with metadata describing what the policy for obtaining access entails. The access process varies between data providers: for instance, Statistics Netherlands requests that the user is affiliated with an authorised research institute and using their data involves formalities and costs, whereas other research data are often freely available for download to anyone around the globe.
For datasets with machine-readable access policy metadata, the ODISSEI Data Access Broker, i.e. an automated system that is closely connected to the Portal, will be able to facilitate the user. For example, the Access broker will be able to send data access request to the data owner, initiatea federated authentication session, or redirect researchers to the landing page of an open dataset. In case a dataset does not yet have fully machine-readable access policy metadata, the ODISSEI Data Manager based at EUR will help the data owner and researcher with the access process.
Once the data owner reaches an agreement with the user, the owner allows the ODISSEI Data Access Broker to make the data available to the user. This can be achieved, for instance, by transferring the data to the designated analysis environment, typically the ODISSEI Secure Supercomputer (in case of large, complex or sensitive data held at CBS) or the computer of the user(in case of small and/or open data).
Status: In September 2022, a first version of the Portal was launched. It contains the metadata of over 4000 datasets, including those from the CBS Microdata catalogue, the LISS panel from Centerdata, and social science data deposted at DANS. The portal will be furter developed until 2024, by enriching data with existing word lists and linked open data, to further improve the findability. We will also add additional metadata from a large amount of providers, with the goal of making all Dutch social science metadata available in one interface. In addition, the data access broker will be added, which allows researchers to request data access to multiple data owners at once.
Project team ODISSEI Portal:
- VU: Jacco van Ossenbruggen (Task Leader), Elena Beretta, Margherita Martorana, Ronald Siebes
- DANS: Ricarda Braukmann, Wim Hugo, Thomas van Erven, Fjodor van Rijsselberg, Michiel Zuidema, Slava Tykhonov, Eko Indarto, Laura Huis in ’t Veld
- SURF: Jorik Van Kemenade, Freek Dijkstra