ODISSEI Lunch Lecture: From PDF to Knowledge Graph: parsing CBS metadata

The ODISSEI Lunch Lecture series highlights innovations in social science. 

In this lunch lecture, Chang Sun (Maastricht University) will present her project on “Knowledge Graph for metadata of CBS Microdata”. Currently, the metadata are only available in Dutch and in PDF format. This  project converts the descriptions of all CBS microdata sets into one knowledge graph with comprehensive metadata in Dutch and English using text mining and semantic web technologies. Researchers can easily query the metadata, explore the relations among multiple datasets, and find the needed variables. For example, if a researcher searches a dataset about “Age at Death” in the Health and Well-being category, all information related to this dataset will appear including keywords and variable names. “Age at Death” dataset has a keyword – “Death”. This keyword will lead to other datasets such as “Date of Death”. “Cause of Death”, “Production statistics Health and welfare” from Population, Business categories, and Health and well-being categories. This will tremendously save time and costs for the data requester but also data maintainers.

Chang Sun is a PhD student working at the Institute of Data Science at Maastricht University. She achieved her master degree in Artificial Intelligence in 2017 and then started her PhD research in the data science domain.  Currently, she is working on privacy-preserving data mining and federated/distributed machine learning technologies to solve the problem of analysing sensitive data across multiple independent data parties. She is also developing a personal data vault platform where people can take full control of their own data in order to strengthen and extend the (re-)use of personal data while maximally protecting individuals’ privacy. 

Register here.

Date 18 May 2021