profile_picture
Effy Xue Li
Post-Doctoral Researcher, CWI Amsterdam

Hi! I’m Effy, you can also call me 李雪 [lǐ xuě]. I am a PostDoctoral Researcher at CWI Amsterdam, in the TRL lab, led by dr. ir. Madelon Hulsebos. I am interested in the topic of towards automating contextual data science, with an emphasis on data management.

I did my PhD at University of Amsterdam in INDE lab, supervised by Prof. Paul Groth and Dr. Jan-Christoph Kalo. During my PhD, I did an internship at MotherDuck on Data wrangling with LLMs. Previously I was an AI resident at Microsoft Research Cambridge, UK. Before that I did my Masters’ in Univeristy of Edinburgh.

My PhD topic was on Knowlege Graph Construction from Conversational Data. Specifically, I look into how can we better utilizing (large) language models to extract information such as entities and relations, and furthermore, making (large) language models more data-efficient, robust and adaptable. My PhD research was funded and situated in a NWO project IN-SIGHT.it.

I am originally from China and have lived in the UK and the Netherlands. When I am not in front of my laptop, you can often find me bouldering (in completely beginner level), cycling or drinking coffee (like everyone else :D). If you are keen to collaborate or exchange ideas over a cofee, you can reach out to me at effy DOT li AT cwi DOT nl.

Interests

  • LLMs for Data Management
  • LLMs for Information Extraction
  • Automated Data Science

Academia

CWI Amsterdam
2025-04 - present
University of Amsterdam
2020-10 - 2025-04
Ph.D. Knowledge Graph Construction
University of Edinburgh
2017 - 2018
M.Sc. Artificial Intelligence
supervised by Prof. Shay Cohen.
Guangzhou University
2012 - 2016
B.Eng. Electronic Information Engineering
graduated with top 1% GPA with National Scholarship

News

Recent Publications

How different is different? Systematically identifying distribution shifts and their impacts in NER datasets, 2024, Language Resources and Evaluation (LREC)
Xue Li , Paul Groth
Towards Efficient Data Wrangling with LLMs using Code Generation, 2024, DEEM@SIGMOD'24
Xue Li , Till Döhmen
Do Instruction-tuned Large Language Models Help with Relation Extraction?, 2023, ISWC 2023 LM-KBC workshop
Xue Li , Fina Polat , Paul Groth
Knowledge-centric Prompt Composition for Knowledge Base Construction from Pre-trained Language Models, 2023, ISWC 2023 LM-KBC workshop
Xue Li , Author Name
The Challenges of Cross-Document Coreference Resolution for Email, 2021, Proceedings of the 11th Knowledge Capture Conference
Xue Li , Sara Magliacane , Paul Groth

Teaching and Supervision

2023 October - December: Teaching Asistant NLP1 UvA
2023 Janurary - July: Master thesis supervision. Student: Casper Smit, title: Graphical information for cross-document coreference resolution in emails.
2023 April - May: Teaching Asistant Data and Knowledge UvA
2022 Janurary - June: Master thesis supervision. Student: Timothy Dorr; Title: Not All Names Are WEIRD-Identifying Name Origin Bias in Named Entity Recognition Tasks on Email Data
2022 Feburary - April: Teaching Asistant Causal Data Science UvA