profile_picture
Effy Xue Li
Last-year Ph.D. candidate, University of Amsterdam

Hi! I’m Effy, you can also call me 李雪 [lǐ xuě]. I am currently a PhD student at University of Amsterdam in INDE lab, supervised by Prof. Paul Groth and Dr. Jan-Christoph Kalo. I recently did an internship at MotherDuck on LLMs for Data Management. Previously I was an AI resident at Microsoft Research Cambridge, UK. Before that I did my Masters’ in Univeristy of Edinburgh.

My PhD topic on Knowlege Graph Construction from Conversational Data. Specifically, I look into how can we better utilizing (large) language models to extract information such as entities and relations, and furthermore, making (large) language models more data-efficient, robust and adaptable. I am also interested in efficiently using LLMs for Data Management. My PhD research is funded and situated in a NWO project IN-SIGHT.it.

I am originally from China and have lived in the UK and the Netherlands. When I am not in front of my laptop, you can often find me bouldering (in completely beginner level), cycling or drinking coffee (like everyone else :D). If you are keen to collaborate or exchange ideas over a cofee, you can reach out to me at x DOT li3 AT uva DOT nl.

Interests

  • Knowledge graph construction from conversations
  • LLMs for Information Extraction
  • LLMs for Data Management
  • Domain adaptation for NLP

Academia

University of Amsterdam
2020-10 - present
Ph.D. Knowledge Graph Construction
University of Edinburgh
2017 - 2018
M.Sc. Artificial Intelligence
supervised by Prof. Shay Cohen.
Guangzhou University
2012 - 2016
B.Eng. Electronic Information Engineering
graduated with top 1% GPA with National Scholarship

News

  • I was at VLDB'24, Guangzhou China. Here is my trip report , August 2024.
  • Our paper titled ‘How different is different? Systematically identifying distribution shifts and their impacts in NER datasets’ got accepted to Language Resources and Evaluation (LREC)! , July 2024.
  • Our paper titled ‘Towards Efficient Data Wrangling with LLMs using Code Generation’ got accepted to DEEM@SIGMOD'24! I will be in Chile to present our work. , May 2024.
  • Our paper titled ‘Do Instruction-tuned Large Language Models Help with Relation Extraction?’ got accepted to ISWC LM-KBC workshop! , August 2023.
  • Our team scored 2nd on the LM-KBC challenge on Track 2! , August 2023.

Recent Publications

How different is different? Systematically identifying distribution shifts and their impacts in NER datasets, 2024, Language Resources and Evaluation (LREC)
Xue Li , Paul Groth
Towards Efficient Data Wrangling with LLMs using Code Generation, 2024, DEEM@SIGMOD'24
Xue Li , Till Döhmen
Do Instruction-tuned Large Language Models Help with Relation Extraction?, 2023, ISWC 2023 LM-KBC workshop
Xue Li , Fina Polat , Paul Groth
Knowledge-centric Prompt Composition for Knowledge Base Construction from Pre-trained Language Models, 2023, ISWC 2023 LM-KBC workshop
Xue Li , Author Name
The Challenges of Cross-Document Coreference Resolution for Email, 2021, Proceedings of the 11th Knowledge Capture Conference
Xue Li , Sara Magliacane , Paul Groth

Teaching and Supervision

2023 October - December: Teaching Asistant NLP1 UvA
2023 Janurary - July: Master thesis supervision. Student: Casper Smit, title: Graphical information for cross-document coreference resolution in emails.
2023 April - May: Teaching Asistant Data and Knowledge UvA
2022 Janurary - June: Master thesis supervision. Student: Timothy Dorr; Title: Not All Names Are WEIRD-Identifying Name Origin Bias in Named Entity Recognition Tasks on Email Data
2022 Feburary - April: Teaching Asistant Causal Data Science UvA