Kickoff workshop
The kickoff workshop is planned to be on September 23-24, 2021 (Thursday/Friday).
We ask all participants of the workshop to register. Please register via the form. The Zoom link will be sent before the workshop.
Schedule of the workshop
All times are Central European Summer Time (CEST).
Day 1 (Sept. 23) – 3:00-6:00 pm
3 pm: Welcome, opening remarks
3:15 pm: Project presentation
3:30-3:40 pm: Introduction of the Advisory Board
3:40-3:55 pm: Short break and get together
4:00 pm: Keynote by Arman Cohan:
"Facilitating scientific knowledge discovery through improved
representation learning and extreme summarization"
4:40-4:45 pm: Short break
4:45 pm: Panel 1: Open Issues in Mining Scientific Publications
6:00 pm End of day 1
Day 2 (Sept. 24) – 3:00-6:00 pm
3 pm: Welcome back
3:15 pm: Panel 2: Entity and dataset linking in scientific texts
4:30-4:45 pm: Short break
4:45 pm: Breakout sessions
5:45 pm: Concluding remarks
Keynote
The video recording of the keynote talk is online: Keynote on Youtube, 38min.
Title: Facilitating scientific knowledge discovery through improved representation learning and extreme summarization
Abstract: As the pace of scientific publication continues to increase, technologies to help users to search, discover, and understand the scientific literature have become critical. In this talk I will discuss two of our works in this direction that specifically facilitate discovery of relevant scientific information. First I’ll present SPECTER, a representation learning model for scientific papers that leverages the citation graph along with the power of Transformers in encoding textual information. SPECTER paper embeddings result in significant improvements in many downstream applications, including recommendations, user feeds, citation ranking, and peer review assistant tools. In the second part of the talk I will discuss TL;DR, an extreme summarization dataset and model for scientific papers that provides a single sentence summary of an entire scientific paper. Our model uses a simple scaffolding strategy to leverage the title of papers during training and is able to achieve substantial improvements on this low-resource and challenging task.
Bio: Arman Cohan is a Research Scientist at the Allen Institute for AI and an affiliate Assistant Professor at University of Washington. His research primarily focuses on representation learning, language modeling and transfer learning methods in NLP, as well as their applications in the scientific and health domains. He obtained his PhD at Georgetown University in 2018 and research has been recognized with best paper award at EMNLP 2017, honorable mention at COLING 2018, and Harold N. Glassman Distinguished Doctoral Dissertation award in 2019.