Prof. Steffen Eger - Multimodal Scientific Content Generation with LLMs

Event details
-
Friday 9 May 2025 - 11:00am to 12:00pm
Description
Title: Multimodal Scientific Content Generation with LLMs
Abstract: Generating scientific images (e.g., scientific figures) by hand can often be a time-consuming laborious task, where some popular coding languages such as TikZ come with a steep learning curve. Automatizing this process promises to facilitate and accelerate scientific multimodal content production. In this talk, I will discuss our recent approaches to the problem. In particular, I will talk about (i) AutomaTikz (ICLR 2024) which addresses the problem of generating scientific figures from textual instructions, (ii) DeTikZify (NeurIPS 2024) which allows to generate scientific figures from images or sketches, and (iii) ScImage (ICLR 2025) which provides a template-based benchmark for evaluating multimodal LLMs for instruction-based figure generation. I will also talk about our ongoing research in this context which e.g., addresses the misalignment problem between scientific images and the corresponding captions.
This talk may be of interest beyond NLP, especially to those working with scientific visualization, multimodal learning, or large language models. All are welcome!
Location: In-person at Ada Lovelace (108), Regent Court and online via Google Meet
Short Bio: Since 2024, Steffen Eger is Full Professor at University of Technology Nuremberg, Germany, leading the Natural Language Learning & Generation (NLLG) Lab (https://nl2g.github.io/). Before that, he was an Interim Professor at Bielefeld University and a Group Leader at TU Darmstadt and University of Mannheim in Germany, respectively. His research focuses on various aspects of Natural Language Processing (NLP), in particular evaluation for text generation and interdisciplinary problems at the intersection of NLP, the digital humanities and the social sciences.