Summary:
This seminar will examine the significance of evaluating image captions to ensure linguistic fluency and semantic coherence with visual elements. The discussion will highlight advancements in automated caption evaluation while addressing the limitations of current assessment metrics, particularly their focus on English and the lack of multilingual inclusivity. The session will also introduce enhancements to the CLIPScore metric, aimed at improving its interpretability and reliability for practical applications. By utilizing a model-agnostic conformal risk control framework, the presentation will delve into the calibration of CLIPScore distributions, focusing on detailed assessments of individual captioning errors and the creation of more accurate and reliable scoring intervals for evaluation purposes.
Speaker Profile:
Gonçalo Gomes holds a Master's degree in Data Science and Engineering from Instituto Superior Técnico, Universidade de Lisboa. Currently, he is a second-year PhD student and a junior researcher at the Human Language Technologies Lab of INESC-ID, as well as at SARDINE Labs of Instituto de Telecomunicações (IT). His research revolves around the creation of comprehensive and trustworthy evaluation frameworks for vision-and-language technologies, with a particular emphasis on fostering inclusive AI frameworks adaptable to non-English contexts.
Share this event
Is this your event?
Click here to claim the page now
Get your event seen by hundreds of attendees for free!
Boost your reach and make your event a success.