Towards GeoAI Foundation Models: Unlocking Spatial Knowledge through Multimodal Learning

Friday 1st November 13:00-14:00 Maths 311B

Abstract

As AI continues to evolve, the integration of diverse data sources—such as images and text—offers exciting new possibilities for understanding our world. This talk introduces an innovative framework that combines geospatial knowledge with multimodal foundation models (e.g., GPT-4 Vision), to enhance geospatial analytics. By enabling AI to interpret both visual and linguistic information with spatial awareness, we unlock AI’s ability to tackle complex geospatial tasks, such as geo-localizing images, understanding land use, and predicting urban perceptions.

At the core of this framework is a cutting-edge technique—spatially explicit contrastive learning—which fine-tunes models to reason about geo-locations and spatial patterns, providing deeper insights from diverse geospatial datasets. This approach sets a new benchmark for AI in geospatial applications, enabling us to scale insights across cities and landscapes with unprecedented accuracy. As we push the boundaries of GeoAI, this fusion of multimodal learning and spatial reasoning paves the way for smarter, more informed decision-making in urban planning, environmental sustainability, and beyond.

Add to your calendar

Download event information as iCalendar file (only this event)