Paid Voices vs. Public Feeds: Interpretable Cross-Platform Theme Modeling of Climate Discourse

Samantha Sudhoff*, Pranav Perumal, Zhaoqing Wu, Tunazzina Islam*. Under Review.

[arXiv]

Abstract

Climate discourse online plays an important role in shaping public understanding of climate change and influencing political and policy debates. However, climate communication unfolds across structurally different online environments. Paid advertising platforms host targeted, institutionally produced messaging, while public social media platforms reflect largely organic, user-driven discussion. Despite these differences, computational studies typically analyze each environment in isolation, limiting our ability to compare institutional messaging and public discourse within a unified analytical framework. In this work, we conduct a comparative analysis of climate discourse across paid advertisements on Meta (previously known as Facebook) and public posts on Bluesky from July 2024 to September 2025. To support this comparison, we develop an interpretable thematic discovery pipeline that clusters texts by semantic similarity and leverages large language models (LLMs) to generate concise, human-interpretable theme labels. We evaluate the resulting themes against standard topic modeling baselines using both human judgments and an LLM-based evaluator, and further probe their utility through downstream stance prediction and theme-guided retrieval tasks. Using the induced themes, we analyze differences in thematic prevalence across platforms and examine how discourse shifts around major real-world events. Our findings show that paid advertising and public social media discourse exhibit systematic differences in thematic structure, stance alignment, and temporal responsiveness. While our empirical analysis focuses on climate communication, the proposed framework provides a general approach for comparative narrative analysis across heterogeneous communication environments.