E-flux Scraping

Building an exhibition network dataset from E-flux programs and archives.

E-flux exhibition network visualization

Overview

This project automates data collection from E-flux listings to build a network of exhibitions, institutions, and artists. It focuses on reliable parsing and entity reconciliation for downstream network analysis.

The resulting dataset supports research on cultural diffusion and collaboration patterns across global art institutions.

Key features

  • Scraped and normalized thousands of exhibition listings.
  • Built entity matching to connect venues, artists, and shows.
  • Created a cleaned graph-ready dataset for analytics workflows.
  • Documented repeatable scraping and validation steps.

Technical approach

Python scraping pipelines, structured data validation, and scheduled refresh jobs power the dataset. Outputs are stored in a reproducible format for analysis.

Results & impact

The E-flux dataset expanded the coverage of art-world networks and enabled new research on institutional influence and collaboration.

Links & resources