Network science & data
PainterPalette
The largest historical painters dataset, revealing how artistic movements evolve through networks.
Overview
PainterPalette is the largest public dataset of historical painters, built by merging WikiArt, Wikidata, and Art500k records into a single structured resource. It combines data collection, entity resolution, and graph analytics to expose how creative movements diffuse across time and geography.
The dataset powers research into cultural diffusion and collaboration, with analyses of painter networks, style networks, and co-exhibition graphs presented at NetSci 2025 and related workshops.
| Metric | Details |
|---|---|
| Entries | 10,000+ historical painters |
| Attributes | 40+ structured fields per profile |
| Sources | WikiArt, Wikidata, Art500k |
| Graphs | Co-location, co-exhibition, style influence networks |
Key features
- Extracted and cleaned painter data with a Wikidata SPARQL wrapper.
- Designed relational and Neo4j schemas for cultural graph analytics.
- Built ETL pipelines in Python and KNIME for repeatable updates.
- Analyzed painter co-location and co-exhibition networks at scale.
Technical approach
The pipeline uses Python, MySQL, and Neo4j to reconcile entities and generate reproducible network metrics. Visualizations are produced with Gephi and tailored notebooks for research presentation.
Results & impact
PainterPalette now supports ongoing cultural analytics research and has been adopted by collaborators for exploratory studies in art history and network science.