ATRAPA: Active travel infrastructure changes – Data update πŸšΆβ€β™€πŸš΄πŸΏβ€β™‚πŸ“Š ️️

Data collection status

New official data for Barcelona!

  • Roadway and pavements surfaces (AMB): Area level (polygons)β€”no historical data.

Data collection status

  • Cyclable streets and bike lanes (2015): Street level (vector)β€”definitions being clarified.

National road historical datasets (Sweden, Netherlands)

No clear distinction between new infrastructure and data updates.

Access to Planet’s education and research program

Global satellite imagery (since 2016), 3–5 m/pixel, likely unsuitable for detailed pedestrian infrastructure.

OSM validation method: Sample stratified approach

  • Study scope: Validate OSM data (bike lanes, pedestrian streets) in 7 cities (Barcelona, Milan, Warsaw, Ljubljana, Utrecht, MalmΓΆ, Paris) for three periods (e.g. 2015, 2019, and 2023).

  • Reference dataset: Google Street View (GSV).

  • Sampling: ~60 census tracts per city (stratified: central, peripheral, high-density, low-density).

  • Image collection: Up to 13,000 GSV images via API, randomly selected within the stratified tracts (GEMOTT πŸ–₯️).

  • Validation (MTurk):

    • 5 users classify each image by answering β€œYes” or β€œNo” to:βœ… Bike Lane, βœ… Pedestrian Area, βœ… Both.

    • At least 3 out of 5 users must agree.

  • Metrics:

    • Accuracy: Proportion of correctly mapped features (true positives).

    • Completeness: Proportion of real-world features present in OSM.

    • SCI (Spatial Completeness Index): Measures variation in completeness.

  • Reliability: Metrics determine which cities, intervals, and infrastructure types have reliable data (based on thresholds in prior studies).

OSM validation method: Sample stratified approach

  • Simulated results:
City Interval Type Accuracy Completeness SCI Reliable
Barcelona 2015–2019 Bike Lanes 91% βœ… 88% βœ… 9% βœ… βœ… Yes
Barcelona 2015–2019 Pedestrian 87% βœ… 84% βœ… 12% βœ… βœ… Yes
Paris 2019–2023 Bike Lanes 78% ❌ 72% ❌ 18% ❌ ❌ No
Warsaw 2015–2019 Bike Lanes 75% ❌ 70% ❌ 10% βœ… ❌ No
Milan 2019–2023 Bike Lanes 89% βœ… 90% βœ… 11% βœ… βœ… Yes
Milan 2019–2023 Pedestrian 65% ❌ 60% ❌ 19% ❌ ❌ No
Ljubljana 2019–2023 Bike Lanes 55% ❌ 50% ❌ 22% ❌ ❌ No
  • Estimated Cost:

    • MTurk: ~€614 (5 users Γ— €0.01 Γ— 13,000 images).

    • Google Street View API: ~€23 (2 lots of 6,000 images).

    • Total: ~€637.

  • Risk: Low data quality in most cases (low accuracy, completeness, or high SCI).

Alternative approach (idea – still to discuss)

  • Full extrinsic validation in Barcelona (GSV + MTurk).

  • Use intrinsic OSM indicators (contributor count, version count, tag richness) for other cities.

Expert consultation and feedback

  • HeiGIT team (Kirsten): Doubted historical OSM data reliability (esp. pre-2017, pedestrian/cycling), though found comparing it with external data like StreetView interesting and useful.

  • Aaron Hipp: Supported the validation approach; highlighted issues with inconsistent definitions across cities.

  • Robin Lovelace: Supported the validation approach; could help with consistent tagging issues, osmextract R package.

  • Ane Rahbek VierΓΈ: Found the validation very study interesting but challenging, mainly due to tagging inconsistencies and lack of historical data for pedestrians and cyclists. Provided useful references.

  • Angelo Guevara?

βœ… General consensus: Experts agree that a research paper on this validation method would be highly valuable.

Next steps and future plans

  • Immediate actions:

    • Continue collecting official data (focus on other cities).

    • Check Planet’s satellite imagery data suitability to test active travel infrastructure.

    • Refine the OSM validation method/ set up validation (MTurk) task.

    • Reply to HeiGIT with our refined method and perhaps meet with them.

  • Paper development: