HY RU EN
Asset 3

Loading

End of content No more pages to load

Your search did not match any articles

How a BBC Data Unit Scraped Airport Noise Complaints

I’d wondered for a while why no one who had talked about scraping at conferences had actually demonstrated the procedure. It seemed to me to be one of the most sought-after skills for any investigative journalist.

Then I tried to do so myself in an impromptu session at the first Data Journalism Conference in Birmingham (#DJUK16) and found out why: it’s not as easy as it’s supposed to look.

To anyone new to data journalism, a scraper is as close to magic as you get with a spreadsheet and no wand.

Numbers and text on page after page after page after page just effortlessly start to appear neatly in a spreadsheet you can sort, filter and interrogate.

You can even leave the scraper running while you ring a contact or just make a cup of tea.

Scraping Heathrow’s Noise Complaints

I used a fairly rudimentary scraper to gather three years’ worth of noise complaint data from the Heathrow Airport website. With the third runway very much on the news agenda that week I wanted to quickly get an idea of how much of an issue noise already was.

The result was this story, which was widely picked up by other outlets.

But how did I do it?

Complaints data for each day of the year was published on a separate URL. To create the spreadsheet would have taken me hours or even days.

(Photo: bbc.com)

Read more

Write a comment

If you found a typo you can notify us by selecting the text area and pressing CTRL+Enter