Balance of Data

by | Dec 1, 2013

Cordelia Webb, GiGL Volunteer

Armed with only a passion for numbers and a particular interest in wildlife statistics, I arrived at GiGL for my week’s work experience not entirely sure what I would be expected to do.

After settling in, I was presented with the number of records for flowering plants, bats, butterflies, mammals (excluding bats) and reptiles for open spaces in the London boroughs of Islington and Enfield. It was interesting to compare Islington, the borough with the least open green space in London, with Enfield, a more suburban borough. I went on to add data for Lambeth, as another inner city borough but in the south of the city. I found that flowering plants, the biggest taxonomic group in GiGL’s database, are universally well recorded while there are consistently far fewer data for reptiles and mammals. Comparing these, along with bats and butterflies which occupy the middle ground of data coverage, made for some interesting results.

Having found that 81% of the records for Lambeth were for flowering plants compared to only 37% in Enfield, I decided to take an overview of the different taxonomic groups recorded in the boroughs’ open spaces and was interested to note that three groups (birds, flowering plants and butterflies) accounted for over 90% of all the records in all three boroughs.

Reptile data also showed some unexpected patterns. All 19 records for reptiles in Islington were from one site, while there were no records for reptiles in Lambeth open spaces at all. This lead me to assess the sources of those data.

The GLA habitat and open space survey of the London boroughs was the source of 75% of the survey records for Lambeth that I examined. In Enfield, 83% of all the flowing plant records had come from this one survey.

Cordelia’s Graph showing records from three boroughs. (Click to enlarge image)

This really showed how uneven data collection is across London. For Enfield, Islington and Lambeth, the vast majority of records that GiGL hold are for three taxonomic groups and many of these, at least for flowering plants, are from one source. It was a potent reminder that statistics are dependent on the quantity and quality of the information on which they are based. From my work, it would be very easy to state that 81% of all wildlife in Lambeth are flowering plants and yet this is obviously far from the truth. Data collection is just as important, if not more so, than the analysis of the data itself.

I had been very excited about coming into GiGL and was very sad when it ended, but I find myself even more inspired by wildlife statistics. Seeing the breadth of GiGL’s records has inspired me to go out to my own garden and park and street and start to record what I see, not only so that I can play with the data myself, but also so I can contribute to London’s more complete wildlife statistics.

All that is left to say is a huge thank you to all at GiGL for giving me a chance to play with data for a whole week, and also for making it as enjoyable as they did. You might just see me come back asking for more!

Cordelia Webb is a high school student, and a maths and data enthusiast. She got in touch with GiGL to ask if she could join us for a week of work experience. Volunteer opportunities in the GiGL office are usually limited as we have restricted office space, but Cordelia’s free week corresponded perfectly with some staff annual leave, so we were very pleased to welcome her to the team.