GiGL holds a lot of data. But what exactly does all that information represent? Matt Davies, GiGL’s Data Manager, is beginning to find out what lies beneath.

Over the last few years the GiGL database has grown to become a considerable resource with approximately 1.3 million species records. As our data holdings have grown, so too has our technical expertise and the potential to use species and habitat data alongside social, geographic and other data. GiGL data are now used for ever more complex projects, from creating strategic planning GIS models such as the Capital Woodlands project to assessing BAP habitat condition.

It’s great to be able to input into such vital projects, but it’s not without its complexities. Ever more diverse demands on the data mean we have to fully understand what constitutes the data that underpin these projects. The whole of the GiGL Partnership need to ensure that the database accurately reflects what’s on the ground by identifying gaps, mobilising appropriate datasets and conducting new surveys.

The time has come for us to assess and better understand what is contained within our database. The time has come for a data audit. Over the next few months GiGL will be contacting data providers with detailed assessments of the species information that is available for their area or taxon of interest. For now, here is a sneak preview.

Geographic Coverage

It is clear from even the briefest glance at the bubble diagram that the geographic coverage of the database varies considerably from borough to borough, which I think probably reflects differences in recorder effort, rather than actual differences in biodiversity. Richmond and Greenwich are two of the better recorded boroughs, which is largely due to The Royal Parks mobilising data via the GiGL Royal Parks Officer.

Taxonomic Coverage

On closer inspection, the bubble diagram also reveals that the taxonomic breakdown for each borough varies considerably. The dataset for the majority of boroughs is dominated by plant records, thanks to the GLA habitat survey. This is particularly evident in Brent, where the Barn Hill Conservation Group have submitted an incredible 59,540 records from meadow surveys. In contrast, Wandsworth, Richmond and Camden all have significant amount of records for other taxonomic groups, reflecting the effort from borough and Royal Parks staff.

The relatively high proportion of butterfly records in north west London boroughs is thanks to the Middlesex branch of Butterfly Conservation. Variations in taxonomic coverage are also due to only some of the London Natural History Society recorders (including those recording plants, birds, fungi, lichen, lepidoptera) having so far contributed data to GiGL.

treemap showing proportion of records provided by each contributor and the number of which are protected species records

Protected Species & Data Ownership

GiGL data continue to be used by consultants and planners assessing planning applications. Of particular interest are protected species that carry weight in the planning system. Protected species account for 7.5% of the overall database, some 94,455 records. The treemap shows that these are provided to GiGL in varying quantities by each group of data providers. While the GLA is the largest data provider overall, contributing 505,056 records, less than 1% of those records are of protected species. Comparing this with data from volunteer sources, which total 392,777 records, 10% of which are protected species, clearly reflects the original motivation for the surveys conducted by the two groups. The data we hold has been collected for many purposes, which often differ from its current uses.

Another noteworthy fact shown by the treemap is the relatively small size of the ‘other’ category, given that it includes all records from Natural England, The Environment Agency and London Wildlife Trust, amongst others. Data provided by boroughs only makes up 9% of the overall database but this masks the fact that some boroughs, such as Wandsworth, are much more pro-active at mobilising records than others. It also masks the fact that some datasets provided by boroughs originate from local nature enthusiasts or volunteers. However, the majority of the records from the general public have been collected via surveys run by London Wildlife Trust, People’s Trust for Endangered Species and some of the boroughs.

Temporal Coverage

The doughnut diagram shows that 78% of records are from the last 10 years. While this is good for creating a baseline of biodiversity in London, it’s not so useful for longer term trend analysis. It is, none the less, interesting to note that 724 records are pre-1900, many of which were recorded by Charles Darwin. We need to be aware of the data resource that informs routine planning application reports and more complex uses of data, such as habitat condition assessment. Some of the limits of our data are already evident from this brief overview. I hope this has stimulated your thoughts about how accurately GiGL data reflects biodiversity in your area, taxonomic group, or timescale of interest and how the GiGL Partnership might plug the holes. I also hope that new data visualisation techniques demonstrated here will allow us to communicate more effectively to all data contributors and that all in the GiGL Partnership will engage with the data audit to help improve the overall resource.

If you would like to discuss mobilising your data please contact Matt, GiGL Data Manager, matt@gigl.org.uk