It is all very well me moaning – that gets nobody anywhere
and generates a feeling of ‘what a miserable old git’. So, if one is going to
moan, one really must do something positive to justify the moans. Hopefully the
effort I make for the HRS goes some way to addressing this imbalance!
The key to resolving the real (or perceived) issues with
increasing demands to generate more data is to try to understand who wants the
data and the uses to which they wish to put them? At a strategic level it seems to me that
there are a small number of critical uses:
- An ongoing audit of the composition of the UKs wildlife assets – what is new and what might have been lost.
- Where is that wildlife? Data is needed at a variety of scales depending upon uses; these may range from 10km resolution for national maps to 1km, 100m or even 1m resolution for more local usage.
- Where are the conservation priorities? If based on IUCN criteria what is needed is data that are sufficiently refined to inform analysis for Red Lists, both national and local. However, we also have BAP and Section 41/42 species to consider too.
- Why is our wildlife asset expanding/declining? Use of data to inform the debate about how wildlife assets are changing is response to a wide variety of environmental factors.
- Reporting on the status of a small sub-set of our wildlife for national/international audit and site safeguard purposes.
There may be other obvious reasons for needing data, but
these five bullet points seem to me to encapsulate the main issues. There are,
within those headlines, several common themes so the headline reasons might be
trimmed further.
Data needs
If there are defined reasons for needing the data, then the
next stage is to determine what data are needed? Do we simply want any old
records, or do we want something more structured? Well, ideally, science would
be best served by data collected under a random stratified process such as employed
in the Breeding Bird Survey. Data collected according to other set protocols
such as those employed by WeBS counts, the Rothampstead Moth Survey or Butterfly
Transects are also very powerful.
The main drawback of these structured programmes is that
they may overlook a proportion of our wildlife, so we also need something else:
a way of ensuring that highly localised and specialised species are recorded on
at least an intermittent basis. Structured programmes address a tiny fraction
of the 50,000+ organisms known from the UK; so, an alternative is needed. This
is where the use of ad-hoc or ‘opportunistic' data come into play. Such data
might include records of protected species or casual sightings from gardens,
but they can (and often do) involve something more useful.
So, what makes data really useful?
Meaningful interpretation relies on data that have been
accumulated over a long timescale. A single record of an animal or plant is
meaningless unless there is something to place it into context. For example, a
record of a rare beetle from a given site might or might not imply a breeding
population. A single record of the same beetle from a site where it has
been recorded on 100 previous occasions suggests that there has been a resident
population and that this population is still present (to some degree).
So, for
data to be robust and useful it needs to be part of a much bigger picture. That
picture might be created by a single person visiting a single site for a given
timeframe; or it might be the same site visited by multiple people over the
same timeframe. Crucially, if everybody who visits the site records a full list
of what they see and can reliably identify, the sum of those data become very
powerful. This is the principle that underpins BirdTrack, for example. It can
be used to look at trends, when combined with data from other locations too. We
have thus established two further critical points:
- To be most useful, submission of complete species lists needs to be encouraged – rather than just the report of a single supposedly rare species. The wider list provides the local and longer-term context.
- Combinations of complete lists by different recorders can be used to investigate trends. The power of the data increases with the numbers of recorders making submissions, so the number of recorders submitting full lists becomes a critical differentiator.
We then reach the issue of composition of species lists. If
lists are compiled by recorders with a limited grasp of a given group, they
will inevitably be short. Much longer lists will be supplied by specialists in
that group, and the combined lists provided by those specialists will be
considerably more powerful because they provide so much more contextual
information.
Real data needs
We have therefore arrived at the critical stage in
developing a strategic approach to data collection. What we need is for all
recorders to record everything they see, but for them also to develop
sufficient specialism to provide important context. If a dataset comprises
records compiled by generalists then it will be heavily skewed towards
the common and easily recognisable. If on the other hand, the data are supplied
by specialists who cannot be bothered with the common and easily identified,
then there will be a different skew. Neither is helpful!
We need, however, to inject an element of practical reality into this
analysis. There are currently lots of generalists and relatively few
specialists, so the data are inevitably skewed. We need to change this
imbalance by focussing on why there are so few specialists and what is
preventing people from deepening their breadth of coverage. I submit that at
least part of the problem lies in 30+ years of the mantra take nothing but
photographs, leave nothing but footprints. There is a new generation that is
naturally resistant to taking specimens (quite understandably). It will be a
brave leader to take on this challenge, but without such an approach, there
will always be an imbalance in the taxonomic coverage.
There is hope, however. What is needed is a higher profile
effort to show how data can be used and to show the power of mass data
collection. Over the last few years I have tried to do this in my blog, but one
person will have little effect unless they are influential. So, if we want a ‘Citizen
Science Revolution’ we really need to educate the potential contributors, so
they understand what is important and what is not important. Those that get the
general message ‘more volume’ will help to address numerical targets, whilst
those who get the message ‘depth and breadth’ will help to make a real
difference.
The trick is, how to shift effort towards better structured
data assembly without alienating those who want to contribute but don’t want to
become a dedicated recorder. One obvious way is to promote the adoption of a
local ‘patch’ and to encourage regular/constant effort at differing levels of
intensity.
Hi Roger, while looking through citations of data on GBIF i came across this paper: https://academic.oup.com/sysbio/advance-article-abstract/doi/10.1093/sysbio/syy044/5034972?redirectedFrom=fulltext
ReplyDeleteMight be of interest. Unfortunately, i don't have a Oxford account ot read it in full. You may have more luck..?