An exchange
with a member of the UK Hoverflies Facebook Page these last couple of days
prompted me to put some thoughts down on the relative benefits of different
approaches to biological recording and the way in which it has developed with
the advent of digital photography.
When
Stuart Ball and I took on the HRS in 1991 the data was mainly supplied on
record cards that had to be entered into a database. The BRC at Monks Wood did
that job, whilst the scheme orgnaisers acted as the interface with recorders
and checked the data to ensure they made sense. Unfortunately, BRC were not
sufficiently well funded to keep pace with the volume of data and a big backlog
developed. In the case of the HRS this was approximately 2 cubic metres of
cards. In the five years after Stuart and I took the job on, we did the job of
data entry and gathering machine-readable data. I think we can say we were the
first scheme to do this. That effort generated 375,000 records but was
organised so that data management took place in the winter and we were able to
concentrate on field work in the summer.
A developing paradigm
Things
changed with the advent of digital photography, improved internet access and of
course such advances as the WILDGuide which made hoverflies accessible
to a much wider audience. Nevertheless, until around 2012 the vast bulk of our
activities centred upon traditional interactions with recorders, many of whom
had been contributing for 30-40 years and were personally known to us. Over 50%
of incoming data came from fewer than 25
people.
The FB
page generates about 25,000 records per year and in addition has helped to
develop about a dozen people who now do their own IDs and submit data as
spreadsheets - maybe a further 5,000 records. Historically, the HRS attracted
around 20-25,000 records a year through traditional routes - not least the
maybe 2,000 records a year that I generated myself from fieldwork. That
traditional resource is now advancing in age - most of the top 25 recorders
(who have contributed about 50% of all records) have been involved for over 20
years and several for over 40 years. In the last few years we have lost 2 and a
third is far from well and able to record. That means we have got to grow a new
generation. We are doing this through two avenues:
- Regular training events that Stuart and I run across the length and breadth of the UK (from Lerwick and Kirkwall to Exeter, Studland and West Sussex). We run between 5 and 8 such courses per year, but probably only generate about 1 new recorder like our old guard from about every 50 trainees; so perhaps one per year.
- Interaction with recorders on the web. Facebook has proven to be exceptionally effective in this respect. FB not only helps with ID skills but it helps to develop the wider recording skills and an understanding of what data are needed. That said, we must also accept that a very high number of contributors who are (were) first and foremost photographers who wanted IDs for their shots. Importantly, the spread of involvement has widened considerably and this has made a huge difference to the recording scheme.
New challenges
Working with photographic recording brings with it
a completely new set of challenges, not least expectations. Many contributors
happily accept that not all photos can be identified, but occasionally they
express frustration. I withdrew from one forum after getting abuse because I
was not prepared to put names to most photos of the genus Syrphus, which is far from straightforward, even from specimens.
The problem of Syrphus identification
crops up again from time-to-time and is frustrating for everybody, especially
when it is also one of the most frequently photographed genera. We can only do
what is possible, and I'm afraid that there is also an issue of best use of
resources.
I take the view that it is unwise to call oneself
an 'expert' - one is setting oneself up for a fall. So I tend to use the term
specialist and accept that I too have a great deal to learn. The difference
between me and the relative novice is that I am acutely aware of the pitfalls,
and have probably fallen into a good many holes of my own making! That is how
we learn. But, using the term 'expert' , the number available to provide
identifications is extremely small - well below 20 across the country and
probably fewer than 10 who can make a reliable job of it.
Data harvesting
The big question then arises as to how data should
be harvested? Should one extract data directly from FB posts or should one
direct contributors to iRecord? I have probably built a rod for my own back by
harvesting directly from FB, but I do have sound reasons for doing this:
- A very substantial number of FB members started either as photographers who wish to know what their subject matter is, or who enjoy sharing their experiences with others who are interested from the perspective of getting a good shot. As such contributing to iRecord or another medium is not their highest priority - we would lose a great deal of data if I did not extract from this site.
- There are considerable advantages to compiling a dataset that has been checked by a small group of the more reliable specialists. This improves confidence that the data are robust, providing one does not simply discard partially identified records to provide perspective; hence I extract all records.
- I extract a great deal of additional data that often gets overlooked by recorders: the gender of the animal, morphs, abundance, behaviour and flower visits (not the plant the animal was sitting on). It is a comprehensive dataset.
- I think the page would be a far less effective resource without the feedback that I manage to post on trends in species abundance or record numbers. If we are to generate a new cohort of recorders (and hopefully replacements for the existing team) then we must educate and mentor people.
- The impact of FB can be seen from the attached graph (it will be bigger still in 2016 as we are dealing with about 50% more records than 2015.
Figure 1. Numbers of records held within the HRS database, separated according to origin: NBN data are held separate to the main HRS datase |
But, what about iRecord?
This was
built as part of a wider initiative to increase biological recording activity.
It has an admirable objective, but starts from the principle that recording
scheme organisers are there to validate records. In theory this is the case,
but most RS organisers took the job on many years ago when things were less
complicated - they maintained a database, did their own recording and gathered
in records from a relatively small cohort of reliable records, most of whom
they knew individually. iRecord is very impersonal and photo ID is an art that
has to be developed - not everybody is willing to do this and relatively few RS
organisers have signed up to iRecord - much to the frustration of the Country
Agencies who want the data.
- There are quite a few contributors who post a set of photos that are all of different animals that they lump under the same species name - that has to be disentangled.
- Records often lack detail - when I extract data from this page I also log the gender, flower visits, behaviour etc. Posts on FB often lack this or say 'on rose' when they mean 'sitting on the leaves of a rose bush' and not 'visiting the flower of a rose' - there is a huge difference in the value of such data and as there is interest in pollinators my approach is providing a far more robust dataset.
- A fair few records are misidentified - there is one regular contributor who rarely achieves 50% correct and seems not to have learned at all in the past 2 years.
- Where records are not accompanied by photos one gets no real feel for the actual skill of the recorder. This is illustrated by people whose data cover Syrphus - lots of records without photos but the odd one with a photo that clearly cannot be taken to species (e.g. males of poor resolution). At that point one must be wary of the overall quality of the data from that person. These have to be dealt with - iRecord is not a particularly good interactive medium and FB is far better in this respect.
· The
dataset that emerges is a mish-mash of occasional records and records from one
or two more advanced recorders, so there is not much chance of advancing the
science of recording. It is compounded by problems with individual recorders going
back through their diaries and adding records that they submitted to the RS
many years ago as a record card and that I have already computerised - so I am
doing a lot of repeat work for relatively little return.
Where do we go from here?
We have
seen a paradigm shift in the way biological recording works. The internet and
digital photography has changed the relationship between recording schemes and
contributors. It has brought a plethora of benefits, but has also exposed
significant weaknesses in the system. The most worrying weakness is the
relatively small number of people with sufficient experience to deal with
identification, coupled with raised expectations that they will provide their
time in line with demand. Unfortunately, there are limits to what they can do
or are willing to do. Some are not
particularly computer literate, and do not have the spare time to respond in
line with the immediacy of modern life. Others deal with groups of organisms
that require dissection or high magnification and checking numerous characters
that are difficult or impossible to depict in photographs. Others still just
don't want their lives ruled by a computer: a comment that resonates is 'I like
the fieldwork but I don't want to become involved in administration'.
Thus, my
view is that we are witnessing a turning point in biological recording. If we
want to use interactive media, then we have got to grow capacity to respond to
demand. My guess is that the resident specialists on the HRS FB page jointly
contribute over 2,000 hours a year to this one medium. It has achieved a huge
amount but there have to be limits to what can be done with existing capacity.
So, we must grow new capacity - which of course depends upon the same limited
cohort of specialists! We will get there I think, but we must also ask for
expectations to be tempered:
We won't
ever manage to identify everything posted as a photograph, and we probably will
never fulfil all the aspirations of data users. Nevertheless, the UK is in a
far better position than anywhere else in the World, with the possible
exception of The Netherlands where biological recording is also well served by
the non-vocational ethos.
This comment has been removed by the author.
ReplyDeleteWow 😲
ReplyDeleteAs the person whose comments sparked this column, I am torn between the guilt of seeing Roger spend more hours of the night writing this (please look after your health and personal needs, Roger) and hoping that raising some questions has helped to clarify and make explicit some aspects of Roger's philosophy.
ReplyDeleteOf course, as an amateur contributor, one can feel disappointed when a set of clear images, which may have taken considerable time to capture, select, crop and post, can only be identified to genus or aggregate level.
That disappointment/frustration could be mitigated if the message were delivered differently. If one of the key uses of the data is to prove the importance of hoverflies as pollinators, then I presume that the exact species is less important (in that respect).
If instead of the response being just 'Syrphus sp.', it were 'Please record as Syrphus sp.' or 'Thanks - this will be recorded as Syrphus sp', the observer would still feel they were contributing to scientific knowledge. Otherwise it feels like one has wasted the time and there is no value to the submission.
I don't think there is any way to have a pre-typed range of responses for use on Facebook (except cut and paste from a separate document) so there would be a little more to type, which multiplied over many records.... With iRecord, there are a a range of verification responses built in and these could be extended if everyone agreed and resource was available.
That is a fair comment Paul, but I'm afraid that the team also has a problem with the amount of time we put in: at the moment it is not unusual to be working to well after midnight and then starting agian before 8 in the morning. Sometimes we have to limit what we say just to keep on top of the workload - which gets to the point of impossibility. There is precious little chance of me or others looking after out health if we are the expected to expand what we do still further. Sorry - I will see what I can do but I won't promise, as it can be a nightmare - one deals with one post only to return to the list and find that three more have arrived - often this continues will into the early hours. It is like trying to sprint for five hours solid.
ReplyDeleteIt seems to me that there is also an issue of the preservation of the data. There is no guarantee that FB records will be accessible or retained in a useable form into the medium to long term future but in contrast iRecord is intended as a long term data archive.
ReplyDeleteNot so. Each record has been extracted and has been turned into a data line on a spreadsheet that is then uploaded into the HRS database that then goes onto the NBN. The URL is retained within this record, and true the post may disappear from FB but this is no different to Museum beetle eating a specimen or the loved ones of the deceased entomologist putting his/her collection in a skip or on the bonfire (which has happened many times).
Delete