Saturday, 29 April 2017

Verification of records

Over the last couple of days, there have been busy threads on both the BWARS and NFBR Facebook groups concerning record verification. It all started with a question concerning the distribution of Anthophora plumipes, a relatively easily recognised solitary bee that flies in early spring and whose distribution is well-known.

The map that was produced on the NBN (Figure 1) showed vastly wider distribution than is shown in reliable data compiled by BWARS (Figure 2) but not currently available through the NBN. The difference is obvious! The data in question was compiled by the ‘Great British Bee Count’ run by ‘Friends of the Earth’. In essence, the data are junk as they stand! Sadly, an awful lot of well-meaning people have been encouraged to participate in what seems to me to be little more than a publicity stunt. No thought seems to have been given to data verification or to the impact poor data can have on the work and outputs of long-established biological recording schemes.

Figure 1. Distribution of Anthophora plumipes according to data collected by the 'Great British Bee Count'

Figure 2. Distribution of Anthophora plumipes based on verified data compiled by BWARS.

In fairness to FoE, they do seem to have recognised the problem and I believe have linked up with Buglife to do something about it. I was recently contacted by someone at Buglife to seek my views on whether the project should extend into hoverflies and whether I would be willing to verify the data. I said NO on both counts. Why? Surely I should be getting involved?

My rationale is simple. The Great British Bee Count swamped BWARS with utterly unreliable data and they were neither able nor willing to take on the job of verification; I don’t blame them as it is not the simple job people sometimes think. It is not just a question of getting a specialist to sit down and check a few photos; it is weeks or possibly months of work that is tedious and frustrating. Also, is it making best use of skills developed over several years or, in most cases, tens of years? My answer is emphatically NO.

I seem to recall that FoE’s rationale for starting the Great British Bee Count was that there was inadequate data on bee distribution and that it needed more effort from the general public. That was pretty naive. The issue should not start with data availability, although it is fair to say that coverage of most invertebrate taxa is much poorer than for vascular plants, birds or mammals. The big issue starts with the complexities of identification and the skills needed to become competent with exceedingly difficult identification. Acquiring these skills take time and patience. I spent maybe a decade doing aculeate Hymenoptera, and still do the odd few specimens. I don’t consider myself an ‘expert’ but can make a reasonable job of separating out the majority of regularly encountered species when I sit down for a couple of days and work through a block of specimens within a single genus. Likewise, I now feel I can cope with Hoverfly identification from photos, but it has taken me ten years to reach that point (and I am still learning).

If we want bigger datasets, the starting point has therefore GOT to be growing skills. It is a very slow process but is the best use of specialist time if one looks at a long-term strategy to improve our knowledge. That is why the HRS has been running training courses for a decade or more. Thanks to OPAL grants we can take the courses to the places where they are needed, and we do so regularly. Even so, I reckon that at best we convert 5% of the people who do our courses into serious recorders; and of that cohort, probably only 10% will go on to have the necessary expertise to take on the task of data verification. For most it is a hobby and one that has to fit in amongst a plethora of other activities and responsibilities. It is the rare individual who can devote time to developing the skills that are needed to take on the challenging taks of ID from photographs and data verification.

So, the message is clear. If we want more data we have got to engage constructively with people who want to learn. FoE’s initiative will generate a lot of interest, and hopefully it will get a new generation of young people sufficiently enthused to take up the net and pooter, microscope and keys. That is the real benefit of the initiative; not the generation of a block of data that has already wasted a certain amount of specialist time dealing with the ensuing kerfuffle when the data they publish are so obviously inaccurate.

On a broader theme, it is also worth reflecting that there is a great deal of naivety about the potential of 'Citizen Science' to solve shortfalls in data. The Great British Bee Count has been helpful in showing the pitfalls of badly designed initiatives and  the need for researchers to be very careful about the datasets that can be relied upon. It also shows why recording scheme organisers have to be vigilant in evaluating those records that they receive. There are lots of tests that can help to verify a record, but in the end the only sure way is to examine a specimen on a pin and place that specimen in a suitable repository for re-evaluation as and when necessary!

Monday, 17 April 2017

An early spring?

Each spring, observers often remark upon whether plants and animals are emerging earlier or later than previous years. In the last 30 years, the general impression has been that springs are getting earlier and this impression is reinforced by the data. Amongst the hoverflies, there are several whose emergence has shifted by as much as two or three weeks, with some emerging at crazily early dates. But, does the first reported emergence actually tell us very much?
In reality, a one-off event is inconsequential; it is far more important to look at the overall phenology of a species or a group of organisms. And, when one looks at phenology, it is not the first and last dates that are important, it is the degree to which peak numbers shift that tells the full story. Thus, analysts get very frustrated when recorders say ‘I’ll give you first and last dates but I cannot be bothered with anything else. Without the supporting context, first and last dates are utterly meaningless.

Making sense of the data

Monitoring photographic data compiled by recorders who are relatively unselective is a great way of developing data on readily recognisable and useful indicators of seasonal change. The Hoverfly Recording Scheme has been doing this for around ten years, but it is only since the advent of the UK Hoverflies Facebook group that the volume of data has reached a level where the data are sufficiently robust to look at differences early in the year. In the past, one would have had to wait for a year or more for relevant data to arrive. Now, we have the data almost immediately to hand and can start to interpret the impressions of observers almost on a ‘real-time’ basis.
This year, the overall impression has been that spring got going very early. Was that really the case? I thought it was worth looking at a suite of indicator species to find out. Initially I compiled a long list of species that looked to have emerged earlier than usual. This was rapidly whittled down to just three species because many of the potentially early species are reported in relatively low numbers. They are not really very useful because the reports depend entirely upon chance. Records of widespread and abundant species provide a much more solid basis for analysis because many more people will see and report them.


For this analysis I took three species that fit my criteria of being abundant, easily recognisable and widely reported. They are: Epistrophe eligans, Leucozona lucorum and Dasysyrphus albostriatus.
As 2017 has only just started, the median date for these species cannot be calculated. The median date for early emergence can, however. So, I compiled a table of the first three dates for each species in 2014 to 2017. From this, I calculated the median date for each year for each species and then ranked them according to date of median emergence (Figure 1). This initial analysis shows that the early emergence dates for 2017 are indeed earlier than in previous years, with two out of three species ranking first and the other lying second in the rankings.
Figure 1. Tabulation of first two third dates for three early hoverflies

Can one get any other ideas on the degree to which this year is early when compared with previous years? My answer to this was to look at the spread of dates for each year, taking the median dates for the three species and creating a second median. This is probably statistically wrong, but as a crude analysis it helps to paint a picture (Figure 2). Taking these dates, a further median can be created between the earliest and latest medians of the combination of three species. This date is 31 March. One can look at the degree to which each combined median varies from the central date (Figure 3) which suggests that 2017 is possibly as much as 7 days earlier than the median for the previous 3 years.
Figure 2. Median dates for the combination of species over the period 2014 to 2017

Figure 3. Variation in median dates from the median for the period 2014 to 2017


Is this believable? Time will tell, but my general impression is that the species lists for 2017 contain animals that would not have been seen for at least two weeks further into the season even within the last ten years. If one compares with 20 years ago, the evidence is very strong that hoverfly emergence has advanced by several weeks and that the field season is getting longer. In some cases, it is likely that unless you get out early, some short-lived species will have come and gone before you have mobilised!