Monday, 22 May 2017

Feedback on noteworthy records

A thread posted on the NFBR Facebook group concerning noteworthy records provides food for thought. The important line is: 'You collect a species in a group with which you are not very familiar. Straight away you wonder - is it an interesting record? Would anyone be interested in hearing about it?' Responses to the thread frequently emphasised that all records are interesting, but of course the main point is whether the level of interest is enough to encourage an outsider to contribute?

The occasional record of something interesting may be an incentive to do more, but that really depends upon feedback. So, the critical issue is for Recording Schemes to make sure that contributors get regular general feedback, and individual feedback  either when they find something unusual or when the organism is fairly commonplace.

Feedback I have had from recorders suggests that they are discouraged when we, the specialists, don't put enough effort into giving the feedback that they want. The common complaint concerns iRecord where lots of posts do not get looked at because the scheme organiser(s) is/are not willing to participate or has/have not got the capacity to do verification on a daily basis. I am very guilty of that! If pictures are posted on Facebook pages, the common complaint is that nobody looks at them or, in my case, that I don't adequately spend time explaining why something is, or is not, what its contributor thinks it might be. One contributor took the trouble to express their frustrations (paraphrased) 'I put a lot of work into cropping these photographs and you could not be bothered to comment more than 'Syrphus sp.'.

We need to think about this. In the case of that particular contributor, I now try to write a bit more on each post. It adds to my time commitment, whilst not necessarily improving the numbers or quality of records entering the recording scheme!

The challenge we face is, therefore, how to provide the necessary feedback to enthuse potential new specialists, whilst avoiding burn-out amongst the existing specialists. I am afraid there is no simple answer. What we can say is that modern communication has raised expectations. Recording Schemes need to provide updated maps almost in 'real time'. We don't have the luxury of working for several years to produce an atlas – we really need that atlas to be on-line and regularly updated. Likewise, we need to engage on interactive media and make ourselves available to provide advice on a daily or even hourly basis.

These demands are a different paradigm to the days when a scheme organiser had the summer out in the field, spent the winter checking specimens and corresponded with those contributors that sent in record cards or sought help with identification of problem specimens. They were far gentler times. Today, organisers of schemes that generate large volumes of records must expect to spend several hours a day providing advice and verifying records. I don't notice a great rush of people who have the skills to do this and are willing to take on the job. I do see a gradual growth in skills and the development of a small cohort of people who will be able to take on aspects of mentoring that are essential. Mentoring is a skill in itself and we need to avoid pitfalls such as elitism or dogmatism.

This takes me to the nub of the problem. Where there are lots of capable specialists who are prepared to spend their time helping with ID and providing feedback, it may be possible to do more to provide the necessary encouragement. In many cases, however, there are very small numbers of people capable of providing technical advice and consequently the demand often outstrips the capacity to provide. Increasing interest in a group of organisms does not necessarily lead to a commensurate growth of specialists – that takes many years!

In the case of hoverflies there is no chance of providing a specialist County Recorder for all counties. I recall that when I tried to encourage one very capable recorder to take a more prominent role in the Hoverfly Recording Scheme the response was: 'I like the fieldwork but don't want to take on the administration'. Wise man! But we do need people who are willing (and able) to take on the administration.

Somehow we must cross this hurdle, but we must do so without sacrificing quality.

Sunday, 7 May 2017

It is worth reflecting on progress

Stuart Ball and I took on the Hoverfly Recording Scheme in 1991 and issued our first progress report in March 1992. At that time we had a huge pile of record cards to digitise and there were innumerable datasets to trawl and incorporate into a single entity. So, the first coverage maps comprised a small part of the data: around 100,000 records. The maps we published (Maps 1 & 2) showed the scale of the job to come. Not only were there around 2 cubic metres of record cards to digitise, but the records all had to be checked and erroneous elements resolved. As the early maps show, grid reference errors were commonplace (and still are!).
Maps 1 & 2. Map from the first newsletter in 1992 - overall coverage at 10 km resolution and the numbers of species per 10km square using a graduated scale.

 The most recent coverage maps that I have access to are for the middle of 2015. They provide a clear picture of the progress that has been made. At this point, the dataset comprised around 940,000 records, most grid reference errors had been resolved and the maps look pretty respectable (maps 3 and 4).

Maps 3 & 4. Overall coverage at 10km resolution and graduated scale for the numbers of species in 2015. The overall coverage map is graduated with clear circles and grey circles reflecting older records. Black spots represent the most recent records.

Both sets of maps provide important context for the achievements of the UK Hoverflies Facebook group in 2016. The final pair of maps show the coverage achieved from Facebook. iSpot and iRecord in 2016. These data and a substantial number of other datasets have yet to be completely incorporated into the database, so I cannot show precisely what was achieved in 2016. Nevertheless, these maps show some important achievements; not least how much new coverage has been achieved in south Wales (Maps 5 & 6).

Maps 5 & 6. Overall coverage and numbers of species recorded from Facebook, iSpot and Flickr in 2016
Each year there will be variation in coverage. New recorders will emerge and others will cease activity. Over time the maps will be filled in. So, I think we should look very positively on what was achieved in 2016. The maps for 2017 will be different, and when all of the 2016 data are incorporated I am sure we will see much more extensive coverage. The big challenge remains the need to recruit active recorders in the least well-covered areas.

Do we need a new approach to Hoverfly ID?

For a long while, I have felt that we are missing something with hoverfly ID in the UK, and especially in the case of live animal photography.

Stubbs and Falk has served us very well for over 30 years, but it comes from another era in which it was assumed that species identification would be based on specimens. Today, that is no longer the case, as photography has advanced and the numbers of people recording hoverflies has changed out of all expectations. The HRS was one of the first recording schemes to recognise the potential of photography and the Resident Team on the Facebook page have been active in this field for perhaps ten years. As such, we are pretty experienced. Nevertheless, we regularly encounter good photographs that we cannot take to species. Sometimes we are assisted by Gerard Pennards who we really ought to consider as a member of the team. Gerard brings a much-needed and valued European perspective.

This experience brings me to the nub of the challenge we face. Our current understanding of the UK fauna is based on a sub-sample of the European fauna. Although in places the keys do take account of possible European species, they don't tackle the fundamental problem of how to distinguish tricky photographs where you cannot change the angles to see critical characters. Sometimes we need to see a bit more, and sometimes, perhaps, there are perfectly good features that we don't recognise because our keys are geared to a restricted fauna. European keys often add in couplets to sort out overlaps that don't occur in the UK. This was brought home to me this weekend with a shot of a Parasyrphus that might either be P. mallinellus or P. lineola. The UK key simply concentrates on leg colour but van Veen makes a further separation based on antennal colour. That could be very useful when checking photographs. There are lots of other places where such splits might be helpful.

I therefore think we need to be working towards a new approach to identification of hoverflies in Britain. The current guides serve us well, but we might just do that little bit better and might find ways of improving the level of identification from photographs. That is not to say that we don't need taxonomy based on specimens; clearly there are many places where we cannot avoid the need for specimens. BUT, I think we might just enter a new paradigm if we start to tackle the question of ID from a live animal and photographic perspective.

All I need now is the time to think out the approach! I think it is the sort of thing that could be a really nice opportunity for collaboration between UK and European specialists. There is no doubt the UK would learn a great deal from our European counterparts, but we would also bring important experience to the table.

Tuesday, 2 May 2017

Data verification - how is it done?

A question posed on the NFBR Facebook group today raised an interesting question ' Do BWARS, or any other recording scheme or BRC, publish any guidance on verification for their species group?'

I have never tackled this problem directly but have given some guidance to novices about self-verification after making an identification. Once you have made your diagnosis, it is important to check some basic data:
  • Does the animal conform to the detailed descriptions given in major monographs such as Stubbs & Falk or Van Veen.
  • Does the species occur in your general vicinity? It may not have been previously recorded from your 10km square but if you live in NW Scotland and the determination you have made is of a species that is confined to southern England, you can be pretty sure your ID is either wrong or the record is highly aberrant. Range is important!
  • Does the date recorded coincide with the core phenology range? If not, there must be doubt, although there are exceptions.
  • Is the species common or rare? Bear in mind that rarity does not mean that you won't find such an animal, but the vast majority of records are of common species, and many rare species have peculiar habitat associations. If the species is rare, make extra checks and seek the view of a specialist.
  • Can a firm identification be made using the medium you have chosen? If you are basing ID on photographs, you need to be aware that a lot of hoverflies can only be identified with certainty using microscopic characters that either don't show in photos or are on a part of the animal that cannot be accessed using conventional live-animal techniques (e.g. the male genitalia).
  •  Are there other species with which it might be confused?
  • Does the habitat match the descriptions in the guide book?

So, how do I verify records?

If a record is accompanied by a photograph I can start by saying:
  •  Is the animal sufficiently well depicted to make an ID? We do see a fair selection of photos that are at low resolution and cannot be used to look at key features; many photos are also from angles that don't capture the critical features, and some are obscured by glare and colour casts that give rise to uncertainty.
  •  If it is potentially identifiable, can I conclude the genus? if so:
  • Can I take this animal to the correct couplet in the key? if so:
  • Can I arrive at a firm conclusion either at specific level or for a species pair (e.g. Platycheirus scutatus sl. or s.s.)
  • If the affirmative can be given for the above then I can either verify the diagnosis made by the recorder or make my own diagnosis.
Where records are not accompanied by a photograph (or supported by a specimen) I follow a pretty standard routine:
  • Is the recorder and their abilities known to me? If so, how experienced are they? For experienced recorders that are a known quantity I probably don't need to do much for common and readily identified species. If not, it is helpful to see a full list of their records to get perspective of what they are recording. Each time I see a dataset I build a picture of the competency of the recorder.
  • Do the records coincide with the known distribution and phenology for the relevant species? If on the limits of these ranges then one might want to check further.
  • If a single record then obviously one cannot do much more than say 'is this record potentially believable? In many cases one has to use a significant element of trust. To my mind this is not really verification it is just a broad-scale quality assurance process but in no way says that all records are correct. We all make mistakes, so all datasets are likely to contain glitches. For the most part, a low level of mistakes has little or no impact on the reliability of the data.
  • Are there tricky species? If so, I may go back and ascertain the presence of a specimen. If there is no specimen and the recorder has relatively low levels of experience then the record may not be reliable. An awful lot of verification is based on trust and a knowledge of individual recorders, so that is something that only happens over time.
  • And, in knowing recorders I often ask myself how bold they are in their diagnosis - the more caution I see, the greater assurance I have that records are likely to be reliable if not perfect.

So what about verification of iRecord?

I get frustrated with iRecord and much prefer spreadsheets. My reasons are focussed on data that are not supported by specimens or photographs. Analysing a spreadsheet is a relatively quick process, whereas trying to sort out the odd few records entered intermittently on iRecord involves a lack of context.

When I receive a spreadsheet I quickly scan for certain indicators. For example, if I know that a recorder does not take specimens and their lists contain species that cannot be done from photos or in the field, then the data is suspect. Does the list contain 'Chinnery' mistakes? There are glitches in Chinnery that are giveaways that the data are suspect - I won't give these tricks out because they are so useful to me! Are there records of females that can only be taken to aggregate but are listed as a segregate?

Similarly, do the lists contain habitat indicator species that are clearly out of range. A classic is to see inland dry sites with lists that include species such as Platycheirus immarginatus (it does happen and shows that the data have been created using the pictures in Stubbs & Falk).

There is no guaranteed way of ensuring correct identifications and with lists it is a matter of trust. Having spent a lot of time developing data from photographic photographs, I now have a big dataset that has improved the parameters for species phenology. I also know a lot more about the potential problems that people encounter - the list of mistakes is huge, as my previous post on iRecord data has shown.

And, there is a lesson for us all

On one occasion I accepted a record that was not supported by a photograph but came from the right sort of place and was not terribly difficult to ID, so I accepted it. Shortly afterwards, the originator posted a photo of the animal and extolled the virtues of iRecord as a means of confirming the identity of a specimen. The photo and the determination did not match - this made me look a fool! It just goes to show that unless the verifier checks a piece of chitin on a pin, the record can only be said to be likely to be reliable and not a correct ID!

Monday, 1 May 2017

Dealing with species complexes

One of the big problems we have when identifying specimens from photographs is the level of resolution needed to make a firm identification. In many species the critical characters are subtle or simply cannot be depicted from anything but the most awkward and unlikely angles for live animal shots. Therefore, if we cannot make a firm identification we will often go only as far as genus (occasionally Tribe); or we will go as far as the point where a particular species split occurred.

The main reference point for species splits (in Britain) is the first edition of Stubbs and Falk (1983), which established the foundations of the modern species list. At that time some 250 species were known, with various ideas presented about possible splits (species A, B, C etc). Over time, some of those ideas have been confirmed as reliable new species and splits in what was then regarded as a single species have been made. One or two have been lumped (e.g. Baccha elongata and B. obscuripennis).

The problem in recording terms is what to do with pre- and post- split data? Junking all the pre-split data is not wise – it will find its way back in again and may contaminate the data for the original species. So, unless original voucher specimens have been re-examined and a new diagnsis has been made, we allocate these records as sensu lato (in the broad sense), often notated as sl. or agg.

There then comes the problem of what to do when species complexes are presented as photographic records? In many cases we simply cannot make the relevant split from the photograph so one option would be to log at generic level. This is not very satisfactory, however, because we can be reasonably sure about its identity if basing the identification on the earliest edition of Stubbs & Falk. So, I log such specimens as sl.

The main splits that are relevant here are:

Cheilosia albitarsis to C. albitarsis and C. ranunculi
Platycheirus clypeatus to P. clypeatus, P. europaeus, P. occultus and P. ramsarensis
Platycheirus peltatus to P. peltatus and P. nielseni
Platycheirus scutatus to P. aurolateralis, P. scutatus and P. splendidus
Xanthogramma pedissequum to X. pedissequum and X. stackelbergi (and X. dives in Europe)

Those apart, we think that at some point Dasysyrphus venustus will get split into at least two species – hence we are careful here. There has also been a lot of uncertainty about the status of D. hilaris which is almost identical to D. venustus – hence if there is no face shot we lump these together and say venustus agg. – but for data purposes I log as Dasysyrphus sp. When this one splits we won't be able to do it from photos because the main useful characters are on the sternites - there will be at least three species - D. venustus, D. hilaris and at least one additional species but potentially two or more.

Then there comes Melanostoma – we just cannot be sure what will be doable once this is sorted out! We had thought that a recent review using DNA had eliminated the developing theme of at least five species within M. mellinum and two or more in M. scalare, but M. mellarium has been added to the mixture and I (or others) have yet to do a careful analysis of specimens to be sure what we have.

Finally, we have the problem of groups of species that Alan Stubbs recognised as groups based around a central name. This is mostly used in Cheilosia where there are a number of species that share a common obvious feature such as projecting hairs on the face, or hairy/bare eyes etc. In those Alan has erected 'groups' and a similar approach has been adopted by van Veen (see my post on identification guides). In our diagnoses, we may well say that the depicted specimen is likely to be a member of the grossa, pagana, variabilis etc. groups but we don't give an aggregated name because there is no certainty of which couplet (split between two species sharing most but not all of the same characters) one can get to.

Much of this more complicated taxonomy is well-known to those Dipterists who have used the various editions of Stubbs & Falk, but newcomers using the WILDGuide will not. Unfortunately, there is only so much space in such a volume and it was only ever designed as an introductory guide rather than a comprehensive replacement for Stubbs & Falk. If you progress to the microscope then you will need this volume.

Stubbs & Falk is arguably obsolescent because there have been a dozen or so new additions to the British fauna since 2002 when it was last updated. We have said that we will provide a supplement, but as yet that has not happened – it is one of several major jobs on our list – maybe next winter!

Saturday, 29 April 2017

Verification of records

Over the last couple of days, there have been busy threads on both the BWARS and NFBR Facebook groups concerning record verification. It all started with a question concerning the distribution of Anthophora plumipes, a relatively easily recognised solitary bee that flies in early spring and whose distribution is well-known.

The map that was produced on the NBN (Figure 1) showed vastly wider distribution than is shown in reliable data compiled by BWARS (Figure 2) but not currently available through the NBN. The difference is obvious! The data in question was compiled by the ‘Great British Bee Count’ run by ‘Friends of the Earth’. In essence, the data are junk as they stand! Sadly, an awful lot of well-meaning people have been encouraged to participate in what seems to me to be little more than a publicity stunt. No thought seems to have been given to data verification or to the impact poor data can have on the work and outputs of long-established biological recording schemes.

Figure 1. Distribution of Anthophora plumipes according to data collected by the 'Great British Bee Count'

Figure 2. Distribution of Anthophora plumipes based on verified data compiled by BWARS.

In fairness to FoE, they do seem to have recognised the problem and I believe have linked up with Buglife to do something about it. I was recently contacted by someone at Buglife to seek my views on whether the project should extend into hoverflies and whether I would be willing to verify the data. I said NO on both counts. Why? Surely I should be getting involved?

My rationale is simple. The Great British Bee Count swamped BWARS with utterly unreliable data and they were neither able nor willing to take on the job of verification; I don’t blame them as it is not the simple job people sometimes think. It is not just a question of getting a specialist to sit down and check a few photos; it is weeks or possibly months of work that is tedious and frustrating. Also, is it making best use of skills developed over several years or, in most cases, tens of years? My answer is emphatically NO.

I seem to recall that FoE’s rationale for starting the Great British Bee Count was that there was inadequate data on bee distribution and that it needed more effort from the general public. That was pretty naive. The issue should not start with data availability, although it is fair to say that coverage of most invertebrate taxa is much poorer than for vascular plants, birds or mammals. The big issue starts with the complexities of identification and the skills needed to become competent with exceedingly difficult identification. Acquiring these skills take time and patience. I spent maybe a decade doing aculeate Hymenoptera, and still do the odd few specimens. I don’t consider myself an ‘expert’ but can make a reasonable job of separating out the majority of regularly encountered species when I sit down for a couple of days and work through a block of specimens within a single genus. Likewise, I now feel I can cope with Hoverfly identification from photos, but it has taken me ten years to reach that point (and I am still learning).

If we want bigger datasets, the starting point has therefore GOT to be growing skills. It is a very slow process but is the best use of specialist time if one looks at a long-term strategy to improve our knowledge. That is why the HRS has been running training courses for a decade or more. Thanks to OPAL grants we can take the courses to the places where they are needed, and we do so regularly. Even so, I reckon that at best we convert 5% of the people who do our courses into serious recorders; and of that cohort, probably only 10% will go on to have the necessary expertise to take on the task of data verification. For most it is a hobby and one that has to fit in amongst a plethora of other activities and responsibilities. It is the rare individual who can devote time to developing the skills that are needed to take on the challenging taks of ID from photographs and data verification.

So, the message is clear. If we want more data we have got to engage constructively with people who want to learn. FoE’s initiative will generate a lot of interest, and hopefully it will get a new generation of young people sufficiently enthused to take up the net and pooter, microscope and keys. That is the real benefit of the initiative; not the generation of a block of data that has already wasted a certain amount of specialist time dealing with the ensuing kerfuffle when the data they publish are so obviously inaccurate.

On a broader theme, it is also worth reflecting that there is a great deal of naivety about the potential of 'Citizen Science' to solve shortfalls in data. The Great British Bee Count has been helpful in showing the pitfalls of badly designed initiatives and  the need for researchers to be very careful about the datasets that can be relied upon. It also shows why recording scheme organisers have to be vigilant in evaluating those records that they receive. There are lots of tests that can help to verify a record, but in the end the only sure way is to examine a specimen on a pin and place that specimen in a suitable repository for re-evaluation as and when necessary!