Sunday, 24 August 2014
Records from photographs - a conundrum
At the start of this week I hit a brick wall in terms of the effort I make to extract data from websites. It is a pretty huge task these days, but when I started six years ago it was in its infancy. The growth in photographic recording has been massive and these days I am unable to keep on top of it without working a minimum of 50 hours a week. That is unsustainable as I find it getting in the way of my ability to earn a living! So I posted on various forums that I was going to cut my commitment at Christmas and would restrict my involvement to one or two forums.
I was looking to see what expressions of interest there might be to help out. The result has been very helpful, with an excellent software engineer volunteering to build a data extraction bot and website (he has one up and beta-testing already!). Two other people have offered help. On the whole, comments have been very positive but I did get one response that raised all the questions that are raised by taxonomic specialists – what is the value to be had from ad-hoc photographic records?
This is a really important question and one that deserves careful analysis because in my view the national datasets are increasingly skewed towards such data. Certainly that is the case for hoverflies and I suspect that a similar situation will obtain elsewhere. Does it matter? And, if it does not matter, what are the benefits of growing a bigger network of recorders who perhaps only record part of the fauna?
At one time I might have held similar views to those expressed above, but I have given a lot of thought to the issue and have concluded that on the whole the benefits vastly outweigh the drawbacks. My reasoning is as follows:
One can either take a highly insular approach to recording and confine recording schemes to the outputs of a very small number of recorders who cover all taxa within a particular family. Alternatively, one can absorb all records and recognise that the dataset will be disproportionately skewed towards those species that people see and can identify without resorting to taking a specimen and undergoing microscopic examination. The two approaches yield very different data profiles, and in the past the outputs of key recorders would have dominated the dataset (for about 30 years the dataset was dominated by just 20 very active recorders. Many of those recorders are no longer very active and the datset is now growing from a new cohort of recorders, rather fewer of whom cover all taxa. Thus, the HRS dataset now fits much closer to the dataset emerging from photos. All the same, provided one has a clear picture of who records in particular ways one can split the data according to technique and analyse it accordingly. So there is no real problem from a data management perspective.
There then comes the issue of rarity or difficulty of ID. Whilst the occasional record of a 'rarity' might be of some interest, it is of limited value when wanting to analyse trends – you need an awful lot of records to do much trend analysis, and by its very nature rarity precludes such analysis. In actual fact one does find from some photographs that there are more of certain species than we might think – e.g. Palloptera muleibris turns up far more frequently as photos than it does in my net (I think I've seen it twice in 30 years!). The data for many hovers and larger brachycera have contributed to the various species status reviews. So, if we judge datasets on rarity then maybe they are not covering all taxa, but in actual fact photographers do see species that the specialists rarely see – for example I reckon that there are more photographic records of Actophila superbiens this year than will come from specialists.
I would then suggest that 'common' species are often the bellweather of changes in the wider countryside, so big datasets of species that people can identify may actually tell us quite a lot about the natural world. We can do this with a variety of hoverflies from photos – changes in emergence periods and in distribution. The data are too limited yet to look at trends but they are improving.
Finally, I think we must look at what one is trying to do when engaging with photographic recorders. We have to be realistic that this is the biggest cohort of natural historians and is increasing in influence. We either engage and hope to show how there remains a need for sound recording by collection, or we shrink into a box and fight people off when attacked. I favour the former and that is why I put effort in. What is more, if the biological recording community is to remain active and relevant, the photographic community is a very big constituency so we need to engage and to show what can and cannot be done with data accumulated this way.
So, am I wasting my time? Well if some people think that is the case then they don't have to get involved. But, we rely on a very small band of people to make the detailed datasets and those alone will not provide some of the data that are important. If by outreach we pick up the occasional person that gets more deeply involved, or converts to using a microscope (there have been a few), then there is a future for sound taxonomic recording. If we fail to do that outreach and to show value to what people are doing, not only will recording diminish as the current generation pops its cloggs, capacity to generate new competency will also diminish. If I was a politician reading some comments I would be thinking – why bother with this lot – they are not inclusive and are negative. If I got too positive a message I might develop too many expectations.
My view and approach is to look for a level of input that demonstrates both the value and the limitations, but the main limitation is the lack of specialist capacity. I have previously written about the need to develop more expert capacity, and I believe that developing the recording effort via engagement with photographers is a valid way of doing so. It may not be 'ideal' but then if you wait for the ideal situation it is unlikely to happen.
It is also worth bearing in mind that the sort of engagement I have made means that large numbers of people have developed an interest in diptera at some level. Some will buy the books - e.g. the revised Larger Brachycera book. We need to drum up interest in order to sell enough books to make them economically viable. If we don't then such books will not get published. Increasing interest helps to sell the Wildguide and that in turn generates an income to produce guides such as the hoped for Scathophagid book (we have donated the proceeds to Dipterists Forum). Likewise, that interest may help to generate the case for a Diptera Wildguide - and for that my database may be essential to source relevant photos. So, it is not a simple question of limited records, there is a strategic case too.
There is a genuine need for debate about the value of different recording techniques, but the most important issue is to think about what positive benefits can be accrued. If we don't make an effort to extract data and to engage to encourage, then we are missing a time-limited opportunity, as the people who can provide the taxonomic expertise are aging and we need to grow a new constituency of specialists to provide the detailed taxonomic advice.