In the past ten years, several models have been developed to
make use of 'ad-hoc' or ‘opportunistic data’. They are regularly used in analyses
of trends in Britain’s wildlife and are the black boxes behind the banner
headlines of x or y changes in the abundance of Britain’s wildlife
(substantially declines). The processes are complicated, so my brief
description is necessarily short and open to correction by those in the know.
However, for basic purposes of explaining how different datasets perform, the
following may be useful:
These models take existing data and use them to predict
where a given species might occur. To do so, they develop a list of the species
that occur in surrounding squares that contain similar land-cover
characteristics. The lists will comprise a mixture of those species that might
be expected almost everywhere, those that are more specialised but are still
widespread and abundant, and scarcer species that have more demanding
ecological parameters.
The completeness of coverage of surrounding squares will
determine the degree to which a model can predict presence or absence. It has
been assumed that models will smooth out irregularities in recording effort,
but I have felt for a very long time that they will be affected by the
composition of species lists. If the list is complete, there is more chance of
predicting the presence of scarcer species or of species that are difficult to
identify. On the other hand, incomplete lists will make it more difficult for
the model to identify critical ecological factors and species will not be
predicted.
Crucially, a test of whether a list is complete will depend
upon those species that occur consistently across the landscape. There are
arguably three classes of species that fall into this category:
- readily recognisable species that almost everybody records;
- species that are difficult to find but are still very widespread and are therefore less well recorded; and
- species that are very widespread, but difficult to identify and hence are under-recorded.
If a species list contains all the above species, it can be
assumed that it is comprehensively recorded. The shorter the list of these ‘constant’
species, the less well it is recorded. The problem that dogs these models is
the issue of completeness of coverage. So, inevitably, if coverage is weak, the
models will have trouble predicting presence or absence. This shows up quite
well in models covering, say, the west coast of Scotland where there has been
very little recording at any time. At the moment, I am unconvinved that we really know what the constants are amongst the taxonomically challenging parts of our wildlife.
So, the question then arises:
What can we do to improve the accuracy of predictive models?
Readers who use BirdTrack will be aware that the system
requires the recorder to say whether they have submitted a complete or partial submission.
If your list only notes the rare and unusual, it is not included in the analysis,
and likewise if there were species that you were unable to identify then the
list is incomplete and should not be included in the analysis. BirdTrack takes
opportunistic recording one step closer to providing the robust data that
occupancy models need to deliver reliable results.
In most other taxonomic groups, ‘opportunistic’ data is a
complete hotchpotch of complete lists and casual single records. All have an
important role to play because they all help to fill in little parts of the jigsaw.
But, of course, if a visit is made to a site and only part of what was seen is
reported, then the model only has part of the species list to work with. Repeat
visits by a range of recorders will fill in some of the gaps over time, but
unless the range of recorders includes people who tackle the tricky species,
the lists will always be incomplete, and the model will inevitably have less to
work on.
So, if we want to improve the accuracy of predictive models,
the answer is quite simple. We need to improve overall coverage, both in terms
of geographical extent and in terms of depth of species composition. This is
one reason why a general call for more recording may not have the desired
effect; indeed, it could compound model shortfalls by focussing on a larger
volume of the easily identified species and give the impression that more
challenging species are declining or declining at a faster rate than they
actually are.
I have shown in previous posts how the trend for Portevinia maculata has sharply altered
upward since photographic recording became the preferred recording medium. The Portevinia maculata
model, however, illustrates a second issue. It was probably greatly
under-recorded and is now much better recorded. So, the army of recorders who
have looked for it and added new squares have made an important contribution to
our knowledge of its true distribution. So, there are definite benefits from
certain increases in recorder effort.
It therefore follows that if one of the significant
objectives of biological recording is to improve our knowledge of the
distribution and status of Britain’s wildlife, we need to think about how to
improve the data that underpin these predictive models. These models were used
to produce the maps in the WILDGuide to hoverflies and doubtless in other
guides too. So, there are also benefits to the avid recorder if the models are
improved - the next generation of guide books should be more accurate!.
Thus, rather than a general cry for more data, I think the
new cry should be – complete lists please? Or, if you are not one for retaining
specimens, do please try to ensure that your coverage is as complete as
possible. We have seen a strong shift in this direction in the UK Hoverflies
Facebook group and it is much welcomed. I think this shift illustrates two
important points:
- More active group members have developed the ability to create such lists; and
- These members have developed the key skill of logging all observations rather than just a checklist of the unusual.
Whatever your interest in wildlife recording, it is worth
thinking about the added value of full species lists. They will make a
difference.
With occupancy models, we can estimate the total number of species (or taxa) present, for example, using multiple season multi species occupancy models. Not only can these models help with false absences but they can help us with false positives too. Improving coverage and species identifications will increase accuracy when there is an associated increase in the number of surveys, and similarly it will help to record the common as well the rare species.
ReplyDeleteA really important consideration is the number of repeat surveys at our sampling sites of interest: the accuracy of the estimates can be improved more effectively by repeating more surveys at a single sampling location rather than by spreading fewer surveys across many sampling locations. The exact number of surveys required will depend on the detection probabilities of the species involved, their occupancy rates and available resources.
Another key feature of occupancy models is that they allow us to account for the different abilities and detection rates of different observers. This has the potential to help with observers that are more likely to record some species than others.
Darryl MacKenzie et al (2018) have published a 2nd edition of Occupancy Estimation and Modelling, which is a useful update on these methods, and there are some good packages in R that can be used to help with any analysis.
Apologies, I was unable to comment on the UK Hoverflies Facebook post, so hope it is ok to discuss on here.
This comment has been removed by a blog administrator.
ReplyDelete