Saturday, 20 August 2016

Making sense of data for July & August 2016

Further to my last post, I had a look at the same dataset (i.e. 01 July to 13 August) to see how the data for the same period in each of the previous three years compares with 2016.

A total of 126 'species' were recorded over this period between 2013 and 2016:




Number of species
Number of records
2016
106
5652
2015
98
4796
2014
92
2915
2013
92
1585


Making something of this is not straightforward as the yearly figures are so different. The simplest way of expressing the data is as a proportion of the total records. For the purposes of this quick analasys it will have to suffice although there more complicated statistics might be applied. More detailed investigation will have to wait for the time when we have a much bigger dataset covering many more years.

Top 50 species in 2016


A relatively small proportion of species make up the bulk of the records so when they are organised in rank order, largest to smallest, for 2016, one is well into the tail of occasional records by the time one gets to the 50th most abundant species (Figure 1).

Figure 1. The 50 most frequently recorded hoverflies fro, 01 July to 13 August organised by rank order for 2016 as a % of the total records for the recording period that year. Orange highlights the highest year proportion and blue the lowest.
Taking the top 50 in 2016 as a cut-off point, I then looked at which year was the best for each species, and which year the least productive and then converted each into a tabulation of larval biology to generate Figures 3 & 4. (Note: I lumped Myathropa florea into the wetland guild because they often breed in locations that might strictly be considered to be wet features rather than dead wood habitat).

Figure 2. Best representation in each year according to larval biology

Figure 3. Poorest representation in each year according to larval biology

 

What do the data tell us?

At first glance, one might say 'not much', but there are subtle and possibly important differences:

  • Wetland and aphidophagous species are the two guilds that dominate the most frequently recorded species list. It seems that in years when wetland species do well, aphidophages fare less well, proportionately. Of course, it may simply be that when wetland species numbers rise, the ranking of other species is diluted and they apparently fare less well. A more comprehensive statistical analysis is probably needed to investigate this, together, perhaps, with a much bigger dataset.
  • 2014 was a particularly good year for species associated with Hymenoptera; either as scavengers and parsaitoids in social wasp and bee nests, or associated with ant attended root aphids (if larval ecology conforms to current thinking).
  • The data used in this analysis cover the later stages of the hoverfly season, and certain guilds are probably under-represented because they emerge earlier in the year. This is particularly the case for saproxylic (dead wood) species. More detailed analysis will be possible when the full 2016 dataset has been assembled.
  • There has been a greater than threefold increase in the volumes of data assembled using photographic recording, and as such the dataset is starting to be sufficiently big to be a powerful tool in its own right. Good coverage of species is achieved, but it seems likely that there is under-representation of some guilds, especially plant feeders within the genus Cheilosia.
Some bigger questions emerge from this rather crude analysis:
  1.  To what extent are differences in the proportions of hoverflies recorded in individual year  a reflection of environmental factors within in that year?
  2. Conversely, to what extent does the weather (or other environmental factors) in preceding years impact upon numbers recorded in any given year?
At the moment neither question can be answered with certainty. Nevertheless, most of the commonest hoverflies are species that have several broods each year. It therefore seems likely that the weather in preceding months rather than years will play a dominant role in governing individual species' abundance. We know that this happens in species such as Rhingia campestris and it seems equally likely that the same will obtain for both wetland and aphidophagous species.

Hoverfly associates of social Hymenoptera normally have a single generation each year. Therefore, good years for adult hoverflies probably reflect the breeding succes of the preceding year rather than the year in question. This raises the important point that disentangling the ecology of one guild of animals may be dependent upon good data for another group of orgamisms. Are there sufficient data to say how well bumblebees, ants and wasps fare each year? It strikes me that some simple system for monitoring absolute numbers of bumblebees and social wasps might provide important data to help to investigate this relationship. That gets me thinking that perhaps we need more join-up between the HRS and BWARS.









Monday, 15 August 2016

A bit of analysis

Yesterday, in a post on the UK Hoverflies Facebook Page I tried to make some sense of what is happening to hoverfly populations this year. It generated several useful comments that got me thinking further. In that post, some basic graphs were presented in which the general differences between the datasets for the period 1 July to 13 August in 2015 and 2016 were illustrated (Figures 1-4). My initial thoughts (anecdotally) were that recorder effort had changed, with much more data coming from a smaller core of very active recorders and that whilst recorder effort had risen, the numbers of species per recorder had dropped. My interpretation of this was that the data lent some credence to the belief that hoverfly numbers were substantially down in 2016.
Figure 1. Basic data for 1 July to 13 August 2015 showing total numbers of daily records, recorders and species recorded.

Figure 2. Basic data for 1 July to 13 August 2016 showing total numbers of daily records, recorders and species recorded.

Figure 3. Comparisons of numbers of records per recorder in the periods 1 July to 13 August in 2015

Figure 4. Comparisons of numbers of records per recorder in the periods 1 July to 13 August in 2016
It was pointed out in comments that this might not be the case and that the data could simply be skewed by the numbers of new recruits who only posted shots of the commonest species. This morning, I therefore took a further look at data for the same period (plus one day). The results were illuminating (Table 1)!



2014
2015
2016
Total records
2974
4843
5891
Numbers of recorders each year
653
539
456
Top 10 recorder contributions each year
809 = 27.2%
1311 = 27.07%
2049 = 34.78%
Top 20 recorder contributions each year
1053 = 35.4%
1927 = 39.79%
2899 = 49.21%
  Table 1. Breakdown of photographic data for the period 01 July to 14 August in successive years 2014 to 2016

The data suggest that there has actually been a drop in recorder activity, but that those that are active have contributed vastly more records. To some extent I think the latter is true. There is a very nice group of regular contributors who now record almost daily from their garden or wildlife patch. This is the essence of the Garden Hoverfly Recording Scheme that Alan Stubbs pioneered 25 years ago. It is precisely what Stuart and I have been hoping to generate. The data are certainly skewed by some changes in recorder activity. For example, one major recorder in 2014 changed to a slightly different technique and although we have his data on a separate spreadsheet they don't appear in this analysis. Were they to be included, the trend would be greatly more pronounced. In addition, several other very active recorders in 2015 have switched to maintaining their own spreadsheets for submission monthly or at the end of the year. Again, their data would accentuate the change that has happened.

So, what has happened to recorder activity? This is best summed up by looking at the composition of the top 30 recorders within the photographic dataset that I have used. The recruitment/loss process is shown in Table 2. It is greatly encouraging and bodes well for the future of hoverfly recording!



Number of recorders
Contributing all three years
15
Contributing in 2015 & 2016
12
Contributing in 2014 only
1
Contributing in 2016 only
2
  Table 2. History of the top 30 recorder contributions to the photographic dataset between 2014 and 2014


Tuesday, 9 August 2016

Presenting records via Facebook



Use of the UK Hoverflies Facebook group as a way of getting photos checked has grown quite remarkably. It is a great way of facilitating interactions between observers and the Recording Scheme and is now generating a phenomenal number of records: there were over 4,000 gathered in July from web-based sources, with the majority coming from this group. Obviously such large numbers mean that data comes in all sorts of formats and there is quite a big job turning FB posts into a spreadsheet, so I thought some guidance on record/photo submission might be helpful.
My spreadsheet for web-based data has the following headings:
Column header
Notes
Species name
I record full identifications and partial identifications in two different spreadsheets. The first comprises those records that can be taken fully to species or to an aggregate name where there has been a split and the Resident Team refer to the aggregated name. For example, if we cannot put a firm name to a specimen that would once have been recorded as Xanthogramma pedissequum I record this as Xanthogramma pedissequum s.l.

If we can only get to Genus or Tribe, then these go on a separate sheet.
Date
This is arranged dd/mm/yyyy e.g. 31/07/2016; if we cannot get a full date then it will be recorded as either 'July 2016 or as '2016. The latter two options are far from ideal but can be used.
Grid reference
Ideally a four or six-figure OS grid reference. A lot of grid reference finders take the resolution to the 1 metre level i.e. 10-figure level which in most cases is probably false accuracy if we move around to get shots and records. Nevertheless, if that is what the recorder has used then it is logged.
Location name
If a location name has not been supplied then I look the site up on Streetmap and find a logical name if I can.
Recorder
This is the name of the person who makes the post. Where I know that people use various aliases then I include these too (some people have up to four aliases).
Determiner
If a photo is published then I use my own name unless I feel it appropriate to defer to somebody else - either Ian Andrews or Joan Childs, or occasionally Gerard Pennards. Just occasionally I use the recorder's name when he/she has posted an awkward angled shot of an obvious species and almost certainly knows what they saw.

I run a separate spreadsheet for records that come in without a photo to back them up and in these cases the determiner is the name of the recorder unless there are good reasons to use another name.
Stage
Essentially male/female/larva but occasionally it is not possible to determine gender and in these cases the annotation is 'adult'. I use 'adult' for pairs in cop and make a note in the notes section that this was a pair in cop. Likewise where there is a nuptial stack of Eristalis nemorum I note the numbers of adults in the abundance column.
I run separate lines for males and females, except where there are mating pairs or nuptial stacks.
Abundance
This is as best as I can do to determine how many individuals of the gender reported or photographed.
Source
This is the URL for the Facebook post or for the iSpot or Flickr post. If several posts depict animals from the same location on the same day then I add additional URLs separated by &
Notes
Here I log observations by recorders and items such as flower visit records. I do not log the leaves an animal was on and make clear that the flower involved (if visited) was a potential pollination event by saying 'at x' or 'visiting  y'. This column is useful for behavioural notes. Where I am unsure of the plant associated with the hoverfly I usually put 'at ???' or 'at ?Sweet Pea' or 'at umbellifer'

What is the best format for a post?

I usually copy and paste details from the FB post, so there are things that can be done to help.

A really nice example of presentation might be along the lines of:

I had an excellent day in the garden with lots of hoverflies. Here are my photos:
Episyrphus balteatus (2 photos, same specimen); Eristalis arbustorum at hogweed; Eristalis pertinax; Eristalis tenax (2 photos of separate animals); Platycheirus albimanus male and female. both visiting vipers bugloss
Not photographed:
Syritta pipiens 1 male; Volucella zonaria 1female at buddleia
Observation details:
Queen Elizabeth II Reservoir, West Molesey
TQ123678
09/08/2016

The above example is fine - I can cut and paste the basic site details straight into the spreadsheet. I usually then copy the species list into Word and run find and replace to turn it into a list with tabs to separate numbers and observations - the above would then look like this:

Episyrphus balteatus
1
adult
2 photos, same specimen
Eristalis arbustorum
1
adult
at hogweed
Eristalis pertinax
1
adult

Eristalis tenax
1
adult
2 photos of separate animals
Platycheirus albimanus
1
male
 visiting vipers bugloss
Platycheirus albimanus
1
female
 visiting vipers bugloss
I can then cut and paste each column into the spreadsheet

Some hints on what not to do

Species names
It helps not to abbreviate the generic name - I simply have to type this so it is not a huge problem, but it does add to the job.

Grid references:
One of the biggest problems we have is grid references. Stuart has found that the advent of GPS has exacerbated rather than resolved this problem. Somewhere between 5 and 10% of all records have some sort of grid reference problem.
The worst I have to deal with are lat long data, which people submit in all sorts of permutations - they can take a fair while to sort out and make sure that the correct OS reference has been determined.

The other regular problems are:

·         people leaving off the 100km letters (e.g. TQ)

·         people using lower case - I turn these into upper case to make sure that the spreadsheet is uniform and simple to read. It is a small niggle but does add to the time involved in creating a complete record.

·         Separating the grid reference using commas, lines etc e.g. TQ28/68 or TQ28,67 etc - I have to remove these. I regularly run a find and replace to clean all of these sorts of glitches from the data.
·         people who place the location out to sea, either accidentally or deliberately. We do get the occasional marine record from oil rigs and ships, but these are usually noted in the post.

I should also say that it does help to separate records from different sites onto different posts - I am a 'bear of small brain' and having a post with records from several sites does cause confusion and transcription errors.

What do the resulting data look like? 

Example section of photographic spreadsheet for 2016.