Tuesday, 5 December 2017

Whose data is it?

The Hoverfly Recording Scheme gets fairly regular requests for access to the dataset, mainly from University 'pollinator' groups. In general, we are happy to oblige and as a result the HRS data get used in all manner of ways. This is absolutely right - when we engage with recorders via Facebook we make it clear that the data are used in this way, so I hope nobody is in any doubt that we are assembling a dataset for wider usage and for the 'common good'.

Nevertheless, there is a fine dividing line between making data available for research purposes and simply being seen as the source of data upon which to build research proposals. Today, I got a request for access to the data from a PhD student who remarked that his PhD proposal was highly dependent upon access to the HRS dataset. Nobody had talked to us previously, so this came like a bolt out of the blue! OK, we will make the data available - after all, we are simply the custodians of the data and NOT the owners. We must not be protective apart from making sure that the data are used wisely and in the common good.

The problem I start to have is that the job of running a recording scheme for a popular group has evolved into pretty much a full-time occupation. I do not dare have a day off between about the end of February and the end of November and can anticipate putting in between 6 and 10 hours daily during the summer months. I know other schemes find it difficult to keep up with the demands on their time, so I cannot be the only one that feels the change. If running a scheme has become this demanding, there is a need to ask what motivates the recording scheme organiser and what will either:
  • motivate them to keep going; or
  • de-motivate them and lead to a loss of scheme activity?
Now, I am a well-known 'grumpy old git' and there will be those who say 'Morris is moaning again'. But, you can be sure that if I am vocal there are several others who won't put their names to what I say but will be quietly saying 'thanks Roger for saying what I don't want to say in case it affects my chances of a job/promotion/honours etc.'

I've got nothing to lose - my career has hit the rocks and in eight months time I will be able to draw my pension so I simply have to survive until then! So, I will say what others might be more reticent to say!

So, what motivates me?

These days, my main motivation is to try to make sure that by the time I pop my cloggs there is somebody to take over from me, Stuart and everybody else. A huge investment of time and emotional capital has gone into building the HRS from a pretty shaky base into one of the biggest invertebrate datasets in the UK (and probably one of the biggest Diptera datasets in the World). That investment will be wasted if we have no successors.

The other thing that motivates me is that after all these years running the scheme (26 years now since 1991) we now have a long enough data run to start to do some nice analytical work and to publish some interesting papers. I WANT to do just that - after all, I was trained as a scientist, I have a scientist's mind and I want to do something meaningful with the data. BUT, I must remember that we are simply custodians of the data and NOT the owners.

I reckon we should be aiming to retire from the front-line of running the scheme around 2021 (30 years tenure) and I would like to think that by then we will have produced a decent run of papers; but to do so we must pull our fingers out (that means me!)

And what de-motivates me?

I have to say that I have become increasingly frustrated to get the impression that recording schemes are looked upon by all and sundry as a source of free data. That starts with the biodiversity industry that is always looking for new ways to increase the volume of biodiversity data without stopping to think about who will compile it, verify it, generate the enthusiasm amongst recorders, validate records and extract records. In practice, it has meant that an awful lot of schemes have turned from a private passion into an Albatross - you cannot drop it without there being dire consequences for something that you have invested half your life in (well almost) but if you don't drop it you have to invest even more because the demands are increasing.

To then find that the academic World sees us as simply a data resource, builds PhD or other grant bids based on access to the data we compile, but does not bother to talk to us first is somewhat irksome to say the least. To then see papers emerging in which the data come from us but the credits go to the academics is deeply frustrating. It is of course 'Citizen Science' - that great unwashed with no scientific expertise providing the great scientists with the material to produce their latest papers.

I also become increasingly demoralised to encounter ever-increasing attacks on anybody who has the temerity to post a photograph of a preserved specimen or to talk about specimens as 'material'. Why should I have to spend part of my time defending the collection of data that is the only facility available to show mankind the folly of our actions? It is as if the worst part of mankind is that which lacks morality and is actually prepared to generate reliable and meaningful data. Far better to rant at Governments without reliable data and then rant because you've been shot down for lack of legally admissible information!

And the moral of the story?

I think it is time that the agenda changed from 'how to we motivate recorders to produce more data' to 'how do we maintain and improve the morale of the people that keep the recording schemes going?'

 Rant over - but hopefully it sparks a meaningful debate!


  1. If it's taking up so much of your time, and you always allow anybody and everybody to use the data, why not just publish the whole lot on NBN? Then those who want it can download it without bothering you.

    If the interest is in parameters that NBN doesn't store, it could still be made available online - needn't even be set up as a database - just a zipped CSV file download.

    If you are going to keep the personal touch, I'd suggest you start asking for a voluntary contribution towards the running of the recording group, at least from those unconnected with recording. Needn't be much, but people appreciate things more if they've paid for them!

    Malcolm Storey

    1. Malcolm you will see my relies below - yet again you seem to have concluded that I'm an imbecile that has not got a clue!

  2. We are in the process of validating the data before it goes back on the NBN Malcolm. It was on the NBN as a dataset to 2005 but we have a MASSIVE job going through and cleaning the data so that it is reliable. Just 'going and publishing on the NBN' is oh so simple isn't it! Actually it is not. And, I was not complaining about the time it takes to deal with these requests - the time involved is the time it takes to extract and validate records plus running the scheme from an administrative perspective. Please don't think I am creating obstacles by simply not using technologies - that is the techie view - it is there so there is no problem: somebody still has to make the system work and it is the same people who do all the other jobs. ALSO, you might not be aware that access to the NBN is for non-academic usage - academics still have to seek permission from schemes to download the data - it is not a 'free to use service'.

  3. Oh and it sounds simple to start charging - well actually it is anything but - for a start you then need a dedicated bank account, auditors, accounts etc. That is another tier of bureaucracy that takes up the time (and possibly money) of the scheme organiser - perhaps it is time you took on organising and running a big scheme!