Posts Tagged ‘reliability’

Hans Pfeiffenberger on Trust, Reliability & Quality

Wednesday, February 16th, 2011

Hans Pfeiffenberger on RidingtheWave

The building of scientific knowledge needs to be preserved for the long term, so it needs solid foundations. Today’s knowledge is conveyed through articles, papers and events, as well as datasets, which begs the question as to whether datasets can be less reliable than books and articles. Are there any sound reasons for treating them differently? An EC-commissioned report, Riding the Wave – How Europe can gain from the rising tide of scientific data (Oct 2010) writes that challenges related to trust that need overcoming include: How can we make informed judgements about whether certain data are authentic and can be trusted? How can we judge which repositories we can trust? How can appropriate access and use of resources be granted or controlled? On data publication and access, the Report states: How can data producers be rewarded for publishing data? How can we know who has deposited what data and who is re-using them – or who has the right to access data which are restricted in some way? How do we deal with the various ‘filters’ that different disciplines use when choosing and describing data? What about differences in these attitudes within disciplines, or from one time to another?

Sample of initiatives

The International Polar Year is a good example of generating significant amounts of data and of the issues at stake. Its mission is to take a data snapshot of the polar caps for re-use in decades to come bringing together 50.000 participants and 63 Nations with ca. 1 G€.

DataCite is a Digital Object Identifier (DOI) registration agency for research data. DataCite is now considering to ask data repositories for some kind of certification (professional organization with some policy to permanently deliver on the technology promise). DataCite will foster global interoperability about a specific policy issue.

ICSU World Data System is about global interoperability on a number of policy issues long-term availability: handover of data in case of default would be so much more helpful, if DOIs were employed

  • Open Access
  • Makes things so much easier
  • What about endangered species, social science data?
  • Operates by accreditation, considers certification
  • Which certification?

The Situation Today

There are lots of data repositories today…

  • Most operate as projects, on a best effort basis
  • They are highly incompatible regarding, e.g. (access) protocols and formats supported, content qualities (QA, granularities, and so on), rights/licensing
  • Interoperability at a global scale is hard/impossible
  • Integration of data (don’t mix high/low quality data)
  • Trust about long term availability

Digital Data Library = a data repository with a policy.

Conclusions
Most important elements for the stability of the knowledge architecture of science:

  • Quality of each building block: quality assurance, encoding of quality indicators, provenance
  • Persistent availability, accessibility of each block: Handover / Mirroring and Persistent IDs
  • Checksums?

The talk is available on the dedicated web page.

Bookmark and Share
DL.org Blog powered byWordPress