Skip to content

Data documentation

Introduction

The data documentation is based on small JSON-LD documents, each documenting a single resource. Examples of resources can be a dataset, an instrument, a sample, etc. All resources are uniquely identified by their IRI.

The primary focus of the tripper.dataset module is to document datasets such that they are consistent with the DCAT vocabulary, but at the same time easily extended additional semantic meaning provided by other ontologies. It is also easy to add and relate the datasets to other types of documents, like people, instruments and samples.

The tripper.dataset module provides a Python API for documenting resources at all four levels of data documentation, including:

  • Cataloguing: Storing and accessing documents based on their IRI and data properties. (Addressed FAIR aspects: findability and accessibility).
  • Structural documentation: The structure of a dataset. Provided via DLite data models. (Addressed FAIR aspects: interoperability).
  • Contextual documentation: Relations between resources, i.e. linked data. Enables contextual search. (Addressed FAIR aspects: findability and reusability).
  • Semantic documentation: Describe what the resource is using ontologies. In combination with structural documentation, maps the properties of a data model to ontological concepts. (Addressed FAIR aspects: findability, interoperability and reusability).

The figure below shows illustrates how a dataset is documented in a triplestore.

Documentation of a dataset

Resource types

The tripper.dataset module include the following set of predefined resource types:

Future releases will support adding custom resource types.