What does provenance mean?
Provenance refers to the history of the creation and modification of the data. Provenance information should include, for example, information on the modification of the data, the correction, the splitting of the data into parts, or the combination of the data with other datasets.
Data provanence information can include information like…
Data Creation & Source Information
Origin:
- Who created or collected the data? (e.g., researcher, institution, automated system)
- Collection Date & Time: When was the data collected/generated?
- Data Sources: If the dataset is derived from other sources, list them with citations.
Data Processing & Transformation
- Processing Steps: What modifications, cleaning, or transformations were applied?
- Software & Tools: Any tools, scripts, or software used for data processing (including versions).
- Intermediate Data: If applicable, describe intermediate datasets created before the final version.
Data Contributors & Roles
- Roles & Responsibilities: Define contributions, e.g., who curated, analyzed, or published the data.
Data Changes
- Version Number: Identify the version of the dataset (e.g., v1.0, v2.1).
- Change History: Document modifications, corrections, or updates to the dataset.
- Timestamps for Changes: When were updates made?