FAQ: Data catalogue

Frequently asked questions about the University of Helsinki Data catalogue, metadata and preservation of research data.
What is considered as research data?

Research data may include: 

  • Raw data, which consists of unprocessed information from experiments, sensors, or interviews.
  • Processed data, which has been cleaned or analysed.
  • Methods and protocols, which document how data is collected and processed. 

In the context of University of Helsinki Data catalogue term “research data” includes all of these and many more:

  • Code and software, such as scripts and programs used for analysis.
  • Source data, referring to external datasets used in research.
  • Simulation and modelling data, generated from computational models and simulations.
  • Images and visualisations, such as medical scans and charts, support analysis.
  • Audio and video data include interviews and field recordings.
  • Survey and interview data consist of responses and transcripts.
  • Spreadsheets and statistical data store numerical information.
  • Metadata, which provides descriptive details about datasets. 
How can I search for research data in Data catalogue?

The research data described in Data catalogue can be found using Data catalogue's own interface, and search engines. In addition, metadata records from Data catalogue are exported to other metadata services, such as Research.fi

  • Data Catalogue: You can search datasets by quick search (free-text search) or by exploring the content of various repositories. The Data catalogue repository structure supports browse-type searching. You can also refine your search by using various filters, such as year of publication, creator, subject, and data availability.
     
  • Search engines: data from Data catalogue can be found, for example, by Google. It is therefore important to provide a rich description for your research data, as this will also improve their discoverability.
How can I publish my research data in Data catalogue?

We recommend that you publish your research data in a suitable repository (read more at the FAQ for data preservation: Where can I publish my data?). If it is a repository from which we automatically import data into Data catalogue, your data will appear there during the next transfer. Just remember to enter the University of Helsinki as your affiliation (organisation information).

If the repository in question is not harvested in Data catalogue, or if the data cannot be published, for example due to its sensitivity or trade secrets, you can submit its metadata to Data catalogue using the form on the right. 

Can I describe the accumulating data in Data catalogue?

Yes. You can also update the metadata later if necessary.

Can I describe a larger collection of data in Data catalogue?

When describing the data, it is a good idea to try to divide the data into meaningful sets. What these are depends on the case, but data relating to a single research article or project, for example, often form a clear unit. A set of data can also be formed according to the type of data, for example, the data produced by a single measuring instrument over a certain period of time. The more precisely the different parts of a dataset are described, the greater their value is in terms of increased reuse and discoverability. Datasets such as "All data from our research team 2012-2015" are unlikely to be very useful, but if this is the only way to describe the data, it is obviously better than not describing the data at all.

My data contains sensitive information. Can I publish its' metadata in Data catalogue?

Yes you can! When you describe your data in Data catalogue, you are also complying with the research funders' requirement to follow the FAIR principles.

Although the metadata do not include the data itself, please take care not to accidentally disclose sensitive information when describing the data. For example, if you have interviewed people from two universities whose location has been anonymised due to the sensitivity of the topic, the location must not be disclosed in the metadata either. In this case, it would be a good idea to refer to the universities in the same way as in the research article, e.g., Uni1 and Uni2. If you are wondering how to describe sensitive data, feel free to ask us for help!

There is no filter for faculty in Data catalogue. Why?

Some of the metadata records entered in Data catalogue contain information on the faculty where the data was produced. You can find this information in the keyword field. By clicking on the name of the faculty, you will get a list of metadata records that also have the same faculty information. You can also search for dataset by faculty name using Data catalogue's search function.

Most research data repositories do not contain the faculty information. Most of Data catalogue's content is automatically imported from such data repositories. If Data catalogue had a filter for faculty, it would only be able to filter data from those metadata entries that have information about the faculty. The result would give only a small fraction of the data produced by the faculty and found in Data catalogue, thus misleading the user. Because of this, there is no filter for faculty in Data catalogue.

Will Data catalogue automatically generate a new DOI for my dataset? How can I avoid unnecessary DOIs?

If you manually describe the research data in Data catalogue, you will get a DOI for the metadata, even if the dataset itself already has an existing DOI. The DOI assigned by Data catalogue refers to the metadata (in Data catalogue), not, for example, to the data published in a repository.

The dataset I have published in Zenodo is not in Data catalogue. Why?

Let's start with the most common reason: did you put University of Helsinki as your affiliation (organisational information)? Since Zenodo has publications from a number of different universities, our data transfer relies on this information.

If the affiliation information is in order (and there are no typos!), try the following: did you save the data recently? We transfer data from Zenodo at certain frequencies. So wait a while. If the data still does not appear in the Data Catalogue, please send us an email (datasupport@helsinki.fi). 

I am not listed as a creator of the data I have produced, even though the data can be found in Data catalogue. What can I do?

The problem may be due to the fact that the automatic harvesting of metadata, i.e. the data transfer from the repository, is not fully successful. Please let us know (datasupport@helsinki.fi) and we will correct the problem. Please indicate in your message which metadata record is in question.

I have requested data via the "Request access" button in Data catalogue but have not heard anything for a week. Who should I contact next?

Please email us at datasupport@helsinki.fi

Can I delete my research metadata in Data catalogue?

In principle, nothing is deleted from Data catalogue, as the aim is to keep the metadata of the research data as long as possible. However, metadata can be edited. If necessary, please contact Data Support (datasupport@helsinki.fi). 

Can I propose a new data repository to be harvested in Data catalogue?

Yes you can! You can suggest it using the form on the right. We'll investigate every suggestion, but please note, that we might not be able to harvest your favorite repository — not all repositories' metadata standards match our technical requirements for metadata. For example, they might lack the affiliation information, thus we can’t separate data produced in the University of Helsinki from data produced elsewhere.

Why my data is not visible in my Tuhat profile?

Currently, information from Data catalogue is not automatically transferred to Tuhat and you cannot add information there yourself. We are exploring the possibility of adding information of the produced data to Tuhat profile. We recommend adding information about published data to ORCID.