These guidelines are based on the European Research Area (ERA) Forum Stakeholder’s document ‘Living guidelines on the responsible use of generative AI in research’.
Researchers at the University of Helsinki must follow the General Principles for the Use of Artificial Intelligence (in intranet, requires login). Please read these principles for use first before proceeding.
In addition, the guidelines outlined here do not replace, but complement, the TENK Guidelines for the Responsible Conduct of Research. GenAI is merely a tool—normal research ethics still apply, and common sense goes a long way in using such tools responsibly and ethically.
In addition, the Center for Information Technology at the University of Helsinki already created practical guidelines for Generative AI at the University, which includes guidance on using tools such as Copilot and CurreChat (‘ChatGPT for the University of Helsinki’), and Kontra (machine translation between English, Swedish, and Finnish). In addition, those practical guidelines offer direction related to using external GenAI solutions such as ChatGPT and image-creating applications such as Midjourney. We envision new tools being provided to the University community over time.
Please note that these guidelines on the use of GenAI in research are intended for all types of generative AI, and will be updated as necessary. At a later stage, the University will consider adding to these guidelines instructions on the development of GenAI solutions.
Researchers are responsible for their own use of GenAI. When using GenAI, be aware of the limitations related to bias (data used for training the AI model is not necessarily balanced, transparent or ethical), the risk of producing incorrect or irrelevant outputs, and inaccuracies. Tools such as ChatGPT aim to produce well-written text, not necessarily factual or fact-checked text. Therefore, you must always verify the output of GenAI, through, for instance, consulting reliable sources of information, because you as the users are responsible for the contents and outputs. The University assumes responsibility for the technical elements of the tools it provides, but limits its responsibility to providing instructions to researchers on using the tool in question.
When substantially using GenAI in your research, indicate the name of the tool and the version used, the date of use, how the tool was used, and for what step or purpose. This helps preserve the reproducibility of research and allows others to evaluate the accuracy of your results. It is also helpful in allowing the research community to share best practices.
Follow GDPR and data protection instructions (in Flamma, requires login) when processing personal data, copyright legislation, and confidentiality agreements. It is against copyright law to use copyrighted material, such as copyrighted publications, as an input to GenAI tools. Do not use sensitive or confidential data or unpublished results as input, since there is risk that the data will be ‘leaked’. Finally, before beginning a project using GenAI tools, a data protection impact assessment is required if you collect or use data related to individuals.
In addition, consider the consequences and assumptions of your research from the viewpoint of the ethics of AI (see, e.g., UNESCO's Recommendation on the Ethics of Artificial Intelligence). Such considerations include, for instance, the fair distribution of benefits and costs, equitable inclusion, freedom from unfair bias and discrimination as well as the prevention of unjustifiable coercion, subordination, and manipulation. In addition, security, safety, privacy issues, and ethical questions related to accountability and responsible distribution, transparency, and social and other impacts (including environmental) are also important.
As a general rule of thumb: The more you rely on a GenAI tool, the more verification, control, and accountability is required from you.
You can use GenAI for brainstorming, but you cannot rely on the accuracy or novelty of the results generated by the tool. GenAI relies on data provided when training the model. Thus, you cannot expect novel, creative ideas from GenAI. You can use GenAI to find sources of information and help with your literature review. However, the tool can only provide you with pointers. It is your responsibility to make sure the sources exist and that they are relevant to your work. GenAI can also be used to summarise a given text or translation, but remember not to add copyrighted material to a tool. Keep in mind that GenAI tools are prone to produce incorrect outputs or biases. They can also add existing works into your text. Thus, it is the responsibility of the author to review all outputs. Finally, make sure that the tool used does not leak any data if the inputs are not accessible as open source materials.
You can use GenAI to help produce analyses, or to synthesise or summarise data if you understand the limits and ethical implications of the tool. Be clear about how to use GenAI appropriately and remember to note the specific tool you used, its version, the date of use, and how it was used. To allow for the reproducibility of your results, and if needed for reporting, please save the inputs (i.e., prompts) and outputs. The results from a GenAI tool can be treated similarly to how research data are treated.
Remember to reflect on the use of GenAI if it is used in your research, and ensure that its ethical implications are considered in the request for an ethical review statement.
Please check and follow the GenAI policy of the funding agency. If no GenAI policy is stated, you can always ask the agency about its use. Be transparent about your use: describe in the application how you have used GenAI; if you use GenAI in your research, outline the details of its use. If you are going to employ AI in the research you plan, remember to mention it in the methods section and, as relevant, to take that use into account in the Data Management Plan.
You can use GenAI to help prepare a questionnaire, for instance. Keep in mind that GenAI uses existing information. Thus, your questions might appear in previous work. Brainstorming and data aggregation can also be useful during this step if the issues previously mentioned are also considered. GenAI cannot be used, however, to generate your actual research data.
You can use GenAI to find and choose suitable tools for data curation or data analysis, including non-GenAI tools.
You can use GenAI to generate computer code, but you must check and test it yourself.
When reporting and publishing your research (i.e., scientific papers, conference talks, press releases, talks, etc.), check for and follow the possible GenAI policy of the dissemination forum. If no GenAI policy is stated, you should ask about its use or examine similar publications.
In general, GenAI cannot be listed as a co-author. You can use GenAI to improve the grammar and wording of your text or to create visualisations of your data and results, but you must ensure that the outputs accurately reflect your findings. Describe in detail which GenAI tools have been used substantially in the research process and when writing. Some researchers have used GenAI to prepare a literature review with bad results (i.e., non-existent references, limited viewpoints, etc.). You can use GenAI to help with editing (such as producing LaTeX code). Note that if you use GenAI to produce text or illustrations, there is a risk that the text or illustration output relied on copyrighted material and you may inadvertently commit plagiarism.
In any accompanying meta-data, provide information about the use of GenAI.
GenAI tools can be used at almost any level of the research process. Therefore, you should add the necessary information on its use in the appropriate locations. The details included, however, remain up to the author’s discretion. For example, you may describe your use of GenAI in the research plan if you plan to use it to help you answer your research question or in the data management plan if you plan to use GenAI to help analyse your data.
The EU issued a new regulation concerning artificial intelligence. The EU’s AI Act does not apply if an AI system is developed and put into service solely for scientific research and development purposes. However, the regulation does apply if the system is tested in the real world or if you are planning to bring an AI system to market. If you have any plans to commercialise an AI tool, please contact University's Legal Services as early as possible.
Generative AI: artificial intelligence that generates data (e.g., text, images, etc.).
Bias: incomplete or inaccurate data resulting from AI. The data used for training can be skewed, such as excluding people of colour or minority populations. As an example, the data used for ChatGPT has a language and geographical bias, in that it skews towards English-language texts and publications, predominantly from the US.
Data leak or leaking: data made accessible outside its intended environment, typically unintended or by accident.
LaTeX code: LaTeX is a typesetting software which can rely on customised code styles, colours, and languages amongst others.
Synthesised or synthetic data: artificially generated data which attempt to imitate real data.