Skip to content
Main menu hidden.

Image: Smolendskarkivet, Riksarkivet

Published: 2021-01-19

Enabling research – research data for future use

FEATURE Join in on a conversation with Professor Per Ambrosiani from the Department of Language Studies and Sanna Isabel Ulfsparre, librarian at the Umeå University Library. The conversation covers what counts as data, and how you make research data accessible. How does it feel to make data accessible to the world and how does it work?

Text: Susanne Sjöberg, Sanna Isabel Ulfsparre

Per Ambrosiani, who is professor of Russian, has since 2014 worked primarily together with Docent Elisabeth Löfstrand at Stockholm University to make Russian documents in the so-called Smolensk Archives from the seventeenth century accessible. The work has been conducted through funding from Riksbankens jubileumsfond – an independent foundation with the goal of promoting and supporting research in the humanities and social sciences. The aim of the activities was to make the material accessible to a wider audience by building a digital catalogue at the Swedish National Archives in Stockholm.

The documents themselves have an interesting history as they were produced during the siege and takeover of the city of Smolensk in 1609–1611 and were then brought to Poland as a war-trophy, and again later moved as a war-trophy to Sweden in the middle of the 1600s. The documents are also of importance from a history of language perspective since the 1,000 sheets spread across 75 scrolls were written with a Russian handwriting characteristic of the period – called skoropis’.

Such a cohesive material is interesting as research data in several domains and a considerable number of researchers who find these useful are active beyond the borders of Sweden. Per Ambrosiani contacted the Department of Scholarly Communication at the Umeå University Library in order to make the database accessible through as many channels as possible and in the best ways possible. And when the connection with the library was established, discussions also started regarding finding ways to make the research material available, in a way that it can be of use in future research.

“Already from the start of the project, the objective was to make a digital catalogue available at the Swedish National Archives,” says Per Ambrosiani. “We hope that it can be used by more disciplines and result in further research projects in the future. Both historians and philologists can be interested in the material, but in varying ways.”

We hope that it can be used by more disciplines and result in further research projects in the future

Many of the documents have been transcribed by the project participants in order to make them understandable also to people who do not read skoropis’.

“Those who can’t read the texts themselves have to trust that we have transcribed them correctly,” says Per Ambrosiani and continues: “But it’s worth noting that it’s difficult to transcribe texts of this character with full precision. The texts are sometimes rather difficult to read, and they are written using linguistic conventions that have changed over the past centuries. This is why we have also made the texts accessible as scanned copies so that those who can read skoropis’ can view the originals and make their own assessments.”

On the question of how it feels when your data becomes openly accessible to people beyond the project, he answers:

“It feels good. At the same time, it makes you vulnerable as you may reveal ‘your incompetence’. Nevertheless, academia is used to this, and you get the same sensation every time you publish your work. It’s just something you have to get used to.”

What is research data?

It can be difficult to define what counts as research data when you are set with the task to organise complex material with huge potential. Making sure that the organisation and description of the material reaches the highest standard is key to making the material findable, understandable and, if possible, reusable by others.

“Raw material, photos, transcriptions, process descriptions, metadata – anything that can be studied is to be regarded as potential research data when working on descriptions of and making such resources accessible,” says Sanna Isabel Ulfsparre. “In that work, focus must lie on others being able to find the material, regardless if they study in the same domain as the person making the material accessible, or in a whole other field.”

What is important is primarily that researchers get access to a comprehensive description of the material and a chance to get acquainted with what type of research data is available. Sometimes that is sufficient, whereas at other times, a researcher needs access to the raw material. For those purposes, a description of how to access the resources also needs to be available.

For various reasons, it is not always possible to make all material accessible to the same extent as the contents of the Smolensk Archives. Sometimes this depends on the format, and at other times it is due to ethical considerations – when the material contains sensitive personal data, for instance. The challenge is to organise data in a way that is structured, searchable, lucid and describes all aspects one needs and in a way that is comprehensible. There are established ways of achieving this, but it is usually a specialist skill beyond the scope of the researchers’ area of focus.

Combining these skills is when we’re most successful

“We are not librarians or archivists, so we don’t know automatically how to make our data accessible in the best way possible. But those who possess that skill usually don’t know how this type of documents are to be read and interpreted. Combining these skills is when we’re most successful,” says Per Ambrosiani.

Disseminating research data

One of the main areas of development in research support is to build infrastructures and standardised ways of making research data accessible and to disseminate it. The development is driven by research policy targets and funding-bodies’ expectations that as much as possible of the research output is to be made as accessible as possible. Research data is to an increasing extent counted as valuable research output in itself, whereas focus has previously been on publications such as articles and books.

One objective is to make research data FAIR – that is to say, Findable when searched, Accessible (at least by detailed metadata), Interoperable with other data and between software and systems, and Reusable. In Sweden, this is carried out by registering research data in the national research data catalogue with the Swedish National Data Service (SND).

“Researchers can register their databases with SND. You’re the person who knows your material best and can describe it in detail in the fields provided in their service, Doris. When the registration is complete, it will be submitted to us at the library to be checked. Therefore, we may contact you with follow-up questions, or if we notice that something is missing,” says Sanna Isabel Ulfsparre.

When the registration is finally published, this also creates permanent links, DOI. DOI is a link that will persist over time. This, for instance, makes is possible to link between research data and publications that contain results from research, and for others to refer to research data in their work.

One of the objectives of the infrastructures that are now being built is to increase the spread of research data, partly by making metadata machine-readable, and partly by metadata being fetched and shown in other databases. Primary data can be stable and hidden in a secure storage site if need be because the crown jewel in the dissemination process is the metadata description.

A catalogue like the national research data catalogue is hence a source of information for databases and researchers across the world. Something similar is already in place for other publications. For instance, other databases gather metadata about our university’s publications from the DiVA database. In that way, publications become searchable in many places, but only need to be registered once.

By using permanent links, we can link research data with related publications and other related research data and disseminate the knowledge of it in a way reminiscent of citations and references to other research publications. For the Smolensk Archives this means an opportunity for the database to be kept active.

We who have been working in the project are hoping that the Swedish National Archives can take over the responsibility so that it can live on

“We who have been working in the project are hoping that the Swedish National Archives can take over the responsibility so that it can live on. When new publications that refer to the material appear, we would like theses publications to be visible also in connection with the database,” says Per Ambrosiani.

“This type of update of new publications and datasets can take place through connections via metadata in research data catalogues, and through publication platforms. In that way, you don’t need to go into the research database and make edits yourself,” says Sanna Isabel Ulfsparre.

The SND research data catalogue is built with a European context in mind focusing primarily on interoperability with EU infrastructures.

“There is a rather small group of researchers in Sweden interested in this material,” says Per Ambrosiani. “That makes it even more important to reach out, so that everyone finds out that this material is available now.”

The Smolensk Archives is also interesting for researchers in Russia, and the Russian historian Adrian Selin at the National Research University Higher School of Economics in Saint Petersburg has also made a very important contribution to the work with the database.

In future, we may get a global system where databases fetch metadata from each other, but we’re not quite there yet

“When several researchers from various countries collaborate in a project like this, you can use each other’s local knowledge to find good ways of spreading the data,” says Sanna Isabel Ulfsparre.

“In future, we may get a global system where databases fetch metadata from each other, but we’re not quite there yet,” she says.

The future of research data

Everything indicates that publishing and making research data accessible is here to stay. What does this mean for the future of researchers?

“Young researchers want this in their qualifications. Publishing has become a currency, and this is also a form of publication,” says Per Ambrosiani. “At the same time, making use of and organising research data has rarely been seen as important from a promotional perspective.”

“I believe that when funding bodies start to demand openness, it will provide useful qualifications to publish research data openly. But it’s not a straight road ahead and a change of culture that takes its time,” says Sanna Isabel Ulfsparre. “We have also noticed that the publishers are starting to request links between articles and research data descriptions.”

“If the ongoing project can find ways of organising research data in a conscious and well-structured manner, it reduces the strain of transferring information to a metadata description once the time is ripe for publication. Researchers need to process their research data during the project anyway. So the trick is to be conscious and methodical about it. A data management plan is a good help in getting this done in a consistent way over time.”

Link to the database in the Swedish National Archives

Information about how the material may be used and spread

The text in this article is published open access.

You are required to include the following license information, including links, when making use of the material:

"Enabling research – research data for future use" by Susanne Sjöberg and Sanna Isabel Ulfsparre is licensed under CC BY 2.0

If changes are made in any way, make sure that this is clearly stated in connection to the license information.

Parts of the material not explicitly mentioned in this information is not included in the license.