Under the Public Records Act 2005 all government departments are required to create and maintain full and accurate records of their business. Traditionally, these records were kept as paper documents, but now many public records are born-digital and only held in electronic formats.
Datasets, including official statistics, make up an unknown portion of all of the born-digital public records currently in existence. When it comes to maintaining and preserving them, they are a category of public record that is not well understood. For example, datasets:
• contain dense stores of information in tabular form, which often provide a very precise picture of an aspect of the nation at a point in time
• require large amounts of technical metadata to make them usable
• are normally created and stored in file formats such as SPSS, STATA and SAS, which do not have preservation tools or software packages readily available to help in preserving them
• often have complex confidentiality issues that affect how they may be used.
The status of digital datasets across the Official Statistics System (OSS) is currently unclear and there has been no high-level stocktake done to assess the number and size of datasets. What file formats and media are being used to store them is unknown, as is the quality of the metadata that describes the datasets.
To tackle these and other issues, Archives New Zealand is collaborating with Statistics NZ to plan the future direction for the preservation of and access to all public record datasets.
Euan Cochrane from Statistics NZ has a two-month secondment at Archives New Zealand and is working with Evelyn Wareham, Manager Digital Sustainability Programme, on the public sector datasets project.
“Through this project, Archives NZ and Statistics NZ will have an information base for further discussions about roles and responsibilities in the long-term retention and preservation of datasets across the public sector,” Evelyn says.
“We will provide recommendations for government agencies to consider when deciding the next steps for the preservation of their digital datasets.”
While on secondment, Euan is researching the preservation activities around datasets in other organisations around the world and interviewing six government agencies to gather information about the datasets they hold.
“This research will enable us to develop a high-level view of the current issues around maintaining long-term access to statistical datasets, datasets in business systems and research datasets across the OSS and the public sector,” says Euan.
“It will also help us identify issues that need to be addressed for datasets to be successfully retained in the long term.”
The research will also enable the production of a ‘typology’ of datasets, including definitions of different types of datasets and analysis of proposals for the next steps in the preservation and sustained future usability of datasets.
A fact sheet with useful tips on the preservation and future access to public record datasets for all government agencies to follow will also be produced.
For more information contact: evelyn.wareham@archives.govt.nz.