In the Spring Budget 2020, Chancellor of the Exchequer, Rishi Sunak unveiled a commitment to improve the data sharing across government. The Data Standards Authority has since released guidance on how government should share metadata with a set of open standards. This short read will look at the guidance, unpick its meaning and look at what it could mean in the future.
In early August, over three months after the spring budget, the Data Standards Authority (DSA) released their first guidance into how government should share and publish metadata. The first directive encouraged civil servants to use Dublin Core schema when sharing across government. Dublin Core Schemas are specifications which describe the architecture and composition of metadata specs in a technical language. The Dublin Core Schemas have two languages which are currently supported by the Dublin Core Metadata Initiative, XLMS and RDFS. These provide an overview and guidance to how metadata should be stored in documents of this type allowing organisations to conform to them, thereby making understanding and interpreting the metadata a simpler and quicker task.
For publishing data on gov.uk or data.gov.uk, the UK government has different guidelines for civil servants. By following the standards on schema.org dataset schema, government officials are able to describe datasets in a consistent, predictable way, allowing both users and search engines to find, use and understand what your dataset holds. Schema.org is another non-profit organisation which aims to “create, maintain and promote schemas for data on the internet, one web pages, in emails and beyond.” Schema.org vocabularies are maintained by their strong open source community through GitHub, allowing them to make it easier for developers to use and understand.
The final set of guidance published in early August was how to use metadata to describe CSV data. CSV – Comma Separated Values – data is a very common data recording system found on the web, which stores each line as a data record. The DSA recommends that governmental employees use the format CSVW (CSV on the web) for describing the data stored in CSV format. CSVW is a World Wide Web Consortium standard which allows the use of metadata to describe areas of tabular data. According to the DSA, this standard will improve the way CSV files are accessed and understood, with the metadata allowing users to merge more than one CSV files together, along with the ability to load data into a data store so that it is easier to carry out queries and analysis. Another benefit of CSVW is the fact it reduces the number of mistakes when using software to auto-detect column types, which will save both time and money for the government.
This was the first major piece of guidance published by the DSA, which according to its charter aims to make data sharing easier and more effective across government. This means this guidance on metadata is only the beginning, with more robust and explicit direction to come. For business using advice given by the DSA may enable companies to share data more easily with government as well as publish data more easily on the government website. If following the advice, it could allow an easier transition to working with government and thereby save both time and money in the future, placing businesses who were in line with government at a competitive advantage to those who aren’t. So make that small change and look at the advice, as it might benefit you not only now, but in the future…
Sources: gov.uk, ukauthority.com, schema.org, dublincore.org, howtogeek.com. All accessed [02/09/2020]