This document describes our policy on open data.
The policy also includes our Code of Conduct for those that wish to use our data.
Additionally, this document also describes how we utilise data that is already open and available.
This document is for anyone interested in how we produce, publish and use open data.
This is the first issue of this policy. It was first published in July 2015.
Feedback should be provided via firstname.lastname@example.org
This policy is owned by OCVA (Oxfordshire Community & Voluntary Action).
This policy is published under Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
This policy is also available for reuse via GitHub, courtesy of MACC
Company / Charity number(s): 1108504 / 5363946
Registered office: The Old Courthouse, Floyds’ Row, Oxford, OX1 1SS
What is open data and why are we working with it?
We subscribe to the following definition of open data:
Open data is data that can be freely used, reused and redistributed by anyone.
It is subject only, at most, to the requirement to attribute and sharealike.
Open data are the building blocks of open knowledge.
Open knowledge is what open data becomes when it’s useful, usable and used.
We are committed to a strong community and voluntary sector in Oxfordshire. We believe that open data can assist with this ambition, through enabling transparency of our actions, in the form of open data.
When publishing data about our work, we have three main aims:
In this context, we aim to operate an open data policy that is robust and practical, leading to greater engagement in the issues faced by the community and voluntary sector in Oxfordshire.
What is meant by open data at OCVA?
When considering open data, we commit to the following:
When making our data openly available, we maintain a set of expectations, known as our Code of Conduct. If you utilise open data published by us, we request that you consider this.
In our work with voluntary sector organisations across Oxfordshire, we will provide advice and guidance in terms of publishing and using open data.
This policy provides information relevant to each of these commitments. We describe the key aspects, actions and mechanisms that we use to deliver our open data policy.
How will we publish open data?
The act of providing open data is to publish and share. We understand that this involves responsibility and due diligence.
When we publish data openly, our aim is to ensure it is of a quality to be accessed, used and understood. In doing so, we place the following expectations on our data publication.
It can often be the case that whilst data is made available openly, it is poor in structure, out-of-date, or has bad quality amongst other aspects. It is the intention of this policy to mitigate against such factors, via these criteria.
|Principle:||Our open data will respect privacy.|
|Best practice:||We will always ensure our open data is free from identifiers that could be linked to an individual person or organisation. We do this by ensuring our open data set only contains data on organisations who have explicitly consented to our publicly sharing their data. We also do this by ensuring that contact information is not included. Map points are included. These are based on postcodes. Postcodes themselves are not included in the data. Upon providing explicit consent, organisations are advised to act with caution regarding where on a map their group shows up (is pinned).|
|Principle:||Our open data will be comprehensive for the subject.|
|Best practice:||We will always quality assure our data, in terms of the level of completeness and readiness for publication.
We will not knowingly publish data that is incomplete for the relevant focus and/or time period.
|Principle:||Our open data will be relevant and succinct for the subject.|
|Best practice:||We will always consider the size, scope and spread of our data – to make it useful for those who may want to access it.
We will not publish open data that is overly large, or not provide logics, lookups or additional materials.
|Principle:||Our open data will be interoperable.|
|Best practice:||We will not publish data that involves jargon or acronyms that are not documented.|
|Principle:||Our open data will be presented in an open and standard format.|
We will publish data in common, accessible and standard formats such as CSV, XML.We will not publish open data in bespoke, redundant or proprietary formats.
We are discussing a shared standard for peer organisations across the UK. (This includes MACC and NAVCA.)
|Principle:||Our open data will be appropriately licenced.|
|Best practice:||We will always issue an open licence with our open datasets. Our default is a Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
We will not publish data that is subject to a restrictive licence
|See also:||Open licence|
|Principle:||Our open data will always be well documented.|
We will always provide notes and guidance to accompany our datasets.
We will not purposefully provide data that is poorly described, or requires sector knowledge to comprehend.
|Principle:||We will publish data in open, accessible and consistent ways.|
|Best practice:||We will always publish data in a consistent method, making it accessible to all.
We will not publish data with passwords or access restrictions, or in places that are not signposted.
|Principle:||Our open data will be timely.|
|Best practice:||We will always provide regular and timely updates to relevant open datasets.
We will not miss updates to our relevant datasets, or let our data go “stale”.
|Principle:||We welcome feedback and discussion about our open data.|
|Best practice:||We will always make it clear how to provide feedback on our open data, and any resultant actions.
We will not publish data without a feedback mechanism.
|See also:||Contact us|
How will we utilise open data?
Alongside publishing our own datasets, we wish to take an active role in the analysis and discussion of the insights that can be gleaned.
For this reason, we will establish the following as part of our open data policy.
Alongside the formal notes and documentation on our datasets, we will post to our main website newsblog, providing narrative and information on data being released and updated.
Any data visualisations or progress with our open data tool project will also be shared on our main website newsblog.
When we utilise other datasets in our commentary and analysis, we will always provide clear attribution and guidance as to the source of the data, and any actions we may have taken. Any tools we build will also provide clear attribution.
What do we expect from those who use our data?
We encourage others to access, use and discuss our open data. We strive towards a strong community and voluntary sector in Oxfordshire, and value the contributions and insights that can be gleaned through use of data.
When doing so, we would hope the following basic Code of Conduct is observed:
Much of the datasets we publish are succinct and easily available for download. When accessing our data, we request that you do not place unnecessary burden on our servers by making repeated data requests over a short period of time.
When using our data, we request that our licence is observed.
When producing any material that uses our data, please ensure an attribution to OCVA is included.
When making use of our data, always state any steps that were made to undertake calculations or analysis that are not present in the source.
When using our data, you must not:
We encourage discussion of our data, and the uses of it. In doing so, we request you are respectful of others.
If you spot any mistakes, errors or points for clarification, please feedback via our designated channels.
We also encourage requests and ideas for new data that we may publish. Again, please do so via our feedback channels.
The following workflows and checklists are used by our staff in the preparation, publication and update of open data. These are linked to our open data best practices, detailed in our open data policy. Over time, we will update and enhance these workflows.
When preparing any data for publication, we would always undertake the following:
|Does the data contain names of individuals?||If yes, then remove|
|Does the data contain any unique identifiers that can be used to retrieve personal information from external systems?||If yes, then remove (postcodes are used to create latitude/longitude points. Those opting in to sharing their data in our data set are encouraged to review what the map point for their organisation should be.)|
When preparing data for publication, we would always undertake the following:
|For any dataset, consider the overall physical file size||If over 10MB, then check contents and consider further segmentation|
|For any dataset, check the column headers and data labels are legible.||If not, provide lookup file and note in data release table|
|For aggregated datasets, check that aggregations are explained and logged.||Ensure these are documented in data release table|
|For any dataset, check that time periods used are in accordance with common standards (eg: financial quarters, calendar months)||If there is a bespoke date range, then detail in data release table|
When preparing data for publication, we will always consider the following:
|For geographic areas used within datasets, provide the code alongside the name.||Applicable to: Ward|
|Provide and/or signpost data users to the latest lookup of any codes used.||In the case of administrative geographic regions, refer to authoritative sources such as Ordnance Survey, Office for National Statistics and the NHS.|
|When using internal / OCVA specific codes, ensure that a lookup and/or explanation is provided.||Log this in the data release table.|
When preparing data for publication, we will always undertake the following:
|For spreadsheets and tabular data, release in standard open formats.||Release as:
Open Document Format for spreadsheets (.odf)
Comma Separated Format for flat files (.csv)
|When working with other data standards and systems, ensure that the format is open and accessible.||Consider XML, JSON or RDF formats as open. Check with standard or publication organisation.|
|Avoid publishing data in closed, proprietary and formats that make the data inaccessible.|
When publishing data, we will always ensure a relevant licence is provided.
Our default licence is a Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
|Does the dataset fit within the default licence for OCVA?
|If yes, then ensure this is in the data release web page/ documentation.
If no, then select alternative, and document accordingly.
|Does the data contain any information that is derived from other sources?||If yes, then detail these sources in the data release table.
If there may be an issue with these derivations, then seek advice.
When publishing – and updating – data, we will always seek to ensure that this is done in a timely manner.
For datasets that are updated periodically, ensure this takes place within acceptable timeframe.
|If data publication is outside of these thresholds, update/add to data release table|
|Ensure that relevant older data can be accessed after an update – that it is not deleted or destroyed.||For ongoing statistics, ensure the new time period data is made available alongside other periods.
Where data must be overwritten, document in data release table.
When publishing data openly, we will always check the following:
|When creating data files, use the file naming convention detailed in the data standard used.||
See openvcs.github.io for the latest guidance. (For example:
openvcs-v02-ocva.csv where ‘openvcs’ identifies the data format, ‘v02’ identifies the version of openvcs in use (version 0.2 in this case) and ‘ocva’ identifies the name of your organisation (for me it’s OCVA, Oxfordshire Community & Voluntary Action). I’d suggest not putting the date in the file name because that would change the file name each time. But you could put the date of the data export on its related web page, or even in the text of the link to the file.)
|When hosting data files, always ensure that the end URL is accessible, and free from any security barriers, passwords or blocks.||If there is an issue in terms of accessing the URL to the data file, seek advice.|
When seeking, collecting and receiving feedback on our data, we will always consider the following:
|When an “Issue” is posted via email respond accordingly – even if no acknowledgement.||Where the issue can be progressed, respond accordingly.
If no action can be applied immediately, respond accordingly.
|When feedback is received via settings such as face-to-face meetings or workshops, consider how best to add this to existing feedback.||Where appropriate, create a new Issue for the relevant dataset, attributing the source of the observation / remark.|
|When comments are made about our usage of open data, respond according to the Code of Conduct.|
When using datasets published by other organisations, we will always ensure the following attribution considerations are provided within the context of any material we produce unless such use is within any data tool we build. (For details on how usage will be attributed from within our tools please refer to any user or publication notes about such tools.)
|Name of the dataset utilised||eg: Adult Learning Centres|
|Publisher of dataset||eg: Oxfordshire County Council|
|Source URL (from where the data can be retrieved)||eg: oxfordshire.org/open or ocva.org.uk/open|
|Notes on usage||Any notes on actions undertaken that result in the source data being changed or modified.|
|Groups data||Aggregate statistics on groups registered with OCVA||Sample first published July 2015.||Quarterly|
|Funding data||Details of funding provided by us to other organisations||Not yet available.||Quarterly|