Agencies considering the adoption of open data standards should involve a trusted standard development organization with permanence, customer focus, and with sufficient user involvement

This paper analyzes several case studies of data standard development and management, recommends a set of best practices from them, and evaluates how implementable the best practices are with a new data standard creation project.

Date Posted
03/04/2019
TwitterLinkedInFacebook
Identifier
2019-L00867

Making Open Transportation Data Useful and Accessible: Recommendations for Good Practices in Open Data Standards Management

Summary Information

Open data policies and advocacy has resulted in an explosion in the number of available datasets across a variety of sectors. Some groups in the open data movement have moved beyond an initial push to get as much data online as quickly as possible to increase access towards measuring success by how open data is used to support answering questions to solve a particular problem.

Going one step further, researchers have been investigating not just if open data is used, but by whom and have suggested the creation of a new digital-divide between those who have the skills and infrastructure to make use of the data for their benefit and those who do not. These researchers have suggested that in addition to free and open data, quality documentation, technology access, and technical assistance must also be free and open in order to empower those not already empowered.

Effective data standards are critical for ensuring that open data is useful and accessible to its intended audience. Historically however, standards-making organizations have been using processes that are not agile enough, accessible, or relevant to the variety and quantity of data that is being introduced from the multitude of recent open government data policies and pushes.

Lessons Learned

Develop and manage standards by a trusted source with permanence, customer focus, and with sufficient user involvement.


  • Additional standard development organizations (SDOs) that are accredited are needed to cover areas of expertise where open data is expected to be released. Also needed is a directory of SDOs by domain to help remove standards being re-developed. If no SDO is available develop the standard within the context of the project with base organization separately, prepare to be flexible and involve with users’ input.



Leverage existing data vocabularies.

  • Though existing data standards that the industry has coalesced around may not be perfect, try and reuse existing data vocabulary and software by extending the standard rather than creating new ones.



Develop the standard process size for the audience.

  • Rigorous and overly detailed processes and standards documentation would likely deter all but the most technical and determined individuals in niche work. Instead, it is recommended to use tools and terminology that are already familiar with the industry.



Evaluate the standard at the right pace and use rigorous methods.

  • Use semantic versioning and be explicit about pre-releasing. Use versioning tools like GitHub for writing, managing and collaborating on the data standards without the initial setup time.

Limit unnecessary tools and libraries.

  • When looking at file formats, there are a lot of options that will improve data access and compressibility but may not be very accessible to your audience. Often times basic formats like the CSV-based standard can meet everyone’s needs and be more accessible to users.



Diligently document the standard and process.

  • Use Readme files that include version, date updated, authors, changelog, known issues, and that indicate which fields are mandatory or optional.




  •  

Balance flexibility while limiting vocabulary dispersion.

  • Adding too much flexibility in mandatory fields will result in a wide variety of data and will limit interoperability. Specify different fields as mandatory for different applications that provide the broadest number of uses.



Develop structure and tools that catch and limit errors.

  • Limiting the number of times something is duplicated will reduce the opportunity for conflict and limit the number of places the data is needed for edits. Using number-string lookups for common elements saves small amounts of space and input time but makes it harder for readers to find errors when they are just looking over the data. Include a validator for the standard that users can test against to reduce errors coming in.



Promote your standard to make sure the industry knows about it.

  • No data standard is useful in isolation. Use websites, listservs, conferences and other media resources to make sure people know about your standard and why it was created.
Goal Areas

Keywords Taxonomy: