Converting NTSB database downloads
The National Transportation Safety Board (NTSB) maintains a database of aviation accidents and incidents that can be accessed by the general public on the NTSB web site. That database contains information about accidents and selected incidents within the United States, its territories and possessions, and in international waters. The database also includes events involving US-registered aircraft that occur outside US territory.
While the database is publicly accessible on the NTSB web site, users are limited in how the information can be used. The records can be displayed on the web site, or the results of a search may be downloaded. However, because the downloads are in the form of a text or XML file, rather than a file type that can be used by common spreadsheet or database programs, users must first transform the data into a form that can be analyzed, but no resources are provided by the NTSB for that purpose.
Using the data analysis program R, AirSafe.com has created two programs that can take either the text file or XML file version of the output and turn it into a CSV file that can be used by widely available spreadsheet and data analysis programs. AirSafe.com has made the following resources available to the public, the R programs that perform either a text to CSV or XML to CSV file transformation, and a CSV file representing the download of the entire database of over 77,000 records from 1982 to 2015.
Who would find these resources useful? There are several kinds of groups or individuals who may find some or all of the above resources useful. Some of those individuals or groups include the following:
- Current users of the NTSB online database who have a need or a desire to anaylze the database using tools that are not provided on the NTSB web site.
- Aviation professionals or organizations of aviation professionals who want a better understanding of the historical risks associated with their professions.
- Organizations that currently rely on the summary aviation safety statistics created by the NTSB, and that may need to create customized summary statistics.
- Journalists and news media organizations.
- Data scientists or data science students who are seeking out authentic and publicly available data for teaching, training, or research purposes.
Find out more
Below are additional details, including the full report with example outputs from the NTSB database, as well as links to all the code used to make the data transformations.
- Completed analysis (HTML)
- Completed analysis (on RPubs)
- Code for converting text files
- Code for converting XML files
- Code for creating the final report
In addition to the links to the full report above, below are links to the data behind the report resources used to complete this analysis:
- Raw NTSB database output (text)
- Raw NTSB database output (XML)
- NTSB data dictionary for raw data
- Processed NTSB data (CSV)
- Data dictionary for processed data
http://www.airsafe.com/analyze/ntsb-database.htm -- Revised: 17 February 2016