Table of Contents
Introduction
Since 1950, local National Weather Service offices have tracked reports of severe and hazardous weather. These reports are compiled in a large online repository known as the National Center for Environmental Information (NCEI) Storm Events database. These reports serve as a record of the toll that weather takes across the U.S., logging fatalities, injuries and damage. They also serve as a great way to identify events to study, and that is where my Storm Events Filter becomes relevant.
The Storm Events Filter
The Storm Events Filter is a prompt-based python program, available on GitHub. It helps the user to identify cases based on a combination of geographic, temporal, and hazard criteria. It arose from a need: I was looking to identify marginal supercells for a research project. I wanted to find records in the Storm Events database associated with hail between 1″-1.75″ where no tornadoes or thunderstorm wind reports of 80 knots or greater occurred within 250 km of the hail report. Putting this database together by combing through the database manually, or even with the assistance of AI, looked to be a daunting task. With the Storm Events Filter, this search becomes trivial.

To run the Storm Events Filter, one must simply download the repository from GitHub, install a couple of commonly-used python packages, and download the raw NCEI comma separated value (CSV) files. Once the program knows where to find the storm reports, constructing a dataset of cases is as simple as running the program and following the prompts. I provide a walk-through video for the Storm Events Filter on YouTube.
Users can search within the entire continental U.S., or by region (state-based), by sub-region (county-based), by state(s), or by county(-ies). Storm reports can be identified based on their spatial and temporal proximity to a weather balloon sounding or a radar site. Reports can be confined to a specific range of years, months, days, or time of day. A search function scans the text for user-inputted terms such as “supercell”, “microburst”, or specific town names.

The most robust functionality of the Storm Events Filter lies in its ability to hone in on various combinations of hazard types of varying intensity. The user is prompted to select the primary hazard for which the program will scan. The user can set multiple, independent hazards, each of which may be tied to secondary hazards. For the search that I described above, the user would set the primary hazard to hail, magnitude 1-1.75″, with one secondary hazard of tornado (set to exclude) and a second secondary hazard of wind, sent to exclude reports of 80 knots and above.
Applications of the Storm Events Filter
The intent behind the Storm Events Filter is to facilitate the construction of specific datasets that represent a constrained set of environmental conditions for the deep interrogation of atmospheric processes. By identifying cases with a common set of geographical and temporal bounds that produced a similar range of hazards, the filtering process limits the range of atmospheric behavior, with the goal of simplifying how researchers understanding the complex interactions that give rise to weather hazards.
The Storm Events Filter lays important groundwork for an even more sophisticated tool that I’m currently developing. This new program utilizes the process-based framework described in a recent post and is intended to support both real-time forecasting and the interrogation of archived cases. I am hoping that it will be released publicly by the end of March.