CareSet Collects COVID Collaborators
Last night, HHS released COVID-19 capacity data at the hospital-level. Previously, this data had only been available on a state level.
CareSet had early access to the hospital data in order to help create the Facility COVID PUF Community Frequently Asked Questions document. It is available on our GitHub page. If you have questions about the new capacity data, please create an issue on that GitHub repository. We will answer any questions we can. If there are questions we cannot answer, we will route them through the data team at the WhiteHouse COVID Task Force, who along with HHS and the CDC, maintain the HHS Protect database where this data is stored.
State vs. Hospital level data
Previously, with state-level data, it was not possible to understand which specific hospitals were getting hit hardest by COVID-19. For instance, imagine a state with 5 large hospitals. The whole state might be at 50% capacity for ICU beds. We can imagine a pattern like this, demonstrating the problem with data aggregated at this level:
- Hospital A: 50 ICU Beds, 10 ICU beds filled: 20% capacity
- Hospital B: 50 ICU Beds, 20 ICU beds filled: 40% capacity
- Hospital C: 20 ICU Beds, 10 ICU beds filled: 50% capacity
- Hospital D: 30 ICU Beds, 30 ICU beds filled: 100% capacity
- Hospital E: 10 ICU Beds, 15 “ICU” beds filled: 150% capacity
In this example, the fact that several hospitals had capacity did not prevent two hospitals from being overwhelmed. In short, the previous versions of hospital capacity data showed forests, not specific trees. It is easy for one hospital to have a very different situation than the other hospitals in a city or state.
Making an FAQ and checking it twice
In advance of the release, our focus was primarily on error checking the new dataset. We were ensuring variables had reasonable names. And that the data structure made sense. As we did this, we, and the other organizations who served as beta-testers of this data release, created the FAQ. We would especially like to thank researchers from the University of Minnesota COVID-19 Hospitalization Tracking Project and volunteers from COVID Exit Strategy for their help.
Developing the FAQ was our focus. But we wanted to show one quick data visualization. The video above was created by Aaron Qin and Almir Mavliutov, two of our data quality engineers. You can see a good example of the kind of data analysis that is not possible with aggregate data. The analysis, a “bar chart race” shows the number of confirmed COVID patients at hospitals in several Texas cities over time. It is easy to see when a city gets “hit” with COVID using data on a granular level.