Methodology

Background

The UCLA Law COVID Behind Bars Data Project, launched in March 2020, tracks the spread and impact of the novel coronavirus in American carceral facilities and advocates for greater transparency and accountability around the pandemic response of the carceral system.

Our team of more than 100 staff and volunteer researchers gathers and presents data about COVID-19 in prisons, jails, youth facilities, and immigration detention centers across the United States. We also collect information about pandemic-related prison and jail releases, legal filings and court orders bearing on the safety of incarcerated people, and grassroots organizing campaigns and fundraisers.

Since the beginning of the pandemic, the U.S. public’s ability to assess the extent and impact of viral spread has been limited by shortcomings in data reporting. Data concerning America’s carceral institutions — which confine over two million people — is particularly hard to come by, and this opacity costs lives. The crowding and subpar healthcare systems in carceral facilities make them hotspots for viral spread, and the people who work and are incarcerated in them do not have the option to socially distance.

Epidemiological data reported in these settings are limited, often vaguely defined, and difficult to compare between jurisdictions. To save lives and prevent the potential long-term health impacts of COVID-19 infection, advocates and policymakers need data to demonstrate and respond to the urgency of this public health crisis as well as the need for significant decarceration to limit the spread and protect those who are medically vulnerable.

In an effort to make publicly accessible the limited data that carceral agencies report, the UCLA Law COVID Behind Bars Data Project collects and centralizes detailed, facility-level data on COVID-19 infections and related deaths for incarcerated persons and staff in the U.S. The project also gathers other data, such as regarding population levels in carceral facilities, which are critical to contextualizing COVID-19 infection numbers and do not exist elsewhere in a unified and contemporaneous dataset.

Overview of our core dataset

Using custom web scrapers, we collect data twice weekly from all 50 state carceral agencies, the District of Columbia’s Department of Corrections, the Federal Bureau of Prisons (BOP), U.S. Immigration and Customs Enforcement (ICE), along with several county jail systems, youth detention centers, and state psychiatric hospitals.

Our core dataset includes:

  • the cumulative number of COVID-19 infections among both incarcerated people and staff;
  • the number of active COVID-19 infections among both incarcerated people and staff;
  • the cumulative number of COVID-19-related deaths among both incarcerated people and staff;
  • the cumulative number of COVID-19 tests administered to incarcerated people;
  • and the cumulative number of incarcerated people and staff who have received COVID-19 vaccinations.

However, carceral agencies vary dramatically in what they report publicly. For example, as of June 2021, the West Virginia Division of Corrections and Rehabilitations reports data for all core variables for both populations, while the Mississippi Department of Corrections reports only cumulative and active cases among incarcerated people.

Further, we aim to collect and report these data at the facility level, where available. However, jurisdictions vary in whether they report data disaggregated by facility. For data availability for state and federal agencies, please see our Data Reporting & Quality Scorecard.

When data are not available publicly, we make every effort to obtain missing information through original public records requests. In some cases, we also partner with other organizations who gather data directly from agencies. We make our data available on GitHub. Also available on GitHub is our scraper production code, along with the R package behindbarstools that our team has been developing, which includes a variety of functions to help pull, clean, wrangle, and visualize our data.

Core variables: definitions and considerations

Cumulative Cases

  • Residents.Confirmed: The number of incarcerated people who have ever tested positive for COVID-19.
  • Staff.Confirmed: The number of staff who have ever tested positive for COVID-19.
  • Considerations: While agencies generally report “cumulative cases” as the number of individuals who were ever at a facility who have ever been infected with COVID-19, some agencies, such as the Ohio Department of Rehabilitation and Correction and the Federal Bureau of Prisons, report “cumulative cases” as the number of individuals currently incarcerated who have been infected with COVID-19. Therefore, the reported values can decrease as people are transferred or released. Another important consideration is that, though testing data are not always made available, we know that testing practices vary widely by carceral agency. As a result, true case counts are likely higher than reported, and the extent of this underdetection is extremely variable. Finally, not all agencies report data on staff COVID-19 cases. Some jurisdictions leave it to staff members’ discretion whether to report positive test results they receive from community healthcare providers. As a result, the number of staff cases reported may be lower even than the number detected by testing.

Active Cases

  • Residents.Active: the total number of incarcerated individuals who have an active COVID-19 infection and have not been deemed recovered.
  • Staff.Active: the total number of staff who have an active COVID-19 infection and have not been deemed recovered.
  • Considerations: Though testing data are not always made available, we know that testing practices vary widely by carceral agency. As a result, true case counts are likely higher than reported, and the extent of this underdetection is extremely variable. And again, not all agencies report data on staff COVID-19 cases. Some jurisdictions leave it to staff members’ discretion whether to report positive test results they receive from community healthcare providers. As a result, the number of staff cases reported may be lower even than the number detected by testing.

Deaths

  • Residents.Deaths: The total number of incarcerated people who have died with or from COVID-19.
  • Staff.Deaths: The total number of staff who have died with or from COVID-19.
  • Considerations: Agencies differ in the categories of deaths they report as COVID-19-related. Some agencies include all deaths suspected of being related to COVID-19, and some include only those with a positive test result. Some jurisdictions do not include deaths that occur after or while someone has COVID-19 if the medical provider or examiner declares an alternative cause of death, such as a heart attack. There have also been instances where jurisdictions have not counted the deaths of people who died of COVID-19 after being released, even when they contracted the virus inside a facility. We are currently investigating suspected undercounts in COVID-19-related deaths.

Tests

  • Residents.Tadmin: The total number of tests administered to incarcerated people.
  • Residents.Tested: The total number of incarcerated people who have been tested for COVID-19 at least once.
  • Considerations: Some agencies report only the number of people tested, rather than the total number of tests administered, but we do not post that variable on our website because it is less meaningful than the number of tests administered; reporting only the number of people tested obscures the regularity of testing. However, we do make all publicly reported testing data available in our raw dataset on GitHub. Very few agencies report reliable testing data for staff, and even those that do often exclude tests that staff receive in the community, making reported data difficult to interpret. As a result, we do not report staff testing data.

Vaccinations

  • Residents.Initiated: The total number of tests administered to incarcerated people.
  • Staff.Initiated: The total number of incarcerated people who have been tested for COVID-19 at least once.
  • Residents.Completed: The total number of incarcerated individuals who are fully vaccinated.
  • Staff.Completed: The total number of staff who are fully vaccinated.
  • Considerations: Carceral agencies vary widely in what data they report about vaccinations and how they report them. In a June 2021 blog post, we outlined the challenges posed by the lack of consistency in reporting of vaccination data. One notable challenge involves differences in who agencies include in the incarcerated populations they report vaccine data for. These populations certainly include those who were vaccinated while in and remain in custody, but may or may not include those who have since been released and/or those who received a vaccine before entering the facility.

How We Calculate Rates of Infection and Death Among Incarcerated People

Our website displays approximate rates per population for cumulative cases, active cases, and deaths among incarcerated people at the jurisdiction and facility levels, where possible. Rates provide necessary context for understanding the severity of the COVID-19 situation within a particular facility or jurisdiction. For example, 600 cases in a facility detaining 1,000 people represent more significant viral spread than do the same number of cases in a facility detaining 5,000 people.

Where we have sufficient facility-level data, we calculate these rates by dividing the total number of infections or deaths at a facility (numerator) by the population of people incarcerated at that facility at a point in time (denominator). On each state page, we also aggregate the numerators and denominators for all federal, state, and immigration facilities to approximate rates by jurisdiction type within a particular state. For prisons and jails, we collected our population data from agency websites and through public records requests to carceral agencies; we filled in gaps with the Homeland Infrastructure Foundation-Level Data (HIFLD) Prison Boundaries dataset maintained by the Department of Homeland Security.

Because the majority of carceral agencies do not regularly report up-to-date facility-level population figures and we are reporting cumulative counts since the start of the pandemic, we chose to align our population denominators with data reflecting that point in time where possible: the most recently reported facility populations as of February 2020. These figures do not account for subsequent releases, intakes, and movement between facilities. For facilities without available population totals from February 2020, we used the last reported population data available for the facility. We document whether we are pulling our population denominators for each facility from public records, typically sourced from around February 2020, or from HIFLD data, which typically have population values up to five years old.

ICE makes quarterly detention center population data available for download on its Detention Management webpage. To calculate COVID-19 rates in immigration detention, we use the agency’s reported average daily population for each facility during Q1 of Fiscal Year 2021 as the denominators.

Because the data we collect are reported in aggregate rather than at the individual case level, turnover in facility populations makes it impossible for us to determine whether people who have been infected with COVID-19 at some point during the pandemic are still among those incarcerated (and, if so, whether they remain at the same facility). As a result, the universe of people for the infection and death counts (numerators) is not the same as the universe for the population counts (denominators). Eventually, at some facilities with large outbreaks, this may mean that numerators exceed denominators. We are currenting gathering more detailed time-series population data from jurisdictions that report this data, also available on our GitHub.

We do not have reliable data for staffing levels at all facilities. It is also difficult to determine who each agency includes in its definition of “staff,” how many staff are actually present in each facility (versus on leave or working in administrative offices), and whether staff work within one or multiple facilities. Due to these complexities, we are not currently providing rates for staff, but we plan to do so in the future.

How to Cite Our Core Dataset

Citations for academic publications and research reports:

Sharon Dolovich, Aaron Littman, Kalind Parish, Grace DiLaura, Chase Hommeyer, Michael Everett, Hope Johnson, Neal Marquez, and Erika Tyagi. UCLA Law COVID Behind Bars Data Project: Prison/Jail Cases and Deaths Dataset [date you downloaded the data]. UCLA School of Law, uclacovidbehindbars.org.

Citations for media outlets, policy briefs, and online resources:

UCLA Law COVID Behind Bars Data Project, uclacovidbehindbars.org.

Data licensing:

Our data are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. That means that you must give appropriate credit, provide a link to the license, and indicate if changes were made. You may not use our data for commercial purposes, which means anything primarily intended for or directed toward commercial advantage or monetary compensation.

Our project collaborates with Bronx Defenders, Columbia Law School’s Center for Institutional and Social Change, and Zealous to collect legal documents from around the country related to COVID-19 and incarceration. Together, we then organize and code them into the jointly managed Health is Justice litigation hub for public defenders, litigators, and other advocates. The majority of the legal documents in the Health is Justice litigation hub are federal court opinions, but we are expanding to state legal filings, declarations, and exhibits.

In addition to the Health is Justice litigation hub, our project also manages additional data self-reported by advocates regarding COVID-19-related legal filings involving incarcerated youth (via this form) and individuals in immigration detention (via this form).

Our data reflect only a subset of filings and are by no means exhaustive.

Data on Prison and Jail Releases

We collect data on jurisdictions across the U.S. that have released people from adult prison and jail custody in response to the COVID-19 pandemic. Variables we collect include the official or agency authorizing release, the number of people released, and details on the conditions for release, among others. Our data collection relies on published articles, reports, and data demonstrating realized and pandemic-related releases. We also perform strategic outreach to various legal and advocacy groups to request pertinent data via this form. For the most part, we only include release efforts where the data source includes some sort of programmatic description of who is being released (e.g., people with technical violations of parole, people charged with non-violent crimes, etc.). Though we maintain the most complete dataset of COVID-19-related release efforts, it reflects only a subset of efforts and is by no means exhaustive.

Data on Grassroots and Community Organizing Efforts

Our team collects data on grassroots and community organizing efforts by incarcerated people, their families, community-based organizations, nonprofits, and advocates aimed at influencing government agencies to protect the lives of people incarcerated in prisons, jails, and detention centers against the threats posed by COVID-19.

In the context of this project, we define grassroots organizing as efforts and actions planned by, for, and with incarcerated people. We define community organizing as efforts and actions planned by community-based organizations.

We do not include efforts for which we have no basis to believe that the action was connected to a health and safety risk posed by COVID-19 in a carceral facility, or efforts that are too individualized to be considered an organizing effort.

The Perilous Chronicle’s List of Prisoner Actions is a vital source for our data, as are social media platforms such as Twitter, where organizers often promote their efforts. For example, our team has learned about many efforts by following the hashtag #FreeThemAll, commonly used to amplify demands by organizers inside and outside facilities to release incarcerated individuals to prevent COVID-19 outbreaks. We also gather data from news reports and social media to enhance our existing understanding of efforts happening on the ground. Finally, we ask organizers to self-report their efforts through this form. Our grassroots and community organizing data reflect only a subset of efforts and are by no means exhaustive.