To counter what they see as unprecedented political interference in one of the world’s most respected statistical agencies, prominent statisticians are urging the U.S. Census Bureau to be much more transparent about how it is now processing the billions of bits of information it has collected from a truncated 2020 census.
A series of actions by President Donald Trump’s administration has jeopardized the agency’s ability to deliver an accurate count of the U.S. population later this year, a task force of the American Statistical Association (ASA) concludes in a report released this week. So, to maintain public trust in this year’s census, the task force recommends the agency invite an independent group of researchers to pore over the data. The team would then issue a public report on whether the Census Bureau has met its goal of “counting everyone once, and only once, and in the right place.”
“We are doing our best to support the Census Bureau because they have been put in a very difficult situation,” says ASA President Rob Santos, who co-chaired the task force. “They don’t have full control of their operations.”
Members of the task force, which includes three former Census directors, believe there’s no time to waste. The bureau ended field operations last night, more than 2 weeks earlier than scheduled. The agency announced the move only hours after the Supreme Court rejected an effort by civil rights groups and local officials to extend fieldwork through the end of this month. Those groups had argued that the abbreviated timetable could mean failing to count a disproportionate number of residents from minority, immigrant, and low-income communities. But on 13 October the high court gave the administration permission to stop the count.
The Census Bureau’s next deadline is 31 December, when it must give Trump a tally of the U.S. population that will be used to determine how many seats each state gets in the 435-seat House of Representatives over the next decade. Census data also determine how the federal government distributes some $1.5 trillion annually and are widely used by researchers and businesses to understand demographic trends.
That schedule gives the Census Bureau barely 10 weeks for its “postcollection” analysis of the vast trove of information it has amassed since the 2020 census went live on 1 April. The daunting analysis effort includes checking the accuracy of data submitted by those who responded on their own, as well as those who gave answers to someone knocking on their door. (Roughly one-third of all households do not self-respond.) For residents who provided only partial information or none at all, the agency tries to fill the gaps using data already on file with other government agencies. Those data, called administrative records, also require a thorough vetting.
Scientists worry the narrow window doesn’t give the Census Bureau enough time to abide by its usual high standards. The bureau’s “current plan for quality assessment is unknown,” the task force notes, and “the compressed schedule has eliminated many quality-control steps” the agency had once expected to take.
One casualty could be how Census officials verify the accuracy of computer codes used to process data from different sources, says Nancy Potok, co-chair of the task force and a former deputy Census director. “In previous censuses we used two programming teams, working independently, to look for errors and find bugs,” she explains. “That process takes months, and there are always mistakes. But they may decide not to double up this time because they won’t have time to fix the problem.”
“The census is a chain of many, many operations,” she adds. “And it’s only as strong as its weakest link.”
What does 99.9% mean?
The 2020 decennial census is a $15 billion operation that has unfolded over 10 years. And it has already endured what the report calls “a perfect storm of adverse circumstances.”
The COVID-19 pandemic and unusually active wildfire and hurricane seasons disrupted field operations for its army of 500,000 enumerators. There have also been politically driven upheavals. In 2019, the Supreme Court rejected the Trump administration’s last-minute attempt to add a citizenship question. And this year, three political appointees took up high-level posts at the traditionally nonpartisan agency. Their arrival heightened concerns that Commerce Secretary Wilbur Ross, whose department includes the Census Bureau, is calling the shots on many operational matters normally left to the agency’s director, Steven Dillingham.
In April, Ross asked Congress for an additional 4 months to carry out field operations. But the administration dropped that request in July, triggering outcries from both stakeholders and Democrats in the House, which had approved the extension.
Ross’s claim this week that the 2020 census has met its goals because “99.9% of housing units have been accounted for” is the latest example of political interference, according to the report.
“It’s a meaningless number” that can’t be used as a measure of data quality, says Potok, who retired in January as the government’s chief statistician. For example, she notes, enumerators might mark an address as counted even if they have repeatedly failed to obtain any information from residents. Or they may move on after collecting only partial information about an address—say, the number of people living there but not their age, gender, and nationality—from a neighbor or building superintendent. (In census lingo, they are called “proxies.”)
Experts also say focusing on the percentage of housing units reached, rather than the quality of the information collected at each address, can actually be damaging. That’s because efforts to drive up that number could have led to “operational shortcuts that will jeopardize the quality of the count.”
Santos believes Ross is making the 99.9% claim in hopes of rebutting comments made earlier this year by senior Census officials asserting that the agency needs more time to complete its work. “Ross says it’s possible [to have a quality census despite a truncated schedule],” Santos says. But internal government emails that have been made public “say that he’s living in a fantasy world,” he adds. “It’s sophistry.”
How to monitor quality
There are ways for the Census Bureau to assess how close it has come to conducting a complete and accurate count, the report says. But the effort needs to begin immediately and should involve outside experts.
In particular, the report has a two-page list of “quality measures” the bureau can apply as it performs the dozens of steps needed to generate an accurate final count. For example, it can calculate the percentage of addresses enumerated by proxy or through administrative records, processes that tend to produce lower quality data. It can also take additional steps to verify whether an address that generated no data on residents is actually vacant or does not exist.
The report proposes applying similar metrics to the “postcollection” phase of the census. Here, analysts can judge quality by looking at factors such as the percentage of records that lack a full name or date of birth, the number of duplicate enumerations, and how much information is being imputed. (Imputation means making an educated guess about the demographic characteristics of the occupants based on indirect information such as the type of housing unit or demographic composition of the neighborhood.)
Individuals who answer the census questionnaire on their own generally provide the most accurate data, the report notes. They also reduce the number of addresses that need to be visited, which saves money. So boosting self-response rates is a top goal for every census.
The overall self-response rate for the 2020 census is 66.8%, a hair above the 66.5% final rate for 2010. But self-response rates vary greatly by state and geographic region, and even within a single metropolitan area. Researchers say posting a single number for a state—as the Census Bureau has done daily since early August—is not a good way to indicate the overall quality of the census.
To illustrate that point, the task force compared self-response rates in 2010 and 2020. It found that more census tracts— an area comprising a few city blocks—exhibited self-responses above 80% and below 60% this year than was the case in 2010.
Census officials need to be aware of that variation when processing the data from those areas, because it might be of lower quality, the task force says. But Census officials haven’t said whether they will take or have taken such steps to address that issue.
Failing to acknowledge and address the variation in data quality from differing self-response rates could have real-world consequences, statisticians say. “Nobody wants an undercount,” Potok says. “But if the response rate were the same in every geographic area, then an undercount wouldn’t be such a problem. However, the census is carving up a fixed pie. There are only 435 House seats, for example, and if one state gets more [as a result of a flawed count], another gets fewer.”
Such issues are driving the task force’s sense of urgency in asking Census officials to let outside experts review the files. “There is no one indicator that tells you everything is fine,” Potok says. “But if one of these indicators suggests that something doesn’t look right, then you need to dig deeper to find the problem.”
Sending a message
The report says most of the data needed to do that deep dive are already available because this year, for the first time, Census officials conducted field operations digitally. Enumerators received their daily assignments electronically and used their mobile devices to record and transmit their data.
A digitized census gives outside researchers enough time to assess the quality of the 2020 census and make recommendations to the Census Bureau before the final count is sent to the White House, Santos says.
At that point, Dillingham would have several options for dealing with any serious flaws. “He could tell the president he cannot submit a count, or that he doesn’t believe the numbers are accurate,” Potok says. “Or he could say that we’ve identified problems with the count that Congress might want to look into.”
Santos doesn’t think any of those alternative scenarios is likely, because all of them would require buy-in from Ross, who is Dillingham’s boss. Nor does he expect the Census Bureau to take up the task force’s offer for outside help, which was made in a letter to Dillingham and Ross. (The Census Bureau had not responded to queries from ScienceInsider as this story went to press.)
Even so, the task force hopes policymakers will recognize the value of having an independent assessment of the 2020 census. “We wanted to send a message to the Census Bureau to preserve the data it has collected and make it available to researchers so they can assess how things went,” Santos says. “We think that’s what needs to be done to restore public confidence in the census.”