Gone too far. This graph could suggest that some patients arrive at a hospital before stroke symptoms appear.

Exorcising Extrapolation

BARCELONA, SPAIN--Your experiment resulted in a cloud of data points. What to do? Plot them on a graph and calculate the best-fitting line, of course. But statisticians agree that the line shouldn't extend beyond the range of known points without signaling to the reader that they've entered hypothetical territory--by switching to a dotted line, for instance. Such signposts go missing all the time in four leading medical journals, says Yen-Hong Kuo, a biostatistician at the Jersey Shore Medical Center in Neptune, New Jersey.

Kuo scoured every issue of The Journal of the American Medical Association (JAMA), The New England Journal of Medicine, The Lancet, and the British Medical Journal published in the first half of 2000 looking for scatter plots with lines drawn through them. Of 37 such plots, almost 60% had a line running beyond the actual data points, which all of them failed to indicate, Kuo reported here on 14 September at the Fourth International Congress on Peer Review in Biomedical Publication. In four cases, the lines extended so far that the graphs made no sense. One JAMA paper, for instance, suggested that patients arrived at emergency rooms before the onset of stroke symptoms (see graph) while a study published in The Lancet implied that patients secrete a negative amount of proteins in their urine.

Such gaffes may be statistical misdemeanors, says Kuo, but they can be confusing, or even dangerous if they lead doctors to choose the wrong treatment. And medical journal editors at the meeting acknowledged that they should do a better job. But bad best-fit lines seem impossible to erase, says Barbara Hawkins, an epidemiologist specializing in ophthalmology at Johns Hopkins University. "It's one of my pet peeves," says Hawkins. "I always point it out when I review a paper, but it doesn't always get fixed."

