SAN JOSE, CALIFORNIA—In the study of regional dialects, Twitter turns out to be a world of its own. By analyzing roughly 100 million tweets and accompanying GPS data, computational linguist Jacob Eisenstein of the Georgia Institute of Technology in Atlanta searches for geographical patterns. The maps above, from 2012, show a few examples of the variation he’s discovered. As he explained in a session today at the annual meeting of AAAS (which publishes Science), some of these variations are predictable: The plural pronoun “yinz” (as in, “I’ll see yinz later”) and the adjective “hella” (“That movie was hella long”) occur in tight clumps around Pittsburgh and around northern California, respectively. If you use “frfr,” it’s likely you’re tweeting from the American South, where the phrase “for real, for real” is most common. Eisenstein also notes that the use of more region-specific words seems to vary based on the size of a user’s intended audience. Tweets that contain a hashtag—a way of reaching more readers—are less likely to contain these “local variables” than tweets that start with another user’s handle—i.e., only readable by followers of both the tweeter and recipient. But the research has also raised a new question, Eisenstein says. It makes sense that Twitter language would vary when it reflects the way people actually speak. But why do abbreviations unique to the Web, such as “lls” (“laughing like shit”) also show variation—in this case, clustered around Maryland? He’s still looking for a legit explanation.
Check out our full coverage of the AAAS annual meeting.
What message would you send into space? Tell us on Twitter and Vine with #msgtospace!