A University of Maryland-led research team was recently awarded a $3 million grant from the National Science Foundation to study ethical practices in the collection and usage of personal data from online sources.
Pervasive Data Ethics For Computational Research, or PERVADE, will take about four years to complete, according to the project's website. While the project is being led by this university's information studies college, it is a collaborative effort between six different institutions.
This university received just under $1 million of the grant, while the rest of the money was divided up among the other institutions, which include University of California, Irvine; University of Colorado, Boulder; Princeton University; University of Wisconsin, Milwaukee; and the Data and Society Research Institute, said Katie Shilton, a professor at this university and the principal investigator on the grant.
The PERVADE study will use surveys and interviews to understand how people feel about their online data being used for research and how researchers can approach the issue ethically.
"What we're hoping to come out with are sets of best practices, as well as research support tools, to help researchers make decisions about how and when to use all of this big data for research," Shilton said.
To explain the study, Shilton cited an incident when Facebook collaborated with psychology researchers in an attempt to determine whether showing users happier posts on their news feeds would cause them to post happier things themselves.
Despite the fact that the study's intervention into people's personal lives was "minimal," Shilton said many individuals were very upset to find that they had unknowingly been part of an experiment.
"What kind of research is okay?" Shilton asked, posing one of the questions that the PERVADE study hopes to answer. "Are [people] okay with being part of a cancer prevention study, but not a psychological experiment?"
Regardless of personal opinions on the collection of social media data for research, Shilton said, the topic is an extremely important one for all users of technology considering the increasing relevance of social media and personal data in the digital age.
"There's more and more data available about individuals because of online and connected devices," Shilton said, explaining that researchers in fields such as psychology, sociology or health now have access to a "trove" of personal data that can give them valuable information about people's beliefs and habits.
Much of the data in question is technically public, such as tweets or Facebook posts, but Shilton added "that definition gets really messy" because "public" can be defined in different ways.
Tweets and Facebook posts are considered secondary data, which means it is lawful for researchers to use them without the consent or knowledge of the subject, Shilton said. Primary data, on the other hand, is primarily collected through first-hand research.
While social media is treated as secondary data, Shilton said she thinks this perception is shifting.
Tweets often are more personal and the thought process behind posting them is different than it is for publishing an op-ed newspaper article, which is also considered secondary data, Shilton added.
"The more we study the way people use Twitter, the more we realize, you know, these are really different things," she said. "These new forms of data sort of push on the blind spots in current regulations."
Adam Hemmeter, a junior economics and materials science and engineering major, said he can see the benefits of researchers using personal data for their studies, but he believes that determining limits on the usage of personal data for research "is very critical."
"As technology becomes more integrated into our lives, our lives will inherently become less and less private … we'll have to do our best to protect that," Hemmeter said.
Kevin Schechter, a senior computer science major, said he is not bothered by the idea of his personal data being used for research.
"I don't think I would mind too much as long as my name, profile info, stuff like that, wasn't publicly published," Schechter said, adding that the PERVADE study would provide useful insights into the field of research.