QM Speaker: Hugh McCague

When:
January 27, 2020 @ 10:00 AM – 11:30 AM
2020-01-27T10:00:00-05:00
2020-01-27T11:30:00-05:00
Where:
Endler Room (BSB 164)
4700 Keele St
North York
ON M3J 1P3
Cost:
Free

Title: Some Emerging Data Sources and Their Statistical Implications:
Wiki Surveys, Google Searches, Tweets, and More

Abstract: Survey research has been characterized as now passing into its third era. The first era centred on in-person and paper questionnaire surveys. The second era centered on telephone interviews. The current, third era, we are entering into involves social media, commercial transactions, device monitoring, and other digital sources of data. Issues of ethics, legality, reliability, and validity are major concerns of the use of these data sources, as was the case with methods of the two earlier eras. These areas of concern are under active investigation with some apparent successes. The challenge in assessing the new approaches is to be both critical and open to learning. I will review some examples of recent research and applications involving data collection from Wiki Surveys, Google searches, and Twitter. Similar to the movement from the first era to the second era, part of the drive into the third era is the much reduced costs of data collection while gathering much larger amounts of data. The concern, of course, is the quality and representativeness of the data even with much larger sample sizes. Declining participation rates in many traditional surveys is also a factor in seeking new data sources. Also, the different social media data sources have their individual strengths and can be collaborative with respondents and revealing in ways not possible in traditional surveys. While the data collection methods of the earlier two eras are becoming less common as the third era develops, those earlier methods are still being employed and adapted to allow an optimal integrative approach with innovative statistical and qualitative methods.

Additional resources available here!

Slides available here!

Selected Bibliography

Scholarly

DiGrazia J, McKelvey K, Bollen J, Rojas F (2013) More Tweets, More Votes: Social Media as a Quantitative Indicator of Political Behavior. PLoS ONE 8(11): e79449. https://doi.org/10.1371/journal.pone.0079449

Salganik MJ, Levy KEC (2015) Wiki Surveys: Open and Quantifiable Social Data Collection. PLoS ONE 10(5): e0123483. https://doi.org/10.1371/journal.pone.0123483

Yang S, Santillana M, Kou S (2015) Accurate estimation of influenza epidemics using Google search data via ARGO. Proceedings of the National Academy of Sciences, 112: 14473-14478. https://www.pnas.org/content/112/47/14473

Popular

Stephens-Davidowitz S (2017). Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are. New York: HaperCollins.

Brief Biography

Dr. Hugh McCague works as a statistician in the Institute for Social Research, the Statistical Consulting Service, and the Statistics Canada Research Data Centre at York University. His earlier work experience includes the Research Division of the former Ontario Hydro, the Management Sciences section of Bell Canada, and the Department of Medicine at McMaster University. A substantial part of his work has been on survey sampling and analysis including the creation and use of survey weights for statistical modelling.