A Workflow for Collecting Web Data

Fields of Gold: Generating Relevant and Credible Insights Via Web Scraping and APIs

Abstract

Marketing researchers increasingly use web scraping and Application Programming Interfaces (APIs) to collect publicly available data from the internet. While guidance on the technicalities of collecting web data are abundant, much of the design decisions involved in collecting web data have remained largely neglected and undiscussed. A lack of awareness and understanding of these design decisions, both among authors and reviewers, threatens the credibility of research findings based on web data. To address these issues, this article develops a systematic workflow that guides researchers across the different stages of collecting web data. Throughout, the authors discuss how various design decisions affect the relevance and credibility of research findings. The workflow is accompanied by a comprehensive review of the use of web data in marketing research, identifying common themes of how web data has enriched past work. Finally, the authors highlight promising avenues for how future work might leverage web data to address important marketing questions and disseminate research findings.

web scraping application programming interface (APIs) field data research methods workflow open science open data