Finding Astroturf in 22M Regulatory Comments using Natural Language Processing



This project used natural language processing techniques to sift through and analyze the nearly 23 million comments received from the public by the Federal Communications Commission in response to its proposal to repeal Net Neutrality rules (Title II classification of internet services).

Using the word vector encodings of public comments, I found a class of 1.3 million (!) "mail-merge" comments that had not been identified by previous analyses.

Blog post here.

I've also worked on a follow-up survey to this study with the Startup Policy Lab in San Francisco. When reaching out to the commenters that provided an email address, 88% of the pro-repeal respondents said that they did not submit the original comment.

Preliminary blog post about these results here.