How does misinformation spread in a social network?
False information spreads easily on social media. Traditionally, machine learning (ML) methods for misinformation prediction have used natural language processing (NLP) tools. However, NLP tools struggle at this task, since tweets may have too little text to classify confidently, for rare events like the COVID pandemic, there may not be much data on which to train a model.
In Summer 2022, two students of mine worked on a project trying to detect misinformation in a dataset of COVID+5G misinformation. The goal is to detect whether information spreads in a distinctive way. This is because any given misinformation can be detected after the fact from examples – but when something unprecedented is happening (like COVID!!) we won’t have any data about what misinfo on that topic might look like.
This project ended up successfully reproducing some prior work on using Graph Neural Networks to predict whether a given tweet contained misinformation or not. However, we did not develop any approaches that outperformed this baseline.
Here’s our poster:
Simay Cural ’24
Simay worked on extracting motifs from graphs.
The goal was to see whether misinformation graphs have particular substructures, as compared to true info.
Simay has gone on to pursue a M.S. in Computer Science at Columbia.
Max Perozek ’23
Max worked on reproducing prior work using graph neural networks. He also worked on all of the data pre-processing and cleaning.
Max is currently a data scientist at Whoop, a fitness wearable and software company.