Using Reddit to Detect Landslides

Climate change has led to more frequent and more intense rainfall events, and one side effect of intense rain is more landslides. These dangerous events can be difficult to study on a large scale, and in fact, individual landslides often go undetected by researchers and responders. While NASA’s multi-partner landslide team has been collecting global landslide information since 2007, their work manually scouring media reports, citizen science data, and other sources is a very time-consuming process.

A team of master’s students at the University of British Columbia recently came up with a novel strategy that could help fill in these gaps. As part of a group project, they used a computer science technique called “natural language processing” to review the contents of the social media website Reddit for landslide information.

Using natural language processing gives computers the ability to review words and groups of words to extract relevant data from websites. This technique helped the team determine when a landslide happened by teaching computers to evaluate phrases like “last Thursday morning.” (The program was also taught to disregard references to phrases like “landslide victories” and the Fleetwood Mac song “Landslide.”) The team used this process to scrape Reddit for landslide locations and supporting information, such as the conditions that triggered the landslide, casualties, the time of the event, and more.

More information about this work can be found in the NASA Applied Sciences story University Students Use Reddit to Help NASA Find Landslides.