Natural Language Processing
Statistical Modeling of Word Rank Evolution
The goal is to model word rank evolution using a Wright-Fisher inspired model. Google Ngram data is used to analyze eight languages and compared it to the model that simulates drift evolution. The time-series evolutionary dynamics of word ranks are investigated by adjusting the model parameters and comparing it to the language data.
Preprint (submitted and under peer-review for PLOS ONE Journal):
Analysis of Twitter Texts
The goal is to uncover the discourse and evolution behind certain hashtag social movements using NLP methods, machine learning algorithms, and language models. Twitter data was collected and processed using NLTK and several Python tools.
Performance Analysis on Question Answering
Collaborators: Sam Nguyen & Juanita Ordonez
The objective was to fine-tune and evaluate three language models named BERT, ALBERT, and LongFormer on question answering data set called DuoRC where it contains movie plots with narrative structures. Due to the complexity and length of narrative texts, these models are needed to not only answer the question but must also go beyond its capabilities to perform complex reasoning and reading comprehension to infer answers to questions.
Machine Learning Application on Opacity
Collaborators: Robert C. Blake & Ben C. Yee
The objective was to encode opacity - a material property on how much radiation can pass through it - into a neural network as a surrogate model against an existing atomic physics code.
Twitter Network Analysis of the California Camp Fire
Collaborators: Maia Powell & Matthew Mondares
The goal was to explore the spread of information generated by Twitter bots during the 2018 California Camp Fire disaster utilizing user-user and hashtag co-occurrence networks. Twitter bots are users who have automated repetitive and straightforward tweets. Most of them post, repost, or like other tweets to spread information faster than actual users for an unknown large-scale goal.
Predictive Modeling of Flood Susceptibility
Collaborators: Madeline Brown, Ritesh Sharma, & Umesh Krishnamurthy
This project was about modeling flood risk given multiple factors such as scale, demographics, risk perceptions, topology, soil moisture, and precipitation. The general goal was to develop a model for real-time predictions to alert and inform communities of flood risks.
Modeling Spider Predation
The goal of this study was to model the predation movements of the spider species Anelosimus Studiosus using stochastic differential equations.