Blog Posts
Selected posts I have contributed to the Web Science and Digital Libraries (WS-DL) research blog.
Technical Posts
Research findings, tools, and technical investigations on web archiving and social media.
Summer Project as a Google Summer of Code (GSoC) Contributor
This summer (summer of 2025), I got the opportunity to be a Google Summer of Code (GSoC) contributor. I worked with the TV News Archive at the Internet Archive under the mentorship of Dr. Sawood Alam, focusing on detecting social media content in TV news through logo and post screenshot detection.
A Battle of Opinions: Tools vs. Humans (and Humans vs. Humans) in Sentiment Analysis
We analyzed the sentiment of 100 tweets using three sentiment analysis tools (TextBlob, VADER, and a RoBERTa-base model) and six human raters, finding that consensus was hard to reach—even among humans—and that RoBERTa showed the best alignment with human ratings.
Tracking Political Trends Around US Presidential Election
Team Titans won the PhD category of ODU's CSGS Hackathon 2024 by analyzing trending election hashtags on X using keyword analysis, sentiment analysis, and topic modeling around the 2024 U.S. presidential election.
Tools for Using the Internet Archive's TV News Archive as Research Data
A survey of tools built on top of the Internet Archive's TV News Archive—including Third Eye, GDELT Television Explorer, GDELT AI Television Explorer, and GDELT TV News Visual Explorer—for analyzing and visualizing broadcast television news.
Lil Tay's Fake Death Announcement: The Quest for a Deleted Instagram Post in Web Archives
When Lil Tay's Instagram account posted a false death announcement in August 2023, we traced the deleted post through web archives—finding one replayable memento on Archive.is despite no successful copies in the Internet Archive.
PDFServer - Our Summer Internship at LANL
A recap of our 2022 summer internship at Los Alamos National Laboratory, where we developed PDFServer—a prototype to extract metadata, generate social cards, and robustify links from scientific PDF documents for LANL's RASSTI workflow.
Digging Up a Gem Through the Web Archives
Using the Wayback Machine to rediscover mementos of my high school brass band days at Devi Balika Vidyalaya, Colombo—starting from a Googlewhack for "himarsha dbvsbb" and digging into archived newspaper features.
Creation Time and Published Time Are Not the Same: Estimating the Instagram Epoch
While studying how to extract published datetime from Instagram post URLs, we uncovered a discrepancy between the published time in HTML and the creation time embedded in the post shortcode, with implications for sorting events by Instagram publishing time.
Twitter Rewrites Your URLs, but Assumes You'll Never Rewrite Theirs: More Problems Replaying Archived Twitter
An investigation into how Twitter rewrites outbound URLs and the additional replay challenges this creates when trying to faithfully reproduce archived Twitter pages.
Twitter Added Labels On Its Old User Interface
Twitter retroactively added warning labels to tweets on its old user interface, creating mismatches between what was originally archived and what users see when replaying older Twitter mementos.
New Twitter UI: Replaying Archived Twitter Pages That Never Existed
Twitter's new user interface can cause archived Twitter pages to replay as layouts that never existed at crawl time, introducing further fidelity problems for web archive replay.
How Well Is Instagram Archived?
Using Katy Perry's Instagram account as a case study, we found that only about one-third of her individual posts were archived in public web archives—far behind Twitter and Facebook—despite Instagram's massive user base.
Twitter Was Already Difficult To Archive, Now It's Worse!
Changes to Twitter made an already challenging archiving problem even harder, compounding difficulties in capturing and replaying Twitter content in web archives.
Professional Highlights
Conference trips, teaching, internships, career milestones, and reflections.
Teaching Database Concepts for Senior Undergraduate and Graduate Students at ODU
In Spring 2026, I taught CS 450/550: Database Concepts at ODU—a fully online course with live Zoom sessions. This post shares my first experience teaching a senior undergraduate/graduate course, from course setup and grading to student feedback that shaped the semester.
An Internship Experience With the Internet Archive as a Google Summer of Code Contributor
A reflection on my Google Summer of Code 2025 experience at the Internet Archive—working with mentors on detecting social media content in TV news, pivoting from traditional ML to LLMs, and learning to collaborate in open source.
ACM Capital Region Celebration of Women in Computing (CAPWIC) 2025 Trip Report
Trip report from CAPWIC 2025 at George Washington University, where eight ODU CS Ph.D. students—including five from WS-DL—attended keynotes, research sessions, and presented posters.
Trust and Influence Program Review Meeting 2024 Trip Report
Trip report from the AFOSR 2024 Trust and Influence Program Review in Dayton, Ohio, attended as part of our Minerva Research Initiative grant on innovating interdisciplinary methods for hard-to-reach environments.
A Milestone Achieved: Completing my Master's Degree and Advancing to PhD Candidacy
Reflecting on completing my MS thesis, "Supporting Account-based Queries for Archived Instagram Posts," and advancing to PhD candidacy in the WS-DL research group at ODU.
Data Science High School Summer Camp 2021
Recap of ODU's 10-day data science summer camp for Norfolk high school students—teaching Python, NumPy, data visualization, and hands-on activities in a hybrid onsite/virtual program.
Himarsha Jayanetti (Computer Science Master's Student)
My introduction to the WS-DL blog as a new ODU Master's student from Sri Lanka, researching access patterns of robots vs. humans in the Internet Archive with collaborator Kritika Garg.
