Penn CBC GenAI Student Survey
Survey analysis examining student usage and perceptions of generative AI tools at Penn.
How the safety behavior of medical language models degrades across multi-turn clinical conversations.
- CIS 700 Agentic AI final project at Penn
- builds on PatientSafetyBench
- measures degradation across conversation length and prompt structure
Enhancing Claude 3.5 Sonnet Reliability on SWE-bench Verified
Reproduced Claude 3.5 Sonnet on SWE-bench Verified and tested whether prompt interventions could shift the failure distribution.
- 20% baseline resolve rate on a 50-task sample
- tried two prompt interventions: a strict TDD protocol and a simplified self-verification rule
- TDD protocol resolved 0 tasks because the agent stalled before editing
- simplified protocol held 20% but caused destructive over-deletions to surge
Step budget and prompt design are not independent levers. Tightening one without adjusting the other can offset the gains from the other.
Image2GPS
Predicting GPS coordinates from images of Penn's campus by running the full ML pipeline end to end.
- collected and labeled the dataset by hand around campus
- compared ResNet-18, ResNet-50, and ConvNeXt + k-NN architectures
- best model cut average geodesic error by 54% over the baseline
Distributed Web Search Engine
First time building a distributed systems project in Java. Crawler, KV store, indexer, PageRank, and search server on the Flame framework.
- deployed on AWS EC2
- crawled 250,000+ pages
- implemented PageRank, TF-IDF ranking, and a search UI with spell-check
code available on request
Community Data Elements for Spinal Cord Injury Research
First-author paper in Experimental Neurology on how community-based data standards affect reporting quality and interoperability in SCI research.
- systematically annotated 39 public datasets against 17 required data elements
- found that introducing standards lifted reporting rates from 43% to 93%
- co-developed a systematic annotation protocol now published on protocols.io
scPyDR
Python package for dimensionality reduction and visualization of single-cell RNA-seq data.
- built with two collaborators as a UCSD bioinformatics course project
- simpler implementations of Scanpy's PCA and UMAP for benchmarking
- command-line tool with configurable filtering and normalization, benchmarked against Scanpy on public 10x Genomics data