Headshot of Chace with GT tennis courts in background.

Chace Caven

CS + Math

Georgia Institute of Technology

LinkedInGitHubCV

Hi.

My goal is to invent artificial general intelligence.

I pursue this goal by investing my time in machine learning, neuroscience, and early education.

Outside of work and research, I play club tennis, compete in hackathons, read books, and go on side quests to improve my own visual system.

My main strength is knowing what to do when there is nobody there to tell me what to do. The majority of my projects are self-directed, and I actively practice managing a full schedule and innovating under pressure.

Chace with Georgia Tech Club Tennis Team A Chace wearing a Pupil Labs Neon eye tracker. Chace wearing a Pupil Labs Neon eye tracker. Chace wearing a Pupil Labs Neon eye tracker.

Here are some of my projects and interests.

Generative models of spatial and visual information

Early in my career, I built drones. In order to make a drone autonomous, it needs to perform simultaneous localization and mapping (SLAM). I was particularly interested in visual SLAM.

There exist algorithms, such as ORBSLAM, that (1) take in a stream of images, (2) extract hard-coded visual features from each image, then (3) track the position of the features over time to (4) estimate the position of the camera and (5) a sparse point cloud of the environment.

This information is useful to me, a human being who understands how to interpret point clouds and measure distances through space. It is not as useful to a downstream algorithm. In practice, researchers build another algorithm to process and interpret point clouds.

This project is about creating a new method for performing SLAM without building point clouds or plotted coordinate trajectories. Instead, I use machine learning to build a generative model of vision and spatial information. Read the report here.

Evolving small liquid neural networks to uncover neural population dynamics

This is my ongoing project in a research lab at Georgia Tech led by Dr. Chris Rozell.

The brain is an immensely complicated system. For mobile, movement-based tasks, analysis is limited: we can introduce stimuli and observe behavior. We can also "listen" via the use of a mobile electroencephalogram (EEG). I am interested in decoding EEG signals with machine learning techniques.

There's a problem: unlike speech processing, there is a tiny amount of EEG data available, ruling out most deep learning techniques. Also, a major goal is model interpretability - neuroscientists want to use a trained model to figure out new neural dynamics that can be used to diagnose or treat conditions.

How do you create a system small enough to not overfit but still capable of modeling complex interactions? I'm interested in using a specific type of sparse recurrent model called neural circuit policies. I want to optimize the structure of a graph that itself directly models the EEG data.

How do you optimize the structure of a graph? That seems like a non-differentiable task. Instead of gradient descent, I am going back to my roots and using a genetic algorithm for structure discovery called NEAT. I hope to release my first preprint in 2025, so stay tuned.

Neurons that fire together, wire together: exploring backpropagation alternatives

I am interested in creating the digital equivalent of a neuron; a single unit which can be duplicated, wired up, and activated by sensory information. Ideally, the unit has some "learning" behavior that can be accomplished without backpropagating through the entire network. The removal of a backpropagating step also removes the distinction between training and testing.

The main challenge is devising an objective function for each unit. The only information available is (1) the incoming activations, (2) the weights of the network, and (3) the output activations.

What if the objective is simply to pattern match the incoming activations? I.e., determine a "vocabulary" of patterns and optimize the vocabulary to cover the incoming data. In this project, I design the mathematical formulation of such an objective and apply it to a small one-layer convolution network.

Here's a link to the Google Colab notebook: fire_together_wire_together.ipynb.

Self-supervised pretraining of audio spectrogram transformers

Recorded speech can, in theory, be decomposed into three attributes: the speech itself (words, emotion, pacing), the speaker (i.e., the physical structures that produced the sounds), and the background noise.

This project was all about designing machine learning architectures and objectives to "pre-train" a model on 100,000 hours of unlabelled speech audio such that it generalizes better on tasks where only 100 hours of labeled audio exists.

In addition to machine learning methods, I wrote a custom data engine that streams audio from the internet, performs augmentations, and dispatches training commands 413x faster than real-time. The entire pipeline is written in safe, synchronous Rust (the best programming language).

This project took place during a summer internship, so the source code is not available.