alt.
Design, Architecture,
Engineering, & Mathematics
About
Final-year student in computer science and mathematics at Dartmouth College, with particular interests in deep learning, category theory, functional programming, and design.
As of July 2024, I will be joining Meta as a software engineer working with the Instagram team to create new experiences for its global user base.
blogartphotography
presentationsentendr
I carry a keen sense of responsibility for my work, including:
- Ethical and responsible data collection and warehousing.
Knowing what to collect, how, why, and most importantly how to respect user privacy and copyright issues where applicable is important. I have relevant experience and coursework in data mining and ethics therein.
Proper warehousing, be it in data lakes or SQL/NoSQL databases, is also critical. I have experience working with SQL databases (MySQL, PostgreSQL), NoSQL databases (MongoDB, Firebase), and vector databases (Pinecone). and I am working to better understand the underlying architectures and implementations of SQL databases.
- Ethical analysis, interpretation and usage.
We are in an exciting period when the potential of deep learning methods especially transformers are being realized. I am interested in the applications of novel neural network architectures and ideas to real-world problems. I have experience working with neural networks in computer vision, natural language processing, and reinforcement learning.
I am particularly excited to explore the interplay between these cutting-edge fields and data ethics.
- Presentation and use in production.
Research output is only useful once it can be turned into a product and presented to an end user. For this, one needs to identify existing gaps, design, and implement innovative solutions that address existing needs. I have experiences designing and building front-end, user-facing applications. I am experienced with both React, Vue, and their proxies (Nuxt, Next, Astro).
- Teaching
They say the best way to learn is to teach others.
At Dartmouth, I got the exciting opportunity to work closely with multiple professors in tinkering with and teaching their courses, as well as holding office hours to help students debug their code and understand the material better. The courses included Systems Engineering, Artificial Intelligence, Database Systems, Data Structures, and Fullstack Web Development.
- I also have experience building systems with a focus on
efficiency and high performance using C,
C++, Rust, and Haskell.
I am also working on some stuff I am excited about over at entendr and recently started learning Racket because of its language-oriented development features.
Are you excited about any of these things?
Do reach out!
Education
Double Major in Mathematics and Computer Science.
Relevant coursework in mathematics includes vector calculus, linear algebra, differential equations, abstract algebra, real analysis, logic, cryptography, and game theory.
Relevant coursework in computer science includes data structures & OOP, algorithms, theory of computation, computer systems, computer architecture, database systems, full-stack web development, machine learning, artificial intelligence, deep learning, computer vision, data-mining, and physical computing.
Participated in a 10-week summer fellowship organized by Y Combinator that engages current and aspiring startup founders with experienced founders and investors to cultivate a strong acumen for entrepreneurship.
Work Experience
Joining the Instagram team starting July 2024 to work on infrastructure supporting 1.4 billion Instagram users.
Assisted in the instruction and facilitation of various courses:
Course | Term | Instructor |
---|---|---|
Fullstack Web Dev | 24S | Tim Tregubov |
Systems | 24 W 23 F 22 S 22 W 21 F | Charles Palmer Charles Palmer Xia Zhou Charles Palmer Temi Prioleau |
Artificial Intelligence | 23 F | Devin Balkcom |
Database Systems | 23 X | Adam Goldstein |
OOP | 23 S 23 W | Devin Balkcom Tim Pierson |
Worked with Professor Alberto Quattrini Li and graduate student Mingi Jeong at the Dartmouth Reality and Robotics Lab.
Research work focused on applications of modern deep learning and computer vision techniques in robotics for more robust waterline detection.
At the time, Meta was pivoting its machine learning strategy to be more efficient with limited data in light of cross-site tracking restrictions introduced by Apple in iOS 14.5.
To facilitate this transition;
- I prototyped a new concept for a closed-loop ML pipeline from Facebook and Instagram apps to internal data-stores, to machine learning workflows, and eventually relaying feedback to the apps.
- I developed a command-line and web utility (in C++ and PHP) for monitoring and controlling the data throughput of the ML pipeline.
The tool enabled machine learning engineers to fine-tune the data rates that they desired for their models, depending on scale of experimentation.
Copia re-imagines e-commerce for developing communities in East Africa by providing solutions for the restrictions of the region, such as ordering products via text message and paying via mobile money.
At my time there:
- I developed a new widget for the Copia website that simplified the order-tracking process for orders made via text message.
- I also worked with the engineering team that maintained Copia's database and internal APIs. Technologies used included Python, Django, and PostgreSQL.
It was estimated that more than 80 per-cent of Copia’s two-million customers ordered via text message at least once a month.
Compsight specializes in building tech solutions for SMEs and other small players in Kenya that otherwise would not have leeway to build entire engineering departments.
Featured Projects
Optical Flow
Experimentation with different techniques for tracking optical flow
in a video sequence. Optical flow is the apparent motion of objects,
either due to the motion of the objects themselves or the motion of the
camera.
Techiniques implemented include:
- Lucas-Kanade method for estimating optical flow by assuming that the flow is constant in a local neighborhood.
- Lucas-Kanade inverse-compositional method for estimating optical flow by assuming that the flow is constant in a local neighborhood and that the flow is smooth across the image.
- Matthews-Baker inverse-compositional method for estimating optical flow by assuming that the flow is constant in a local neighborhood and that the flow is smooth across the image.
You can view the report here.
Augmented Reality with Planar Homographies
Experimentation with using planar homographies to overlay images on top of
other images. The goal is to create an augmented reality effect by
transforming the perspective of the overlay image to match the perspective
of the background image.
Techiniques implemented include:
- Image feature descriptors, such as BRIEF and ORB, for detecting and describing keypoints in images.
- RANSAC algorithm for estimating the homography between two images.
- Perspective Transform for transforming the perspective of an image to match the perspective of another image.
You can view the report here.
Artificial Intelligence (AI) is a hot topic, especially in light of recent improvements in its capabilities. This research project studies the societal attitudes toward AI, both present and how they have evolved over the years, as a way to understand how different events have shaped the public's perception of AI. We use topic modeling, sentiment analysis, and procrustes analysis to analyze relationships across time periods and extract insight into the changing story of artificial intelligence. Here's the studied dataset and the project report.
Collaborative project with Aimen Abdulaziz and Angelic McPherson.
I wrote a high-performant web scraper in Haskell to scrape 17000+ articles from online technology websites such as DeepMind, MIT Tech Review, OpenAI, Singularity Hub, and TechCrunch
The scraper uses Arrows and other functional programming patterns to ensure concurrency and efficiency. The dataset is open-source and available on HuggingFace.
Collaborative project with Aimen Abdulaziz and Angelic McPherson.
Generative Pre-trained Transformer
For fun, I implemented the Generative Pre-trained Transformer (GPT) architecture in PyTorch. GPT is a language model that uses a transformer architecture to generate text. I trained the model on the tiny Shakespeare dataset and used it to generate text in the style of Shakespeare.
Transformers are a type of neural network architecture that employes techniques such as:
- Self-attention, which allows the model to learn the relationships between different parts of the input data.
- Positional encoding, which allows the model to learn the relative positions of the input data.
- Multi-head attention, which allows the model to learn different relationships between different parts of the input data.
- Residual connections, which allows the model to learn the difference between the input data and the output data.
Adversarial Training for Neural Networks
Experimentation with various adversarial training techniques for neural networks. Adversarial training is useful to improve the robustness of neural networks to adversarial attacks — which often happen as noise in the input data. Techniques explored include:
- Data augmentation, which helps the model have more data to learn from. New samples are generated by randomly cropping and/or flipping some images. We also add pertubations, to a random subset of images, which helps the model learn to be robust to noise.
- Dropout, which entails randomly blocking a subset of the neurons in the network from transmitting information during training. This helps huge models avoid overfitting, thus generalize better.
- Ensemble learning, which entails training multiple models on the same dataset and then averaging their predictions. This helps the model generalize better by mitigating the effects of a single model overfitting.
Relational Database App
A relational database with a Python frontend and a MySQL backend, including various triggers and SQL automations.
Collaborative project with Ke Lou.
Logisim Processor
A fully functional 16-bit CPU implemented in Logisim. All CPU components including the ALU, registers, control unit, program counter, RAM, micro-sequencer, finite-state machine, and IO are implemented from basic gates.
Intelligent Chess bot
A chess bot that uses various strategies including minimax, alpha-beta pruning, iterative deepening, transposition tables, move ordering, null-move pruning, aspiration windows, and quiescence search to maximize outcome against an opponent in chess.
Tiny Search Engine
A hyper-efficient search engine that crawls webpages (whose domain can be restricted to a given subset) and indexes them, then handles user queries on the contents of the collection of pages, with results ranked by frequency. It also supports query modifiers such as AND, OR, and NOT.