Projects

You can filter the projects to see me as:

:star: Natural Language Understanding for Twitter Campaign Monitoring

:link:

Developed a data-driven pipeline to analyze and categorize public discourse on social media using state-of-the-art NLP models. I implemented and evaluated various techniques—including BERT, GloVe embeddings, Word Mover’s Distance, and short-text clustering—to monitor political narratives and campaign performance. The system provides insights into public sentiment, enabling organizations to optimize their communication strategies based on automated analysis of large-scale tweet datasets.

:star: Paparzzi Tester

:link:

I developed a test generation and execution framework for the Paparazzi autopilot system. This tool includes a flight data recorder to capture realistic telemetry data, which I utilized for my graduate thesis research. I designed the framework to automate the collection of diverse flight datasets, facilitating more robust testing and validation of autopilot behaviors.

:star: Haplophysh

:link:

Conducted research on deep learning-based phishing detection, developing an end-to-end experimental framework for data acquisition, automated processing, and model evaluation. I architected a fully automated data engineering pipeline that ingested, transformed, and merged heterogeneous data from multiple sources into a unified format, designed to adapt to changing source schemas seamlessly. On the modeling side, I designed and compared several neural network architectures—including RNNs (GRU) and CNNs with multi-level embeddings—using Keras's functional API to evaluate their effectiveness across diverse, large-scale phishing datasets.

More details are available in the repository's readme and the 1-page extended abstract.

:star: Inferring State Models from Black-box Software Using Hybrid Deep Neural Networks

:link:

Developed as an MSc project in collaboration with MicroPilot Inc, this research focused on generating state models for UAV autopilot software to enhance testing and verification. I designed a hybrid deep neural network combining convolutional and recurrent layers to predict system states from sensor readings and servo outputs using a black-box approach, ensuring the method remains generalizable across different autopilot systems.

The project involved building a custom data pipeline using TensorFlow's Dataset API to handle large-scale, non-standardized industrial datasets efficiently. I implemented custom Keras layers for sequence masking and developed specialized loss functions to address the unique challenges of telemetry data evaluation. The research demonstrated that high-fidelity state models could be inferred without requiring access to the underlying source code.

This work was published in the ASE '20 conference (read here). A follow-up study, "Deep state inference: Toward behavioral model inference of black-box software systems", was published in IEEE Transactions on Software Engineering (TSE '21). The project was further extended to include hyper-parameter optimization and transfer learning, details of which are available in my thesis.

iHealth card

iHealth Card is a personal, card-shaped USB device designed to store medical records in a distributed and encrypted manner, providing a scalable healthcare solution for regions with limited internet infrastructure. I developed the companion Android application using Java and Kotlin, allowing healthcare workers to securely access and manage patient data directly from the device.

Tweeeeter

I trained an LSTM based recurrent neural network with character embeddings on the corpus of my own tweets to generate new tweets in my writing style.

:star: Sequence 2 Script

:link:

This project started during the Neuro Nexus hackathon in March 2019. We developed a tool to assist laboratories and clinicians in translating pharmacogenomic testing results into clinically useful recommendations. The recommendations are based on expert guidelines developed by the Clinical Pharmacogenetics Implementation Consortium (CPIC) and the Royal Dutch Association for the Advancements of Pharmacy - Pharmacogenetics Working Group (DPWG).

As the technical lead, I coordinated the initial brainstorming sessions, defined project goals and milestones, and managed task allocation. We built an online tool with an intuitive interface for clinicians. I architected and implemented the backend using Python and deployed it as a serverless application on AWS Lambda. I also managed the integration of the front-end and the data pipeline, ensuring the project met its objectives and deadlines.

This experience provided significant growth in leadership and technical architecture, particularly in leveraging AWS services for rapid prototype development in a high-pressure environment.

:star: Nivad Cloud

:link:

As a co-founder and the technical lead of Nivad Cloud, I architected and developed the initial infrastructure for a Backend-as-a-Service (BaaS) platform. Our flagship product was a hardened, secure in-app purchase system that successfully served thousands of production applications. As the platform evolved, I led the technical design and development across the entire stack, including the backend API, web-based management dashboards, and various client-side SDKs.

Beyond the technical implementation, I was instrumental in the startup's growth, leading efforts to secure seed funding and scale the team. This experience allowed me to transition from a hands-on developer to a technical leader, where I focused on hiring top talent, delegating responsibilities, and fostering a robust product mindset. I oversaw the expansion of our product line and ensured that our architecture remained scalable and secure as our customer base and team size grew.

This role was pivotal in shaping my approach to engineering leadership, balancing entrepreneurial agility with the rigorous architectural standards required for high-availability commercial services.

Crowd Summarizer

:link:

For my BSc. capstone project, I developed a gamified crowdsourcing platform for code description data collection. I built a web-based system that enabled users to perform data labeling and peer-verification, integrated with mechanisms for scoring, achievement badges, and a competitive leaderboard to ensure high engagement and data quality.

I deployed the platform on OpenShift and successfully facilitated the collection and verification of thousands of data labels from hundreds of active users. I also created a step-by-step technical tutorial for the platform's maintenance and formally recognized top-performing participants for their contributions to the research dataset.

:star: Mini Google

:link:

Mini Google is a specialized search engine for academic papers that crawls ResearchGate to index titles, authors, and abstracts using an Elasticsearch backend. I implemented the crawling system, search interface, and ranking algorithms.

The system features clustering based on author co-authorship and a custom PageRank implementation to rank papers based on citation graphs, providing relevant search results through a Flask-based web interface.

Inverted Indexing for query answering + Semantic classification on IMDb and MEDLINE

Developed an information retrieval and text classification system as part of a large-scale project involving multiple components implemented from scratch. I built a custom inverted indexing and query engine in Java, covering tokenization, stemming, and TF-IDF-based ranking. Additionally, I implemented a classification pipeline to evaluate KNN and Naïve Bayes algorithms on the IMDb and MEDLINE datasets, focusing on the end-to-end process of data preprocessing, vectorization, and performance evaluation.

Gold Hunters AI challenges

:link:

I implemented all the socket and networking code and parts of the game UI in this project in Java. We created a multi-agent game for ACM ICPC participants as an extra contest they could optionally sumbit codes to. In Gold Hunters game two groups of gold seeking agents explore a map in search of hidden treasures and might occasionally engage in a fight. We developed two versions with slightly different objectives as well as improvements for two rounds of the regional contest in 2014 and 2015.

rss facebook twitter github gitlab youtube mail spotify lastfm instagram linkedin google google-plus pinterest medium vimeo stackoverflow reddit quora quora