Hi, my name is

Haoyang.

Solutions fuel my passion.

A passionate software developer. I tend to make use of modern technologies to solve problems and make values.

Also a Music Arrangement enthusiast, don’t forget to check my portfolio!

About Me

I am a second-year master’s student at Georgetown University majoring in computer science, with current GPA 3.96.

My interests focus on backend development and data engineering. I have implemented service backends, big data workflows, and applications during my past internships and projects. These works include experiences with distributed components, machine learning algorithms, backend frameworks, and cloud services and are involved with multiple languages, like Python, Scala, and Java.

Here are a few technologies and concepts I've been working with:
  • Python
  • Java
  • C/C++
  • Amazon Web Service
  • SQL
  • Kubernetes
  • Docker
  • NoSQL
  • Git
  • Spring
  • Unix/Linux
  • Elasticsearch
  • Redis
  • Keras
  • Pytorch
  • Spark
  • Scala
  • Object Oriented Programming (OOP)
  • Algorithms and Data Structures
  • Design Patterns

Experience

Research Assistant - Georgetown University McDonough School of Business
Jun 2022 - Nov 2022
  • Counseled and budgeted the cloud-based project workflow, trimmed development cost by 50% and time by 70% compared to training and tuning models from scratch.

  • Designed and built a serverless data processing workflow with AWS Lambda and AWS S3. Improved the processing time with parallelized serverless function calls by 1000x than traditional cloud server instances, while keeping the cost similar, reduced data processing time by over 90%.

Algorithm Intern - Beijing Computing Center
Jun 2019 - Sep 2019
  • Facilitated and helped the build of backend of proposal analysis system, including implementing text similarity, text classification, abstract generation algorithms, and corresponding backend RESTful API with Flask in Python, and ORM SQL frameworks.

  • Refactored monolithic applications to a Microservice and Component based architecture, breaking components into Pods and containers for Kubernetes clusters.

  • Implemented an image correction algorithm which reduced color RMSE by 60%, which was integral to ensuring the product’s performance.

Research & Development Intern - Baidu
Jan 2020 - Aug 2020
  • Completed optimizations and manufacturer customization requirements with self-testing for the Baidu Mobile App for Android in Java, of which 80% pushed to master, including adapting different notification push services for Chinese users, modifying default page styles for different phone models, adding or removing specific entries for preinstalled OEM versions, and so on.

  • Expanded team technology stack by evaluating new UI toolkit Flutter.

Education

2021 - 2023
Master of Science in Computer Science
Georgetown University
GPA: 3.96 out of 4.0
  • Selected courses: Database Management System, Computer Architecture, Grad. Algorithm, Gems of Theoretical Computer Science, Information Assurance, Deep Learning with Neural Networks, Statistical Machine Learning
2016 - 2020
Bachelor of Science in Computer Science and Technology
Beijing Jiaotong University
  • Selected courses: Object Oriented Programming & C++, Operating Systems, Computer Organization, Data Structure, Algorithm Design and Analysis, Artificial Intelligence, Database Systems

Projects

Toy-DB: Implementation of Relational Database Management System
Java Visitor Design Pattern Git SQL Database Management System
Toy-DB: Implementation of Relational Database Management System
  • Implemented an RDBMS in Java, with Visitor Design Pattern and composite data structures, supporting nearly full SQL syntax, including implicit join, expression updates, and arbitrary expression evaluation for WHERE conditions.

  • Implemented cost-based and rule-based query optimization, integrity constraints and achieved sub-1-second responsiveness manipulating up to one-million-record tables, which is close to commercial grade.

Music Data Analysis
Python Pandas Numpy Scikit-learn Scipy NetworkX Plotly MongoDB
Music Data Analysis
  • Scraped and collected feature data for over 500K songs and artists from sources including Spotify, Wikipedia and allmusic.com, larger than any public dataset, stored in MongoDB NoSQL database. Worked on missing value imputation, outliers identification and duplication removal with statistical methods using Pandas, Numpy and Scikit-learn.

  • Analysed influence and trends caused by popular artists with clustering, regression, ANOVA, classification, and network analysis using Scikit-learn, Scipy and NetworkX. Utilized Matplotlib and Plotly for visualization.

Prediction on Taxi Drivers’ Income Based on GPS Data
MongoDB Spark Scala PyTorch Distributed System Big Data
Prediction on Taxi Drivers’ Income Based on GPS Data
  • Built a high-quality dataset to describe behaviors of taxi drivers in Qingdao using multiple dimensions from hundreds of Gigabytes of raw GPS data with a distributed system in the workflow of MongoDB and Spark; extracted multiple features like the empty rate, work time, and profit from the spatial–temporal data.

  • Designed a brand-new multi-input RNN model with human-related features, environment-related features, and income data input simultaneously using PyTorch; the RMSE for predicting drivers’ income improved by 8.3% using the dataset as compared to LSTM.

Portfolio

Beyoind the Twilight
My original compose! Done on Jun 2020.


If You Say
Covered on Aniversary with Ella ❤ Piano arranged by me.


Become to Love You
Piano cover reharmonized by me with the original mandopop.


Rain
A piece of improvisation on a rainy day.


Remember Me
Piano cover of theme of the movie Coco.



Original Trailer Music for Wandering Earth

Recomposed trailer music for movie Wandering Earth.