Haoye CAI

Master of Science in Computer Science, Stanford University | hcaiaa@stanford.edu

I am currently a Master's student in Stanford University studying Computer Science. I finished my Bachelor's degree with a double-major in Computer Science and Mathematics in Hong Kong University of Science and Technology, with a GPA of 4.012/4.3. I also had an exchange semester in Georgia Institute of Technology, with a GPA of 4.0/4.0. My resume can be downloaded here.

My research interests include: Computer Vision, Deep Learning, Statistical Machine Learning and Artificial Intelligence.
My strengths are in the following fields:
  • Optical Flow, Scene Flow
  • Medical Imaging
  • Human Pose Estimation, Human Pose/Motion Generation
  • Generative Models, Video Generation

Research

Deep Video Generation, Prediction and Completion of Human Action Sequences

Haoye Cai*, Chunyan Bai*, Yu-Wing Tai, and Chi-Keung Tang
European Conference on Computer Vision (ECCV), 2018

Computer Vision, Video Generation, Generative Models

We propose a two-stage generative model to solve human action video generation, prediction and completion uniformly. Our method can generate better videos than existing state-of-the-art methods both qualitatively and quantitatively.

Paper available at: Here
Links: Project Page | Video Result Demo (Highlight!)

June -- November 2017

Cross-modality Training to Learn Cardiac Motion Flow for SSFP MRI Images

Computer Vision, Medical Imaging, Optical Flow

- In process of submission, First Author
We propose a novel framework (for cardiac motion flow estimation) that utilizes motion correspondence from another modality DENSE as supervision to learn cardiac motion flow in ordinary SSFP MRI images. Our method outperforms existing state-of-the-art optical flow algorithms applied on this medical imaging domain.

Links: Project Page | Video Demo (Highlight!) | Slides

January -- May 2017

Projects

CodeIT Suisse 2016

Web Development, Backend Development, Fintech

This is the solution project for CodeIT Suisse 2016 hackathon competition, where we won the championship as a group of five. In this project, we built a high frequency arbitrage trading solution for several stock market using a master-slave architecture designed by ourselves to enhance concurrency.
Skills used: Nodejs cluster, Redis Queue, AMI, Firebase, D3.js, etc.
Links: Project Page | CodeIT Suisse

October 2016

JOS with extended paging system

Operating System, Kernel Design

In this project, we built a fully-functional micro operating system JOS with extended paging system. We implemented paging to disk so that virtual memory could exceed RAM. Furthermore, we proposed a novel paging heuristic in order to enhance the performance of paging system, and also explored the influence of process scheduling policy on paging system
Skills used: C, x86 Assembly.
Links: Project Page

April 2017

Team-Forming Website

Web Development, Backend/Frontend Development

This is the final project for COMP3111H Honors Software Engineering. In this project, we built a easy-to-use, good-looking website for team forming with full functionalities. After registering and logging in to the website, a user can view, create, or join a team and invite team members. All information can be easily viewed and all operations can be easily done within our interface. We also developed a complete user system with a set of access rules.
Skills used: Angularjs, Ionic(for ios app development), Bootstrap, AMI, Firebase, Karma, etc.
Links: Project Page

November 2016

Internship

Tencent YouTu Lab

Text Detection and Recognition

I built text recognition pipeline using CRNN and attention model. I also built end-to-end text detection-recognition pipeline, combining two tasks in one model. In this pipeline, I implemented feature transformation to enable our recognition network to reuse features obtained by the detection network. We achieved state-of-the-art text recognition accuracy.

December 2017 - February 2018

SenseTime Group Limited, Hong Kong

3D Human Pose Estimation for Monocular Images

I participated in a summer internship in Algorithm Research under Depth and Reconstruction Team, and studied the topic about 3D human pose estimation for monocular images. I first reproduced prior work in ICCV 2017using fully-connected neural nets to learn 2D-to-3D pose regression. Then I proposed and implemented two potential improvements: First, I built a DenseNet to extract features from raw images, and concatenated the features with 2D poses in multi-stage fashion to compensate for the ambiguity in 2D space. Second, I viewed this problem differently as dimensionality followed by reconstruction, and thus tried PCA space instead of 2D pose space. The results achieved are state-of-the-art. All frameworks are built in multi-gpu mode and deployed on clusters.

June 2017 - August 2017

Awards

First Place in CodeIT Suisse Coding Challenge, Credit Suisse
Hong Kong University of Science and Technology Academic Achievement Medal
Dean's List (for each semester), HKUST
The Hong Kong Electric Co. Ltd. Scholarship, HKUST
The Cheng Foundation Scholarship for Chinese Mainland Undergraduate Students
University's Scholarship Scheme for Continuing Undergraduate Students, HKUST
HKSAR Government Scholarship Fund - Reaching Out Award, HKUST
Second prize in National Olympiad in Informatics in Provinces, CCF