Hi there, I'm Yue ZHAO (赵越 in Chinese)! 👋
I am a Ph.D. student at Carnegie Mellon University (CMU), and an ex management consultant at PwC Canada. As a seasoned ML software/system architect, I have led/participated > 10 ML libraries initiatives, 10,000 GitHub stars (top 0.002%: ranked 800 out of 40M GitHub users), and >300,0000 total downloads.
- data mining topics related to outlier detection (anomaly detection)
- machine learning systems (MLSys) that can speed/scale upp and automate data mining and machine learning algorithms
- collaboration opportunities (anytime & anywhere & any type) and
- research internships (open for Summer 2022). I could legally work in Canada, United States, and China
- Email (zhaoy [AT] cmu.edu)
- 知乎:「微调」
- Homepage
- WeChat (微信)
-
Apr 2021: How to evaluate/select outlier detection models without any external information (e.g., ground truth)? We have a new preprint on using internal strategies for model selection. Do they suffice? Check out our paper!
-
Mar 2020: I will join Prof. Jure Leskovek's team @ Stanford University during summer :)
-
Feb 2021: Therapeutics Data Commons (TDC), a large collection of > 60 machine learning-ready datasets across more than 20 therapeutic tasks, is released. See paper on arxiv! Great work led by Kexin Huang and Prof. Marinka Zitnik from Havard!
-
Jan 2021: Have a new system paper (SUOD: Accelerating Large-scale Unsupervised Heterogeneous Outlier Detection) accepted at Conference on Machine Learning and Systems (MLSys). SUOD is an acceleration system for large-scale unsupervised outlier detection with Xiyang Hu. It has been downloaded by more than 900,000 times, included as part of PyOD.
-
Jan 2021: We have a new library PyHealth released for more than 30 state-of-the-art predictive health algorithms (mostly deep learning based). See the corresponding paper as well!

Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage. The group is 100% composed of volunteers and interested parties, and has expanded into a large amount of related projects for saving online and digital history.
