-
Sky Computing Lab, UC Berkeley
- Berkeley, CA
Highlights
- Pro
Block or Report
Block or report Michaelvll
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abusePinned
-
skypilot-org/skypilot Public
SkyPilot is a framework for easily running machine learning workloads on any cloud through a unified interface.
-
ucbrise/graphtrans Public
Representing Long-Range Context for Graph Neural Networks with Global Attention
-
mit-han-lab/lite-transformer Public
[ICLR 2020] Lite Transformer with Long-Short Range Attention
-
facebookresearch/fairseq Public
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
-
tensorflow/tensor2tensor Public
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
2,010 contributions in the last year
Activity overview
Contribution activity
December 2022
Created 28 commits in 3 repositories
Created a pull request in skypilot-org/skypilot that received 4 comments
[sqlite] Mitigate the database locked problem for skylet config
Describe the changes in this PR: This PR is to mitigate the problem in #1507 Enable the WAL mode for the database we are using for autostop to eli…
Opened 13 other pull requests in 2 repositories
skypilot-org/skypilot
5
open
5
merged
1
closed
- [Mypy] Add type check with mypy and fix storage
- Add user identity to cluster status to avoid leakage when switching account
- [Image] Check image size and existence
- [AWS Catalog] Handle the exceptions when the user does not have access to the region
- [Logs] Fix logs download
- [AWS] Enable requested ap regions
- [AWS] Add regions requested by the user
- [Failover] Fix leakage of existing cluster when failed to start
- [Tests] Fix github CI
- [Spot] Add option for specifying controller instance type
- [AWS SSO] Use service account to access EC2 and S3 on head and worker nodes
skypilot-org/skypilot-catalog
2
merged
Reviewed 23 pull requests in 2 repositories
skypilot-org/skypilot
22 pull requests
- [Catalog] Better handling for catalog fetch failure
- [Mypy] Add type check with mypy and fix storage
- Add user identity to cluster status to avoid leakage when switching account
- Optimizing & Provisioning Retries at the granularity of regions/zones
- Add safe guard for provisioning/terminating TPU VM and fix spot launch TPU resource leak
- Tag all SkyPilot AWS nodes with "skypilot-user: $USER".
- [Logs] Fix logs download
- [AWS] Bring-your-own-VPC that disables public IPs for all SkyPilot nodes.
- [Spot] Let the controller aware of the failed setup and fail early
- [Spot] Add option for specifying controller instance type
- [AWS Catalog] Handle the exceptions when the user does not have access to the region
- [AWS] Add regions requested by the user
- [Failover] Fix leakage of existing cluster when failed to start
- Create pull_request_template.md
- Catalog: sort by (price, region name, zone name).
- Clean up preempted resources for TPU
- [docs] Update exec/status/autostop docs and formatting.
- [AWS] Update catalog from host periodically
-
Fix
ResourceHandlesemantics -
Hints for spot controller in
sky statusoutput. - Fix start spot controller bug.
- Fix bug with retrieving node IPs
skypilot-org/skypilot-catalog
1 pull request
Created an issue in skypilot-org/skypilot that received 3 comments
[Spot/UX] Need a way to only show the RUNNING spot jobs
Request from our user: Is there a way to clear jobs that are finished (either "SUCCEEDED" or "FAILED")?
Opened 6 other issues in 1 repository
skypilot-org/skypilot
2
closed
4
open
-
[TPU VM] Cannot
sky downa TPU VM when it does not appear on GCP - [Autostop] Need to reset the timer for every execution stage
- [Storage] Uploading list of files to bucket fails
- [Failover] The stopped cluster will be removed from the cluster table if failed to restart
- [IAM] Utilize ray-autoscaler-v1 IAM role on the remote VM
- [Spot] Make non-terminal spot job status more clear when the table is from the cache







