- Seattle
- https://www.lei.chat
Block or Report
Block or report antiagainst
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abusePinned
-
-
google/uVkCompute Public
A micro Vulkan compute pipeline and a collection of benchmarking compute shaders
-
openxla/iree Public
A retargetable MLIR-based machine learning compiler and runtime toolkit.
-
llvm/llvm-project Public
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at…
-
This repo hosts the source for the DirectX Shader Compiler which is based on LLVM/Clang.
-
1,483 contributions in the last year
Activity overview
Contribution activity
June 2023
Created 37 commits in 1 repository
Created a pull request in openxla/iree that received 29 comments
[cuda] Implement basics for a CUDA HAL driver rewrite
This commit starts a CUDA HAL driver rewrite under experimental/.
We create a new cuda2/ directory to host the new code to avoid
interrupting the c…
Opened 20 other pull requests in 2 repositories
openxla/iree
17
merged
1
closed
1
open
- [ci] Update Xcode command line tools path on x86_64 macOS runners
- [ci] Update fetch_cuda_toolkit.py to use 12.1.1 for releases
- [cuda] Dump more synchronization related attributes
- [spirv] Dump spirv.module for Metal and WebGPU targets
- [cuda] Port over tracing utilities and use in NCCL channel
- [cuda] Port over native executable and its cache
- [cuda] Port over native executable and its cache
- [cuda] Port over channel implementation via NCCL
- [spirv] Dump spirv.module with dump-executable-intermediates-to=
- [llvmgpu] Target CUDA sm_60 architecture by default
- [cuda] Port over descriptor set and pipeline layout
- [cuda] Wire up basic creating devices, allocators, and buffers
- [cuda] Dump whether the device has integrated memory
- [cuda] Port over allocator and buffer implementation
- [cuda] Dump useful GPU characteristics
- [cuda] NFC: Split files for CUDA and NCCL dynamic symbols
- [ci] Update NVIDIA driver packages to v530 in docker images
- [gpu] Distribute fused producer elementwise ops in SIMT pipeline
- [ci] Update base docker image to use Ubuntu 20.04
Kapeli/Dash-User-Contributions
1
open
Reviewed 34 pull requests in 1 repository
openxla/iree
25 pull requests
- [StableHLO] Use stablehlo submodule
- [metal] Implement a Metal HAL driver
- Use cuGetProcAddress to load CUDA entry points
- Fuse iota ops with consumers always.
-
Reword comments for
IREE_BUILD_DOCS. -
Allow defining
IREE_HOST_SIZE_Tto other types. - [spirv] Dump spirv.module with dump-executable-intermediates-to=
- [StableHLO] Make reduce lowering more robust
- Correctly tag Vulkan Ampere tests as requiring sm80
- [GPUCheckResourceUsage] Don't choke on alloc of memref of index
- Cleaning up the tracing.h mechanism to enable alternative implementations.
- [cuda] Port over descriptor set and pipeline layout
- Allowing for sync allocations to be deallocated via queue-ordered deallocas.
- Adding a fallback for when CUDA memory pools are unsupported.
- [ConvertToLLVM] Don't choke on alloc of memref of index
- Refresh deployment-configuration website pages.
- Adding IREE_HAL_EXTERNAL_BUFFER_TYPE_DEVICE_ALLOCATION.
- Remove MHLO support
- Retain the parent channel on a split in iree_hal_nccl_channel_t.
- Moving builtins lower in the pipeline and adding option to force.
- Adding support for async memory pool allocations in the CUDA HAL.
- [cuda] Port over allocator and buffer implementation
- Integrate llvm-project at 223a0f63
- [ci] Update NVIDIA driver packages to v530 in docker images
- [cuda] Implement basics for a CUDA HAL driver rewrite
- Some pull request reviews not shown.
Created an issue in openxla/iree that received 1 comment
Use past average latency for comparison on pull request benchmarks
Right now we are using the last landed commit's latency for comparison when performing benchmarks on pull requests. With just one single data point…






