CUDA
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
Here are 4,922 public repositories matching this topic...
A high-throughput and memory-efficient inference and serving engine for LLMs
-
Updated
Jun 3, 2024 - Python
🎉CUDA 笔记 / 大模型手撕CUDA / C++笔记,更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.
-
Updated
Jun 3, 2024 - Cuda
Main repository for QMCPACK, an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids with full performance portable GPU support
-
Updated
Jun 3, 2024 - C++
Radar Simulator built with Python and C++
-
Updated
Jun 3, 2024 - Python
✨ Zero-code distributed tracing and profiling, observability via eBPF 🚀
-
Updated
Jun 3, 2024 - Go
A retargetable MLIR-based machine learning compiler and runtime toolkit.
-
Updated
Jun 2, 2024 - C++
A 3D render engine from scratch, using CUDA/C++.
-
Updated
Jun 2, 2024 - Cuda
FractalShark - a fast Mandelbrot Set renderer for Nvidia GPUs
-
Updated
Jun 2, 2024 - C++
High Performance Monodomain program for cardiac eletrophysiology simulations.
-
Updated
Jun 2, 2024 - C
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
-
Updated
Jun 3, 2024 - Python
Implementations of various simulations for integrate and fire models, as well as conductance based models with synaptic neurotransmission
-
Updated
Jun 2, 2024 - Rust
From zero to hero CUDA for accelerating maths and machine learning on GPU.
-
Updated
Jun 2, 2024 - Cuda
The open-source serverless GPU container runtime.
-
Updated
Jun 2, 2024 - Go
A GPU-based correlator for MeerKAT Extension
-
Updated
Jun 2, 2024 - Python
FlashInfer: Kernel Library for LLM Serving
-
Updated
Jun 2, 2024 - Cuda
Advanced High Performance Computing in C with OpenMP, CUDA, MPI and NCCL. The folder project includes my final project for the special course. I implemented a Jacobi-solver for the Poisson partial differential problem both using OpenMP in the CPU, using CUDA on the GPU and using CUDA, MPI and NCCL on multiple GPUs.
-
Updated
Jun 2, 2024 - C++
Sandbox for graphics paper implementation
-
Updated
Jun 2, 2024 - C++
Created by Nvidia
Released June 23, 2007
- Followers
- 202 followers
- Website
- developer.nvidia.com/cuda-zone
- Wikipedia
- Wikipedia