delta-io / delta
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
See what the GitHub community is most excited about today.
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Apache Spark - A unified analytics engine for large-scale data processing
An Agile RISC-V SoC Design Framework with in-order cores, out-of-order cores, accelerators, and more
Rocket Chip Generator
The Scala 3 compiler, also known as Dotty.
Source code for Twitter's Recommendation Algorithm
Scala 2 compiler and standard library. Scala 2 bugs at https://github.com/scala/bug; Scala 3 at https://github.com/scala/scala3
A flow-style query language for SQL engines
Open-source code analysis platform for C/C++/Java/Binary/Javascript/Python/Kotlin based on code property graphs. Discord https://discord.gg/vv4MH284Hc
The Community Maintained High Velocity Web Framework For Java and Scala.
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
A Spark plugin for reading and writing Excel files
TheHive: a Scalable, Open Source and Free Security Incident Response Platform
Spark RAPIDS plugin - accelerate Apache Spark with GPUs
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Mill is a fast JVM build tool that supports Java, Scala and Kotlin. 2-4x faster than Gradle and 4-10x faster than Maven for common workflows, Mill aims to make your project’s build process performant, maintainable, and flexible
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
Scala language server with rich IDE features 🚀