The Wayback Machine - https://web.archive.org/web/20211106051935/https://github.com/oap-project
Skip to content
@oap-project

Optimized Analytics Package for Spark Platform (OAP)

Pinned Loading

  1. Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.

    Scala 82 29

  2. Spark* plug-in for accelerating Spark* SQL performance by using cache and index at SQL data source layer.

    Scala 20 17

  3. Spark* Shuffle plugin for support shuffling through remote persistent memory over fabrics, which leverages the RDMA network and remote persistent memory (for read) to provide extremely high perform…

    C++ 5 7

  4. Optimized Spark package to accelerate machine learning algorithms in Apache Spark MLlib.

    Scala 4 8

  5. Spark plug-in package for accelerating Spark runtime spill functions using PMem such as RDD cache PMem extension.

    Scala 4 5

  6. Tools for building, packaging, and OAP public cloud integrations such as AWS EMR, Google Dataproc and K8S.

    Python 5 9

Repositories

  • raydp Public

    RayDP: Distributed data processing library that provides simple APIs for running Spark on Ray and integrating Spark with distributed deep learning and machine learning frameworks.

    Python 123 Apache-2.0 29 28 4 Updated Nov 5, 2021
  • arrow Public

    Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication…

    C++ 0 Apache-2.0 2,088 0 2 Updated Nov 5, 2021
  • gazelle_plugin Public

    Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.

    Scala 82 Apache-2.0 29 108 13 Updated Nov 5, 2021
  • oap-mllib Public

    Optimized Spark package to accelerate machine learning algorithms in Apache Spark MLlib.

    Scala 4 Apache-2.0 8 15 0 Updated Nov 4, 2021
  • remote-shuffle Public

    Spark* shuffle plugin for support shuffling data through a remote Hadoop-compatible file system, as opposed to vanilla Spark's local-disks.

    Scala 6 Apache-2.0 7 5 0 Updated Nov 2, 2021
  • oap-tools Public

    Tools for building, packaging, and OAP public cloud integrations such as AWS EMR, Google Dataproc and K8S.

    Python 5 Apache-2.0 9 6 0 Updated Nov 1, 2021
  • sql-ds-cache Public

    Spark* plug-in for accelerating Spark* SQL performance by using cache and index at SQL data source layer.

    Scala 20 Apache-2.0 17 14 0 Updated Oct 29, 2021
  • pmem-shuffle Public

    Spark* Shuffle plugin for support shuffling through remote persistent memory over fabrics, which leverages the RDMA network and remote persistent memory (for read) to provide extremely high performance and low latency shuffle solutions for Spark*.

    C++ 5 Apache-2.0 7 15 0 Updated Oct 19, 2021
  • recdp Public
    Python 1 4 1 0 Updated Oct 15, 2021
  • pmem-common Public

    Common library for accessing PMEM native library functions including memkind, vmemcache and so on.

    Java 2 Apache-2.0 8 3 1 Updated Sep 10, 2021

Top languages

Loading…

Most used topics

Loading…