The Wayback Machine - https://web.archive.org/web/20220516134621/https://github.com/topics/pandas
Skip to content
#

pandas

Here are 16,257 public repositories matching this topic...

datasets
dlwh
dlwh commented Mar 16, 2022

Describe the bug

Streaming Datasets can't be pickled, so any interaction between them and multiprocessing results in a crash.

Steps to reproduce the bug

import transformers
from transformers import Trainer, AutoModelForCausalLM, TrainingArguments
import datasets

ds = datasets.load_dataset('oscar', "unshuffled_deduplicated_en", split='train', streaming=True).with_format("
bug good first issue
Data-Science-For-Beginners
soubhikmandal2000
soubhikmandal2000 commented Oct 31, 2021
  • Base README.md
  • Quizzes
  • Introduction base README
    • Defining Data Science README
    • Defining Data Science assignment
    • Ethics README
    • Ethics assignment
    • Defining Data README
    • Defining Data assignment
    • Stats and Probability README
    • Stats and Probability assignment
  • Working with Data base README
    • Rel
good first issue help wanted translations
orf
orf commented Jan 25, 2022

We're trying to introduce Parquet into our team, and the largest blocker that we've seen is the dreaded "schemas are inconsistent" error message:

RuntimeError: Schemas are inconsistent, try using to_parquet(..., schema="infer"), or pass an explicit pyarrow schema. Such as to_parquet(..., schema={"column1": pa.string()})

This error message is super unhelpful: surely Dask knows what th

good first issue dataframe parquet

人工智能学习路线图,整理近200个实战案例与项目,免费提供配套教材,零基础入门,就业实战!包括:Python,数学,机器学习,数据分析,深度学习,计算机视觉,自然语言处理,PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域

  • Updated Feb 6, 2020
VibhuJawa
VibhuJawa commented May 5, 2022

Describe the bug
series.unique() returns a cuDF.Series while it returns a numpy.ndarray for pandas.

Steps/Code to reproduce bug

In [1]: import cudf
In [2]: import pandas as pd

In [3]: type(pd.Series([1,1]).unique())
Out[3]: numpy.ndarray

In [4]: type(cudf.Series([1,1]).unique())
Out[4]: cudf.core.series.Series

Expected behavior
I would exp

bug good first issue cuDF (Python)
danfojs
kylemcdonald
kylemcdonald commented Mar 2, 2022

I would like to convert a DataFrame to a JSON object the same way that Pandas does with to_dict().

toJSON() treats rows as elements in an array, and ignores the index labels. But to_dict() uses the index as keys.

Here is an example of what I have in mind:

function to_dict(df) {
    const rows = df.toJSON();
    const entries = df.index.map((e, i) => ({ [e]: rows[i] }));
  
enhancement good first issue

A comprehensive list of Deep Learning / Artificial Intelligence and Machine Learning tutorials - rapidly expanding into areas of AI/Deep Learning / Machine Vision / NLP and industry specific areas such as Climate / Energy, Automotives, Retail, Pharma, Medicine, Healthcare, Policy, Ethics and more.

  • Updated Apr 6, 2022
  • Python
ta
markdregan
markdregan commented Jan 7, 2022

Wondering if this already exists? If not happy to create if valuable.

I'm looking for a mapping from the column names outputted, to the actual technical indicator it represents.

examples:
momentum_ao == "Momentum, Awesome Oscilator"
momentum_kama == "Momentum, Kaufman’s Adaptive Moving Average (KAMA)"

Can help quickly grasp what the features represent without having to refer back to do

aws-data-wrangler

Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

  • Updated May 16, 2022
  • Python

Improve this page

Add a description, image, and links to the pandas topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the pandas topic, visit your repo's landing page and select "manage topics."

Learn more