The Wayback Machine - https://web.archive.org/web/20220328183711/https://github.com/topics/dataframe
Skip to content
#

dataframe

Here are 575 public repositories matching this topic...

LiterallyUniqueLogin
LiterallyUniqueLogin commented Feb 20, 2022

What language are you using?

Python

What version of polars are you using?

0.13.1

What operating system are you using polars on?

CentOS Linux release 8.1.1911 (Core)

What language version are you using

python 3.7.9

Describe your bug.

When calling scan_csv on an empty file, a confusing message about buffers appears instead of simply saying th

good first issue
bdice
bdice commented Feb 3, 2022

Is your feature request related to a problem? Please describe.
While reviewing PR #9817 to introduce DataFrame.diff, I noticed that it is restricted to acting on numeric types.

A time-series diff is probably a very common user need, if provided a series of timestamps and seeking the durations between observations.

Pandas supports diffs on non-numeric types like timestamps:

feature request good first issue cuDF (Python)
danfojs
kylemcdonald
kylemcdonald commented Mar 2, 2022

I would like to convert a DataFrame to a JSON object the same way that Pandas does with to_dict().

toJSON() treats rows as elements in an array, and ignores the index labels. But to_dict() uses the index as keys.

Here is an example of what I have in mind:

function to_dict(df) {
    const rows = df.toJSON();
    const entries = df.index.map((e, i) => ({ [e]: rows[i] }));
  
enhancement good first issue
CMobley7
CMobley7 commented Mar 27, 2022

When I ran Center of Gravity: cg over 3 months of Bitcoin prices ("20200801" to "20201101"), I got

Close cg
count 132481.000000 132472.000000
mean 11378.306788 -5.499988
std 844.355621 0.001991
min 9881.820000 -5.616297
25% 10710.500000 -5.500833
50% 11368.680000 -5.499987
75% 11742.540000 -5.499146
1
bug good first issue
yjshen
yjshen commented Mar 25, 2022

TPC-DS has many queries with IN predicates where all elements are constants. It's a low-hanging fruit if we could implement an InSet function for this all constants value case.

While implementing this, we could either use a hashtable or a chain of if-elif-else, depending on the length and the type of the constants array.

Q8:

 WHERE substr(ca_zip, 1, 5) IN (
               '2412
enhancement good first issue help wanted
DataFrame
thatlittleboy
thatlittleboy commented Jan 2, 2022

Background

This thread is borne out of the discussion from #968 , in an effort to make documentation more beginner-friendly & more understandable.
One of the subtasks mentioned in that thread was to go through the function docstrings and include a minimal working example to each of the public functions in pyjanitor.

Criteria reiterated here for the benefit of discussion:

It sh
good first issue
pdpipe
yarkhinephyo
yarkhinephyo commented Nov 28, 2021

For pipeline stages provided by the pdpipe.basic_stages, supplying conditions to the prec and post keyword arguments may not return the correct error messages.

Example Code

import pandas as pd; import pdpipe as pdp;
df = pd.DataFrame([[1,4],[4,5],[1,11]], [1,2,3], ['a','b'])
pline = pdp.PdPipeline([
  pdp.FreqDrop(2, 'a', prec=pdp.cond.HasAllColumns(['x']))
])
pline.apply(

Improve this page

Add a description, image, and links to the dataframe topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the dataframe topic, visit your repo's landing page and select "manage topics."

Learn more