dataframe
Here are 553 public repositories matching this topic...
-
Updated
Jan 11, 2022 - Python
-
Updated
Jan 7, 2022 - Java
Based on @karthikeyann's work on this PR rapidsai/cudf#9767 I'm wondering if it makes sense to consider removing the defaults for the stream parameters in various detail functions. It is pretty surprising how often these are getting missed.
The most common case seems to be in factory functions and various ::create functions. Maybe just do it for those?
-
Updated
Jan 7, 2022 - Java
Describe the bug
Failed to execute Series.drop_duplicates.
In [75]: a = md.DataFrame(np.random.rand(10, 2), columns=['a', 'b'], chunk_size=2)
In [76]: a['a'].drop_duplicates().execute() -
Updated
Apr 20, 2021 - Rust
Is your feature request related to a problem? Please describe.
My request is a new indicator called Clenow momentum.
Describe the solution you'd like
It measures momentum by getting the exponential regression of log prices and the Coefficient of Exponential Regression depending on the rolling days. It can detect trends in a stock as well as the direction of the stock.
**Addition
Is your feature request related to a problem? Please describe.
The Series.map() function should enable the usage of index in the passed lambda, just like the normal Array.map() function does. My example use case is calculating a moving average, which requires referencing values next to the current position in the Series.
Describe the solution you'd like
I would like to be able to writ
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
#1514 points out a secondary goal for the python library to expose the datafusion and arrow-rust versions.
I think Spark has a good implementation of this by binding the spark version to SparkContext in the jvm code, then exposing it to the pyspark API. pyspark's version itself is hard-coded.
Example:
In the image below the word starships should begin on a new line to avoid being split.
Terminal width is provided to determine how many columns to print. The terminal width or the total width of the column headers may be used to wrap the text in the footer.
-
Updated
Jan 8, 2022 - C++
-
Updated
Jan 29, 2021 - C#
Background
This thread is borne out of the discussion from #968 , in an effort to make documentation more beginner-friendly & more understandable.
One of the subtasks mentioned in that thread was to go through the function docstrings and include a minimal working example to each of the public functions in pyjanitor.
Criteria reiterated here for the benefit of discussion:
It sh
-
Updated
Oct 25, 2021 - Go
For pipeline stages provided by the pdpipe.basic_stages, supplying conditions to the prec and post keyword arguments may not return the correct error messages.
Example Code
import pandas as pd; import pdpipe as pdp;
df = pd.DataFrame([[1,4],[4,5],[1,11]], [1,2,3], ['a','b'])
pline = pdp.PdPipeline([
pdp.FreqDrop(2, 'a', prec=pdp.cond.HasAllColumns(['x']))
])
pline.apply(
-
Updated
Jan 6, 2019 - Python
-
Updated
Jun 4, 2021 - Python
-
Updated
Dec 18, 2021 - Python
-
Updated
Jan 11, 2022 - Clojure
Improve this page
Add a description, image, and links to the dataframe topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the dataframe topic, visit your repo's landing page and select "manage topics."



vaex.from_arrays(s=['a,b']).s.str.replace(r'(\w+)',r'--\g<1>==',regex=True)
when using capture group in str, it fails, while str_pandas.replace() is correct

Name: vaex
Version: 4.6.0
Summary: Out-of-Core DataFrames to visualize and explore big tabular datasets
Home-page: