2,800 questions
1
vote
1
answer
25
views
polars-u64-idx not available for latest version
While the standard Polars package is available in version 1.34.0 the polars-u64-idx package is missing the latest versions.
Does anyone know if this package is discontinued?
2
votes
3
answers
63
views
How do I get polars.Expr.str.json_decode to decode simple map to List(Struct({'key': String, 'value': Int32}))?
json_decode requires that we specify the dtype.
Polars represents maps with arbitrary keys as a List<struct<2>> (see here).
EDIT: Suppose I don't know the keys in my JSON ahead of time, ...
2
votes
1
answer
57
views
How to perform sinking lazyframes with diverging queries to different partitions
I have a very big parquet file which I'm attempting to read from and split into partitioned folders on a column "token".
Currently I'm using pl.scan_parquet on the big parquet file followed ...
1
vote
3
answers
70
views
Forward Fill values from subset of values in polars
I have this dataframe:
import polars as pl
df = pl.DataFrame({'value': [1,2,3,4,5,None, None], 'flag': [0,1,1,1,0,0,0]})
┌───────┬──────┐
│ value ┆ flag │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═══════╪═...
2
votes
1
answer
49
views
How to select joined columns with structure like namespaces (a.col1, b.col2)?
I am working to migrate from PySpark to Polars. In PySpark I often use aliases on dataframes so I can clearly see which columns come from which side of a join. I'd like to get similarly readable code ...
0
votes
0
answers
88
views
Enabling Delta Table checkpointing when using polars write_delta()
I am using polars.df.write_delta() to initially create, and subsequently append to, Delta Tables in Microsoft Fabric OneLake storage, via a Fabric python notebook.
Having had a production process up ...
1
vote
1
answer
74
views
Converting a Rust `futures::TryStream` to a `polars::LazyFrame`
I have an application where I have a futures::TryStream. Still in a streaming fashion, I want to convert this into a polars::LazyFrame. It is important to note that the TryStream comes from the ...
0
votes
0
answers
71
views
PyCharm "view as DataFrame" shows nothing for polars DataFrames
Basically the title. Using PyCharm 2023.3.3 I'm not able to see the data of polars DataFrames.
As an example, I've a simple DataFrame like this:
print(ids_df)
shape: (1, 4)
┌───────────────────────────...
3
votes
3
answers
79
views
Dynamically index a column in Polars
I have a simple dataframe look like this:
import polars as pl
df = pl.DataFrame({
'ref': ['a', 'b', 'c', 'd', 'e', 'f'],
'idx': [4, 3, 1, 6, 2, 5],
})
How can I obtain the result as ...
1
vote
1
answer
71
views
Find value closest to subset of values in polars columns
I have this dataframe
import polars as pl
df = pl.from_repr("""
┌────────────┬──────┐
│ date ┆ ME │
│ --- ┆ --- │
│ date ┆ i64 │
╞════════════╪══════╡
│ 2027-11-...
3
votes
0
answers
64
views
How to repeat List in Polars [duplicate]
I am trying to repeat the values of a List in polars. The equivalent operation in pure python would be:
[1,2,3,4] * 3 -> [1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4].
So the content of the list is repeated ...
0
votes
1
answer
82
views
How to extract & coalesce deeply nested values that may not exist?
I'm trying to extract some data from deeply nested JSON - this works:
lf.with_columns(
[
pl.coalesce(
[
pl.col("a"),
pl.col("...
0
votes
1
answer
65
views
Show progress bar when reading files with globbing with polars
I have a folder with multiple Excel files.
I'm reading all of them in a single polars DataFrame concatenated vertically using globbing:
import polars as pl
df = pl.read_excel("folder/*.xlsx")...
3
votes
2
answers
98
views
How to create a cross table with percentages in Polars?
I would like to create a cross table that shows, in each cell, the percentages of rows over the total number of rows.
Inspired by this post I started with:
df = pl.DataFrame({"a": [2, 0, 1, ...
1
vote
3
answers
116
views
Drop column by index in polars
I need to drop the first column in a polars DataFrame.
I tried:
result = df.select([col for idx, col in enumerate(df.columns) if idx != 0])
But it looks long and clumsy for such a simple task?
I also ...