The Wayback Machine - https://web.archive.org/web/20201020072036/https://github.com/google/python-fire/issues/274
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return pandas.DataFrame #274

Open
xinbinhuang opened this issue Jul 21, 2020 · 10 comments
Open

Return pandas.DataFrame #274

xinbinhuang opened this issue Jul 21, 2020 · 10 comments
Labels
bug

Comments

@xinbinhuang
Copy link

@xinbinhuang xinbinhuang commented Jul 21, 2020

Hi team,

It seems that currently fire cannot return dataframe from function or method calls. I would to like make a contribution to this and create a PR if this would be good feature to fire.

Let me know how you thinks

Cheers
Bin

@nfultz
Copy link
Contributor

@nfultz nfultz commented Jul 21, 2020

That's odd, this worked in ~2018, I demoed it in a lightning talk at a pydata meetup - https://docs.google.com/presentation/d/1NHAYGN4Fx4YBVcaGAiJ4U3gEkyhhgE643r7Ka2MF2wI/edit?usp=sharing

EDIT:

My script still works, you just need to manually call to_string at the end

$ pd read_csv drive/aws_costs.csv - mean - round - to_string
@xinbinhuang
Copy link
Author

@xinbinhuang xinbinhuang commented Jul 21, 2020

Hmm I see. I didn't know about the to_string method for Dataframe.

My Fire component is actually a class which utilize pandas under the hood and it returns a dataframe for most of the methods.

I think it's not intuitive for end-user (i.e. my client and colleague) to know that they need to chain command with - to_string to render the output properly. Given that dataframe is such a popular object right now, do you think it's worth adding the - to_string as default behavior to Fire?

@nfultz
Copy link
Contributor

@nfultz nfultz commented Jul 21, 2020

If you look at the stack trace, it's because DataFrames fail on inspect, which is used by fire to generate the list of subcommands.

You reproduce it ouside of fire using:

>>> import pandas
>>> import inspect
>>> df = pandas.read_csv("drive/aws_costs.csv")
>>> inspect.getmembers(df)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.6/inspect.py", line 342, in getmembers
    value = getattr(object, key)
  File "/home/nfultz/.local/lib/python3.6/site-packages/pandas/core/frame.py", line 409, in _constructor_expanddim
    raise NotImplementedError("Not supported for DataFrames!")
NotImplementedError: Not supported for DataFrames!

So if you make pandas work with inspect, fire will be compatible accordingly, or at least not crash.

OTOH, I'm not sure generating a list of the data frame API is very helpful since it's quite large.

@dbieber
Copy link
Member

@dbieber dbieber commented Jul 21, 2020

Python Fire should work even when inspect.getmembers doesn't. I'll mark this as a bug.

@dbieber dbieber added the bug label Jul 21, 2020
@nfultz
Copy link
Contributor

@nfultz nfultz commented Jul 21, 2020

Fair enough. See also pandas-dev/pandas#31474 , they fixed it but haven't released it yet.

It's also been filed against inspect https://bugs.python.org/issue35108#msg361208

@xinbinhuang
Copy link
Author

@xinbinhuang xinbinhuang commented Jul 21, 2020

@nfultz I am not familiar with inspect.getmembers in general, but I do get impressed by how quickly you gather all these information! These information is pretty good for me probably other people as a reference.

Thank you so much for the help!

@dbieber
Copy link
Member

@dbieber dbieber commented Oct 3, 2020

Would you be able to provide a minimal example reproducing the issue?

@genos
Copy link

@genos genos commented Oct 13, 2020

Would you be able to provide a minimal example reproducing the issue?

Please pardon me for barging in, but I use Pandas a lot and have really enjoyed using fire for CLIs.

Script

#!/usr/bin/env python3
import fire
import pandas as pd

def test(n: int) -> pd.DataFrame:
    return pd.DataFrame(data=range(n))

if __name__ == "__main__":
    fire.Fire(test)

Environment

Setting up a Python 3.8 virtual environment, activating it, installing dependencies:

~/tmp ∃ python3 -m venv env
~/tmp ∃ source ./env/bin/activate
(env) ~/tmp ∃ pip install -U pip fire jinja2 pandas -qqq
(env) ~/tmp ∃ python --version
Python 3.8.6
(env) ~/tmp ∃ pip freeze
fire==0.3.1
Jinja2==2.11.2
MarkupSafe==1.1.1
numpy==1.19.2
pandas==1.1.3
python-dateutil==2.8.1
pytz==2020.1
six==1.15.0
termcolor==1.1.0

Running

It looks like our version of Pandas fixes the inspect-related NotImplementedError mentioned above; when running the script inside our environment, we get a ValueError

(env) ~/tmp ∃ ./w.py
ERROR: The function received no value for the required argument: n
Usage: w.py N

For detailed information on this command, run:
  w.py --help
(env) ~/tmp ∃ ./w.py 10
/usr/local/Cellar/[email protected]/3.8.6/Frameworks/Python.framework/Versions/3.8/lib/python3.8/inspect.py:350: FutureWarning: _AXIS_NAMES has been deprecated.
  value = getattr(object, key)
/usr/local/Cellar/[email protected]/3.8.6/Frameworks/Python.framework/Versions/3.8/lib/python3.8/inspect.py:350: FutureWarning: _AXIS_NUMBERS has been deprecated.
  value = getattr(object, key)
Traceback (most recent call last):
  File "./w.py", line 9, in <module>
    fire.Fire(test)
  File "/Users/graham/tmp/env/lib/python3.8/site-packages/fire/core.py", line 164, in Fire
    _PrintResult(component_trace, verbose=component_trace.verbose)
  File "/Users/graham/tmp/env/lib/python3.8/site-packages/fire/core.py", line 266, in _PrintResult
    help_text = helptext.HelpText(
  File "/Users/graham/tmp/env/lib/python3.8/site-packages/fire/helptext.py", line 63, in HelpText
    actions_grouped_by_kind = _GetActionsGroupedByKind(component, verbose=verbose)
  File "/Users/graham/tmp/env/lib/python3.8/site-packages/fire/helptext.py", line 332, in _GetActionsGroupedByKind
    members = completion.VisibleMembers(component, verbose=verbose)
  File "/Users/graham/tmp/env/lib/python3.8/site-packages/fire/completion.py", line 365, in VisibleMembers
    return [
  File "/Users/graham/tmp/env/lib/python3.8/site-packages/fire/completion.py", line 367, in <listcomp>
    if MemberVisible(component, member_name, member, class_attrs=class_attrs,
  File "/Users/graham/tmp/env/lib/python3.8/site-packages/fire/completion.py", line 311, in MemberVisible
    if member in (absolute_import, division, print_function):
  File "/Users/graham/tmp/env/lib/python3.8/site-packages/pandas/core/generic.py", line 1329, in __nonzero__
    raise ValueError(
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
@dbieber
Copy link
Member

@dbieber dbieber commented Oct 13, 2020

Thanks!

@genos
Copy link

@genos genos commented Oct 13, 2020

Happy to contribute!

If we uninstall the released version of fire and instead install from the current master commit from GitHub:

(env) ~/tmp ∃ pip uninstall fire
Found existing installation: fire 0.3.1
Uninstalling fire-0.3.1:
  Would remove:
    /Users/graham/tmp/env/lib/python3.8/site-packages/fire-0.3.1-py3.8.egg-info
    /Users/graham/tmp/env/lib/python3.8/site-packages/fire/*
Proceed (y/n)? y
  Successfully uninstalled fire-0.3.1
(env) ~/tmp ∃ pip install git+https://github.com/google/python-fire.git@878b8d86f488ef2606cffdf58297dd2781708316 -qqq

it looks like we get dropped into the help information for pd.DataFrame:

(env) ~/tmp ∃ ./w.py 10
/usr/local/Cellar/[email protected]/3.8.6/Frameworks/Python.framework/Versions/3.8/lib/python3.8/inspect.py:350: FutureWarning: _AXIS_NAMES has been deprecated.
  value = getattr(object, key)
/usr/local/Cellar/[email protected]/3.8.6/Frameworks/Python.framework/Versions/3.8/lib/python3.8/inspect.py:350: FutureWarning: _AXIS_NUMBERS has been deprecated.
  value = getattr(object, key)
NAME
    w.py 10 - Two-dimensional, size-mutable, potentially heterogeneous tabular data.

SYNOPSIS
    w.py 10 GROUP | COMMAND | VALUE

DESCRIPTION
    Data structure also contains labeled axes (rows and columns).
    Arithmetic operations align on both row and column labels. Can be
    thought of as a dict-like container for Series objects. The primary
    pandas data structure.

GROUPS
    GROUP is one of the following:

     T
       Two-dimensional, size-mutable, potentially heterogeneous tabular data.

     at
       Access a single value for a row/column label pair.

     attrs

     axes

     columns
       Immutable Index implementing a monotonic integer range.

     dtypes
       One-dimensional ndarray with axis labels (including time series).

     iat
       Access a single value for a row/column pair by integer position.

     iloc
       Purely integer-location based indexing for selection by position.

     index
       Immutable Index implementing a monotonic integer range.

     loc
       Access a group of rows and columns by label(s) or a boolean array.

     plot
       Make plots of Series or DataFrame.

     shape

     style
       Helps style a DataFrame or Series according to the data with HTML and CSS.

COMMANDS
    COMMAND is one of the following:
     abs
       Return a Series/DataFrame with absolute numeric value of each element.

     add
       Get Addition of dataframe and other, element-wise (binary operator `add`).


# Several lines omitted
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
4 participants
You can’t perform that action at this time.