0

I have following csv and i need to filter specific set of rows

table   entity_name node_name   src_name    table_col_name  look_up_indicator   type       keys
gw_policy   account ns0         fullname    insured_name       N                attribute   NA
gw_policy   polocy  ns1         agent_name  insured_id             N                attribute   NA
gw_policy   account ns2         phone_num   agent_phone        N                attribute   NA
gw_policy   account ns0         fullname    insured_name       N                attribute   NA
gw_policy   polocy  ns1         agent_name  agent              N                attribute   NA
gw_policy   account ns2         phone_num   a_phone            N                attribute   NA
gw_policy   account ns0         fullname    agen_name      N                attribute   NA
gw_policy   polocy  ns1         agent_name  agent              N                attribute   N

Now from above csv i need to get set of rows based on range of row name

ie, in this case i need to get the rows between two 'insured_name' and two 'agent', from the column name 'table_col_name'

so the result would look like

#Expected

# For the insured_name

insured_name
insured_id
agent_phone
insured_name

# For the agent
agent
a_phone
agen_name
agent

So how can achieve this using pandas?

Appreciated for the help?

Thanks

2
  • There are 3 rows with agent. How to decide which 2 agents to pick out of these two? Commented Dec 24, 2020 at 14:55
  • Hi @MayankPorwal.. sorry i have updated the sheet Commented Dec 24, 2020 at 15:21

1 Answer 1

1

Use df.index with df.loc:

In [2278]: insured_name_ix = df[df.table_col_name.eq('insured_name')].index

In [2283]: x = df.loc[insured_name_ix[0]: insured_name_ix[1]]

In [2284]: x
Out[2284]: 
       table entity_name node_name    src_name table_col_name look_up_indicator       type keys
0  gw_policy     account       ns0    fullname   insured_name                 N  attribute  NaN
1  gw_policy      polocy       ns1  agent_name          agent                 N  attribute  NaN
2  gw_policy     account       ns2   phone_num    agent_phone                 N  attribute  NaN
3  gw_policy     account       ns0    fullname   insured_name                 N  attribute  NaN

In [2317]: agent_ix = df[df.table_col_name.eq('agent')].index
In [2319]: y = df.loc[agent_ix[0]: agent_ix[1]]

In [2320]: y
Out[2320]: 
       table entity_name node_name    src_name table_col_name look_up_indicator       type keys
4  gw_policy      polocy       ns1  agent_name          agent                 N  attribute  NaN
5  gw_policy     account       ns2   phone_num        a_phone                 N  attribute  NaN
6  gw_policy     account       ns0    fullname      agen_name                 N  attribute  NaN
7  gw_policy      polocy       ns1  agent_name          agent                 N  attribute    N
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.