1

Let's say that I have a dataframe, X, initiated with 0s and a dimension m x n. I have n unique values (1,2,3,...,n) in a pandas.series, Y, that has length m. How do I set the Y[i] column of the ith row of X (change 0 to 1) efficiently without using a loop. Especially for large m and n.

For example, for Y = [3,2,1]
X
row     1       2      3
0       0       0      0
1       0       0      0
2       0       0      0

to
row     1       2      3
0       0       0      1
1       0       1      0
2       1       0      0
4
  • 2
    What is the issue, exactly? Have you tried anything, done any research? Also, why are you seemingly using 0 and 1 instead of actual boolean values? Commented Feb 19, 2020 at 23:12
  • If your matrix is not square (i.e. I assume m does not necessarily equal n), then it is not helpful to have a square matrix as your example. Commented Feb 19, 2020 at 23:21
  • Sheer curiosity. I wanted to know if there was a built-in function for something like that. Instead of using loop and .iloc. Commented Feb 20, 2020 at 0:32
  • Note that iat is faster for setting scalar values compared to iloc. Commented Feb 20, 2020 at 1:06

1 Answer 1

2

I'm not sure why you are against for loops. This should be fairly efficient.

for row, col in enumerate(Y):
    df.iat[n, col] = 1

You could also compute the index locations and set their values to one, then reshape the result to the m x n shape of the matrix.

Y = [3, 2, 1]
n = 5
m = len(Y)
locations = set(row * n + col for row, col in enumerate(Y))
df = pd.DataFrame(
    np.array([1 if idx in locations else 0 for idx in range(m * n)]).reshape((m, n))
)
>>> df
   0  1  2  3  4
0  0  0  0  1  0
1  0  0  1  0  0
2  0  1  0  0  0
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.