I have a data.frame with 2291 rows and 4 columns, and I want to pick those rows whose column 3 match with column 2 of the next row, and start again from the next matched row and end until the matching goes on until it stops.
I tried using a for loop from 1:nrow(df), but this is not exactly accurate as i (I think) doesn't really start from the point of matched row.
My current code is like this:
test <- NULL
x <- c()
y <- c()
for(i in 1:nrow(df)){
if(df[i,3]==df[i+1,2]){
x <- df[i,]
y <- df[i+1,]
i = i+1 #stuck at this
}
test <- rbind(test,x,y)
}
Sample data looks like this:
X 2670000 3750000 C
X 3830000 8680000 E3
X 8680000 10120000 E1-A
X 10120000 11130079 D
X 11170079 11810079 E3
X 11810079 12810079 E2-A
X 12810079 13530079 E3
X 13530079 14050079 E3
X 14050079 15330079 A
X 15330079 16810079 E2-A
X 16810079 17690079 E2-A
What I want is:
X 3830000 8680000 E3
X 8680000 10120000 E1-A
X 10120000 11130079 D
X 11170079 11810079 E3
X 11810079 12810079 E2-A
X 12810079 13530079 E3
X 13530079 14050079 E3
X 14050079 15330079 A
X 15330079 16810079 E2-A
X 16810079 17690079 E2-A
I'm actually interested in the column 4 values. After such a condition when
df[i,3] is not equal to df[i+1,2], can the code be updated to store the column 4 values in vectors?
For example: The result for this sample would be:
vector_1
"E3" "E1-A" "D"
vector_2
"E3" "E2-A" "E3" "E3" "A" "E2-A" "E2-A"
What I get so far is:
X 3830000 8680000 E3
X 8680000 10120000 E1-A
X 8680000 10120000 E1-A
X 10120000 11130079 D
X 8680000 10120000 E1-A
X 10120000 11130079 D
X 11170079 11810079 E3
X 11810079 12810079 E2-A
X 11810079 12810079 E2-A
X 12810079 13530079 E3
If I go from row 1 to the last row of my df, I want to keep adding column 4 values in a vector as long as column 3 of i matches column 2 of i+1. Once that condition breaks, the next time the same condition is met, I want to keep storing the column 4 values again.
Thank you!
i == nrow(df)then there isdf[i+1, 3]...