4

I'm relatively new to R and trying to figure out how to merge multiple data.frames with varying numbers of rows but all with a common column, "Year". I've looked through similar questions, and this question: Merge dataframes, different lengths provided a great answer. However, when I applied it to my own data, I couldn't get it to work with multiple data.frames; I always receive an error message.

Sample data:

> df1 <- data.frame(Year=2006:2011, Site1=c("2.3", "1"  , "3.1", "2.9", "1.4", "3"))  
> df2 <- data.frame(Year=2007:2011, Site2=c("2.7", "4.1", "1.1", "2.6", "3.1"))  
> df3 <- data.frame(Year=2008:2011, Site3=c("1.3", "2"  , "3.6", "1.7"))  

The goal is to produce a single data.frame where column 1 is the year, column 2 is site 1, column 3 is site 2, and so on. I have ~17 data.frames currently (there will be up to 40), corresponding to 17 sites with variable timelines/number of rows.

Any help would be appreciated.

Code I've tried:

> NewDF <- merge(df1, df2, by="Year", all.x=TRUE, all.y=TRUE)  

This worked great for 2 data.frames, but when I tried to add in another data.frame, I received the error message:

> NewDF <- merge(list=c(df1, df2, df3), by="Year", all.x=TRUE, all.y=TRUE)  
 Error in as.data.frame(x) : argument "x" is missing, with no default

1 Answer 1

17

You want to merge the result with df3, i.e.:

merge(df3, merge(df1, df2, by="Year", all.x=TRUE, all.y=TRUE), by = "Year", all.x = TRUE, all.y = TRUE)
#  Year Site3 Site1 Site2
#1 2006  <NA>   2.3  <NA>
#2 2007  <NA>     1   2.7
#3 2008   1.3   3.1   4.1
#4 2009     2   2.9   1.1
#5 2010   3.6   1.4   2.6
#6 2011   1.7     3   3.1

Or if you have your data.frame's in a list, use Reduce to generalize the above:

Reduce(function(x,y) merge(x, y, by = "Year", all.x = TRUE, all.y = TRUE),
       list(df1, df2, df3))
#  Year Site1 Site2 Site3
#1 2006   2.3  <NA>  <NA>
#2 2007     1   2.7  <NA>
#3 2008   3.1   4.1   1.3
#4 2009   2.9   1.1     2
#5 2010   1.4   2.6   3.6
#6 2011     3   3.1   1.7
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks for a quick response! I just tried out both options, and have a couple questions. 1) For option 1, is there a way to easily run the merge with 17 data.frames? It works great for 3, but not sure the most efficient way to apply it to 17. 2) I tried the reduce function, and it worked perfectly for my first 3 data.frames. When I added a 4th, I received an error message: `Reduce(function(x,y) merge(x, y, by = "Year", all.x = TRUE, all.y = TRUE), list(Faye006WWmean, Faye006BWWmean, Faye007WWmean, Grin001WWmean)) Error in match.names(clabs, names(xi)) : names do not match previous names
@user2296922 1) that's what second method does 2) something is weird about your 4th data.frame - try merging just that one with any other one and see what that does
I just tried reordering the data.frames within the list, and there was no issue. Must have been the way I entered the code the first time. Thanks for the help!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.