1

I have 2 dataframes:

df1 = data.frame(Bird_ID = c(1:6), Sex = c("Male","Female","Male","Male","Male","UNK"), Age.years =c("2","4","8","2","12","1"))

df2 = data.frame(Bird_ID = c(7), Sex = c("Female"), date.fledged= c("19/10/2021"))

df1
# Bird_ID Sex Age.years
# 1 Male 2
# 2 Female 4
# 3 Male 8
# 4 Male 2
# 5 Male 12
# 6 UNK 1
       
df2
# Bird_ID Sex Date.fledged
# 7 Female 19/10/2021
  • My first dataframe (df1) is my database where I have all my birds records with useful informations
  • My second dataframe (df2) is the "updater". I want to merge these informations to the main database (df1) and the output would be like this:
dfmerged = data.frame(Bird_ID = c(1:7), Sex = c("Male","Female","Male","Male","Male","UNK","Female"), Age.years =c("2","4","8","2","12","1",NA))`

dfmerged
# Bird_ID Sex Age.years
# 1 Male 2
# 2 Female 4
# 3 Male 8
# 4 Male 2
# 5 Male 12
# 6 UNK 1
# 7 Female NA

How can I update the bird database df1 using information from df2and keeping only (and all) columns that are in the main database df1? For example here dfmerged keeps only the columns from df1,drops the "Date.fledged" column from df2 and bird 7 has NA as "Age.years" because data is missing (and that is the wanted output).

1
  • 1
    Does "update" mean adding new data or does it include updating existing data? Commented Oct 19, 2021 at 11:15

2 Answers 2

1

You could use

library(dplyr)

df1 %>% 
  bind_rows(df2) %>% 
  select(names(df1))

This returns

  Bird_ID    Sex Age.years
1       1   Male         2
2       2 Female         4
3       3   Male         8
4       4   Male         2
5       5   Male        12
6       6    UNK         1
7       7 Female      <NA>
Sign up to request clarification or add additional context in comments.

1 Comment

thks. that does the job
1

You may do a full join.

merge(df1, df2, by = c('Bird_ID', 'Sex'), all = TRUE)[-4]

#  Bird_ID    Sex Age.years
#1       1   Male         2
#2       2 Female         4
#3       3   Male         8
#4       4   Male         2
#5       5   Male        12
#6       6    UNK         1
#7       7 Female      <NA>

In dplyr -

library(dplyr)

full_join(df1, df2, by = c('Bird_ID', 'Sex')) %>%
  select(-date.fledged)

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.