I have to compare 2 files file a and file b with the same columns but are randomly placed in 2 files like compare column values
account, code, date, type, pc, vol and bs and if a match is found for any row then replace the column 1 account in file a
with account1 from file b and display all the non matched records from file a . the output file c should look like below
I don't have any background in linux, after going through the online forums I got a slight idea that this can be done via AWK but I am not that familiar with awk. please help. The comparison need to be done in linux environment
by randomly placed columns I mean is that in the command I should be able give the columns that I can use to match file a with file b (the order of the column will not be the same in both the files). It can be 8 or 10 or 15 columns to match in both the files.
similiar to that account and account1 must not be the first and last column , the command should have ability to pick what column needs to be updated as well
duplicates processing -->if file b has many duplicates for a 1 record in file a then it should update the file b first matched record so that it cannot be used again in the matching --> if file a has many duplicates and file b has only 1 matched record then the final file c should have only 1 account value get updated rather than updating all the records of matched values of file a
file a
account,temp1,code,type,date,subtask,pc,toy,vol,bs,sub
6576,WEQR,TYRE,BS,54122022,OBCD,K,BAT,5000,F,SCSC
1234,GFHD,ASDF,BS,21122022,STOP,C,CAT,1000,S,MATH
7654,GHAD,LOPI,CV,9089022,KGAD,G,BSEE,5908,J,IOYU
file b
account,code,date,type,inst,insttype,pc,str,vol,bs,name,xdate,account1
1234,ASDF,21122022,BS,GOLDY,RUB,C,123.1,1000,S,RON,90891234,CCCCC
2761,LOPCS,10122022,BSFD,SLV,STR,C,123.9,1001,B,RON,99999988,DDDDD
0980,RTDF,28822025,JUFG,BRNZ,HIY,C,123.8,2000,S,RON,88881234,EEEEE
file c
account,temp1,code,type,date,subtask,pc,toy,vol,bs,sub
6576,WEQR,TYRE,BS,54122022,OBCD,K,BAT,5000,F,SCSC
CCCCC,GFHD,ASDF,BS,21122022,STOP,C,CAT,1000,S,MATH
7654,GHAD,LOPI,CV,9089022,KGAD,G,BSEE,5908,J,IOYU
accountalways the first column? Isaccount1always the last column? Can there be duplicate values?