I want to extract gene names (usually column 10, what's after "Name=") by matching the first column of file_b to file_a, and extracting the gene names if the second column of file_b lies within the gene interval, delineated by columns 4 and 5 of file_b. The first columns must match, such that I only get one gene per row (file_b) but I could in theory have multiple adjacent rows (column_b) match the same gene (e.g. if the second row in file_b was MT 4065)
MT 4050 mt-nd1nd2
groupIII 7332350 si:dkeyp-68b7.10
groupIV 5347350 zgc:153018
groupVI 11230375 bnip4
groupVII 17978350 si:ch211-284e13.4
EXTRA (IF POSSIBLE): Some of the entries (of file_b) will not land within a gene but may be close to one, say 100 units away to either side. It would be nice to have seperate code which allows you to specify this proximity, as was attempted here: Extract names from File_B having overlapping intervals with File_A
Any help is VERY much appreciated!