Timeline for Compare files and combine rows with matching values based on last column
Current License: CC BY-SA 4.0
19 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Aug 31 at 15:18 | comment | added | Ed Morton | You should have included a case where a given m-pattern value exists in some but not all files so we could test a potential solution using that - right now the only scenarios your sample input covers are the cases where an m-pattern value only exists in 1 file or exists in all files, it's missing the case where an m-pattern exists in 2 or more but not all files. Fewer fields per input line, e.g. just 2-4 instead of up to 21, would also have made it easier to see your requirements. | |
| Aug 31 at 14:57 | answer | added | Ed Morton | timeline score: 0 | |
| S Aug 27 at 7:59 | history | suggested | chicks | CC BY-SA 4.0 |
capitalize title
|
| Aug 26 at 21:30 | review | Suggested edits | |||
| S Aug 27 at 7:59 | |||||
| Aug 24 at 14:22 | answer | added | Stéphane Chazelas | timeline score: 5 | |
| Aug 24 at 10:58 | answer | added | cas | timeline score: 3 | |
| Aug 23 at 11:54 | history | became hot network question | |||
| Aug 23 at 9:49 | comment | added | ilkkachu | @MarcusMüller, not exactly their fault, since SE doesn't make it too easy to work with tabs. If you paste tabs in the edit window, they are tabs there, but the tab stops are every 4 spaces (which messes up the columns already since some of the fields here are 5 chars while others are 2), and the page gets rendered with spaces in the normal view, so anyone wanting to copy the text intact, would need to go to the edit view to do that... (I just tried that in the sandbox) | |
| Aug 23 at 6:04 | comment | added | Matteo |
@MarcusMüller apologies I realized copy-pasting doesn't preserve TABs... I'm still trying to figure out a way to do so efficiently when opening a code block. Maybe I will save part of the file(s) I need, then sed the space with \t, as you also suggested.
|
|
| Aug 23 at 6:01 | vote | accept | Matteo | ||
| Aug 22 at 21:59 | answer | added | Marcus Müller | timeline score: 5 | |
| Aug 22 at 21:33 | comment | added | Marcus Müller | note that your files are, unlike your code assumes, not actually separating entries with tabs, but simply with multiple spaces. Did you manually make these examples (hint: don't do such things, you're bound to make the problem more confusing that way)? Are you sure it's tabs in the actual files? | |
| Aug 22 at 20:54 | answer | added | markp-fuso | timeline score: 8 | |
| Aug 22 at 20:52 | comment | added | Matteo |
@ilkkachu wow that's so interesting! I didn't know about join since I'm still learning and so far mostly worked with basic awk and other commands. But it definitely comes handy, it appears to be doing exactly what I need, even the filtering for only shared M-patterns across all four files!
|
|
| Aug 22 at 20:42 | comment | added | ilkkachu |
or a hideous one-liner for all four files: join -t $'\t' <(join -t $'\t' -1 9 -2 5 <(sort -k9 file1.txt) <(sort -k5 file2.txt) ) <( join -t $'\t' -1 15 -2 21 <(sort -k15 file3.txt) <(sort -k21 file4.txt) )
|
|
| Aug 22 at 20:40 | comment | added | ilkkachu |
I think "join" is the tool that's meant for doing exactly this, though you'll need to sort the inputs beforehand, since in a lexical sort of (often used as default) M10 comes before M2. Also it moves the key field to the front. Something like join -t $'\t' -1 9 -2 5 <(sort -k9 file1.txt) <(sort -k5 file2.txt) might produce something useful for the first two files.
|
|
| Aug 22 at 20:28 | history | edited | Matteo | CC BY-SA 4.0 |
added simple tests done so far
|
| Aug 22 at 20:24 | history | edited | ilkkachu |
edited tags
|
|
| Aug 22 at 18:54 | history | asked | Matteo | CC BY-SA 4.0 |