I have an 11GB word-list file which is already sorted as each word is on its own line.
I need to remove duplicates and lines starting from 077.
I guess I need to run sed and sort -u together but I also want a live output display what's happening in terminal and if possible display the time left.
All of this in one command and it must be able to run optimally at full performance under a Live CD or possibly installed Backtrack 5 rc3.
Time is not very important but if there is a way for me to calculate the ETA, I might be able to borrow my dad's i7 based CPU which should process it faster obviously otherwise I'll have to use an older core 2 CPU.
The problem I'm facing with sort command is that under a VMware player running it live, it doesn't have enough space so I have to specify temp files on my 32GB USB by using the -T command. I guess this won't be a problem had I installed Linux.
So please give me the complete command, be it sed,sort,awk to do this (whichever is most optimal).