Timeline for For each line in file A replace all matching lines in file B with a pattern
Current License: CC BY-SA 4.0
        12 events
    
    | when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Jul 4, 2020 at 12:06 | comment | added | Ed Morton | The max calculation happens once per line of fileA, not once total for all of fileA, and if we can't have the dots string populated before we read fileA then, as you can see in the new script, we need to loop through every value we read from fileA again to populate the str2dots array instead of doing it when we read each line the first time and there we have to access the str2lgth array to get the length for that string when on the first pass we had it in the lgth scalar variable so it will impact performance even if not by much. | |
| Jul 4, 2020 at 11:49 | comment | added | dizcza | 
        
            
    Thank you. But why are you asking "not even 100,000 chars"? I mean, it shouldn't impact the performance since you read fileA only once to determine the max length and fileA is relatively small.
        
     | 
|
| Jul 4, 2020 at 11:45 | comment | added | Ed Morton | I added a version that does both. | |
| Jul 4, 2020 at 11:44 | history | edited | Ed Morton | CC BY-SA 4.0 | 
        
            
             
                
                    added 1002 characters in body 
                
             
        
     | 
| Jul 3, 2020 at 19:58 | comment | added | dizcza | 
        
            
    How to dynamically calculate the largest line length of fileA and put it in awk? And also, if I assume that fileA is already lowercased, can I remove { lc = tolower($0) } block and substitute $0 for lc?
        
     | 
|
| Jul 1, 2020 at 18:57 | comment | added | dizcza | Yeap. I should have googled it. Thanks. | |
| Jul 1, 2020 at 16:53 | comment | added | Ed Morton | 
        
            
    @dizcza google says... stackoverflow.com/q/40049546/1745001. So apparently you just need to set LC_ALL=C (which is almost always good advice anyway unless you have a specific reason not to).
        
     | 
|
| Jul 1, 2020 at 16:50 | comment | added | dizcza | It throws a warning however saying "awk: tst.awk:7: (FILENAME=fileB FNR=606894) warning: Invalid multibyte data detected. There may be a mismatch between your data and your locale." | |
| Jul 1, 2020 at 16:49 | vote | accept | dizcza | ||
| Jul 1, 2020 at 16:49 | comment | added | dizcza | Nice! 3 sec and 02:05 min compared to 10 sec and 9:30 min of Python. | |
| Jul 1, 2020 at 15:37 | history | edited | Ed Morton | CC BY-SA 4.0 | 
        
            
             
                
                    edited body 
                
             
        
     | 
| Jul 1, 2020 at 15:29 | history | answered | Ed Morton | CC BY-SA 4.0 |