Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

Required fields*

8
  • 1
    Nice! 3 sec and 02:05 min compared to 10 sec and 9:30 min of Python. Commented Jul 1, 2020 at 16:49
  • It throws a warning however saying "awk: tst.awk:7: (FILENAME=fileB FNR=606894) warning: Invalid multibyte data detected. There may be a mismatch between your data and your locale." Commented Jul 1, 2020 at 16:50
  • @dizcza google says... stackoverflow.com/q/40049546/1745001. So apparently you just need to set LC_ALL=C (which is almost always good advice anyway unless you have a specific reason not to). Commented Jul 1, 2020 at 16:53
  • 1
    Yeap. I should have googled it. Thanks. Commented Jul 1, 2020 at 18:57
  • 1
    The max calculation happens once per line of fileA, not once total for all of fileA, and if we can't have the dots string populated before we read fileA then, as you can see in the new script, we need to loop through every value we read from fileA again to populate the str2dots array instead of doing it when we read each line the first time and there we have to access the str2lgth array to get the length for that string when on the first pass we had it in the lgth scalar variable so it will impact performance even if not by much. Commented Jul 4, 2020 at 12:06