Skip to main content
5 of 5
edited tags
Stéphane Chazelas
  • 584.8k
  • 96
  • 1.1k
  • 1.7k

Using sed to replace all occurrences at the beginning with a matching number of replacement strings

I'm looking to manipulate the output of $tree --noreport$ in such a way that replaces the leading box-drawing characters and spaces on each line with a matching number of spaces. If I were to write the pattern for matching these characters, it would be ^\\(\u2500\\|\u2514\\|\u251C\\| \\)*\u2500. This string would be wrapped in $'...' because Unicode escape sequences are not recognized by sed. This pattern occurs on every line of the output of tree --noreport except for the first. Each character in each matching string needs to be replaced with a space.

Example input:

.
├── docs
│   ├── jokes
│   │   └── knock_knock.txt
│   └── work
├── images
└── .profile

Example output:

.
    docs
        jokes
            knock_knock.txt
        work
    images
    .profile

I'm now realizing for my purposes that I need to remove ambiguity over where a file or folder's name starts (a file or folder's name may begin with one or more spaces), so the output should actually look maybe like:

.
    /docs
        /jokes/
            /knock_knock.txt
        /work
    /images
    /.profile

The \u2500 at the end of my provided pattern actually distinguishes between tree's formatting and the start of the file/folder's name.

Melab
  • 4.4k
  • 10
  • 42
  • 59