Revisions to Convert rows to columns

New text processed.

Source Link

edited Jan 29, 2016 at 17:10

user79743

Update for new text.

To process the new text posted in the question, It seems to me that two pass of awk are the best answer. One pass, as short as fields exist, will print the header field titles. The next awk pass will print only field 2. In both cases, I added a way to remove leading and trailing spaces (for better formatting).

#!/bin/bash
{
awk -F: 'BEGIN{ sl="Virtual Machine"}
         $1~sl && head == 1 { head=0; exit 0}
         $1~sl && head == 0 { head=1; }
         head == 1 {
             gsub(/^[ \t]+/,"",$1);   # remove leading  spaces
             gsub(/[ \t]+$/,"",$1);   # remove trailing spaces
             printf( "%s\t", $1)
         }
         ' infile
#echo
awk -F: 'BEGIN { sl="Virtual Machine"}
         $1~sl { printf( "%s\n", "") }
         {
             gsub(/^[ \t]+/,"",$2);   # remove leading  spaces
             gsub(/[ \t]+$/,"",$2);   # remove trailing spaces
             printf( "%s\t", $2)
         }
         ' infile
echo
} | column -t -s "$(printf '%b' '\t')"

The surrounding { ... } | column -t -s "$(printf '%b' '\t')" is to format the whole table in a pretty way.
Please note that the "$(printf '%b' '\t')" could be replaced with $'\t' in ksh, bash, or zsh.

Update for new text.

To process the new text posted in the question, It seems to me that two pass of awk are the best answer. One pass, as short as fields exist, will print the header field titles. The next awk pass will print only field 2. In both cases, I added a way to remove leading and trailing spaces (for better formatting).

#!/bin/bash
{
awk -F: 'BEGIN{ sl="Virtual Machine"}
         $1~sl && head == 1 { head=0; exit 0}
         $1~sl && head == 0 { head=1; }
         head == 1 {
             gsub(/^[ \t]+/,"",$1);   # remove leading  spaces
             gsub(/[ \t]+$/,"",$1);   # remove trailing spaces
             printf( "%s\t", $1)
         }
         ' infile
#echo
awk -F: 'BEGIN { sl="Virtual Machine"}
         $1~sl { printf( "%s\n", "") }
         {
             gsub(/^[ \t]+/,"",$2);   # remove leading  spaces
             gsub(/[ \t]+$/,"",$2);   # remove trailing spaces
             printf( "%s\t", $2)
         }
         ' infile
echo
} | column -t -s "$(printf '%b' '\t')"

The surrounding { ... } | column -t -s "$(printf '%b' '\t')" is to format the whole table in a pretty way.
Please note that the "$(printf '%b' '\t')" could be replaced with $'\t' in ksh, bash, or zsh.

added 1 character in body

Source Link

edited Jan 29, 2016 at 14:22

user79743

If walking the file twice is not a (big) problem (will store only one line in memory):

awk -F : '{printf("%s\t ", $1)}' infile
echo
awk -F : '{printf("%s\t ", $2)}' infile

Which, for a general count of fields would be (which could have many walks of the file):

#!/bin/bash
rowcount=2
for (( i=1; i<=rowcount; i++ )); do
    awk -v i="$i" -F : '{printf("%s\t ", $i)}' infile
    echo
done

But for a really general transpose, this will work:

awk '$0!~/^$/{    i++;
                  split($0,arr,":");
                  for (j in arr) {
                      out[i,j]=arr[j];
                      if (maxr<j){ maxr=j} # max number of output rows.
                  }
            }
    END {
        maxc=i                             # max number of output columns.
        for     (j=1; j<=maxr; j++) {
            for (i=1; i<=maxc; i++) {
                printf( "%s\t", out[i,j])  # out field separator.
            }
            printf( "%s\n","" )
        }
    }' infile

And to make it pretty (change tousing tab :\t as out field separator) :

./script | |column -t -s $'\t'

Virtual_Machine  ID                                Status   Memory  Uptime  Server              Pool     HA     VCPU  Type     OS
OL6U7            0004fb00000600003da8ce6948c441bd  Running  65536   17103   MyOVS1.vmworld.com  HA-POOL  false  16    Xen PVM  Oracle Linux 6

The code above for a general transpose will store the whole matrix in memory.
That could be a problem for really big files.

If walking the file twice is not a (big) problem (will store only one line in memory):

awk -F : '{printf("%s\t ", $1)}' infile
echo
awk -F : '{printf("%s\t ", $2)}' infile

Which, for a general count of fields would be (which could have many walks of the file):

#!/bin/bash
rowcount=2
for (( i=1; i<=rowcount; i++ )); do
    awk -v i="$i" -F : '{printf("%s\t ", $i)}' infile
    echo
done

But for a really general transpose, this will work:

awk '$0!~/^$/{    i++;
                  split($0,arr,":");
                  for (j in arr) {
                      out[i,j]=arr[j];
                      if (maxr<j){ maxr=j} # max number of output rows.
                  }
            }
    END {
        maxc=i                             # max number of output columns.
        for     (j=1; j<=maxr; j++) {
            for (i=1; i<=maxc; i++) {
                printf( "%s\t", out[i,j])  # out field separator.
            }
            printf( "%s\n","" )
        }
    }' infile

And to make it pretty (change to : as out field separator) :

./script | |column -t -s $'\t'

Virtual_Machine  ID                                Status   Memory  Uptime  Server              Pool     HA     VCPU  Type     OS
OL6U7            0004fb00000600003da8ce6948c441bd  Running  65536   17103   MyOVS1.vmworld.com  HA-POOL  false  16    Xen PVM  Oracle Linux 6

The code above for a general transpose will store the whole matrix in memory.
That could be a problem for really big files.

If walking the file twice is not a (big) problem (will store only one line in memory):

awk -F : '{printf("%s\t ", $1)}' infile
echo
awk -F : '{printf("%s\t ", $2)}' infile

Which, for a general count of fields would be (which could have many walks of the file):

#!/bin/bash
rowcount=2
for (( i=1; i<=rowcount; i++ )); do
    awk -v i="$i" -F : '{printf("%s\t ", $i)}' infile
    echo
done

But for a really general transpose, this will work:

awk '$0!~/^$/{    i++;
                  split($0,arr,":");
                  for (j in arr) {
                      out[i,j]=arr[j];
                      if (maxr<j){ maxr=j} # max number of output rows.
                  }
            }
    END {
        maxc=i                             # max number of output columns.
        for     (j=1; j<=maxr; j++) {
            for (i=1; i<=maxc; i++) {
                printf( "%s\t", out[i,j])  # out field separator.
            }
            printf( "%s\n","" )
        }
    }' infile

And to make it pretty (using tab \t as out field separator) :

./script | |column -t -s $'\t'

Virtual_Machine  ID                                Status   Memory  Uptime  Server              Pool     HA     VCPU  Type     OS
OL6U7            0004fb00000600003da8ce6948c441bd  Running  65536   17103   MyOVS1.vmworld.com  HA-POOL  false  16    Xen PVM  Oracle Linux 6

The code above for a general transpose will store the whole matrix in memory.
That could be a problem for really big files.

Correct processing.

Source Link

edited Jan 28, 2016 at 22:21

user79743

awk '$0!~/^$/{    i++;
                  split($0,arr,/":/");
                  for (j in arr) { 
 out[NR                     out[i,j]=arr[j]; }
            }
    {      if (maxr<j){ maxr=j} }    # max number of output rows.
                  }
            }
    END {
        maxc=NRmaxc=i                             # max number of output columns.
        for     (j=1; j<=maxr; j++) {
            for (i=1; i<=maxc; i++) {
                printf( "%s \t""%s\t", out[i,j])  ### The# out field separator.
            }
            printf( "%s\n","" )
        }
    }' infile

And to make it pretty (usechange to :: as out filedfield separator) :

./script | column|column -t -s :$'\t'

Virtual_Machine  ID                                Status   Memory  Uptime  Server              Pool     HA     VCPU  Type     OS
OL6U7            0004fb00000600003da8ce6948c441bd  Running  65536   17103   MyOVS1.vmworld.com  HA-POOL  false  16    Xen PVM  Oracle Linux 6

awk '$0!~/^$/{    split($0,arr,/:/);
                for (j in arr) { out[NR,j]=arr[j]; }
            }
    { if (maxr<j){ maxr=j} }    # max number of output rows.
    END {
        maxc=NR                 # max number of output columns.
        for     (j=1; j<=maxr; j++) {
            for (i=1; i<=maxc; i++) {
                printf( "%s \t", out[i,j])  ### The out field separator.
            }
            printf( "%s\n","" )
        }
    }' infile

And to make it pretty (use : as out filed separator) :

./script | column -t -s :

Virtual_Machine  ID                                Status   Memory  Uptime  Server              Pool     HA     VCPU  Type     OS
OL6U7            0004fb00000600003da8ce6948c441bd  Running  65536   17103   MyOVS1.vmworld.com  HA-POOL  false  16    Xen PVM  Oracle Linux 6

awk '$0!~/^$/{    i++;
                  split($0,arr,":");
                  for (j in arr) { 
                      out[i,j]=arr[j];
                      if (maxr<j){ maxr=j} # max number of output rows.
                  }
            }
    END {
        maxc=i                             # max number of output columns.
        for     (j=1; j<=maxr; j++) {
            for (i=1; i<=maxc; i++) {
                printf( "%s\t", out[i,j])  # out field separator.
            }
            printf( "%s\n","" )
        }
    }' infile

And to make it pretty (change to : as out field separator) :

./script | |column -t -s $'\t'

Virtual_Machine  ID                                Status   Memory  Uptime  Server              Pool     HA     VCPU  Type     OS
OL6U7            0004fb00000600003da8ce6948c441bd  Running  65536   17103   MyOVS1.vmworld.com  HA-POOL  false  16    Xen PVM  Oracle Linux 6

Added column pretty print.

Source Link

edited Jan 28, 2016 at 21:45

user79743

Loading

Source Link

answered Jan 28, 2016 at 21:36

user79743

Loading

Stack Exchange Network

Return to Answer

Update for new text.

Update for new text.