How to traverse a file and extract words using their index?

Question

The given file must not be stored in variable and then traversed due to memory size restrictions:

Example:

var=$(cat FILE)
for i in $var
do
  echo $i
done

How do you traverse all strings in a file in the same way as the example above but extract each whitespace-separated directly from the file?

Example:

fileindex=1
totalfilecount=$(cat FILE | wc -w)
while (( ${fileindex} <= ${totalfilecount} ))
do
  onefilename= ??? missing command using fileindex
  ((fileindex+=1))
done

Is there a command that can view a file as an array and allow you to extract words using their index positions?

The idea is to process every word in the file as though the file were an array.

Input file example:

one two
three four
five six

Here is the scenario that requires the above funtionality:

we have server_A and server_B
server_A needs to connect to server_B via sftp (sftp only) and 'get' some files
BOTH 'ls' or 'ls -l' commands in sftp can be using wild cards to filter specific files
each file needs to be processed individually (for various reasons) on the fly
the files cannot be copied as a group to server_B and then processed individually
a list of files must first be created on server_A and then each file in that list is copied from server_B and processed one file at a time

Where is the problem?

The problem is how the 'ls' command can create a dual column list of words if the list is long thus not allowing simple processing as with 'ls -l' which always creates a single column list.

This leads us to my initial question, if such a solution exists.

@h3rrmiller How does using strings allow extraction of individual strings as stated in the question? — Chris Down
– Chris Down, Commented Nov 16, 2012 at 16:25

Chris Down · Accepted Answer · 2012-11-16 14:11:19Z

2

You can do this per word using awk, which should meet your memory requirements:

awk -v RS=\  '{
    # Do something with the word
    print
}' file

You can specify which string you want by using NR.

$ awk -v RS=\  'NR==2{print}' <<< 'foo bar baz'
bar

edited Nov 16, 2012 at 14:11

answered Nov 16, 2012 at 12:37

Chris Down

130k26 gold badges277 silver badges268 bronze badges

The idea is to process every string in the file as though the file were an array, without processing lines.

Radnaskela Samot
– Radnaskela Samot

2012-11-16 13:06:57 +00:00
Commented Nov 16, 2012 at 13:06
@AleksandarHadjikan Okay, edited.

Chris Down
– Chris Down

2012-11-16 13:28:42 +00:00
Commented Nov 16, 2012 at 13:28
This is very good. Is there a way to send an index to awk to specify which word to print out?

Radnaskela Samot
– Radnaskela Samot

2012-11-16 13:57:03 +00:00
Commented Nov 16, 2012 at 13:57
@AleksandarHadjikan Added it to my answer.

Chris Down
– Chris Down

2012-11-16 14:11:25 +00:00
Commented Nov 16, 2012 at 14:11
1

@AleksandarHadjikan It really sounds like you are going about this the complete wrong way. It would be much better not to do that.

Chris Down
– Chris Down

2012-11-16 16:22:08 +00:00
Commented Nov 16, 2012 at 16:22

| Show 8 more comments

angus · Accepted Answer · 2012-11-16 14:13:12Z

1

When you say “strings” you mean “words”, right? Strings of characters separated by whitespace. And according to your examples, you want to access them sequentially.

You can do:

$ sed 's/[ \t]\+/\n/g' YOUR_FILE | while read -r word ; do PROCESS $word ; done

Example of use:

% echo word1 word2 > YOUR_FILE
% echo word3 word4 >> YOUR_FILE
% echo word5 word6 >> YOUR_FILE
% sed 's/[ \t]\+/\n/g' YOUR_FILE | while read -r word ; do echo _${word}_ ; done
_word1_
_word2_
_word3_
_word4_
_word5_
_word6_

edited Nov 16, 2012 at 14:13

answered Nov 16, 2012 at 13:21

angus

12.7k3 gold badges46 silver badges40 bronze badges

Yes string=word. I tried your example however it did not work. I echo $word and it prints the entire file.

Radnaskela Samot
– Radnaskela Samot

2012-11-16 13:42:30 +00:00
Commented Nov 16, 2012 at 13:42
Works for me. See example. In what way do the contents of your file differ from lines of words separated by spaces?

angus
– angus

2012-11-16 13:47:52 +00:00
Commented Nov 16, 2012 at 13:47
Will break if the file contains backslashes.

Chris Down
– Chris Down

2012-11-16 13:52:45 +00:00
Commented Nov 16, 2012 at 13:52
I create three seperate lines with two words in each line.

Radnaskela Samot
– Radnaskela Samot

2012-11-16 13:54:01 +00:00
Commented Nov 16, 2012 at 13:54
@ChrisDown Yes, apparently read is the culprit. Fixed.

angus
– angus

2012-11-16 13:57:42 +00:00
Commented Nov 16, 2012 at 13:57

| Show 5 more comments

Stack Exchange Network

How to traverse a file and extract words using their index?

2 Answers 2

You must log in to answer this question.

Hot Network Questions

How to traverse a file and extract words using their index?

2 Answers 2

You must log in to answer this question.

Related

Hot Network Questions