Extracting columns from a text file is a very common  task for system administrators or for scientific engineers, but, sometimes also other people needs to do this task.

Using the bash and awk, this task is a very simple task.

Imagine you want to know how long is each file in a directory.

With the bash you can get this info with the following command:

# ls -al 

 

gg@pbx:~/aux $ ls -al

total 32

drwxrwxr-x 2 gg gg 4096 Sep 10 16:29 .

drwxr-xr-x 5 gg gg 4096 Sep 10 16:28 ..

-rw-rw-r-- 1 gg gg    6 Sep 10 16:28 1.txt

-rw-rw-r-- 1 gg gg    6 Sep 10 16:29 2.txt

-rw-rw-r-- 1 gg gg   12 Sep 10 16:29 3.txt

-rw-rw-r-- 1 gg gg   12 Sep 10 16:29 4.txt

-rw-rw-r-- 1 gg gg  244 Sep 10 16:29 5.txt

-rw-rw-r-- 1 gg gg  394 Sep 10 16:29 6.txt

 

And you can save the output using normal redirection operator

# ls -al >list.txt

Now if you want to extract the file names and their own size you can do the following:

 

gg@pbx:~/aux $ awk '{ print $9 $5}' list.txt

 

.4096

..4096

1.txt6

2.txt6

3.txt12

4.txt12

5.txt244

6.txt394

list.txt0

 

This is a good result, but you can do better. For example you can apply a format to the output:

 

gg@pbx:~/aux $ awk '{ print $9 "\t" $5}' list.txt

 

.       4096

..      4096

1.txt   6

2.txt   6

3.txt   12

4.txt   12

5.txt   244

6.txt   394

list.txt        0

 

Or if you want to use your data in a spreadsheet you can create a .csv file

 

 

gg@pbx:~/aux $ awk '{ print $9 "," $5}' list.txt | tee  result.csv

,

.,4096

..,4096

1.txt,6

2.txt,6

3.txt,12

4.txt,12

5.txt,244

6.txt,394

list.txt,0

 

And now you can import your file into your favourite spreadsheet.

Gg1