Extracting columns from a text file is a very common task for system administrators or for scientific engineers, but, sometimes also other people needs to do this task.
Using the bash and awk, this task is a very simple task.
Imagine you want to know how long is each file in a directory.
With the bash you can get this info with the following command:
# ls -al
gg@pbx:~/aux $ ls -al
total 32
drwxrwxr-x 2 gg gg 4096 Sep 10 16:29 .
drwxr-xr-x 5 gg gg 4096 Sep 10 16:28 ..
-rw-rw-r-- 1 gg gg 6 Sep 10 16:28 1.txt
-rw-rw-r-- 1 gg gg 6 Sep 10 16:29 2.txt
-rw-rw-r-- 1 gg gg 12 Sep 10 16:29 3.txt
-rw-rw-r-- 1 gg gg 12 Sep 10 16:29 4.txt
-rw-rw-r-- 1 gg gg 244 Sep 10 16:29 5.txt
-rw-rw-r-- 1 gg gg 394 Sep 10 16:29 6.txt
And you can save the output using normal redirection operator
# ls -al >list.txt
Now if you want to extract the file names and their own size you can do the following:
gg@pbx:~/aux $ awk '{ print $9 $5}' list.txt
.4096
..4096
1.txt6
2.txt6
3.txt12
4.txt12
5.txt244
6.txt394
list.txt0
This is a good result, but you can do better. For example you can apply a format to the output:
gg@pbx:~/aux $ awk '{ print $9 "\t" $5}' list.txt
. 4096
.. 4096
1.txt 6
2.txt 6
3.txt 12
4.txt 12
5.txt 244
6.txt 394
list.txt 0
Or if you want to use your data in a spreadsheet you can create a .csv file
gg@pbx:~/aux $ awk '{ print $9 "," $5}' list.txt | tee result.csv
,
.,4096
..,4096
1.txt,6
2.txt,6
3.txt,12
4.txt,12
5.txt,244
6.txt,394
list.txt,0
And now you can import your file into your favourite spreadsheet.
Gg1