Python-Ref > System interaction > Manipulating files and directories > Recursive processing of directories
 
 

<-^^

Recursive processing of directories

How to process a directory and everything in it recursing into subdirectories
Very often one needs not only to process all files in a directory, but also to enter subdirectories and process the files there in the same way. To use this, one may either use a recursive function or the builtin function os.walk.
The former approach and the rationale behind the idea of recursive program is described in Recursion in directory hierarchy. Here is just a brief example searching for files with the ".txt" extension..
Expand/Shrink
Prostředí:
.
|-- 1A
|   |-- 2A
|   |   |-- 3A
|   |   |   `-- file3.txt
|   |   |-- 3B
|   |   |   `-- file3.txt
|   |   `-- file2.txt
|   |-- 2B
|   |   |-- 3A
|   |   |   `-- file3.txt
|   |   |-- 3B
|   |   |   `-- file3.txt
|   |   `-- file2.txt
|   `-- file1.txt
|-- file0.txt
`-- filesystem6-1.py
  1   import os
  2   
  3   def get_txt_files( directory):
  4       files = []
  5       for name in os.listdir( directory):
  6           full_name = os.path.join( directory, name)
  7           if os.path.isfile( full_name) and full_name.endswith(".txt"):
  8               files.append( full_name)
  9           elif os.path.isdir( full_name):
 10               files += get_txt_files( full_name)
 11       return files
 12   
 13   for file in get_txt_files( "./"):
 14       print file
stdout:
./1A/2A/3A/file3.txt
./1A/2A/3B/file3.txt
./1A/2A/file2.txt
./1A/2B/3A/file3.txt
./1A/2B/3B/file3.txt
./1A/2B/file2.txt
./1A/file1.txt
./file0.txt
Doba běhu: 34.2 ms
..with a slightly modified version that shows the internal working of the algorithm.
Expand/Shrink
Prostředí:
.
|-- 1A
|   |-- 2A
|   |   |-- 3A
|   |   |   `-- file3.txt
|   |   |-- 3B
|   |   |   `-- file3.txt
|   |   `-- file2.txt
|   |-- 2B
|   |   |-- 3A
|   |   |   `-- file3.txt
|   |   |-- 3B
|   |   |   `-- file3.txt
|   |   `-- file2.txt
|   `-- file1.txt
|-- file0.txt
`-- filesystem6-2.py
  1   import os
  2   
  3   def get_txt_files( directory):
  4       print "- processing", directory
  5       files = []
  6       for name in os.listdir( directory):
  7           full_name = os.path.join( directory, name)
  8           if os.path.isfile( full_name) and full_name.endswith(".txt"):
  9               print "+ adding    ", full_name
 10               files.append( full_name)
 11           elif os.path.isdir( full_name):
 12               print "> going into", full_name
 13               files += get_txt_files( full_name)
 14       print "< leaving   ", directory
 15       return files
 16   
 17   files = get_txt_files( "./")
stdout:
- processing ./
> going into ./1A
- processing ./1A
> going into ./1A/2A
- processing ./1A/2A
> going into ./1A/2A/3A
- processing ./1A/2A/3A
+ adding     ./1A/2A/3A/file3.txt
< leaving    ./1A/2A/3A
> going into ./1A/2A/3B
- processing ./1A/2A/3B
+ adding     ./1A/2A/3B/file3.txt
< leaving    ./1A/2A/3B
+ adding     ./1A/2A/file2.txt
< leaving    ./1A/2A
> going into ./1A/2B
- processing ./1A/2B
> going into ./1A/2B/3A
- processing ./1A/2B/3A
+ adding     ./1A/2B/3A/file3.txt
< leaving    ./1A/2B/3A
> going into ./1A/2B/3B
- processing ./1A/2B/3B
+ adding     ./1A/2B/3B/file3.txt
< leaving    ./1A/2B/3B
+ adding     ./1A/2B/file2.txt
< leaving    ./1A/2B
+ adding     ./1A/file1.txt
< leaving    ./1A
+ adding     ./file0.txt
< leaving    ./
Doba běhu: 34.6 ms
The latter approach, using os.walk is shown in the next example. It might be easier and shorter to use in simple cases, however it lacks the universality of the former approach. On the other hand, most of the usual problems are quite solvable using os.walk.
Expand/Shrink
Prostředí:
.
|-- 1A
|   |-- 2A
|   |   |-- 3A
|   |   |   `-- file3.txt
|   |   |-- 3B
|   |   |   `-- file3.txt
|   |   `-- file2.txt
|   |-- 2B
|   |   |-- 3A
|   |   |   `-- file3.txt
|   |   |-- 3B
|   |   |   `-- file3.txt
|   |   `-- file2.txt
|   `-- file1.txt
|-- file0.txt
`-- filesystem6-3.py
  1   import os
  2   
  3   all_files = []
  4   for (dirname, directories, files) in os.walk("./"):
  5       for f in files:
  6           if f.endswith(".txt"):
  7               all_files.append( os.path.join( dirname, f))
  8   
  9   for f in all_files:
 10       print f
stdout:
./file0.txt
./1A/file1.txt
./1A/2A/file2.txt
./1A/2A/3A/file3.txt
./1A/2A/3B/file3.txt
./1A/2B/file2.txt
./1A/2B/3A/file3.txt
./1A/2B/3B/file3.txt
Doba běhu: 33.4 ms