Browse files and subfolders in Python


Python Problem Overview

I'd like to browse through the current folder and all its subfolders and get all the files with .htm|.html extensions. I have found out that it is possible to find out whether an object is a dir or file like this:

import os

dirList = os.listdir("./") # current directory
for dir in dirList:
  if os.path.isdir(dir) == True:
    # I don't know how to get into this dir and do the same thing here
    # I got file and i can regexp if it is .htm|html

and in the end, I would like to have all the files and their paths in an array. Is something like that possible?

Python Solutions

Solution 1 - Python

You can use os.walk() to recursively iterate through a directory and all its subdirectories:

for root, dirs, files in os.walk(path):
    for name in files:
        if name.endswith((".html", ".htm")):
            # whatever

To build a list of these names, you can use a list comprehension:

htmlfiles = [os.path.join(root, name)
             for root, dirs, files in os.walk(path)
             for name in files
             if name.endswith((".html", ".htm"))]

Solution 2 - Python

I had a similar thing to work on, and this is how I did it.

import os

rootdir = os.getcwd()

for subdir, dirs, files in os.walk(rootdir):
    for file in files:
        #print os.path.join(subdir, file)
        filepath = subdir + os.sep + file

        if filepath.endswith(".html"):
            print (filepath)

Hope this helps.

Solution 3 - Python

In python 3 you can use os.scandir():

for i in os.scandir(path):
    if i.is_file():
        print('File: ' + i.path)
    elif i.is_dir():
        print('Folder: ' + i.path)

Solution 4 - Python

Use newDirName = os.path.abspath(dir) to create a full directory path name for the subdirectory and then list its contents as you have done with the parent (i.e. newDirList = os.listDir(newDirName))

You can create a separate method of your code snippet and call it recursively through the subdirectory structure. The first parameter is the directory pathname. This will change for each subdirectory.

This answer is based on the 3.1.1 version documentation of the Python Library. There is a good model example of this in action on page 228 of the Python 3.1.1 Library Reference (Chapter 10 - File and Directory Access). Good Luck!

Solution 5 - Python

Slightly altered version of Sven Marnach's solution..

import os

folder_location = 'C:\SomeFolderName'
file_list = create_file_list(folder_location)

def create_file_list(path):
return_list = []

for filenames in os.walk(path):
    for file_list in filenames:
        for file_name in file_list:
        	if file_name.endswith((".txt")):

return return_list

Solution 6 - Python

from tkinter import *
import os

root = Tk()
file = filedialog.askdirectory()
changed_dir = os.listdir(file)


All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionBlackie123View Question on Stackoverflow
Solution 1 - PythonSven MarnachView Answer on Stackoverflow
Solution 2 - PythonPragyaditya DasView Answer on Stackoverflow
Solution 3 - PythonSpasView Answer on Stackoverflow
Solution 4 - PythonNeonJackView Answer on Stackoverflow
Solution 5 - PythoncampervancoderView Answer on Stackoverflow
Solution 6 - PythonAkshat MishraView Answer on Stackoverflow