Files Management with Pathlib

pathlib

Python’s pathlib module includes a number of helpers for creating file names and locating files.

The Path type is what everything in the pathlib module comes from:

>>> from pathlib import Path

Building file paths

If we make a new Path object by passing a period to it we’ll have an path that represents our current directory:

>>> my_dir = Path('.')
>>> my_dir
PosixPath('.')

We can ask for the absolute path of this directory. The resolve method will resolve any symlinks that might exist.

>>> my_dir.resolve()
PosixPath('/home/trey')

We can also use a / to concatenate strings to this object to make a new path. Here’s a path that represents a hello.txt file in our current directory:

>>> my_dir / 'hello.txt'
PosixPath('hello.txt')

Finding and creating files

Path objects have a glob method that can be used for searching for files that match a certain (string, which uses * as a wildcard).

Here we’re looking for all files that end in .txt:

>>> my_dir.glob('*.txt')
<generator object Path.glob at 0x7ff57b7c7570>
>>> list(my_dir.glob('*.txt'))
[]

The return value from glob is a lazy iterable so we’ll need to loop over it or convert it to a list to get the paths inside.

We don’t have any txt files in the current directory so nothing matches. We can create a txt file like this:

>>> (my_dir / 'hello.txt').touch()

Now if we search for txt files in our current directory again, we’ll see one matching path come back:

>>> list(my_dir.glob('*.txt'))
[PosixPath('hello.txt')]

Opening files

We can use pathlib to open files to read/write from/to:

>>> hello_path = my_dir / 'hello.txt'
>>> with hello_path.open(mode='wt') as hello_file:
...     hello_file.write("Hello!\n")
...
7
>>> with hello_path.open() as hello_file:
...     print(hello_file.read())
...
Hello!

This is an alternative to using the built-in open function which does the same thing with filename strings.

If we just want to read from or write to the file once on a single line, we could use the read_text or write_text methods instead:

>>> hello_path.read_text()
'Hello!\n'
>>> hello_path.write_text('Hiya')
4
>>> hello_path.read_text()
'Hiya'

Asking questions of paths

There are all sorts of useful methods on Path objects. For example we can ask whether paths exist, whether they point to files, or whether they point to directories:

>>> hello_path = my_dir / 'hello.txt'
>>> another_path = my_dir / 'hello2.txt'
>>> hello_path.exists()
True
>>> hello_path.is_file()
True
>>> another_path.exists()
False
>>> another_path.is_file()
False
>>> hello_path.is_dir()
False
>>> my_dir.is_file()
True

Other tools

Before pathlib was added to Python, we had to reach for tools in a number of different places within the standard library to accomplish the same things.

We used the glob module’s glob function to search for files by name.

We used various tools in the os.path module to work with file paths (build up paths, get absolute paths, split paths, etc.).

We used os.getpwd, os.listdir, os.mkdir, os.remove, and os.rename as well.

Note

Here is some additional material about using pathlib:

Pathlib Exercises

List Files

This is the ls.py exercise in the modules directory. You need to create the ls.py file in the modules sub-directory of the exercises directory.

Write a program ls.py that lists all files and directories in a given directory or in the current directory if no argument is given. Sort the lines by filename.

Example usage:

$ python3 ls.py
my_file.txt
speeches
whereami.py
$ python3 ls.py speeches/
moon_landing.txt

Remove Empty Directories

This is the remove_empty.py exercise in the modules directory. You need to create the remove_empty.py file in the modules sub-directory of the exercises directory.

Write a program remove_empty.py that removes all empty directories inside a given directory.

The empty directories should be removed recursively (if removing empty directories inside a parent directory makes that parent directory empty then it should be removed too).

Given this file hierarchy:

a
├── b
│   └── d
└── c
    └── e.txt

Our program should work like this:

$ python remove_empty.py
Deleting directory d
Deleting directory b
Deleting directory a

And our file hierarchy should now look like this:

a
└── c
    └── e.txt

Find EditorConfig Files

This is the editorconfig.py exercise in the modules directory. You need to create the editorconfig.py file in the modules sub-directory of the exercises directory.

Write a module editorconfig.py that contains a function find_configs which accepts a filename and returns the contents of all .editorconfig files found in the file’s directory and all its parent directories.

Your editorconfig.find_configs function should return a list of tuples containing the file contents and the full file path of each .editorconfig file found like this.

For example, say we’re in the directory /home/trey/work/project/ and the following .editorconfig files exist:

/home/trey/work/project/.editorconfig:

[*.py]
indent_style = space
indent_size = 4

/home/trey/.editorconfig:

[*]
trim_trailing_whitespace = true

Calling our program should find and print those files like this:

>>> from editorconfig import find_configs
>>> for config, name in find_configs('/home/trey/work/project/my_file.py'):
...     print(f"Configuration for {name}")
...     print(config)
Configuration for /home/trey/work/project/.editorconfig
[*.py]
indent_style = space
indent_size = 4

Configuration for /home/trey/.editorconfig
[*]
trim_trailing_whitespace = true

Tree

Note

This exercise does not have automated tests.

This is the tree.py exercise in the modules directory.

Write a program tree.py that prints out a file tree for the given directory (defaulting to the current directory of no argument is specified).

Also print out the total number of files and directories seen.

Example usage:

$ python3 tree.py
declaration-of-independence.txt
here/
    are/
      directories
speeches/
    moon_landing.txt
whereami.py

4 directories, 3 files

Bonus: Use ASCII art to represent the files visually

Example usage:

$ python3 tree.py
.
|-- declaration-of-independence.txt
|-- here
|   `-- are
|       `-- directories
|-- speeches
|   `-- moon_landing.txt
`-- whereami.py

4 directories, 3 files