Translated by ChatGPT.
Let's explain this using the directory structure provided on the official website:
sound/ Top-level package
init.py Initialize the sound package
formats/ Subpackage for file format conversions
init.py
wavread.py
wavwrite.py
aiffread.py
aiffwrite.py
auread.py
auwrite.py
...
effects/ Subpackage for sound effects
init.py
echo.py
surround.py
reverse.py
...
filters/ Subpackage for filters
init.py
equalizer.py
vocoder.py
karaoke.py
...
Using Existing Python Code
Most programming tutorials start teaching syntax after setting up the environment. But let’s first discuss using code that’s already been written. The simplest way to do this is by installing packages through a package manager:
pip install sound
Then, if you want to use a function in a specific file (let's say wavwrite.py
has a function called write()
), there are multiple ways to import it. Note the different ways to import and call the function:
import sound
sound.formats.wavwrite.write()
from sound import formats
formats.wavwrite.write()
from sound.formats import wavwrite
wavwrite.write()
from sound.formats.wavwrite import write
write()
However, not all Python code can be installed directly. For example, if a research paper is published along with data processing code, often the code is simply shared in a folder. If you download that folder and try import sound
, an error will occur, stating that Python can't find a library named sound.
How Python Locates Code Files
If you think about it, it’s normal that an error occurs. The previous simple command import sound
solved the problem magically—finding the library without needing a file path—even though libraries can be located in different parts of the file system.
This is because Python doesn’t search the entire hard drive. Instead, it uses a variable, typically named PYTHONPATH
, which holds a list of directories where Python library folders are located. When you enter a command in the terminal, Python will:
- Search the current folder, meaning the folder where you launched the Python command.
- Look through each folder in the
PYTHONPATH
variable. - Finally, check the default location for packages installed by the package manager, typically
<path to python>/site-packages
.
If the library is found, Python will import it; otherwise, an error occurs.
In the error mentioned above, if we happen to be in the sound
folder and run Python, the first condition applies, and import sound
won’t produce an error. But this won’t work in other locations.
Terminology: Interactive, Script, Module, Package
Executable Python commands can be found in the following contexts, with the first being interactive and the last three being file-based:
- interactive: The Python interactive interface, also known as calculator mode, which appears after entering
python
in the command line. Commands entered here display results immediately. Once Python exits, these commands are lost. - script: A Python script file, like
somefile.py
when runningpython somefile.py
from the command line.- Python is lightweight, and some commands are worth saving for later reuse, much like Linux commands and bash scripts.
- However, as a fully-featured language, Python can also support complex object-oriented programming. In this sense, the main script file can be considered the main module, serving as the main file and program entry point.
- module: A Python module file, which you can call with
python -m another
(without a.py
extension). Technically, any.py
file is a module, but it’s generally used for files whose variables and functions are meant for other Python files to import. - package: A Python package, which is a higher-level structure of related modules (typically a folder with
__init__.py
inside). But packages and modules aren’t defined solely by folder hierarchy—more on that in the next section.
Script vs. Module
Python offers both scripting flexibility and the robustness of more complex languages. Unlike some languages, you don’t need a main file called main.py
with a Main
class and a main()
function. But Python still needs a starting point in a complex program.
This starting point is the .py
file after python
(without the -m
parameter). Python sets the __name__
attribute of this main file to "__main__"
. This allows Python to treat it as the main script file, even if it's part of a larger library, with no knowledge of sibling or other related modules.
Other modules are used by importing them in the main module (or script). The import command tells Python where to find the module’s code and the relationships between modules in the package hierarchy. Alternatively, the -m
option provides this structure, e.g., python -m sound.formats.wavwrite
, which allows wavwrite.py
to execute and know its relative position in the hierarchy.
Absolute Import vs. Relative Import
The examples at the beginning use absolute imports (no leading .
), whereas relative imports (.
for current directory, ..
for parent directory) are for internal code that's almost never run as the main module.
Returning to the file structure above, if sound/effects/surround.py
needs functions from sound/formats/wavwrite.py
and sound/effects/echo.py
, you could write:
# in sound/effects/surround.py
from ..formats import wavwrite
from . import echo
Organizing Code for Reuse
Finally, after years of coding, I’m drafting a paper, and now I need to organize previous analysis code into a project. Each time I tried to unify everything, import errors between modules were an issue. This post resulted from that experience.
Here's a recommended project structure from this article:
|- notebooks/
|- 01-first-logical-notebook.ipynb
|- 02-second-logical-notebook.ipynb
|- prototype-notebook.ipynb
|- archive/
|- no-longer-useful.ipynb
|- projectname/
|- projectname/
|- __init__.py
|- config.py
|- data.py
|- utils.py
|- setup.py
|- README.md
|- data/
|- raw/
|- processed/
|- cleaned/
|- scripts/
|- script1.py
|- script2.py
|- archive/
|- no-longer-useful.py
|- environment.yml
This structure allows you to convert tools into an installable library. Now, you can use projectname
in notebooks and scripts just like numpy
or pandas
.
To use the code, install it with:
cd projectname
pip install -e .
References
- What’s the difference between a Python module and a Python package? All Python files are modules, while a package is a specific kind of module.
- Official explanation of Python modules: https://docs.python.org/3/tutorial/modules.html
- StackOverflow: "run as module" is different from "run as script". Running as a module sets the
__name__
attribute to the module’s name, while running as a script sets it to__main__
. - A project organization tutorial: https://realpython.com/python-application-layouts/
- Official documentation on packaging: https://packaging.python.org/en/latest/tutorials/packaging-projects/
See also: