<div class="vslide">
  <div class="vslide-title">
    <p style="font-family: Protomolecule; font-size: 2.3em; line-height: 90%; margin: 0px auto; text-align: center; width: 100%;"><span style="letter-spacing: .04rem;">Programming</span><br><span style="letter-spacing: .0rem;">and Databases</span></p>
<p class="author" style="font-family: Protomolecule; margin: 0px auto;  text-align: center; width: 100%; font-size: 1.2em;">Joern Ploennigs</p>
<p class="subtitle" style="font-family: Protomolecule; margin: 1em auto; text-align: center; width: 100%; font-size: 1.2em;">Modularization</p>
    <figcaption>Midjourney: Modular Blocks, ref. Piet Mondrian</figcaption>
  </div>
<script>
  function setSectionBackground(c,v){
    let e=document.currentScript.previousElementSibling;
    while(e&&e.tagName!=='SECTION')e=e.parentElement;
    if(e){
      if(c)e.setAttribute('data-background-color',c);
      if(v){
        e.setAttribute('data-background-video',v);
        e.setAttribute('data-background-video-loop','true');
        e.setAttribute('data-background-video-muted','true');
      }
    }
  }
  setSectionBackground('#000000', 'images/07a_Module/mj_title.mp4');
</script>
<style>
.flex-row{display:flex; gap:2rem; align-items:flex-start; justify-content:space-between;}
.flex-row .col1{flex:1; min-width:10px}
.flex-row .col2{flex:2; min-width:10px}
.flex-row .col3{flex:3; min-width:10px}
.flex-row .col4{flex:4; min-width:10px}
.flex-row .col5{flex:5; min-width:10px}
.flex-row .col6{flex:6; min-width:10px}
.flex-row .col7{flex:7; min-width:10px}
.vcent{display:flex; align-items:center; justify-content:center}
</style>
</div>

# Modularization

<figure class="mj-tile-band">
    <img src='images/07a_Module/mj_title_band.jpg'>
    <figcaption>Midjourney: Modular Blocks, ref. Piet Mondrian</figcaption>
</figure>

> Write programs that do one thing and do it well.
>
> ‚Äî Doug McIlroy

## <a href="../lec_slides/07a_Module.slides.html">Slides</a>/<a href="../pdf/slides/07a_Module.pdf">PDF</a>
<iframe src="../lec_slides/07a_Module.slides.html" width="750" height="500"></iframe>

## Process

![](images/partB_1.svg)

## Split code into files, modules, and packages

Large programs often contain dozens of classes with hundreds of functions and many thousands of lines of code. It quickly becomes difficult to keep an overview when all classes are defined in a single file. Especially when different programmers work at different places in the program, version conflicts can arise very quickly when people work on similar files.

To keep this organized and easy to navigate, code is split into multiple files with the extension `.py`. It is common to have one file
- per class, when classes are defined
- per topic, when helper functions are defined (e.g., mathematical functions, ‚Ä¶)
- per area of responsibility, when (for example) data loading is separated from its processing. This way you can later define other processing steps and reuse the loading.

If we save each class of the geometry elements from the Class Definition section of the last lecture, we would then have, for example, a project structure corresponding to:

- üìÅ geometry
    - üìÑ [ImmutablePoint.py](points/ImmutablePoint.py)
    - üìÑ [Point.py](points/Point.py)
    - üìÑ [Line.py](shapes/Line.py)
    - üìÑ [Pentagon.py](shapes/Pentagon.py)
    - üìÑ [Polygon.py](shapes/Polygon.py)
    - üìÑ [Tetragon.py](shapes/Tetragon.py)
    - üìÑ [Triangle.py](shapes/Triangle.py)

<!-- <center><img src="images/07a_Module/files.png" style="width: 40ex"></center> -->

Each file contains only the code of the class with the same name, even if that is only a few lines, as in the case of the classes `Triangle`, `Tetragon`, and `Pentagon`. Crucially, when a programmer searches for the code of a class, they should be able to see exactly in which file it can be found and not have to search far.

Larger projects are split into multiple *modules* by creating additional subdirectories. For example, we want to group all generic classes for points in the `points` directory and all geometric shapes in the `shapes` directory. This makes it easy to structure larger projects.

The sum of all modules then forms a *package*. In this case the `geometry` package, which we can reuse in different implementations.

- üìÅ geometry
    - üìÅ points
        - üìÑ [ImmutablePoint.py](geometry/points/ImmutablePoint.py)
        - üìÑ [Point.py](geometry/points/Point.py)
    - üìÅ shapes
        - üìÑ [Line.py](geometry/shapes/Line.py)
        - üìÑ [Pentagon.py](geometry/shapes/Pentagon.py)
        - üìÑ [Polygon.py](geometry/shapes/Polygon.py)
        - üìÑ [Tetragon.py](geometry/shapes/Tetragon.py)
        - üìÑ [Triangle.py](geometry/shapes/Triangle.py)

<!-- <center><img src="images/07a_Module/shapes.png" style="width: 40ex"></center> -->

## `main()` - The entry point of a program

When code is spread across multiple files, Python needs a hint about which code should be executed. For this, you define the special entry function `main()`.

It exists in almost all programming languages and always designates the starting point of a program.

In Python, it either takes no arguments or receives them dynamically when it is invoked by the user or by another program (a.k.a. command-line arguments).

In [None]:
def main():
	print("This is the main function")

However, you want to avoid the function `main()` from being called when the Python file is, for example, imported as a library, where you‚Äôre only interested in the functions. Therefore, at the end of a file that defines a `main()` function you use the following conditional.

In [None]:
if __name__ == "__main__":
	main()

She takes advantage of the fact that the value of the special variable `__name__` in the main file is always `'__main__'`, whereas in an imported file it indicates the name of the main file.

## Importing Modules

To avoid constantly loading unnecessary code, Python does not load this code automatically.

So, if we want to use the code in our files, modules, and packages, we must first tell Python to load it.

This *importing* is done with the `import` command.

A simple import is importing an entire package.

This is done by writing `import` followed by the package name.

In [None]:
import geometry.points.ImmutablePoint
import geometry.shapes.Line

def main():
	point_1 = geometry.points.ImmutablePoint.ImmutablePoint(x=54.083336, y=12.108811)
	point_2 = geometry.points.ImmutablePoint.ImmutablePoint(y=12.094167, x=54.075211)
	line_1 = geometry.shapes.Line.Line(start=point_1, end=point_2)
	print(f"Die L√§nge der Linie zwischen Punkt 1 und 2 ist: {line_1.length()}")

if __name__ == "__main__":
	main()

One drawback of importing entire packages is that if we want to reference individual classes that reside in submodules, we must specify the fully qualified class name. For example, in the example above: `geometry.points.ImmutablePoint.ImmutablePoint`.

Therefore, one typically imports individual modules by specifying the path to a module, for example `geometry.points.ImmutablePoint.ImmutablePoint`. Here, you can also assign new names to the imported modules, such as `point` or `line`, in the example below.

In [None]:
import geometry.points.ImmutablePoint as point
import geometry.shapes.Line as line

def main():
	point_1 = point.ImmutablePoint(x=54.083336, y=12.108811)
	point_2 = point.ImmutablePoint(y=12.094167, x=54.075211)
	line_1 = line.Line(start=point_1, end=point_2)
	print(f"Die L√§nge der Linie zwischen Punkt 1 und 2 ist: {line_1.length()}")

if __name__ == "__main__":
	main()

Alternatively, parts of a module can also be imported using the wildcard `*` and the `from` statement. All elements are imported with the wildcard `*`. Specific elements, such as individual classes, can also be specified directly, as in the following example `Line`.

In [None]:
from geometry.points.ImmutablePoint import *
from geometry.shapes.Line import Line

def main():
	point_1 = ImmutablePoint(x=54.083336, y=12.108811)
	point_2 = ImmutablePoint(y=12.094167, x=54.075211)
	line_1 = Line(start=point_1, end=point_2)
	print(f"Die L√§nge der Linie zwischen Punkt 1 und 2 ist: {line_1.length()}")

if __name__ == "__main__":
	main()

## Import standard packages from Python

Python includes many [standard libraries](https://python.readthedocs.io/en/latest/library/index.html) for common tasks. For construction and environmental informatics, the following are the most useful:

| packet |   description |
| ------ | ------------- |
| collections | More complex data types for counting, sorting |
| http   | Functions of the HTTP protocol, such as web servers |
| json   | Functions to store objects as text |
| logging| Functions to write logs |
| math   | Mathematical functions |
| os     | Functions to locate, load, and save files |
| pickle | Functions to store objects in binary form |
| pprint | print functions for pretty-printing objects |
| random | Functions for generating random numbers |
| re     | Functions for regular expressions to search text |
| sys    | Functions to obtain system information |
| time   | Functions for time and date information |
| timeit | Functions to measure the performance of functions |
|traceback| Functions to display the stack trace |
| urllib | Functions to load and process URLs on the Internet |

From the list, we have already become familiar with and used the libraries `math`, `time`, `timeit`, `traceback`, and `logging`. The other packages, however, offer additional useful functions.

For example, if we want to list all files in a directory, we use the `os` package.

In [None]:
import os

folder = "geometry/shapes/"
for count, filename in enumerate(os.listdir(folder)):
	if os.path.isfile(os.path.join(folder, filename)):
		path = os.path.join(folder, filename)
		print(path)

Many websites offer application programming interfaces, so-called APIs, because these APIs are also used by their own websites to load data that is displayed on the page. The APIs usually use the JSON format to exchange data. This is a text-based file format that in Python closely resembles the `dict` data type, but it also supports all other primitive and composite data types in Python. 

We would like, for example, to analyze the weather data of a weather station in Germany. We obtain these data from the [German Weather Service](https://www.dwd.de). These data can also be downloaded from the [API](https://dwd.api.bund.dev/). For this, you need the ID (identification number) of a weather station, which can be found [here](https://www.dwd.de/DE/leistungen/klimadatendeutschland/statliste/statlex_html.html?view=nasPublication&nn=16102). We take as an example a station in the Hansaviertel in Rostock with the ID `12495`.

Then you can load the weather data with Python with the help of the `urllib` package from the API. The JSON format can be processed with the `json` package. To output this in a nicer way, we use the `pprint` function from the `pprint` package (pretty-print).

In [None]:
import urllib.request
import json
import pprint

stationID='12495' # Rostock-Hansaviertel
with urllib.request.urlopen(f'https://dwd.api.proxy.bund.dev/v30/stationOverviewExtended?stationIds={stationID}') as f:
    data=f.read() # This returns binary data
    weather=json.loads(data) # We convert the binary data into a dict
    pprint.pprint(weather, indent=2, compact=True)

<!-- We can start our own web server with the `http.server` package.

import http.server as server

server_object = server.HTTPServer(server_address=('', 80), 	RequestHandlerClass=server.CGIHTTPRequestHandler)

server_object.serve_forever()

-->

## Installing and Importing External Packages

Python's strength, however, lies in its enormous selection of available packages. For most purposes, there are Python packages available. One such directory is [PyPI](https://pypi.org/), which lists over 400,000 packages.

The installation of new packages for Python is easy. For this, you open a terminal (command line) and enter the command `pip install <packetname>`.

For example, we want to display the weather data we just loaded. To do this we use:
- first the `pandas` package for creating a table from the weather data.
- then we use the `plotly` package to draw a chart.
- finally we create a web server with `dash` that will display the chart to us

We install all three with `pip`. We can do this in a single call. We use the `--quiet` switch to reduce the output.

In [None]:
# pip install pandas plotly dash  --quiet

Now we load both packages, with pandas typically assigned the alias `pd` and Plotly Express, which is easy to use, assigned the alias `px`.

In [None]:
import pandas as pd
import plotly.express as px
import dash

First, we convert the daily weather forecast `days` data from the weather station with the `stationID` into a table, since this can be processed by Plotly Express. Tables are called DataFrames in Pandas (in general, that's what such tables are called in data science). So we create from the weather forecast a new object instance of type `DataFrame` via

In [None]:
df = pd.DataFrame(weather[stationID]['days'])
df

Now we plot the data `df` as a line chart using Plotly, with the date axis on the x-axis (`dayDate`) and the minimum temperature `temperatureMin` and the maximum temperature `temperatureMax` on the y-axis.

In [None]:
fig=px.line(df, x="dayDate", y=["temperatureMin", "temperatureMax"])
fig.show(renderer="svg")

Finally, we want to display this diagram on a webpage on a web server. Here we use the package `dash`, which enables creating a webpage with Python commands and displaying interactive Plotly charts in it.

In [None]:
import dash

app = dash.Dash()

In [None]:
Please provide the Python code to translate (German names, docstrings, and comments) so I can perform the translation.

In [None]:
# and replace the Dash object
from jupyter_dash import JupyterDash

app = JupyterDash()

Then we generate a webpage with a heading (`H1`) that contains the plot as a `Graph`.

In [None]:
app.layout = dash.html.Div(children = [
    dash.html.H1(children='Wetter in Rostock'),
    dash.dcc.Graph(id="fare_vs_age", figure=fig)
])

And start the web server.

In [None]:
app.run_server()

This starts a web server that we can access in a browser at `http://127.0.0.1:8083/`. It shows a webpage with an interactive chart that displays the weather data.

![](images/07a_Module/wetter.png)

## Organizing projects across multiple files

<div class="flex-row">
  <div class="col1">

Challenges of large programs:

- Dozens of classes with hundreds of functions
- Thousands of lines of code
- Difficult to keep an overview
- Version conflicts when multiple programmers work on it

  </div>
  <div class="col1"> 

Solution - Split code into .py files:

- *Per class* one file (for class definitions)
- *Per topic* one file (e.g., math functions)
- *Per domain* one file (e.g., loading vs. processing)

  </div>
</div>

## The main() function as the standard entry point

Where does Python know what to run?

- Declare a specific file as the project‚Äôs *center*
- This contains a special function named `main()`
- Exists in almost every programming language
- Specifies the *start point of the program*
- Can receive no arguments or dynamic arguments (command-line arguments)

## The main() function in Python

<div class="flex-row">
  <div class="col1">

Different ways to run:
1. As a script on the command line / standalone program
2. Imported into an interactive Python console  
3. Imported into another Python file

Problem: In case 1 we want to use the entire script, in case 2+3 only parts

The solution: the __name__ variable:

  </div>
  <div class="col1"> 

```python
def main():
    print("This is the main function")

if __name__ == "__main__":
    main()  # nur bei direktem Aufruf
```

  </div>
</div>

## Best Practice for the main() function

What belongs BEFORE the main() definition:

<div class="flex-row">
  <div class="col1">

‚úÖ Only allowed:
- Function definitions
- Class definitions

  </div>
  <div class="col1"> 

‚ùå Avoid:
- Variable assignments (global variables)
- Function calls (side effects)

  </div>
</div>

Advantages of this structure:
- The program flow becomes clearer
- The program is easy to modify
- The program is more reusable

## Program Flow Across Multiple Files

- We now have a central starting point.

- How do we call code from other files?

- Every programming language has commands to load external code:
    - `import` and `include` (Python)
    - Special 'header files' that define the relationships between files

## Program flow in Python - import Statement

<div class="flex-row">
  <div class="col1">

Strategy:
- A .py file contains `main()` (Case 1)
- All others without `main()` or with `__name__` are ignored (Case 3)
- The `import` statement loads external files

- *Warning:* The entire code is executed, including variables and function calls!

  </div>
  <div class="col1"> 

```python
import external_file

def main():
    print("This is the main function")
```
Variables/functions/classes from `external_file.py` are stored in the object `external_file`

  </div>
</div>

## Importing External Libraries (Packages)

*Terminology:*
- *Module:* Imported scripts
- *Namespace:* The resulting object (type: module)
- *Package:* A module with additional submodules

## Fundamentals of Project Structure

<div class="flex-row">
  <div class="col1">

**General Structure**

```
main.py [contains main()]
‚îú‚îÄ‚îÄ module1.py
‚îú‚îÄ‚îÄ module2.py  
‚îî‚îÄ‚îÄ helpers.py
```

Code in `main.py`
```python
import module1
import module2
import helpers
```

  </div>
  <div class="col1"> 

**Example - City Project**

```
city.py
‚îú‚îÄ‚îÄ buildings.py
‚îú‚îÄ‚îÄ streets.py
‚îî‚îÄ‚îÄ geometry.py
```

Code in `main.py`
```python
import buildings
import streets
import geometry
```

  </div>
</div>

## The Import Statement: Advanced Use Cases

- Python's import system is complex and multifaceted.
- In day-to-day use, however, you typically encounter only the following additional constructs:
    - `from-import` statement: Imports a submodule or a portion of a module directly, without the parent namespace
    - `from-import-as` statement: Works the same, but renames the imported object

```python
from external_file import external_function as ext_func
```

## Import Statements ‚Äì The Gateway to the World

- We can import not only our own packages, but also packages published by other people/organizations!
- There are several ways to access packages:
- The package is included with Python
- Download directly as .py files / folders containing .py files
- Use a package manager (pip, uv) (Better!)

## Packages - The Real Power of Python

- Python's position as the world's most popular programming language is largely based on its evolution into an interface and pipeline language.
- Almost every computer-solvable problem can be solved by cleverly chaining together the right Python packages.
- The official hub for packages, the Python Package Index (PyPI), currently hosts over 418,000 freely available packages.
- These can in most cases be installed with a single command or click

<center><a href="https://pypi.org"><img src="images/07a_Module/pip.png" style="width:400px;"></a></center>

## Lesson Learned

<script>setSectionBackground('#66ccffff');</script>
<div class="flex-row">
<div class="col2">

How do you prioritize correctly?

</div>
<div class="col3">
    <figure class="mj-fig">
        <img src="images/07a_Module/eisenhower.jpg" class="mj-fig-img">
        <figcaption class="mj-fig-cap">
            ‚ÄúI have two kinds of problems, the urgent and the important. The urgent are not important, and the important are never urgent.‚Äù, Eisenhower
        </figcaption>
    </figure>
  </div>
</div>

## Lesson Learned - Prioritization

<script>setSectionBackground('#66ccffff');</script>

<div class="flex-row">
  <div class="col1">

- With the Eisenhower matrix, tasks are divided into urgent and important
- You first tackle the urgent and important tasks

  </div>
  <div class="col1">
    <figure class="mj-fig">
        <img src="images/07a_Module/priotisierung.svg" class="mj-fig-img">
    </figure>
  </div>
</div>

<div class="vslide">
  <div class="vslide-title">
    <p style="font-family: Protomolecule; font-size: 2.3em; margin: 0px auto; text-align: center; width: 100%;">Questions?</p>
  </div>
  <script>setSectionBackground('#000000', 'images/mj_questions.mp4');</script>
</div>