Objects#

The best way to predict the future is to invent it — and then encapsulate it in objects.
— Alan Kay
Slides/PDF#
Object-oriented programming (OOP) is a programming paradigm that assumes that a program consists exclusively of Objects that interact cooperatively with one another. Each object has attributes (properties) and methods. The attributes define the object’s state by their values, while the methods define the possible state changes (actions) of an object.
Object-oriented programming helps address some problems that arise when dealing with frequently recurring data structures in large programs.
Challenges with Highly Repetitive Data Structures#
The syntactic problem#
Objects are used to specify how highly repetitive data structures are stored. The point here is that the syntax of the data structure is unambiguous.
As the program grows, so does the number of variables and data structures
for storing data
for controlling the program flow
for saving states
for processing input and output
The underlying elements are usually based on repetitive data structures. For example, when analyzing blueprints or maps, one handles many point coordinates. However, a coordinate can be expressed differently, for example as a tuple or as a list.
point_1 = (54.083336, 12.108811)
point_2 = [12.094167, 54.075211]
If you try to process these points, you may encounter problems because both have different data types. This is the syntactic problem when dealing with variables in large programs. Here, one wants to be able to define the syntax of the data structure.
The semantic problem#
Objects are also used to unambiguously define the semantics of values in a data structure.
For example, if we agree that a point is represented syntactically by a tuple, the meaning of the values remains unknown.
point_1 = (54.083336, 12.108811)
point_2 = (12.094167, 54.075211)
Another programmer may not understand the semantic meaning here. In this example, one might assume that the first value is the x
coordinate and the second is the y
coordinate. Maybe it’s the other way around. Perhaps it is not a Cartesian coordinate system, but a radial one. To eliminate such ambiguities, one would like to give the data structure a clear semantic definition, where it is unambiguous what the values mean.
The Behavioral Problem#
Objects are also used to bundle the functions for processing the data structure directly with it, so that the data structure only exposes those functions that are meaningful to apply to it.
For example, let’s define a function to calculate the distance between two points:
import math
def distance(a, b):
return math.sqrt((a[0]-b[0])**2 + (a[1]-b[1])**2)
This function can, due to Python’s dynamic typing, also be applied to other data structures, e.g., to a Line. That leads to semantic or logical errors. To avoid this, one would have to define a different function for each variant, name it in a way that associates it with that variant, and perform many data-type checks. A simpler approach is to ensure that such a function is available only for the corresponding data structure.
Object-Oriented Programming#
Declaring Classes#
Instead of using a tangle of scattered data structures and functions, we group them into objects. Since we want to standardize the data structures, the structure of these objects must be defined before they are used. This is done via Classes, which represent a blueprint for the objects standardized in this way. A class defines:
which attributes (properties) an object of this class has
and which methods (functions) an object of the class provides
The first step toward creating an object is to define a new class for the object’s type. This is done using the class
keyword, followed by the class name. Thereafter, the contents are indented in the same way as the class definition. This defines which attributes and methods the class has.
class ClassName:
# Class definition
pass
Constructor#
One of the most important methods of a class is the constructor __init__()
. This is a special method that determines how a new instance of the class can be created. It is used to assign initial values to attributes as well as to perform initialization steps (tests, calculations, configuration, etc.).
Every class must have exactly one constructor. If this is not defined, Python creates an empty constructor that does nothing, as in the following example.
class ClassName:
# Empty constructor
def __init__(self):
pass
Instance Attributes#
The constructor is defined as a function __init__(self)
with the parameter self
. self
is, in this context, a self-reference to the new instance of the class. It serves to allow instance attributes to be assigned directly when creating the instance. Instance attributes are attributes that can differ in each instance, i.e., when there are individual values of the property.
For example, let’s define the class of a point with x
and y
coordinates. Since every instance must have these two coordinates and they can also be different for each instance, we assign them as instance attributes already in the constructor. This establishes that every instance of the class has these attributes.
class Point:
# Constructor
def __init__(self, x, y):
self.x = x
self.y = y
The assignment is performed here via dot notation, in which a dot separates the instance variable (self
) from the attribute name x
. self.x
is thus a reference to the attribute x
of the instance self
. The assignment self.x = x
means that we assign to the new instance attribute x
of the instance self
the value of the variable x
. Although both share the same name, they are not the same variable, because self.x
is an attribute of the instance and x
is a parameter of the function and is only valid within this scope.
Since __init__()
is a function, albeit a special one, you can also define defaults. For example, should we define that x
and y
are initialized to 0 if they are not provided, so we can declare this as default values.
class Point:
# Constructor
def __init__(self, x = 0.0, y = 0.0):
self.x = x
self.y = y
Class attributes#
Besides instance attributes, there are also class attributes. These are attributes that should have the same value for all instances of a class.
class Point:
# Attribute of all instances
unit = "m"
Warning
Class attributes apply to all instances. In particular, with composite data types, this means that if one instance changes the value, the value changes in all other instances as well.
Methods#
Classes often also define their own methods, i.e., functions that are intended to be applied specifically to instances of this class and not to other objects.
Methods are declared as functions within the indentation of the class definition. These methods are then available in all instances. Methods always have self
as their first parameter. Here too, this is a reference to the current instance. This allows you to access the attributes or other methods.
For example, we can define the am Anfang definierte distance
-Funktion as a method, so that it now computes the distance between two points self
and other
.
class Point:
# Class attribute
unit = "m"
# Constructor
def __init__(self, x, y):
self.x = x
self.y = y
# Instance method
def distance(self, other):
return math.sqrt((self.x - other.x)**2 + (self.y - other.y)**2)
Class Instances#
Objects themselves are always instances of a class (the class is just a blueprint). A class can have any number of instances, or none at all. All instances have the same structure, but they don’t necessarily have the same values in their attributes.
With the new class Point
, we can now define the points at the outset syntactically, semantically, and in their behavior in a clear way. We create instances of the class not directly with the constructor but by calling the class name as a function, with the parameters of the constructor. The self
parameter is omitted (it is assigned by Python).
We can also name the parameters and thus avoid semantic ambiguities.
point_1 = Point(x=54.083336, y=12.108811)
point_2 = Point(y=12.094167, x=54.075211) # With named parameters, we can change the order
Also the values of the object are now semantically well-defined. We can access them using dot notation, where the object’s variable name is on the left and the attribute name on the right. To access the attribute x
, we write
point_1.x
54.083336
That also works for class attributes. In all its glory, we can then write, e.g.,
print(f"Der Punkt liegt bei x: {point_1.x} {point_1.unit}; y: {point_1.y} {point_1.unit}")
Der Punkt liegt bei x: 54.083336 m; y: 12.108811 m
Similarly, we can assign new values to the attributes.
point_1.x = 54.08
point_1.y = 12.11
print(f"Der Punkt liegt bei x: {point_1.x} {point_1.unit}; y: {point_1.y} {point_1.unit}")
Der Punkt liegt bei x: 54.08 m; y: 12.11 m
With dot notation we can also call the methods. If we want to calculate the distance between point_1
and point_2
, we write
point_1.distance(point_2)
0.016541414993884864
References#
In programs, objects are usually related. For example, a line consists of two points. This relationship between two object classes is called a reference. This relationship between two object classes is called a reference.
To create references in Python, you simply create an attribute of the other object’s type. Thus a line can be defined as a connection between two points, with its length calculated by the previously defined method distance
.
class Line:
def __init__(self, start: Point, end: Point):
self.start = start
self.end = end
def length(self):
return self.start.distance(self.end)
line_1 = Line(start=point_1, end=point_2)
line_1.length()
0.016541414993884864
Encapsulation of Attributes and Methods#
Often one wants in programming to have control over if and how attributes are changed and who may call which methods.
For this, Python allows the distinction between private, protected, and public attributes and methods. If not otherwise specified, all attributes and methods are public. A public attribute or method is visible, readable, and modifiable outside the class. We used this above to read point_1.x
and to assign a new value with point_1.x = 54.08
. To make these attributes or methods inaccessible, we must declare them as private or protected, which means they are accessible only within the class (private) or within a subclass (protected) via self
. To declare an attribute or method as private, we prefix the name with __
at the beginning or _
for protected.
For example, we want to ensure that the position of the object Point
cannot be changed after creation. Such an immutable object is called immutable. One reason for this could be that a change would create inconsistencies and potential errors. If we want, for instance, to construct a rectangle from the points, we must check in the Init function whether all edges are perpendicular. If the points were movable, with any change we would either have to re-check or we would simply prohibit the change.
To ensure that no one can change the coordinates anymore, we must declare x
and y
as private by naming them __x
and __y
. But now no one outside the instance can read the values. To enable that, we define two public getter methods get_x()
and get_y()
that only return the value. If we should make the value mutable, we would also need to define setter methods that would set the values.
class ImmutablePoint:
unit = "m"
def __init__(self, x, y):
self.__x = x
self.__y = y
def distance(self, other):
return math.sqrt((self.__x - other.__x)**2 + (self.__y - other.__y)**2)
def get_x(self):
return self.__x
def get_y(self):
return self.__y
Let’s define our two instances again.
point_1 = ImmutablePoint(54.083336, 12.108811)
and if we try to read the value __x
or __y
, we now get an error message.
print(f"Der Punkt liegt bei x: {point_1.__x} {point_1.unit}; y: {point_1.__y} {point_1.unit}")
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[21], line 1
----> 1 print(f"Der Punkt liegt bei x: {point_1.__x} {point_1.unit}; y: {point_1.__y} {point_1.unit}")
AttributeError: 'ImmutablePoint' object has no attribute '__x'
To read the values, we must use the getter method instead.
print(f"Der Punkt liegt bei x: {point_1.get_x()} {point_1.unit}; y: {point_1.get_y()} {point_1.unit}")
Der Punkt liegt bei x: 54.083336 m; y: 12.108811 m
We can’t change the values either. We can try it, though:
point_1.__x = 54.08
point_1.__y = 12.11
But it has no effect.
print(f"Der Punkt liegt bei x: {point_1.get_x()} {point_1.unit}; y: {point_1.get_y()} {point_1.unit}")
Der Punkt liegt bei x: 54.083336 m; y: 12.108811 m
Inheritance, Generalization, and Polymorphism#
Inheritance is one of the most important features of object-oriented programming. The goal of inheritance is to define attributes and methods that occur in many similar classes only once.
For this, classes are divided into superclasses (parents) which pass down attributes and methods to subclasses (children). That means every subclass (child) possesses all attributes and methods of the superclass (parents), so we do not have to define them again. New attributes and methods defined for a subclass are, however, not transmitted to the parents or siblings. Therefore, in the context of the subclasses one speaks of a specialization and of the generalization for the superclasses, since they apply to multiple classes.
For example, we note that various geometric objects such as triangles, quadrilaterals, pentagons, etc. all represent polygons, which consist of points connected by lines. Triangles, quadrilaterals, pentagons are here specializations of the class Polygon
. So we first define the generic class Polygon
which, as an instance attribute, accepts a list
of more than 2 ImmutablePoint
and stores it as a private attribute __points
. We also define an optional attribute name
which is public and mutable.
from typing import List
class Polygon:
def __init__(self, points: List[Point], name="Polygon"):
if not isinstance(points, list):
raise TypeError("points not of type list")
if not len(points) > 2:
raise TypeError("points need to contain at least 2 Points")
for point in points:
if not isinstance(point, ImmutablePoint):
raise TypeError("Point not of type ImmutablePoint")
self.__points = points
self.name = name
def get_points(self):
return self.__points
# Area of the polygon according to the Gauss-Shoelace formula
def area(self):
n = len(self.__points) # of corners
area = 0.0
for i in range(n):
j = (i + 1) % n
area += self.__points[i].get_x() * self.__points[j].get_y()
area -= self.__points[j].get_x() * self.__points[i].get_y()
area = abs(area) / 2.0
return area
# Overridden standard method for generating a string
def __str__(self):
description = f"{self.name} has an area of {self.area()} and is defined by\n" # \\n is a line break
for i,point in enumerate(self.__points):
description += f" Point {i} at x: {point.get_x()} {point.unit}; y: {point.get_y()} {point.unit}\n"
return description
Here, we also override the standard method __str__(self)
of an object, which is always called when the object is converted to a string. The function is also used by the print()
statement, which is why we can make it more readable. This ability to override methods is called polymorphism or also method overloading (Overloading). With this, we have learned all the programming paradigms of object-oriented programming.
Inheritance |
Generalization |
Polymorphism |
Encapsulation |
---|---|---|---|
Attributes and methods from parent classes are inherited by child classes. This helps avoid redundancies and errors. |
Commonalities are implemented in generalized parent classes. |
Child classes can override methods and thus redefine them. |
Encapsulation of data and methods in objects is a protective mechanism to restrict improper changes. |
We also define the function area()
which computes the area of the polygon (concave and without holes) using the Gauss’s trapezoidal formula.
Now we want to define the subtypes Triangle, Quadrilateral, and Pentagon. To do this, when declaring the new class we specify the superclass in parentheses. We can then also define a new constructor (otherwise the constructor of the superclass is used). In doing so we must always call the constructor of the superclass, which we obtain with the function super()
. For example, we define constructors that accept exactly the right number of points.
class Triangle(Polygon):
def __init__(self, p1, p2, p3, name= "Triangle"):
super().__init__(points=[p1, p2, p3], name=name)
class Tetragon(Polygon):
def __init__(self, p1, p2, p3, p4, name= "Tetragon"):
super().__init__(points=[p1, p2, p3, p4], name=name)
class Pentagon(Polygon):
def __init__(self, p1, p2, p3, p4, p5, name="Pentagon"):
super().__init__(points=[p1, p2, p3, p4, p5], name=name)
The key thing about inheritance now is that all instances of Triangle
, Tetragon
or Pentagon
have the attributes __points
(though it is not visible), name
and the methods get_points
and __str__
.
For example, if we define a triangle
and apply print()
to it, the inherited __str__
method from the superclass Polygon
is invoked internally.
triangle_1 = Triangle(ImmutablePoint(0,0), ImmutablePoint(1,1), ImmutablePoint(2,0), "Dreieck")
print(triangle_1)
triangle_1.area()
Dreieck has an area of 1.0 and is defined by
Point 0 at x: 0 m; y: 0 m
Point 1 at x: 1 m; y: 1 m
Point 2 at x: 2 m; y: 0 m
1.0
The same applies to quadrilaterals and pentagons:
quadrilateral_1 = Tetragon(ImmutablePoint(0,0), ImmutablePoint(0,1),ImmutablePoint(1,1), ImmutablePoint(1,0), "Viereck")
print(quadrilateral_1)
quadrilateral_1.area()
Viereck has an area of 1.0 and is defined by
Point 0 at x: 0 m; y: 0 m
Point 1 at x: 0 m; y: 1 m
Point 2 at x: 1 m; y: 1 m
Point 3 at x: 1 m; y: 0 m
1.0
pentagon_1 = Pentagon(ImmutablePoint(0,0), ImmutablePoint(-1,1), ImmutablePoint(1,2), ImmutablePoint(2,1), ImmutablePoint(2,0), "Fünfeck")
print(pentagon_1)
pentagon_1.area()
Fünfeck has an area of 4.0 and is defined by
Point 0 at x: 0 m; y: 0 m
Point 1 at x: -1 m; y: 1 m
Point 2 at x: 1 m; y: 2 m
Point 3 at x: 2 m; y: 1 m
Point 4 at x: 2 m; y: 0 m
4.0