Algorithms

Algorithms#

Midjourney: Two Towers, ref. M.C. Escher

Everything should be made as simple as possible, but not simpler.

— Albert Einstein

Slides/PDF #

Complex Data Structure - Trees#

Trees are another important complex (composite) data structures.

We define the following tree.

We represent the tree here as dictionaries, in which we semantically label the value as wert and the subtrees as links and rechts. Indentation makes the structure more visible. Omitted fields are automatically None.

tree = {
        "wert": 12,
        "links":{
                "wert": 6,
                "links": {"wert":  3},
                "rechts":{"wert":  9}
        },
        "rechts":{
                "wert": 18,
                "links": {"wert": 15},
                "rechts":{"wert": 21}
        }
}

If we want to traverse the tree, we need a recursive function. Recursive functions cannot be transformed into loops. The recursive functions operate on the specified node and then call themselves with the corresponding subtree of its children.

def traverse(tree):
	value = tree["wert"]                 # the current value
	print(value)                          # print the current value
	if "links" in tree:                   # if the left subtree is defined
		traverse(tree["links"])           # call the function recursively
	if "rechts" in tree:                  # if the right subtree is defined
		traverse(tree["rechts"])          # call the function recursively

traverse(tree)

If we traverse the tree recursively (traversal), we see that we visit the tree from top to bottom, left before right. So we start at the root node 12, then go left to 6, then further left to 3, then right to l, because there is no deeper element, and so on.

Sorting Algorithms#

Bubble Sort#

Sorting elements in a list is a common task in programming. There are a few standard algorithms you should know for this.

Bubble Sort is one of the simplest sorting algorithms. In this approach, each element of the list is traversed and compared with the next element. If the second element is smaller than the first, their positions are swapped. The list is traversed over and over until this condition no longer holds.

def bubbleSort(numbers):
    for i in range(len(numbers)-1):
        for j in range(0, len(numbers)-i-1):
            if numbers[j] > numbers[j + 1] :
                numbers[j], numbers[j + 1] = numbers[j + 1], numbers[j]

number_sequence = [12, 6, 3, 9, 18, 15, 21]

bubble_sort(number_sequence)

number_sequence

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[5], line 3
      1 number_sequence = [12, 6, 3, 9, 18, 15, 21]
----> 3 bubble_sort(number_sequence)
      5 number_sequence

NameError: name 'bubble_sort' is not defined

It should be noted that Bubble Sort directly modifies the list, i.e., the previous order is changed.

QuickSort#

Quicksort is a divide-and-conquer algorithm. It selects an element from the list as the pivot element and then creates two lists: one containing the elements smaller than the pivot and one containing the elements larger. These two sublists are then sorted again with Quicksort until the sublists reach size 1. This way, the sorting problem becomes progressively smaller.

def quickSort(elements):
    less = []
    equal = []
    greater = []

    if len(elements) > 1:
        pivot = elements[len(elements) % 2]
        for x in elements:
            if x < pivot:
                less.append(x)
            elif x == pivot:
                equal.append(x)
            elif x > pivot:
                greater.append(x)
        return quickSort(less) + equal + quickSort(greater)
    else:  # If there is only one or no element in the list
        return elements

The result is a new sorted list that must be assigned.

sequence = [12, 6, 3, 9, 18, 15, 21]

sequence = quickSort(sequence)

sequence

[3, 6, 9, 12, 15, 18, 21]

Insertion Sort#

Another sorting algorithm is insertion sort. It is based on the human strategy of sorting cards by inserting each value into the position where it fits best.

def insertion_sort(arr):
    for i in range(1, len(arr)): # Iterate through all elements
        key = arr[i]
        # Move all elements of arr[0..i-1], that are greater than
        # the current element key one position ahead of their current position
        j = i-1
        while j >= 0 and key < arr[j] :
            arr[j+1] = arr[j]
            j -= 1
        arr[j+1] = key

The result is the correctly sorted list, and it has been directly modified again.

number_sequence = [12, 6, 3, 9, 18, 15, 21]

insertion_sort(number_sequence)

number_sequence

[3, 6, 9, 12, 15, 18, 21]

Python built-in function `sorted`#

The standard solution in everyday programming is Python’s built-in function sorted(). Here the list is not modified and must be reassigned.

sequence = [12, 6, 3, 9, 18, 15, 21]

sequence = sorted(sequence)

sequence

[3, 6, 9, 12, 15, 18, 21]

Performance comparison of the sorting functions#

In a performance comparison of the functions, Bubble sort performs worse than insertion sort. The built-in function sorted() is clearly faster than both. It uses the TimSort algorithm (a variant of insertion sort) and is implemented in C.

%timeit bubbleSort([12, 6, 3, 9, 18, 15, 21])

635 ns ± 2.92 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

%timeit quickSort([12, 6, 3, 9, 18, 15, 21])

1.13 μs ± 6.9 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

%timeit insertion_sort([12, 6, 3, 9, 18, 15, 21])

297 ns ± 0.778 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

%timeit sorted([12, 6, 3, 9, 18, 15, 21])

112 ns ± 0.418 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

This result holds when sorting a considerably larger sequence. We generate a large sequence of 2,500 numbers with range() and reverse it with reverse() to invert the order.

large_number_sequence = list(range(2500))
large_number_sequence.reverse()

%timeit bubble_sort(numbers_sequence_large)

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[16], line 1
----> 1 get_ipython().run_line_magic('timeit', 'bubble_sort(numbers_sequence_large)')

File ~/Documents/code/Lehre/ProgrammierUebung/.venv/lib/python3.12/site-packages/IPython/core/interactiveshell.py:2504, in InteractiveShell.run_line_magic(self, magic_name, line, _stack_depth)
   2502     kwargs['local_ns'] = self.get_local_scope(stack_depth)
   2503 with self.builtin_trap:
-> 2504     result = fn(*args, **kwargs)
   2506 # The code below prevents the output from being displayed
   2507 # when using magics with decorator @output_can_be_silenced
   2508 # when the last Python token in the expression is a ';'.
   2509 if getattr(fn, magic.MAGIC_OUTPUT_CAN_BE_SILENCED, False):

File ~/Documents/code/Lehre/ProgrammierUebung/.venv/lib/python3.12/site-packages/IPython/core/magics/execution.py:1226, in ExecutionMagics.timeit(self, line, cell, local_ns)
   1224 for index in range(0, 10):
   1225     number = 10 ** index
-> 1226     time_number = timer.timeit(number)
   1227     if time_number >= 0.2:
   1228         break

File ~/Documents/code/Lehre/ProgrammierUebung/.venv/lib/python3.12/site-packages/IPython/core/magics/execution.py:184, in Timer.timeit(self, number)
    182 gc.disable()
    183 try:
--> 184     timing = self.inner(it, self.timer)
    185 finally:
    186     if gcold:

File <magic-timeit>:1, in inner(_it, _timer)

NameError: name 'bubble_sort' is not defined

%timeit quick_sort(large_number_sequence)

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[17], line 1
----> 1 get_ipython().run_line_magic('timeit', 'quick_sort(large_number_sequence)')

File ~/Documents/code/Lehre/ProgrammierUebung/.venv/lib/python3.12/site-packages/IPython/core/interactiveshell.py:2504, in InteractiveShell.run_line_magic(self, magic_name, line, _stack_depth)
   2502     kwargs['local_ns'] = self.get_local_scope(stack_depth)
   2503 with self.builtin_trap:
-> 2504     result = fn(*args, **kwargs)
   2506 # The code below prevents the output from being displayed
   2507 # when using magics with decorator @output_can_be_silenced
   2508 # when the last Python token in the expression is a ';'.
   2509 if getattr(fn, magic.MAGIC_OUTPUT_CAN_BE_SILENCED, False):

File ~/Documents/code/Lehre/ProgrammierUebung/.venv/lib/python3.12/site-packages/IPython/core/magics/execution.py:1226, in ExecutionMagics.timeit(self, line, cell, local_ns)
   1224 for index in range(0, 10):
   1225     number = 10 ** index
-> 1226     time_number = timer.timeit(number)
   1227     if time_number >= 0.2:
   1228         break

File ~/Documents/code/Lehre/ProgrammierUebung/.venv/lib/python3.12/site-packages/IPython/core/magics/execution.py:184, in Timer.timeit(self, number)
    182 gc.disable()
    183 try:
--> 184     timing = self.inner(it, self.timer)
    185 finally:
    186     if gcold:

File <magic-timeit>:1, in inner(_it, _timer)

NameError: name 'quick_sort' is not defined

%timeit insertion_sort(large_number_sequence)

81.5 μs ± 181 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

%timeit sorted(large_number_sequence)

5.4 μs ± 162 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

Here it becomes evident once again that BubbleSort performs the worst due to the many comparisons. Quick Sort fares better because it divides the problem into smaller subproblems. Insertion Sort, however, is still more efficient. The built-in sorted() function, which is also based on the insertion sort algorithm, solves the problem the fastest.

Finding Elements in Lists#

Full-Text Search#

Searching for and locating elements in lists is also a daily task. There are typical algorithms here as well.

The naive approach is brute-force search: we iterate through every element in a list with a for-each loop to see if the element can be found anywhere.

def contains(list, x):
    for l in list:
        if(l == x):
            return True
    return False

Let’s use 15 as an example to check whether it is in our unsorted sequence of numbers [12, 6, 3, 9, 18, 15, 21]. It is the penultimate element in the list, so we must iterate through all the preceding elements.

number_sequence = [12, 6, 3, 9, 18, 15, 21]

contains(number_sequence, 15)

True

If we search for 1, it correctly returns that the number is not in the list. However, to determine this, we had to compare every element of the list with 1.

contains(number_sequence, 1)

False

Searching with Binary Search#

Binary search is a divide-and-conquer algorithm for searching in a sorted list. It always selects the middle element from the list and decides whether the element being searched for is smaller or larger, and then continues with the sublist to the right or left. This allows it to ignore large parts of the list and minimize the number of comparisons.

def binary_search_sub(array, target, low, high):
    while low <= high:  # Repeat until low and high meet
        mid = low + (high - low)//2  # integer division by two
        # If found, return the index
        if array[mid] == target:
            return mid
        # If the element is smaller than the target,
        # then it can only be in the left subarray
        elif array[mid] > target:
            return binary_search_sub(array, target, low, mid - 1)
        # Otherwise the element is larger than the target
        # and this must be in the right subarray
        else:
            return binary_search_sub(array, target, mid + 1, high)
    # If nothing found, return the position -1
    return -1

def binary_search(array, target):
    # Since binary search requires a sorted sequence of numbers, sort the numbers beforehand.
    array_sorted = sorted(array)
    return binary_search_sub(array_sorted, target, 0, len(array_sorted))

The return value of a binary search is usually the index of the found number.

binarySearch(number_sequence, 15)

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[24], line 1
----> 1 binarySearch(number_sequence, 15)

NameError: name 'binarySearch' is not defined

If nothing is found at all, the position -1 is returned.

binarySearch(number_sequence, 1)

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[25], line 1
----> 1 binarySearch(number_sequence, 1)

NameError: name 'binarySearch' is not defined

Searching with Binary Search Trees#

The tree data structure introduced at the beginning represents a binary search tree and can be used for searching. We take advantage of the fact that it is already sorted. Then we can define a new search function that conceptually corresponds to binary search.

def searchTree(tree, target_value):
    value = tree["wert"]       # the current value
    if value == target_value:
        return True
    if target_value < value and "links" in tree:
        return searchTree(tree["links"], target_value)
    elif target_value > value and "rechts" in tree:
        return searchTree(tree["rechts"], target_value)
    else:
        return False

searchTree(tree, 15)

True

findTree(tree, 1)

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[28], line 1
----> 1 findTree(tree, 1)

NameError: name 'findTree' is not defined

Creating Binary Search Trees#

Searching in a binary search tree is, in itself, not faster than binary search in a sorted list. However, sorting the list is very expensive. Inserting new elements is also very slow, because you first have to perform a binary search to find the insertion position, and then the entire list must be copied (if implemented as an array) since it becomes one element larger. This costs a lot of performance, especially for very large lists with more than 10,000 elements.

With the use of a binary search tree, on the other hand, you can also take advantage of the fact that it is always sorted when inserting. As a result, we also use the binary-search algorithm during insertion. Especially for growing data structures that must be kept sorted, search trees are extremely effective. For insertion, we can rewrite the function suchBaum() so that it always inserts the value as a new leaf node.

def extend_tree(tree, new_value):
    value = tree["wert"]       # the current value
    if value == new_value: 
        return #  we stop here because the value is already contained and does not need to be added
    elif new_value < value:
        if "links" not in tree: # if there is no left subtree yet
            tree["links"] = {"wert": new_value} # add a new node to the left of the current
        else:
            return extend_tree(tree["links"], new_value)
    else: # otherwise continue searching for insertion position
        if "rechts" not in tree: # if there is no right subtree yet
            tree["rechts"] = {"wert": new_value} # add a new node to the right of the current
        else:
            return extend_tree(tree["rechts"], new_value)

tree2 = {"wert": 12} # We initialize the tree with 12
for value in [12, 6, 3, 9, 18, 15, 21]: # we add the values of the search tree at the start
    extendTree(tree2, value)

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[30], line 3
      1 tree2 = {"wert": 12} # We initialize the tree with 12
      2 for value in [12, 6, 3, 9, 18, 15, 21]: # we add the values of the search tree at the start
----> 3     extendTree(tree2, value)

NameError: name 'extendTree' is not defined

Please note that the tree is modified in place.

To check whether the new tree looks the same as the tree baum shown at the beginning, we print both with print().

print(tree)

{'wert': 12, 'links': {'wert': 6, 'links': {'wert': 3}, 'rechts': {'wert': 9}}, 'rechts': {'wert': 18, 'links': {'wert': 15}, 'rechts': {'wert': 21}}}

print(tree2)

{'wert': 12}

Searching in Lists with Python#

In everyday use, Python’s built-in membership test can be used with 15 in zahlenfolge.

Please provide the Python code containing German variable names, function/class names, docstrings, or inline comments that you want translated.

  Cell In[33], line 1
    Please provide the Python code containing German variable names, function/class names, docstrings, or inline comments that you want translated.
           ^
SyntaxError: invalid syntax

Searching in Python with Sets#

Especially for large lists (well over 100), converting to a set is worthwhile. These are optimized for checking whether an element is already contained. For this purpose, hashing algorithms and typically binary search trees are used, which allow only a subset of the elements to be searched.

number_set = set(number_sequence)

15 in number_set

True

Performance comparison of the search functions#

In performance comparisons, the built-in search with 15 in zahlenfolge wins again. The performance of the full search with contains is better than the search with binary search or the one in the search tree, although it is also surpassed by the internal search in lists and in sets.

This is due to the small example and to the extra overhead from the data structure and the recursive function call. Especially for very large lists, search trees and binary search are significantly more efficient than the full search.

%timeit contains(number_sequence, 15)

55 ns ± 0.0914 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

%timeit binarySearch(number_sequence, 15)

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[37], line 1
----> 1 get_ipython().run_line_magic('timeit', 'binarySearch(number_sequence, 15)')

File ~/Documents/code/Lehre/ProgrammierUebung/.venv/lib/python3.12/site-packages/IPython/core/interactiveshell.py:2504, in InteractiveShell.run_line_magic(self, magic_name, line, _stack_depth)
   2502     kwargs['local_ns'] = self.get_local_scope(stack_depth)
   2503 with self.builtin_trap:
-> 2504     result = fn(*args, **kwargs)
   2506 # The code below prevents the output from being displayed
   2507 # when using magics with decorator @output_can_be_silenced
   2508 # when the last Python token in the expression is a ';'.
   2509 if getattr(fn, magic.MAGIC_OUTPUT_CAN_BE_SILENCED, False):

File ~/Documents/code/Lehre/ProgrammierUebung/.venv/lib/python3.12/site-packages/IPython/core/magics/execution.py:1226, in ExecutionMagics.timeit(self, line, cell, local_ns)
   1224 for index in range(0, 10):
   1225     number = 10 ** index
-> 1226     time_number = timer.timeit(number)
   1227     if time_number >= 0.2:
   1228         break

File ~/Documents/code/Lehre/ProgrammierUebung/.venv/lib/python3.12/site-packages/IPython/core/magics/execution.py:184, in Timer.timeit(self, number)
    182 gc.disable()
    183 try:
--> 184     timing = self.inner(it, self.timer)
    185 finally:
    186     if gcold:

File <magic-timeit>:1, in inner(_it, _timer)

NameError: name 'binarySearch' is not defined

%timeit searchTree(tree, 15)

90.2 ns ± 0.448 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

%timeit 15 in number_sequence

20.8 ns ± 0.421 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

%timeit 15 in number_set

6.75 ns ± 0.0256 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)

Let’s compare once more a very large example with 100,000 elements. We generate a very large sequence of numbers with range() and reverse its order using reverse().

large_number_sequence = list(range(100000))
large_number_sequence.reverse()

We are constructing the set of the sequence.

number_set_large = set(number_sequence_large)

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[42], line 1
----> 1 number_set_large = set(number_sequence_large)

NameError: name 'number_sequence_large' is not defined

To create the search tree, it is not very efficient to add the elements in sorted order or in reverse order, as in this case. That would cause the tree to be unbalanced and simply represent a linked list. To ensure that the tree is well-balanced, we add the elements in the optimal order, which is the same order that a binary search algorithm traverses. There are additional trees that balance themselves automatically; these are the so-called balanced search trees. However, they would go too far for the examples discussed here.

# To ensure that the tree is well balanced
# initialize the tree in the middle
# and then insert the subtrees in the order of binary search
tree_large = {"wert": large_numbers_sequence[len(large_numbers_sequence)//2]}
step = len(large_numbers_sequence)//4
while step > 0:
    idx = step
    while idx < len(large_numbers_sequence):
        insert_into_tree(tree_large, large_numbers_sequence[idx])
        idx = idx + step
    step = step // 2

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[43], line 4
      1 # To ensure that the tree is well balanced
      2 # initialize the tree in the middle
      3 # and then insert the subtrees in the order of binary search
----> 4 tree_large = {"wert": large_numbers_sequence[len(large_numbers_sequence)//2]}
      5 step = len(large_numbers_sequence)//4
      6 while step > 0:

NameError: name 'large_numbers_sequence' is not defined

%timeit contains(large_number_sequence, 15)

507 μs ± 3.08 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

%timeit binarySearch(large_number_sequence, 15)

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[45], line 1
----> 1 get_ipython().run_line_magic('timeit', 'binarySearch(large_number_sequence, 15)')

File ~/Documents/code/Lehre/ProgrammierUebung/.venv/lib/python3.12/site-packages/IPython/core/interactiveshell.py:2504, in InteractiveShell.run_line_magic(self, magic_name, line, _stack_depth)
   2502     kwargs['local_ns'] = self.get_local_scope(stack_depth)
   2503 with self.builtin_trap:
-> 2504     result = fn(*args, **kwargs)
   2506 # The code below prevents the output from being displayed
   2507 # when using magics with decorator @output_can_be_silenced
   2508 # when the last Python token in the expression is a ';'.
   2509 if getattr(fn, magic.MAGIC_OUTPUT_CAN_BE_SILENCED, False):

File ~/Documents/code/Lehre/ProgrammierUebung/.venv/lib/python3.12/site-packages/IPython/core/magics/execution.py:1226, in ExecutionMagics.timeit(self, line, cell, local_ns)
   1224 for index in range(0, 10):
   1225     number = 10 ** index
-> 1226     time_number = timer.timeit(number)
   1227     if time_number >= 0.2:
   1228         break

File ~/Documents/code/Lehre/ProgrammierUebung/.venv/lib/python3.12/site-packages/IPython/core/magics/execution.py:184, in Timer.timeit(self, number)
    182 gc.disable()
    183 try:
--> 184     timing = self.inner(it, self.timer)
    185 finally:
    186     if gcold:

File <magic-timeit>:1, in inner(_it, _timer)

NameError: name 'binarySearch' is not defined

%timeit find_tree(large_tree, 15)

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[46], line 1
----> 1 get_ipython().run_line_magic('timeit', 'find_tree(large_tree, 15)')

File ~/Documents/code/Lehre/ProgrammierUebung/.venv/lib/python3.12/site-packages/IPython/core/interactiveshell.py:2504, in InteractiveShell.run_line_magic(self, magic_name, line, _stack_depth)
   2502     kwargs['local_ns'] = self.get_local_scope(stack_depth)
   2503 with self.builtin_trap:
-> 2504     result = fn(*args, **kwargs)
   2506 # The code below prevents the output from being displayed
   2507 # when using magics with decorator @output_can_be_silenced
   2508 # when the last Python token in the expression is a ';'.
   2509 if getattr(fn, magic.MAGIC_OUTPUT_CAN_BE_SILENCED, False):

File ~/Documents/code/Lehre/ProgrammierUebung/.venv/lib/python3.12/site-packages/IPython/core/magics/execution.py:1226, in ExecutionMagics.timeit(self, line, cell, local_ns)
   1224 for index in range(0, 10):
   1225     number = 10 ** index
-> 1226     time_number = timer.timeit(number)
   1227     if time_number >= 0.2:
   1228         break

File ~/Documents/code/Lehre/ProgrammierUebung/.venv/lib/python3.12/site-packages/IPython/core/magics/execution.py:184, in Timer.timeit(self, number)
    182 gc.disable()
    183 try:
--> 184     timing = self.inner(it, self.timer)
    185 finally:
    186     if gcold:

File <magic-timeit>:1, in inner(_it, _timer)

NameError: name 'find_tree' is not defined

%timeit 15 in large_number_sequence

298 μs ± 10.7 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

%timeit 15 in numberset_large

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[48], line 1
----> 1 get_ipython().run_line_magic('timeit', '15 in numberset_large')

File ~/Documents/code/Lehre/ProgrammierUebung/.venv/lib/python3.12/site-packages/IPython/core/interactiveshell.py:2504, in InteractiveShell.run_line_magic(self, magic_name, line, _stack_depth)
   2502     kwargs['local_ns'] = self.get_local_scope(stack_depth)
   2503 with self.builtin_trap:
-> 2504     result = fn(*args, **kwargs)
   2506 # The code below prevents the output from being displayed
   2507 # when using magics with decorator @output_can_be_silenced
   2508 # when the last Python token in the expression is a ';'.
   2509 if getattr(fn, magic.MAGIC_OUTPUT_CAN_BE_SILENCED, False):

File ~/Documents/code/Lehre/ProgrammierUebung/.venv/lib/python3.12/site-packages/IPython/core/magics/execution.py:1226, in ExecutionMagics.timeit(self, line, cell, local_ns)
   1224 for index in range(0, 10):
   1225     number = 10 ** index
-> 1226     time_number = timer.timeit(number)
   1227     if time_number >= 0.2:
   1228         break

File ~/Documents/code/Lehre/ProgrammierUebung/.venv/lib/python3.12/site-packages/IPython/core/magics/execution.py:184, in Timer.timeit(self, number)
    182 gc.disable()
    183 try:
--> 184     timing = self.inner(it, self.timer)
    185 finally:
    186     if gcold:

File <magic-timeit>:1, in inner(_it, _timer)

NameError: name 'numberset_large' is not defined

The result looks quite different here. Now contains() performs noticeably worse, and Binary Search is now orders of magnitude better. In fact, in some runs it’s even better than Python’s built-in search for lists, since that is also a full search in C. The best results, however, are achieved by the search tree, since it is always sorted and thus can run the Binary Search algorithm without sorting. Only the lookup in a set is another order of magnitude better still.

Quiz#

--- shuffleQuestions: true shuffleAnswers: true --- ### What is an algorithm? - [x] A step-by-step guide to solving a problem - [ ] A random code without structure - [ ] A kind of database - [ ] A graphical design tool ### Which of the following properties does a good algorithm have? - [x] It is efficient and solves the problem in acceptable time - [ ] It is always complicated and hard to understand - [ ] It requires no inputs - [ ] It can only be applied to numbers ### What is a typical application area of trees in computer science? - [x] Efficient searching - [ ] Text formatting - [ ] Data compression - [ ] Network analysis ### What is a tree in computer science? - [x] A hierarchical data structure with nodes - [ ] A loop with several conditions - [ ] A special fixed-length list - [ ] An array of numbers in random order ### Sort the following lines to correctly build a recursive function for traversing a binary tree: ```python # Given: tree = {"wert": 5, "links": {"wert": 3}, "rechts": {"wert": 7}} ``` 1. `def traverse(tree):` 2. ` print(tree["wert"])` 3. ` if "links" in tree:` 4. ` traverse(tree["links"])` 5. ` if "rechts" in tree:` 6. ` traverse(tree["rechts"])` ### What is the bug in this example? ```python def traverse(tree): print(tree["wert"]) traverse(tree["links"]) traverse(tree["rechts"]) ``` - [x] There is no check to see if "left" or "right" exist at all - [ ] The function is not allowed to be called recursively - [ ] The print statement is misindented - [ ] Python does not know dictionaries ### How does Bubble Sort work? - [x] Elements are compared in pairs and swapped if needed - [x] The list is traversed multiple times until no swap is needed - [ ] All elements are sorted immediately - [ ] Only even numbers are considered ### Why is Bubble Sort inefficient for large data sets? - [x] Because many comparisons and swaps are required - [ ] Because Python cannot process lists - [ ] Because it uses no for-loops - [ ] Because the elements are deleted ### How does QuickSort work? - [x] It selects a pivot element, partitions into smaller and larger subsets, and sorts recursively - [x] It uses the divide-and-conquer principle - [ ] It swaps adjacent elements multiple times - [ ] It sorts only by appending to a list ### Sort the following lines to correctly build a simplified QuickSort function: ```python # Assume: list consists only of numbers ``` 1. `def quicksort(arr):` 2. ` if len(arr) <= 1:` 3. ` return arr` 4. ` pivot = arr[0]` 5. ` less = [x for x in arr[1:] if x <= pivot]` 6. ` greater = [x for x in arr[1:] if x > pivot]` 7. ` return quicksort(less) + [pivot] + quicksort(greater)` ### How does Insertion Sort work? - [x] Elements are inserted at the correct place in a already sorted sublist - [ ] Each element is compared with a random element - [ ] All elements are moved simultaneously - [ ] The list is traversed in reverse and sorted ### What is the main difference between Insertion Sort and Bubble Sort? - [x] Insertion Sort inserts elements correctly, Bubble Sort swaps them repeatedly - [ ] Insertion Sort uses no loops - [ ] Bubble Sort uses no comparisons - [ ] Insertion Sort can only be applied to strings ### What is the bug in this example? ```python nums = [4, 2, 1] nums = nums.sort() ``` - [x] `sort()` mutates the list in-place and returns `None` – `nums` becomes `None` - [ ] `sort()` is not allowed on lists - [ ] Lists may not be sorted - [ ] `sorted()` would be faster here ### Why is `sorted()` faster than Bubble Sort or Insertion Sort? - [x] It is implemented in C and uses an optimized variant of `insert_sort` - [ ] It uses random numbers to sort - [ ] It uses no comparisons - [ ] It sorts only half of the list