Before session 14
Before the next class session, watch the videos about tuples and read the text material about sets.
Tuples
Slides in English
Textbook in English
Sets
Python also has a data type for sets. A set is an unordered collection of unique elements, so in a set there are no duplicates. Every element of a set must be immutable (cannot be changed) but among immutable data types any type is allowed: integers, floats, strings, even tuples, and so on.
Although a set (contrary to lists and dictionaries) cannot have mutable elements, a set itself is mutable – we can add or remove elements from it.
Creating sets
A set is created by listing all its elements inside the curly brackets {}, separated by commas, or by using the built-in function set(). Note: to create an empty set you have to use set(), not {}; the latter creates an empty dictionary.
s1 = {8, 2, 3, 6, 7} s2 = set([6, 4, 5]) s3 = set('Good morning') s4 = set()
Go through the following examples as well.
>>> basket = {'apple', 'orange', 'apple', 'pear', 'orange', 'banana'} >>> print(basket) # show that duplicates have been removed {'orange', 'pear', 'banana', 'apple'} >>> X = set('abracadabra') >>> X # unique letters in X {'a', 'r', 'b', 'c', 'd'}
Set operations
Sets have a large number of built-in methods. The table below shows only the most basic ones. For more methods, see Python's documentation on sets.
Operation | Description |
---|---|
s.add(el) | add element el to set s |
s.remove(el) | remove element el from set s if the element is in the set. If el is not in the set, a KeyError is raised. |
s.discard(el) | remove element el from set s , if the element is in the set. If el is not in the set, nothing is done. |
s.update(s1) | update set s by adding the elements from another set s1 |
s.pop() | remove an arbitrary element from set s , the method returns the element removed |
s.clear() | remove all elements from set s |
s.copy() | return a new set with a shallow copy of set s |
Sets also support mathematical operations like union, intersection, difference, and symmetric difference.
Operation | Description |
---|---|
A & B A.intersection(B) | a new set with elements that belong to both A and B at the same time |
A | B A.union(B) | a new set with all elements that belong to either A or B (or both) |
A - B A.difference(B) | difference of A and B, a set of elements that are only in A, but not in B |
A ^ B A.symmetric_difference(B) | symmetric difference of A and B, a set with elements in either A or B, but not in both |
A <= B A.issubset(B) | tests whether every element in set A is in set B |
A >= B A.issuperset(B) | tests whether every element in set B is in set A |
Try out these operations yourself!
Iterating through a set
Using a for-loop, we can iterate though a set:
for letter in set("apple"): print(letter, end=" ")
In the output, the duplicates are removed and the order of letters is not preserved:
p a e l
Quiz
Go to Moodle and solve the test on tuples and sets.
Examples
Example 1. Meetings
The following program is a modified example from last week – The meetings program. The file meetings.txt contains records of meetings. Each number in the file indicates the day of the week when an appointment takes place.
The following program reads the data from the file and creates a dictionary, where the days of the week are the keys, and the numbers of appointments are the values. Then the program creates a list of tuples and sorts it in descending order. Finally, the program outputs three days which have the largest number of meetings.
def day_of_week(n): days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'] return days[n-1] ffile = open("meetings.txt") meetings = dict() for line in ffile: try: number = int(line) day = day_of_week(number) meetings[day] = meetings.get(day,0) + 1 except: print("Invalid value") ffile.close() lst = list() for key, val in meetings.items(): lst.append( (val, key) ) lst.sort(reverse=True) for val, key in lst[:3]: print(key, val)
Example 2. Names
This program prompts the user for the first and last names of three people and outputs two names that have the most number of letters in common. Pay attention to the function names which returns several values (actually one tuple).
def names(number): first = input("Please enter the first name of person "+str(number)+": ") last = input("Please enter the last name of person "+str(number)+": ") return first.lower(), last.lower() def common(firstX, lastX, firstY, lastY): firsts = set(firstX) & set(firstY) lasts = set(lastX) & set(lastY) return len(firsts) + len(lasts) first1, last1 = names(1) first2, last2 = names(2) first3, last3 = names(3) common12 = common(first1, last1, first2, last2) common23 = common(first2, last2, first3, last3) common13 = common(first1, last1, first3, last3) if common12 >= common13 and common12 >= common23: print("Names of persons 1 and 2 are most similar to each other.") if common13 >= common12 and common13 >= common23: print("Names of persons 1 and 3 are most similar to each other.") if common23 >= common12 and common23 >= common13: print("Names of persons 2 and 3 are most similar to each other.")
Exercises
1. Birthdays
Modify the function create_dictionary in the Birthdays program of the last week in such a way that it constructs a dictionary where keys are years and values are numbers of people who have been born on that year. The function should take a filename as an argument and return a dictionary, where both keys and values are integers.
>>> create_dictionary('dates.txt') {1983: 3, 1987: 2, 1995: 10, 1992: 3, 1993: 3, 1994: 6, 1996: 5, 1984: 1, 1980: 2, 1991: 3, 1997: 3, 1988: 1, 1989: 4, 1970: 1, 1990: 3, 1959: 1}
Write a program that prompts the user for a filename and uses the function create_dictionary to output top 3 years that have the most birthdays in that file, together with the numbers of birthdays (one birthyear-number pair on each line).
Enter filename: dates.txt Top 3 years that have the most birthdays are: 1995 - 10 birthdays 1994 - 6 birthdays 1996 - 5 birthdays
2. Distances between points
Write a function distance that takes two tuples of length two (coordinates of two points) as its arguments and returns the distance between the two points.
>>> A = (1, 2) >>> B = (5, 5) >>> distance(A, B) 5.0
Hint:
{$d = \sqrt{(x_1-x_2)^2 + (y_1-y_2)^2}$}
Write a program that first asks the user for the number of points, and then for each point asks for its x and y coordinates in the form (x,y). Then, using the function distance, it finds and prints the numbers of two points that the closest to each other. If there are several pairs of points with the same distance, any of such pairs can be taken. Point numbers start from 1.
Here is an example of the output:
Please enter the number of points: 4 Please enter the coordinates of point 1: (8, 9) Please enter the coordinates of point 2: (3, 2) Please enter the coordinates of point 3: (16, 20) Please enter the coordinates of point 4: (0, 0) Points 2 and 4 are the closest to each other.
Submit your solutions
Go to Moodle and upload your solutions under homework for Session 14. The programs should have the names home1.py and home2.py, respectively.