Sets

I want to briefly introduce another data type, sets. Lists and tuples are ordered collections. Sets are a different type of collection. They are not ordered, and no duplicate items are allowed. Differences of collection types:

Collection

Ordered?

Mutable?

Unique items?

Lists

True

True

False

Tuples

True

False

False

Sets

False

True

True

Set Basics

Sets are unordered collections of unique elements. When we create sets, we use the curly braces to surround the list of elements.

>>> colors = {'red', 'blue', 'yellow', 'red'}
>>> colors
{'red', 'yellow', 'blue'}

We can see that 'red' only appears once in our set even though we added it twice. Note that there was no error for including a duplicate item. Duplicates are simply ignored.

Sets are mutable so we can add and remove elements from them:

>>> colors.remove('yellow')
>>> colors
{'red', 'blue'}
>>> colors.add('purple')
>>> colors
{'red', 'purple', 'blue'}

We cannot make an empty set with empty curly braces because that makes a dictionary, so we use the set() constructor with no input to make an empty set.

>>> s = set()
>>> s
set()
>>> s.add(4)
>>> s
{4}

We can use type() to verify that we’re working with a set:

>>> type(colors)
<class 'set'>

Help works on sets just like on strings, integers, and lists:

>>> help(colors)
>>> help(set)

Why sets?

Sets are kind of like dictionaries with only keys and no values.

It’s very inexpensive to check whether something is in a set, regardless of how large the set is.

>>> 'purple' in colors
True

Sets are also useful for checking whether a list has exclusively unique values:

>>> names = ['Trey', 'Theresa', 'Timothy', 'Trina', 'Theresa']
>>> len(names) == len(set(names))
False
>>> set(names)
{'Trina', 'Theresa', 'Timothy', 'Trey'}

If we make a set out of a list, the duplicates will be lost so the length of the set and the length of the list should be the same if the list contained only unique values.

Intersection

Sets allow for set operations, like in math.

There are operators for getting the union and intersection of sets and asking whether sets are subsets of each other.

For example intersection works this way:

>>> {1, 2, 3} & {3, 2, 4}
{2, 3}

You can look up the other set operations if you ever need them.

Dictionaries & Sets

If we ask a dictionary for its values, we’ll get a set-like object back. This can be useful for seeing which keys are contained in two different dictionaries:

>>> x = {'a': 1, 'b': 3, 'c': 4}
>>> y = {'b': 9, 'd': 8, 'a': 4}
>>> x.keys() & y.keys()
{'a', 'b'}

Set Exercises

Check For Duplicates

This is the has_duplicates exercise in sets.py. Edit the sets.py file in the exercises directory to implement this exercise. To test it, run python test.py has_duplicates in your exercises directory.

Write a function has_duplicates that takes a list as input and returns True if the list contains duplicate elements.

Try running has_duplicates on the following lists:

>>> from sets import has_duplicates
>>> numbers1 = [1, 2, 4, 2]
>>> numbers2 = [1, 2, 3, 4]
>>> has_duplicates(numbers1)
True
>>> has_duplicates(numbers2)
False

Get Shared Keys

This is the get_shared_keys exercise in dictionaries.py. Edit the dictionaries.py file in the exercises directory to implement this exercise. To test it, run python test.py get_shared_keys in your exercises directory.

Edit the function get_shared_keys so that it accepts two dictionaries and returns a set, list, or tuple (it’s up to you) which contains the keys that are included in both dictionaries.

Example usage:

>>> from dictionaries import get_shared_keys
>>> expired = {'c95': '20200315', 'd45': '20200401', 'b38': '20200415'}
>>> used_recently = {'a56': 8, 'b38': 1, 'e77': 4, 'd45': 3}
>>> get_shared_keys(expired, used_recently)
{'d45', 'b38'}

Get most common

This is the get_most_common exercise in sets.py. Edit the sets.py file in the exercises directory to implement this exercise. To test it, run python test.py get_most_common in your exercises directory.

Create a function that accepts a list of sets and returns a set of the most common items from each of the given sets.

For example:

>>> from sets import get_most_common
>>> get_most_common([{1, 2}, {2, 3}, {3, 4}])
{2, 3}