Dictionaries
A dictionary in Python is a collection of key-value pairs. Here's a breakdown of this concept:
Key: A unique identifier used to access a corresponding value.
Value:The data associated with a key.
Key-Value Pair:A combination of a key and its corresponding value.
Consider the key-value pair '4+': 4433. Here:
Key: '4+'
Value: 4433
Together, they form the key-value pair '4+': 4433.
Dictionary values can be of any data type: strings, integers, floats, Booleans, lists, and even dictionaries.
Create a dictionary
To create the dictionary above, complete the following:
Map each content rating to its corresponding number by following an
index:valuepattern. For instance, to map a rating of '4+' to the number 4,433, we type'4+': 4433(notice the colon between'4+'and4433). To map '9+' to 987, we type'9+': 987, and so on.Type the entire sequence of
index:valuepairs and separated each with a comma:'4+': 4433, '9+': 987, '12+': 1155, '17+': 622.Surround the sequence with curly braces:
{'4+': 4433, '9+': 987, '12+': 1155, '17+': 622}
Retrieve a Value
Appending value to a dictionary
Limitations of Keys
While values can be almost anything, keys have some restrictions. They can be of many data types, including: integers, strings, floats, Booleans. However, lists and dictionaries cannot be used as keys. Attempting to do so results in a TypeError. This is because keys must be "hashable," meaning they cannot change over the lifetime of the dictionary. Lists and dictionaries are mutable, so they can't be used as keys. When we populate a dictionary, Python tries to convert each dictionary key to an integer (even if the key is a data type other than an integer). Python does the conversion using the hash() command:
Check if value exists in the Dictionary
An expression of the form a_value in a_dictionary always returns a Boolean value:
We get
Trueifa_valueexists ina_dictionaryas a dictionary key.We get
Falseifa_valuedoesn't exist ina_dictionaryas a dictionary key.
Updating Dictionary Values

Counting with Dictionaries
We can update dictionary values to count how many times each unique content rating occurs in our dataset. Let's start by considering the list ['4+', '4+', '4+', '9+', '9+', '12+', '17+'], which stores a few content ratings. To use code to count how many times each rating occurs in this short list, let's complete the following:
Create a dictionary where the keys are the unique content ratings and the values are all 0:
{'4+': 0, '9+': 0, '12+': 0, '17+': 0}.Loop through the list
['4+', '4+', '4+', '9+', '9+', '12+', '17+'], and for each iteration, do the following:Check if the iteration variable exists as a key in the previously created dictionary.
If it exists, then increment the dictionary value at that key by
1.
If we do not know the key values, we can generate them on run time.
Change Frequencies into Propositions
To change frequencies into proportions or percentages, we can individually modify the values in the dictionary by doing the necessary math. In the example below, we divide each value in the dictionary by the total number of apps to change the frequencies into proportions.
Cleansing data (removing duplicates based on a single highest value)
We are saving highest number of reviews. Now, we will save data of apps with highest number of reviews in a separate data set.
We start by initializing two empty lists,
android_cleanandalready_added.We loop through the
androiddata set, and for every iteration:We isolate the name of the app and the number of reviews.
We add the current row (
app) to theandroid_cleanlist, and the app name (name) to thealready_addedlist if:The number of reviews of the current app matches the number of reviews of that app as described in the
reviews_maxdictionary; andThe name of the app is not already in the
already_addedlist. We need to add this supplementary condition to account for those cases where the highest number of reviews of a duplicate app is the same for more than one entry (for example, the Box app has three entries, and the number of reviews is the same). If we just check forreviews_max[name] == n_reviews, we'll still end up with duplicate entries for some apps.
Last updated