List

Lists

This is how we can create a list of data points :

row_1 = ['Facebook', 0.0, 'USD', 2974676, 3.5]
print(row_1)
print(type(row_1))

Output
['Facebook', 0.0, 'USD', 2974676, 3.5]
<class 'list'>

A list can contain both mixed and identical data types. A list like [4, 5, 6] has identical data types (only integers), while the list ['Facebook', 0.0, 'USD', 2974676, 3.5] has mixed data types. We can also have a list of lists.

List length

To find the length of a list, we can use the len() function:

row_1 = ['Facebook', 0.0, 'USD', 2974676, 3.5]
print(len(row_1))

list_1 = [1, 6, 0]
print(len(list_1))

list_2 = []
print(len(list_2))

Output
5
3
0

List Indexing

Each element (data point) in a list has a specific number associated with it — this is an index number. The indexing always starts at 0, so the first element will have the index number 0, the second element will have the index number 1, and so on.

To quickly find the index of a list element, identify its position in the list, and then subtract 11. For example, the string 'USD' is the third element of the list (position number 3), so its index number must be 2 since 3−1=23−1=2.

The index numbers help us retrieve individual elements from a list. Looking back at the list row_1 from the previous code example (shown below), we can retrieve the first element (the string 'Facebook') with the index number 0 by running the code print(row_1[0]).

row_1 = ['Facebook', 0.0, 'USD', 2974676, 3.5]
print(row_1[0])

Output
Facebook

Negative indexing

the last element has the index number -1, the second to last element has the index number -2, and so on

Subset a list

We, can select multiple items of a list and create a new list from it.

row_1 = ['Facebook', 0.0, 'USD', 2974676, 3.5]

fb_rating_data = [row_1[0], row_1[3], row_1[-1]]
print(fb_rating_data)

Output
['Facebook', 2974676, 3.5]

List slicing

We retrieve the list slice we want by using the syntax a_list[m:n], where the following are true:
- m represents the index number of the first element of the slice
- n represents the index number of the last element of the slice plus one (if the last element has the index number 2, then n will be 3, if the last element has the index number 4, then n will be 5, and so on).
When we need to select the first or last x elements (x stands for a number), we can use even simpler syntax shortcuts:
- a_list[:x] when we want to select the first x elements.
- a_list[-x:] when we want to select the last x elements.

row_3 = ['Clash of Clans', 0.0, 'USD', 2130805, 4.5]

cc_pricing_data = row_3[0:3] # Syntax shortcut
print(cc_pricing_data)

Output
['Clash of Clans', 0.0, 'USD']

Retrieving from List of Lists

The data_set variable is still a list, which means we can retrieve individual list elements and perform list slicing using the syntax we learned. Below, we'll do the following:

Retrieve the first list element (row_1) using data_set[0]
Retrieve the last list element (row_5) using data_set[-1]
Retrieve the first two list elements (row_1 and row_2) by performing list slicing using data_set[:2]

row_1 = ['Facebook', 0.0, 'USD', 2974676, 3.5]
row_2 = ['Instagram', 0.0, 'USD', 2161558, 4.5]
row_3 = ['Clash of Clans', 0.0, 'USD', 2130805, 4.5]
row_4 = ['Fruit Ninja Classic', 1.99, 'USD', 698516, 4.5]
row_5 = ['Minecraft: Pocket Edition', 6.99, 'USD', 522012, 4.5]

data_set = [row_1, row_2, row_3, row_4, row_5]

print(data_set[0])
print(data_set[-1])
print(data_set[:2])

Output
['Facebook', 0.0, 'USD', 2974676, 3.5]
['Minecraft: Pocket Edition', 6.99, 'USD', 522012, 4.5]
[['Facebook', 0.0, 'USD', 2974676, 3.5], 
 ['Instagram', 0.0, 'USD', 2161558, 4.5]]

We'll often need to retrieve individual elements from a list that's part of a list of lists — for instance, we may want to retrieve the value 3.5 from ['Facebook', 0.0, 'USD', 2974676, 3.5], which is part of the data_set list of lists. Below, we extract 3.5 from data_set using what we've learned:

We retrieve row_1 using data_set[0], and assign the result to a variable named fb_row.
We print fb_row, which outputs ['Facebook', 0.0, 'USD', 2974676, 3.5].
We retrieve the last element from fb_row using fb_row[-1] (since fb_row is a list), and we assign the result to a variable named fb_rating.
We print fb_rating, which outputs 3.5

data_set = [row_1, row_2, row_3, row_4, row_5]

fb_row = data_set[0]
print(fb_row)

fb_rating = fb_row[-1]
print(fb_rating)

Output
['Facebook', 0.0, 'USD', 2974676, 3.5]
3.5

Above, we retrieved 3.5 in two steps: we first retrieved data_set[0], and then we retrieved fb_row[-1]. However, there's an easier way to retrieve the same value of 3.5 by chaining the two indices ([0] and [-1]) — the code data_set[0][-1] retrieves 3.5:

data_set = [row_1, row_2, row_3, row_4, row_5]

print(data_set[0][-1]) # data_set[row_1][index -1]

Output
3.5

Above, we've seen two methods of retrieving the value 3.5. Both methods lead to the same output (3.5), but the second method involves less typing because it combines the steps we see in the first case. While you can choose either option, people generally choose the second one.

Append list function

Once we create a list, we can add (or append) values to it using the append() function

a_list = [1, 2]
a_list.append(3)
print(a_list)

Output
[1, 2, 3]

List functions

sum

avg_rating=sum(rating_sum)

length

len(rating_sum)

Sum and average with List append function

We initialize an empty list.
We start looping over our dataset and extract the ratings.
We append the ratings to the empty list we created in step one.
Once we have all the ratings, we do the following:
- Use the sum() function to add all the ratings (to be able to use sum(), we'll need to store the ratings as floats or integers).
- Divide the sum by the number of ratings (which we can get using the len() function).

Below, we can see these steps for our dataset containing five rows:

row_1 = ['Facebook', 0.0, 'USD', 2974676, 3.5]
row_2 = ['Instagram', 0.0, 'USD', 2161558, 4.5]
row_3 = ['Clash of Clans', 0.0, 'USD', 2130805, 4.5]
row_4 = ['Fruit Ninja Classic', 1.99, 'USD', 698516, 4.5]
row_5 = ['Minecraft: Pocket Edition', 6.99, 'USD', 522012, 4.5]

app_data_set = [row_1, row_2, row_3, row_4, row_5]

rating_sum = [] # Step 1
for row in app_data_set:
    rating = row[-1] # Step 2
    rating_sum.append(rating) # Step 3

print(rating_sum)

avg_rating = sum(rating_sum) / len(rating_sum) # Step 4
print(avg_rating)

Output
[3.5, 4.5, 4.5, 4.5, 4.5]
4.3

Delete Data

del(android[10472])

Not in operator

The not in operator is the opposite of the in operator. For instance, 'z' in ['a', 'b', 'c'] returns False because 'z' is not in ['a', 'b', 'c'], but 'z' not in ['a', 'b', 'c'] returns True because it's true that 'z' is not in the list ['a', 'b', 'c'].

Duplicates with list

duplicate_apps = []
unique_apps = []

for app in android:
    name = app[0]
    if name in unique_apps:
        duplicate_apps.append(name)
    else:
        unique_apps.append(name)
    
print('Number of duplicate apps:', len(duplicate_apps))
print('\n')
print('Examples of duplicate apps:', duplicate_apps[:15])

Sort list

swap_avg_by_hour = []

for row in avg_by_hour:
    swap_avg_by_hour.append([row[1], row[0]])
    
print(swap_avg_by_hour)

sorted_swap = sorted(swap_avg_by_hour, reverse=True)

sorted_swap

PreviousConditionals NextDictionaries

Last updated 1 year ago

hashtagLists

hashtagList length

hashtagList Indexing

hashtagNegative indexing

hashtagSubset a list

hashtagList slicing

hashtagRetrieving from List of Lists

hashtagAppend list function

hashtagList functions

hashtagSum and average with List append function

hashtagDelete Data

hashtagNot in operator

hashtagDuplicates with list

hashtagSort list