Useful Functions

If we create our own function with the built in function name, it will be run instead of built in function.

Filtering Non English Names

All these characters that are specific to English texts are encoded using the ASCII standard. Each ASCII character has a corresponding number between 0 and 127 associated with it, and we can take advantage of that to build a function that checks an app name and tells us whether it contains non-ASCII characters.

We built this function below, and we use the built-in ord() function to find out the corresponding encoding number of each character.

def is_english(string):
    
    for character in string:
        if ord(character) > 127:
            return False
    
    return True

The function seems to work fine, but some English app names use emojis or other symbols (™, — (em dash), – (en dash), etc.) that fall outside of the ASCII range. Because of this, we'll remove useful apps if we use the function in its current form.

The function seems to work fine, but some English app names use emojis or other symbols (™, — (em dash), – (en dash), etc.) that fall outside of the ASCII range. Because of this, we'll remove useful apps if we use the function in its current form. To minimize the impact of data loss, we'll only remove an app if its name has more than three non-ASCII characters:

def is_english(string):
    non_ascii = 0
    
    for character in string:
        if ord(character) > 127:
            non_ascii += 1
    
    if non_ascii > 3:
        return False
    else:
        return True

Sorted

This function takes in an iterable data type (like a list, dictionary, tuple, etc.), and returns a list of the elements of that iterable sorted in ascending or descending order (the reverse parameter controls whether the order is ascending or descending).

Replace Function

str.replace(), we substitute the str for the variable name of the string we want to modify. Let's look at an example in code:

Capitatize 1st word

Split strings

Convert a value in each row to int

Convert to string

String Format Function

We use the method with a string — which acts as a template — using the brace characters ({}) to signify where we want to insert any variables. We then pass those variables as arguments to the method. Let's look at a few examples:

str.format() converts the integer to a string. The variables are inserted into the {} in the order we pass them as arguments.

If we want to specify ordering and/or repeat numbers, we can use integers:

example

Formatting Numbers Inside Strings

Another powerful use of the method is helping us apply formatting to numbers as they are inserted into the string. This can make our data more readable, especially in the case of long decimal numbers. Let's look at a quick example:

For most cases, having six numbers after the decimal point — also called precision — is unnecessary. One approach might be that instead of a precision of 6, we only want to show a precision of 2:

We specify number formatting, including things like precision, by adding one of various format specifications inside the braces ({}) of our string. There are many different parts to this format specification part of the documentation, but because the complexity makes it difficult to understand, we're going to just focus on the most common ones you'll need.

To indicate the precision of two, we specify :.2f after the name or position of our argument:

If you are not specifying a named/positional argument, you just leave that part out:

Another useful format specification is to add a comma as a thousands separator, which prevents large numbers from being hard to read, as in the example below:

To add a comma, you would use the syntax :, inside the brackets, after the number or name of the variable you're inserting:

We can also combine the thousands separator and the precision by specifying them in this order:

Example

Last updated