Home ยป Blog ยป Learn how to code with python: Lesson 2 – Working with text

Learn how to code with python: Lesson 2 – Working with text

Working with text
Working with text

In this lesson, I show you how you can work with text. Text in programming lingo is also referred to as a string. And a string can be described as a group of characters. A character can be a letter, a number, a space, a comma, a point, a hash (#), etc.. So far, so good …

Video

Explained in the video

In the video I talk about some common simple operations that can be done on strings. All operations are string methods that are available out of the box as part of the standard library of Python. I discuss only 6 of the many methods that are available. You can read up on more of them here: https://docs.python.org/3.6/library/stdtypes.html#str

String indexes

In the video I talk about “string positions” or indexes. Let me spell it out for you once more. Each character has a position in a string. This positions is identified or referenced by a number in an index. This index is a zero-based index.

For example:

string “ABCDEF”
index  012345

character A has index 0
character B has index 1
character C has index 2
character D has index 3
character E has index 4
character F has index 5

Find

I often use the find method on a string to check if it contains another string. For example:

“It’s raining cats and dogs”.find(“cat”)
will return:
13

And
“It’s raining cats and dogs”.find(“mouse”)

will return:

-1

So if the find method returns -1, then I know that the string does not contain the other string I was looking for.

From the Python documentation:

str.find(sub[, start[, end]])Return the lowest index in the string where substring sub is found within the slice s[start:end]. Optional arguments start and end are interpreted as in slice notation. Return -1 if sub is not found.

Read the docs here: https://docs.python.org/3.5/library/stdtypes.html#str.find

Replace

String replacement is also a very useful and common operation. Here is an example:

“It’s raining cats and dogs”.replace(“cats”, “mice”)
will return:
“It’s raining mice and dogs”

From the Python documentation:

str.replace(old, new[, count])Return a copy of the string with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced.

Read the docs here:ย https://docs.python.org/3.6/library/stdtypes.html#str.replace

Lower and upper

If you want to compare strings in a case insensitive manner, then it helps to transform both strings to all lower or all upper case characters. You will often use lower together with the find method to find matching strings regardless of the case.

Example:
“It’s raining cats and dogs”.find(“Cat”)
will return:
-1

While
“It’s raining cats and dogs”.lower().find(“Cat”.lower())
will return:
13

From the Python documentation:

str.lower()Return a copy of the string with all the cased characters converted to lowercase.

str.upper()Return a copy of the string with all the cased characters converted to uppercase. Note that str.upper().isupper() might be False if s contains uncased characters or if the Unicode category of the resulting character(s) is not โ€œLuโ€ (Letter, uppercase), but e.g. โ€œLtโ€ (Letter, titlecase).

Read about lower in the docs here: https://docs.python.org/3.6/library/stdtypes.html#str.lower
Read about upper in the docs here: https://docs.python.org/3.6/library/stdtypes.html#str.upper

Format

Sometimes you want to add to a string later, for instance when you are constructing a url to call:

“www.davetromp.net/page{}”.format(“1”)
will return:
“www.davetromp.net/page1”

and

“www.davetromp.net/page{}”.format(“1234”)
will return:
“www.davetromp.net/page1234”

From the Python documentation:

str.format(*args, **kwargs)Perform a string formatting operation. The string on which this method is called can contain literal text or replacement fields delimited by braces {}. Each replacement field contains either the numeric index of a positional argument, or the name of a keyword argument. Returns a copy of the string where each replacement field is replaced with the string value of the corresponding argument.

Read the docs here: https://docs.python.org/3.6/library/stdtypes.html#str.format

Split

Strings can be split up in many ways, but using the split method is the most obvious one.

Example:

“It’s raining cats and dogs”.split()
will return:
[“It’s”, ‘raining’, ‘cats’, ‘and’, ‘dogs’]
This is a list of the words in the sentence. By default, the split method will split the string on white spaces. This way we could process a sentence word by word.
To get the third word
“It’s raining cats and dogs”.split()[2]

From the Python documentation:

str.split(sep=None, maxsplit=-1)Return a list of the words in the string, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done (thus, the list will have at most maxsplit+1 elements). If maxsplit is not specified or -1, then there is no limit on the number of splits (all possible splits are made).

Read the docs here: https://docs.python.org/3.5/library/stdtypes.html?highlight=split#str.split

Please share this article if you enjoyed it. I appreciate it!
For the next lesson click here.