Page 115 - Python for Everybody
P. 115

8.14. DEBUGGING 103 orig = t[:]
t.sort()
In this example you could also use the built-in function sorted, which returns a new, sorted list and leaves the original alone. But in that case you should avoid using sorted as a variable name!
4. Lists, split, and files
When we read and parse files, there are many opportunities to encounter input that can crash our program so it is a good idea to revisit the guardian pattern when it comes writing programs that read through a file and look for a “needle in the haystack”.
Let’s revisit our program that is looking for the day of the week on the from lines of our file:
From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008
Since we are breaking this line into words, we could dispense with the use of startswith and simply look at the first word of the line to determine if we are interested in the line at all. We can use continue to skip lines that don’t have “From” as the first word as follows:
fhand = open('mbox-short.txt') for line in fhand:
words = line.split()
if words[0] != 'From' : continue print(words[2])
This looks much simpler and we don’t even need to do the rstrip to remove the newline at the end of the file. But is it better?
python search8.py
Sat
Traceback (most recent call last):
File "search8.py", line 5, in <module> if words[0] != 'From' : continue
     IndexError: list index out of range
It kind of works and we see the day from the first line (Sat), but then the program fails with a traceback error. What went wrong? What messed-up data caused our elegant, clever, and very Pythonic program to fail?
You could stare at it for a long time and puzzle through it or ask someone for help, but the quicker and smarter approach is to add a print statement. The best place to add the print statement is right before the line where the program failed and print out the data that seems to be causing the failure.
Now this approach may generate a lot of lines of output, but at least you will immediately have some clue as to the problem at hand. So we add a print of the variable words right before line five. We even add a prefix “Debug:” to the line so we can keep our regular output separate from our debug output.















































































   113   114   115   116   117