Python Lesson 02 - More on Classes

Lesson 2 builds on the constructs learned in lesson 1. We'll focus on making our own classes, modules, and packages. We'll also start to look at Python's very complete standard library, and learn to play with the APIs that we have access to.

Contents

A Review of Lesson 01

See if you know what all the following code would do:

print "variable assignment and printing:"
name = "Jeff Anderson"
print "Hello, my name is", name

print "if/else:"
if name:
    print "Hello, my name is", name
else:
    print "Hello, I have no name"

print "lists and for loop:"
companies = ['Linode', 'Slicehost']

for c in companies:
    print c, "is a company that doesn't provide shared hosting."

print "while loops:"
number = 0
while number < 10:
    print number
    number = number +2

print "functions:"
def my_spiffy_function(spiff):
    print str(spiff), "is spiffy"

my_spiffy_function('Bluehost')
my_spiffy_function('Hostmonster')
my_spiffy_function('PYTHON!')

Make sure you are comfortable writing these types of constructs. If you don't know what something does, see if it is in the Lesson 1 notes. If you are still stuck, feel free to ask the mailing list, or ask me in person.

Homework Solution

This is the code I ended up writing:

def test_message(sender, recipient, subject, body):
    msg = "From: " + sender + "\n"
    msg += "To: " + recipient + "\n"
    msg += "Subject: " + subject + "\n\n"
    msg += body
    return msg

That was easy. Python is very nice because there's an even easier way to do this. We can create a string that has the whole test messages, and include placeholders. This is basically a templating system built into the language:

def test_message(sender, recipient, subject, body):
    return """
From: %s
To: %s
Subject: %s

%s""" % (sender, recipient, subject, body)

I am using a construct here called string formatting. This in essence makes the test_message function a one liner. When writing Python code, it almost always pays off to go hunting for an existing language feature or standard library module. Chances are that Python has just the thing to solve your problem using a simple and elegant piece of code.

Exceptions

One thing that is nice in Python is support for exceptions and exception handling. You've probably seen at least one exception in the interactive interpretor. You can do something useful when that happens when your code is running.

Here is an example of an exception:

>>> 4 / 0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ZeroDivisionError: integer division or modulo by zero

No matter how new or fast a computer is, it can't divide by zero. Asking it to do so causes Python to tell you that this isn't something it can do. Here is some example code that will divide two numbers, and tell you nicely that an error occurred. Generally if exceptions go unhandled, it will terminate the program. Planning for these exceptional things that may happen keeps everything running smoothly:

def divide(x,y):
    try:
        print x/y
    except ZeroDivisionError:
        print "you can't divide by zero."
    finally:
        print "division attempt complete"
        print

print "divide 4 by 2:"
divide(4,2)
print "divide 4 by 1:"
divide(4,1)
print "divide 4 by 0:"
divide(4,0)

This is a classic way to check for exceptions. The try block is the code that will be run. If during the course of running that code an exception happens, the except block will be executed. No matter if an exception occurred or not, after running the try block, finally is executed last. You don't have to include finally. Most of the time you won't need it.

If an exception occurs that isn't handled by a try/except construct, the program will simply crash.

The alternative to exception handling is very ugly. For every function you write, you'd have to write lines upon lines of code that do checking and second guessing. Implementing exception handling lets you write code without all the extra junk. If something bad happens, you can handle it.

Importing Code

Python comes with a lot of modules already. Lot's of people make their code available for others as well. How do we use other code in our own projects? Pasting it in isn't a good option, especially when updates come out.

import is a statement that will try to get code stored elsewhere loaded up so you can access it and make calls to it. Let's use import on our last .py file:

>>> import division
divide 4 by 2:
2
division attempt complete
divide 4 by 1:
4
division attempt complete
divide 4 by 0:
you can't divide by zero.
division attempt complete
>>> division.divide(4,0)
you can't divide by zero.
division attempt complete

Python saw that division.py was in the current directory, and parsed the code. Notice that it executed the code that was in division.py. Usually startup code that the module needs would be put in. Notice how after importing the module, I can make the function call division.divide(4,0).

I can even import a function in the division module directly to my current namespace:

>>> from division import divide
>>> divide(4,0)
you can't divide by zero.
division attempt complete

The Python Path

So, how exactly does import know where to look for all that stuff that we might import? It looks through a list of directories, as defined by your Python Path. The Python Path is similar to the PATH of your regular shell.

To see what those directories are, you can use the sys module that provides access to many of Python's internals. Here is what it looks like on my desktop at home:

>>> import sys
>>> sys.path
['', '/usr/lib/python2.6/site-packages/pychm-0.8.4-py2.6-linux-i686.egg',
'/usr/lib/python2.6/site-packages/pexpect-2.3-py2.6.egg',
'/usr/lib/python26.zip', '/usr/lib/python2.6',
'/usr/lib/python2.5/site-packages', '/usr/lib/python2.6/plat-linux2',
'/usr/lib/python2.6/lib-tk', '/usr/lib/python2.6/lib-old',
'/usr/lib/python2.6/lib-dynload', '/usr/lib/python2.6/site-packages',
'/usr/lib/python2.6/site-packages/Numeric',
'/usr/lib/python2.6/site-packages/PIL',
'/usr/lib/python2.6/site-packages/gtk-2.0']

The '' at the beginning means "look in the current directory."

There are a few things in here that are actually not directories. They are files that end with the .egg extension. What is this? It's a Python egg. That's right. It's an egg. An egg is basically nothing more than a zip file that contains all the Python code for a package, and a bit of metadata. Eggs were created to make distribution easier in some situations. There is even an egg here that doesn't use the .egg extension. It uses the .zip instead.

You can use an environmental variable to control your Python Path:

$ export PYTHONPATH=$HOME/sandbox/python
$ python
>>> import sys
>>> sys.path
['', '/home/jefferya/sandbox/python',
'/usr/lib/python2.6/site-packages/pychm-0.8.4-py2.6-linux-i686.egg',
'/usr/lib/python2.6/site-packages/pexpect-2.3-py2.6.egg',
'/usr/lib/python26.zip', '/usr/lib/python2.6',
'/usr/lib/python2.5/site-packages', '/usr/lib/python2.6/plat-linux2',
'/usr/lib/python2.6/lib-tk', '/usr/lib/python2.6/lib-old',
'/usr/lib/python2.6/lib-dynload', '/usr/lib/python2.6/site-packages',
'/usr/lib/python2.6/site-packages/Numeric',
'/usr/lib/python2.6/site-packages/PIL',
'/usr/lib/python2.6/site-packages/gtk-2.0']

Classes

Let's review some terminology. In object oriented programming, a class is the blueprint of an object. It is like the plans for a car. Before you can have a corolla, you need to describe it in a format that will allow the factory to churn something out that would be a corolla every time.

An object is analogous to a real world "thing". You'll understand better what an object is as you become more familiar with specific examples.

An instance of a class is like having a real physical Corolla that you can actually get into, start up, drive around, put horrible body kits on it, etc. The base Corolla class is unchanged, and once I have my own Corolla, I can do as I please with it. As opposed to our real-world Corolla example, an instance doesn't exist as an actual tangible thing in the real world. Instead, it exists inside the running program that spawned it. Just like you can have more than one Corolla, you can have more than one instance of the same object. Each has their own memory space, and each can reflect a different state.


Ok, enough with may semi-lame analogy. Let's build an actual class. We'll actually build a semi-useful class for this example:

class customer(object):
    company="Bluehost"
    def __init__(self, name, domain):
        self.name = name
        self.domain = domain
    def domain_expire(self):
        message = "Hello, %s. Your domain name, %s, has expired." % (self.name, \
                self.domain)
        return message

Creating an instance of a class.

So, we have our customer class inside a file called Customer.py in our current working directory. Let's import that class definition into our Python interpretor, and then instantiate it:

>>> import Customer
>>> from Customer import customer
>>> d = customer("Jeff", "pinguino5.net")

I've imported all the code in Customer.py, and then from there, I imported the customer class to my current namespace. When any class is instantiated, the first thing it does is call the __init__ method with the arguments passed. In this case, I gave the class instantiation call two arguments: "Jeff" and "pinguino5.net". Those will get passed to __init__ as name and domain.

This __init__ method (and all other methods in the class ) also have an argument called self. What does that do? Python passes the current instance of the class to the definition.


Let's dissect all the parts of this line of code:

class customer(object):

This is the class definition. The object part means that this class inherits everything from a class called object. This gives the class all the defaults that classes need, such as the default __str__ method. If you want to see what all the defaults are, type dir(object) in a Python interpretor.

The class part is a keyword that tells Python "I'm defining a class."

Every function that is part of a class, has at least one positional argument, self. Python passes the currant instance of the object when the function is called. You can access object variables by doing self.variable.


We also looked at an instance of our customer class in the Python interpretor:

>>> import customer
>>> d = customer.customer("Jeff", "pinguino5.net")

Notice that I'm passing arguments when I create an instance of that class. Those are passed to the __init__ function by Python. In this case, the __init__ function takes those arguments, and puts them into self.name and self.domain. We can see that name and domain are attached to the instance of this object:

>>> dir(c)
['__class__', '__delattr__', '__dict__', '__doc__', '__format__',
'__getattribute__', '__hash__', '__init__', '__module__', '__new__',
'__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__',
'__str__', '__subclasshook__', '__weakref__', 'company', 'domain',
'domain_expire', 'name']

All those attributes that start and end with two underscores are Python special functions. We've defined __init__ and __str__ in our class. The rest are inherited from object itself. You can also see: 'company', 'domain', 'domain_expire', and 'name' all attached to this instance. 'name' and 'domain' were added on the fly by __init__ when the class was created. Let's look at the class using dir instead of an instance to see if there are any differences:

>>> dir(customer.customer)
['__class__', '__delattr__', '__dict__', '__doc__', '__format__',
'__getattribute__', '__hash__', '__init__', '__module__', '__new__',
'__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__',
'__str__', '__subclasshook__', '__weakref__', 'company', 'domain_expire']

The class itself doesn't have those two variables.


Here is another class that is probably more useful than the last one:

class domain(object):
    def __init__(self, domain):
        self.domain = domain
    def digtest(self):
        import subprocess

        records={}

        d = subprocess.Popen(('dig', '+short', '-t', 'A', self.domain), \
        stdout=subprocess.PIPE)
        records['a']=d.stdout.readlines()

        d = subprocess.Popen(('dig', '+short', '-t', 'MX', self.domain), \
        stdout=subprocess.PIPE)
        records['mx']=d.stdout.readlines()

        if not records['a']:
            print "The A record: %s doesn't seem to exist" % (self.domain,)
        if not records['mx']:
            print "The MX record: %s doesn't seem to exist" % (self.domain,)

    def webtest(self):
        """Try loading the web page for the main domain."""
        import urllib2

        req = urllib2.urlopen("http://%s/" % (self.domain,))

        if req.code is not 200: print "Non-200 response: %s received" %s

Questions to consider:

The digtest function makes use of a module called subprocess. Subprocess is a standard library module that allows you to run processes. In this case, it runs dig +short -t A example.com and reads any result from standard output. The subprocess.PIPE just tells subprocess that it is supposed to handle the output of this program. Without it, the output would just be spewed to whatever the current stdout is, usually your terminal.

If dig doesn't return an A record, or an MX record, the function simply prints out an error message.

The webtest utility tries to execute a web request. It is the equivalent of trying to type in the domain name in your browser. It looks at the HTTP response code, which is stored as a special response object defined in the urllib2 module. The instance of that object is stored into a variable called req. req.code stores the HTTP response code as an Integer. This function simply makes sure that it is returning 200 instead of anything else (such as a 404, 500, 403, etc.)

Packages

Let's combine the customer.py file and the domaintests.py files into one package. First, let's make and change into a new directory which will hold all the various .py files in the package:

$ mkdir bhpack
$ cd bhpack
$ ls -a
. ..

Now, to make this a Python package, there needs to be a file called __init__.py:

$ touch __init__.py #run like this, touch creates an empty file
$ ls
. .. __init__.py

Now, let's put copies of our .py files into this package directory:

$ ls
. .. __init__.py customers.py domaintests.py

We now have our very own Python package.

Python Standard Library

smtplib

I use a simple Python script while taking calls. It simply sends an e-mail message. I usually use the customers username@box#.bluehost.com e-mail address to send the test messages. I can even use SMTP auth by supplying their username and password.

I've cleaned it up a bit for class, but here it is:

#!/usr/bin/python

import smtplib

def send(from_addr, to_addr, subject="Test", \
        mail_body="This is a test message", mail_server="localhost", \
        mail_server_port=25, authuser="", authpass=""):
    from_hdr = 'From: %s' % from_addr
    to_hdr = 'To: %s' % to_addr
    subject_hdr = "Subject: %s" % subject
    s = smtplib.SMTP(mail_server, mail_server_port)
    s.set_debuglevel(1)
    ehlo_res = s.ehlo()
    s.starttls()

    if "AUTH PLAIN LOGIN" in ehlo_res[1] and authuser:
        s.login(authuser, authpass)
    else: print "not using authentication"

    email_message="%s\n%s\n%s\n\n%s" % (from_hdr, to_hdr, subject_hdr, mail_body)
    s.sendmail(from_addr, to_addr, email_message)
    s.quit()

if __name__=="__main__":


    send(from_addr="pinguino@box552.bluehost.com", \
         to_addr="atest@pinguino5.net", \
         mail_server="box552.bluehost.com", \
         authuser="pinguino", authpass="mypass" \
        )

The send function is what does the work here. I create an SMTP object instance and store it to the variable s. I have some logic done to see if I have access to a username and password (supplied by a keyword argument to the send function) and if the ehlo response by the server indicates that it can in fact to authentication. If both of those are True, it will go ahead and attempt the login.

I build the e-mail message by hand. The entire message is stored as a string in the email_message variable, and then the s.sendmail function is called.

Notice that I have s.starttls in there. My test messages use both encryption, and authentication. I can also test this much more quickly from my command line than using the webmail. There are some times when this particular test is not useful, but most times it is.

The set_debuglevel(1) causes smtplib to print out everything it is doing, and every response it receives. I can do a quick visual check to make sure that all is working, and that the test message was sent.

urllib2

Let's play with urllib2 a bit more:

>>> import urllib2
>>> u = urllib2.urlopen("http://www.programmerq.net")
>>> u.code
200
>>> for i in url.readlines():
...     print i
...
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">

(snip)

So readlines returns a list, with each line of the response from the server as a string. Simple enough.

You can also read the whole response into an external library that can parse html and do something useful with it.

Let's take a look at the headers now:

>>> u.headers
<httplib.HTTPMessage instance at 0x96ff14c>

Well, that's not too useful. I guess urllib2 has a special object type for its headers. Let's see if we can find a simple list or string representation of them:

>>> dir(u.headers)
['__contains__', '__delitem__', '__doc__', '__getitem__', '__init__',
'__iter__', '__len__', '__module__', '__setitem__', '__str__',
'addcontinue', 'addheader', 'dict', 'encodingheader', 'fp', 'get',
'getaddr', 'getaddrlist', 'getallmatchingheaders', 'getdate', 'getdate_tz',
'getencoding', 'getfirstmatchingheader', 'getheader', 'getheaders',
'getmaintype', 'getparam', 'getparamnames', 'getplist', 'getrawheader',
'getsubtype', 'gettype', 'has_key', 'headers', 'iscomment', 'isheader',
'islast', 'items', 'keys', 'maintype', 'parseplist', 'parsetype', 'plist',
'plisttext', 'readheaders', 'rewindbody', 'seekable', 'setdefault',
'startofbody', 'startofheaders', 'status', 'subtype', 'type', 'typeheader',
'unixfrom', 'values']

Holy cow! That's quite a bit. I notice that this has a few things. __getitem__ is the method that lets you do dictionary-type access on the headers:

>>> u.headers['server']
'Apache/2.2.11 (Unix) mod_ssl/2.2.11 OpenSSL/0.9.8k DAV/2 mod_python/3.3.1
Python/2.6 mod_wsgi/2.3'

How interesting, the headers object has an attribute headers:

>>> u.headers.headers
['Date: Fri, 29 May 2009 15:45:45 GMT\r\n', 'Server: Apache/2.2.11 (Unix)
mod_ssl/2.2.11 OpenSSL/0.9.8k DAV/2 mod_python/3.3.1 Python/2.6
mod_wsgi/2.3\r\n', 'Vary: Cookie\r\n', 'Content-Disposition: inline;
filename=\r\n', 'Connection: close\r\n', 'Transfer-Encoding: chunked\r\n',
'Content-Type: text/html\r\n']

So, the dictionary access makes it wonderful to write neat, clean, and concise higher level code. The list of each header is very similar to how multiline data sources are represented using the readline method. Because this is available to me, I can write more specialized code if I need to.

Contact

If you have any questions, please post them to the mailing list on the Google Group.

I'm also happy to answer questions in person. This document will also be updated to reflect questions and concerns that come up both in person and on the mailing list.

Note

License and Legal

This document is Copyright (c) Jeff Anderson. It is not freely distributable. It is intended to be distributed amongst employees of Bluehost for the purpose of learning the Python language. Any copying or distribution beyond this is not permitted. These terms may be modified at any time by the author.

This notice applies only to the content. The HTML code generated by docutils is in the public domain. The syntax highlighting Javascript is google-code-prettify.