CS303E Homework 4

Instructor: Dr. Bill Young
Due Date: Monday, September 22, 2025 at 11:59pm

Background

A decision tree provides a systematic approach to addressing a question when there are multiple factors in making the decision. For example this shows a decision tree where the choice is whether or not to take a certain job:

>

Note that a decision tree is typically drawn growing downward, unlike a real tree. The top node of the tree is called the "root node." Nodes (decision nodes) inside the tree, including the root node, have associated questions and at each decision node you follow one branch depending on the answer to that question. Each node at the bottom of the tree is called a "leaf node" and represents a final decision.

An alternative representation of a decision tree is a decision table showing possible outcomes given possible inputs. Below is the table that might represent the decision tree in the graphic. This table is "complete" in that it contains all eight possible combinations of values of the three input variables. But not all decision tables are complete in this sense; they may leave out certain combinations of possible inputs.

Free Coffee Salary Commute Accept
Yes<$50K>1.0hrNo
Yes<$50K≤1.0hrNo
Yes≥$50K>1.0hrNo
Yes≥$50K≤1.0hrYes
No <$50K>1.0hrNo
No <$50K≤1.0hrNo
No ≥$50K>1.0hrNo
No ≥$50K≤1.0hrNo

It's often possible to build a tree that is more "efficient" than just naively implementing the decision table. For example, notice that the decision tree in the graphic doesn't contain eight leaves, as it might in a naive implementation of the corresponding table. In particular, if the salary is too low, we don't care about the commute or whether the company offers free coffee, so it would be a waste of time to ask those questions if we take the "no" branch from the first node. Notice also that we don't necessarily perform tests in the same order they appear in the table. It might be less efficient to put certain tests at the "root" of the tree.

It's pretty easy to write a computer program to implement a decision tree. Each place the tree "branches" write an if-else or if-elif-else statement that does one of the choices depending on the value of the condition. Each leaf node gives an answer.

Here is a possible implementation of the decision tree in the graphic:

if salary < 50000:
   accept = False
elif commuteInHours > 1.0:
   accept = False
elif freeCoffee:
   accept = True
else
   accept = False
where the variables like salary, commuteInHours, and freeCoffee would have been set prior to this code, perhaps with input statements or via some computation.

Assignment

Your assignment is to implement a decision tree represented by the following decision table. The question to answer is whether a specific individual buys a computer. The inputs to the decision process are the age of the individual, her income level (high, medium or low), whether she is a student, and her credit rating. Under the given combinations of inputs, the last columns tells us whether the individual buys a computer.

Age Income Student Credit Buys
<=30HighNoPoorNo
<=30HighNoGoodNo
31..40HighNoPoorYes
>40MediumNoPoorYes
>40LowYesPoorYes
>40LowYesGoodNo
31..40LowYesGoodYes
<=30MediumNoPoorNo
<=30LowYesPoorYes
>40MediumYesPoorYes
<=30MediumYesGoodYes
31..40MediumNoGoodYes
31..40HighYesPoorYes
>40MediumNoGoodNo

Notice that this table is not "complete"; there are possible combinations of inputs that are not represented by rows in the table. For example, we can't tell from the table whether a 25 year old non-student with low income and bad credit will buy a computer; probably not. But that's good for us because we can optimize our algorithm by ignoring such cases (called "don't cares"). Your algorithm only has to work for the cases in the table; for don't cares you are free to return either answer.

For this problem, it is possible to define and implement a fairly simple decision tree, depending on the order in which you ask the questions. (My solution had 15 total lines of code, not counting comments and blank lines, including the four input lines; I could have done it in 12 lines, but wrote straightforward code.) You will be judged on whether your program gives the correct answers (the final column) for the cases in the table, not for the efficiency of your solution. But it will pay you to think about how to order the tests. Be sure to test your code for most, if not all, of the 14 lines of the table.

You can assume that the user enters legal answers in response to the four input statements. Your program does not have to behave correctly if the values provided are not legal.

Below are some sample runs of the program showing the desired output:

> python decisionTree.py

Please enter person's age: 25
Person's income (High, Medium, Low): High
Is this person a student (Yes or No)? No
Does this person have good credit (Yes or No)? No

This person will not purchase a computer.

> python decisionTree.py

Please enter person's age: 35
Person's income (High, Medium, Low): Low
Is this person a student (Yes or No)? Yes
Does this person have good credit (Yes or No)? Yes

This person will purchase a computer.

> python decisionTree.py

Please enter person's age: 50
Person's income (High, Medium, Low): Medium
Is this person a student (Yes or No)? No
Does this person have good credit (Yes or No)? No

This person will purchase a computer.

> python decisionTree.py

Please enter person's age: 25
Person's income (High, Medium, Low): Low
Is this person a student (Yes or No)? No
Does this person have good credit (Yes or No)? No

This person will not purchase a computer.

>
Notice that the last example tested a combination of inputs not in the table. Either answer would have been acceptable.

Turning in the Assignment:

The program should be in a file named DecisionTree.py. Submit the file via Canvas before the deadline shown at the top of this page. Submit it to the assignment weekly-hw4 under the assignments sections by uploading your python file.

Your file must run without syntax or obvious run-time errors. It must also contain a header with the following format:

# File: DecisionTree.py
# Student: 
# UT EID:
# Course Name: CS303E
# 
# Date:
# Description of Program: 

If you submit multiple times to Canvas, it will rename your file name to something like Filename-1.py, Filename-2.py, etc. Don't worry about that; we'll grade the latest version.

Programming Tips:

Errors in Conditions: A common error by new programmers is to try something like:
   if income == "Low", "Medium", "High"
      ...
That doesn't work. It's interpreted by Python as:
   if income == ("Low", "Medium", "High"):
      ...
which is comparing the income to a triple of values, which is almost certainly False.

Similarly,

if income == "Low" or "Medium" or "High":
won't work either. This is interpreted by Python as:
if (income == "Low") or "Medium" or "High":
Remember that any non-empty string is treated as True in a Boolean context. So that test is always True, which is not what you wanted. What was probably meant was:
   if income == "Low" or income == "Medium" or income == "High":
      ...
(Think about what happens if this used and rather than or.)

Another common pattern from new programmers is to do something like this:

if test: 
   var = True
else:
   var = False
Note that test is already return a Boolean value. There's no reason not to use it directly. That is, replace the above code with:
var = test
If the value were reversed, just negate the condition. That is, replace:
if negTest:
   var = False
else:
   var = True
with:
var = not negTest
Finally, there's usually no reason to have at test like:
if test == True:
   ...
or
if negTest == False:
   ...
These are just equivalent to:
if test:
   ...
if not negTest:
   ...
You can assume. When an assignment says that "you can assume" something about the input, that just means that you don't have to check it. If the user enters an input that doesn't meet that assumption, the program can crash or behave badly. That's not your problem.

For many of the assignments this semester, we'll say that you can assume certain things about the inputs usually because you don't yet have the skills necessary to check for certain errors. But if you get a programming job, you should always validate the inputs as much as possible. In general, it's bad programming practice to allow bad inputs to crash your program; it means that your program is not robust. Later we'll insist that you "validate" some inputs, meaning to assure that they do meet specifications. For example, you could ensure that the value entered for income (remember it's a string) is a legal value by:

    # Check whether the income value entered is legal.
    if income != 'Low' and income != 'Medium' and income != 'High:
       < handle the error here >
    else:
       # income is a legal value so proceed with the computation
       ...
There are more compact ways to do this, but this would work, and would make your code more robust. You don't have to do this for this program. We will only test it with legal inputs.