CS303E Homework 4

Instructor: Dr. Bill Young
Due Date: Monday, September 22, 2025 at 11:59pm

Background

A decision tree provides a systematic approach to addressing a question when there are multiple factors in making the decision. For example this shows a decision tree where the choice is whether or not to take a certain job:

>

Note that a decision tree is typically drawn growing downward, unlike a real tree. The top node of the tree is called the "root node." The nodes inside the tree are called "decision nodes"; they have associated questions and at each decision node you follow one branch depending on the answer to that question. Nodes at the bottom of the tree (the "fringe") are called "leaf nodes" and represents a final decision.

An alternative representation of a decision tree is a decision table showing possible outcomes given possible inputs. Below is the table that might represent the decision tree in the graphic. This table is "complete" in that it contains all eight possible combinations of values of the three input variables. But not all decision tables are complete in this sense; they may leave out certain combinations of possible inputs.

Free Coffee Salary Commute Accept
Yes<$50K>1.0hrNo
Yes<$50K≤1.0hrNo
Yes≥$50K>1.0hrNo
Yes≥$50K≤1.0hrYes
No <$50K>1.0hrNo
No <$50K≤1.0hrNo
No ≥$50K>1.0hrNo
No ≥$50K≤1.0hrNo

It's often possible to build a tree that is more "efficient" than just naively implementing the decision table. For example, notice that the decision tree in the graphic doesn't contain eight leaves, as it might in a naive implementation of the corresponding table. In particular, if the salary is too low, we don't care about the commute or whether the company offers free coffee, so it would be a waste of time to ask those questions if we take the "no" branch from the first node. Notice also that we don't necessarily perform tests in the same order they appear in the table. It might be less efficient to put certain tests at the "root" of the tree.

It's pretty easy to write a computer program to implement a decision tree. Each place the tree "branches" write an if-else or if-elif-else statement that does one of the choices depending on the value of the condition. Each leaf node gives an answer.

Here is a possible implementation of the decision tree in the graphic:

if salary < 50000:
   accept = False
elif commuteInHours > 1.0:
   accept = False
elif freeCoffee:
   accept = True
else
   accept = False
where the variables like salary, commuteInHours, and freeCoffee would have been set prior to this code, perhaps with input statements or via some computation.

Assignment

Your assignment is to implement a decision tree represented by the following decision table. The question to answer is whether a specific individual buys a computer. The inputs to the decision process are the age of the individual, her income level (high, medium or low), whether she is a student, and her credit rating. Under the given combinations of inputs, the last columns tells us whether the individual buys a computer.

Age Income Student Credit Buys
<=30HighNoPoorNo
<=30HighNoGoodNo
31..40HighNoPoorYes
>40MediumNoPoorYes
>40LowYesPoorYes
>40LowYesGoodNo
31..40LowYesGoodYes
<=30MediumNoPoorNo
<=30LowYesPoorYes
>40MediumYesPoorYes
<=30MediumYesGoodYes
31..40MediumNoGoodYes
31..40HighYesPoorYes
>40MediumNoGoodNo

Notice that this table is not "complete"; there are possible combinations of inputs that are not represented by rows in the table. For example, we can't tell from the table whether a 25 year old non-student with low income and bad credit will buy a computer; probably not. But that's good for us because we can optimize our algorithm by ignoring such cases (called "don't cares"). Your algorithm only has to work for the cases in the table; for don't cares you are free to return either answer, but be sure to return some answer.

For this problem, it is possible to define and implement a fairly simple decision tree, depending on the order in which you ask the questions. (My solution had 15 total lines of code, not counting comments and blank lines, including the four input lines; I could have done it in 12 lines, but wrote straightforward code.) You will be judged on whether your program gives the correct answers (the final column) for the cases in the table, not for the efficiency of your solution. But it will pay you to think about how to order the tests. Be sure to test your code for most, if not all, of the 14 lines of the table.

You can assume that the user enters legal answers in response to the four input statements. You don't have to validate the answers and your program does not have to behave correctly if the values provided are not legal.

Below are some sample runs of the program showing the desired output:

> python decisionTree.py

Please enter person's age: 25
Person's income (High, Medium, Low): High
Is this person a student (Yes or No)? No
Does this person have good credit (Yes or No)? No

This person will not purchase a computer.

> python decisionTree.py

Please enter person's age: 35
Person's income (High, Medium, Low): Low
Is this person a student (Yes or No)? Yes
Does this person have good credit (Yes or No)? Yes

This person will purchase a computer.

> python decisionTree.py

Please enter person's age: 50
Person's income (High, Medium, Low): Medium
Is this person a student (Yes or No)? No
Does this person have good credit (Yes or No)? No

This person will purchase a computer.

> python decisionTree.py

Please enter person's age: 25
Person's income (High, Medium, Low): Low
Is this person a student (Yes or No)? No
Does this person have good credit (Yes or No)? No

This person will not purchase a computer.

>
Notice that the last example tested a combination of inputs not in the table. Either answer would have been acceptable.

Turning in the Assignment:

The program should be in a file named DecisionTree.py. Submit the file via Canvas before the deadline shown at the top of this page. Submit it to the assignment hw4 under the assignments sections by uploading your python file.

Your file must run without syntax or obvious run-time errors. It must also contain a header with the following format:

# File: DecisionTree.py
# Student: 
# UT EID:
# Course Name: CS303E
# 
# Date:
# Description of Program: 

If you submit multiple times to Canvas, it will rename your file name to something like Filename-1.py, Filename-2.py, etc. Don't worry about that; we'll grade the latest version.

Programming Tips:

if-elif-else versus consecutive if: Some of you have wondered whether to use an if-elif-else statement or just a series of if statements. Sometimes, they do the same thing but often they don't and one may be significantly more efficient. Suppose you have a Cartesian coordinate system and you want to know what quadrant a particular point (x, y) is in. You could do the following:
if x == 0 or y == 0: 
   print("Point is on an axis") 
if x > 0 and y > 0: 
   print("Point is in Quadrant I") 
if x < 0 and y > 0: 
   print("Point is in Quadrant II") 
if x < 0 and y < 0: 
   print("Point is in Quadrant III") 
if x > 0 and y < 0: 
   print("Point is in Quadrant IV")
This works because the conditions are mutually exclusive, i.e., it's impossible for any two conditions to both be True for any point. The problem is that you have to evaluate every one of the five conditions, even if you find, say, after the first test that the point is on an axis. This is quite inefficient. Instead, you might do the following:
if x == 0 or y == 0: 
   print("Point is on an axis") 
elif x > 0 and y > 0: 
   print("Point is in Quadrant I") 
elif y > 0: 
   print("Point is in Quadrant II")
elif x < 0: 
   print("Point is in Quadrant III") 
else: 
   print("Point is in Quadrant IV")
Think about why this works; in particular, think about what must be true as you're evaluating any particular condition. This version is much more efficient because you only have to evaluate conditions until you find one that's true. And also notice that you can't just replace the elifs by ifs in this version because then multiple tests could be true simultaneously and you'd get the wrong answer.

Errors in Conditions: A common error by new programmers is to try something like:

   if income == "Low", "Medium", "High"
      ...
That doesn't work. It's interpreted by Python as:
   if income == ("Low", "Medium", "High"):
      ...
which is comparing the income to a triple of values, which is almost certainly False.

Similarly,

if income == "Low" or "Medium" or "High":
won't work either. This is interpreted by Python as:
if (income == "Low") or "Medium" or "High":
Remember that any non-empty string is treated as True in a Boolean context. So that test is always True, which is almost certainly not what you wanted. What was probably meant was:
   if income == "Low" or income == "Medium" or income == "High":
      ...
(Think about what happens if this used and rather than or.)

Another common pattern from new programmers is to do something like this:

if test: 
   var = True
else:
   var = False
Note that test is already return a Boolean value. There's no reason not to use it directly. That is, replace the above code with:
var = test
If the value were reversed, just negate the condition. That is, replace:
if negTest:
   var = False
else:
   var = True
with:
var = not negTest
Finally, there's usually no reason to have at test like:
if test == True:
   ...
or
if negTest == False:
   ...
These are just equivalent to:
if test:
   ...
if not negTest:
   ...