>
Note that a decision tree is typically drawn growing downward, unlike a real tree. The top node of the tree is called the "root node." The nodes inside the tree are called "decision nodes"; they have associated questions and at each decision node you follow one branch depending on the answer to that question. Nodes at the bottom of the tree (the "fringe") are called "leaf nodes" and represents a final decision.
An alternative representation of a decision tree is a decision table showing possible outcomes given possible inputs. Below is the table that might represent the decision tree in the graphic. This table is "complete" in that it contains all eight possible combinations of values of the three input variables. But not all decision tables are complete in this sense; they may leave out certain combinations of possible inputs.
Free Coffee | Salary | Commute | Accept |
---|---|---|---|
Yes | <$50K | >1.0hr | No |
Yes | <$50K | ≤1.0hr | No |
Yes | ≥$50K | >1.0hr | No |
Yes | ≥$50K | ≤1.0hr | Yes |
No | <$50K | >1.0hr | No |
No | <$50K | ≤1.0hr | No |
No | ≥$50K | >1.0hr | No |
No | ≥$50K | ≤1.0hr | No |
It's often possible to build a tree that is more "efficient" than just naively implementing the decision table. For example, notice that the decision tree in the graphic doesn't contain eight leaves, as it might in a naive implementation of the corresponding table. In particular, if the salary is too low, we don't care about the commute or whether the company offers free coffee, so it would be a waste of time to ask those questions if we take the "no" branch from the first node. Notice also that we don't necessarily perform tests in the same order they appear in the table. It might be less efficient to put certain tests at the "root" of the tree.
It's pretty easy to write a computer program to implement a decision tree. Each place the tree "branches" write an if-else or if-elif-else statement that does one of the choices depending on the value of the condition. Each leaf node gives an answer.
Here is a possible implementation of the decision tree in the graphic:
if salary < 50000: accept = False elif commuteInHours > 1.0: accept = False elif freeCoffee: accept = True else accept = Falsewhere the variables like salary, commuteInHours, and freeCoffee would have been set prior to this code, perhaps with input statements or via some computation.
Age | Income | Student | Credit | Buys |
---|---|---|---|---|
<=30 | High | No | Poor | No |
<=30 | High | No | Good | No |
31..40 | High | No | Poor | Yes |
>40 | Medium | No | Poor | Yes |
>40 | Low | Yes | Poor | Yes |
>40 | Low | Yes | Good | No |
31..40 | Low | Yes | Good | Yes |
<=30 | Medium | No | Poor | No |
<=30 | Low | Yes | Poor | Yes |
>40 | Medium | Yes | Poor | Yes |
<=30 | Medium | Yes | Good | Yes |
31..40 | Medium | No | Good | Yes |
31..40 | High | Yes | Poor | Yes |
>40 | Medium | No | Good | No |
Notice that this table is not "complete"; there are possible combinations of inputs that are not represented by rows in the table. For example, we can't tell from the table whether a 25 year old non-student with low income and bad credit will buy a computer; probably not. But that's good for us because we can optimize our algorithm by ignoring such cases (called "don't cares"). Your algorithm only has to work for the cases in the table; for don't cares you are free to return either answer, but be sure to return some answer.
For this problem, it is possible to define and implement a fairly simple decision tree, depending on the order in which you ask the questions. (My solution had 15 total lines of code, not counting comments and blank lines, including the four input lines; I could have done it in 12 lines, but wrote straightforward code.) You will be judged on whether your program gives the correct answers (the final column) for the cases in the table, not for the efficiency of your solution. But it will pay you to think about how to order the tests. Be sure to test your code for most, if not all, of the 14 lines of the table.
You can assume that the user enters legal answers in response to the four input statements. You don't have to validate the answers and your program does not have to behave correctly if the values provided are not legal.
Below are some sample runs of the program showing the desired output:
> python decisionTree.py Please enter person's age: 25 Person's income (High, Medium, Low): High Is this person a student (Yes or No)? No Does this person have good credit (Yes or No)? No This person will not purchase a computer. > python decisionTree.py Please enter person's age: 35 Person's income (High, Medium, Low): Low Is this person a student (Yes or No)? Yes Does this person have good credit (Yes or No)? Yes This person will purchase a computer. > python decisionTree.py Please enter person's age: 50 Person's income (High, Medium, Low): Medium Is this person a student (Yes or No)? No Does this person have good credit (Yes or No)? No This person will purchase a computer. > python decisionTree.py Please enter person's age: 25 Person's income (High, Medium, Low): Low Is this person a student (Yes or No)? No Does this person have good credit (Yes or No)? No This person will not purchase a computer. >Notice that the last example tested a combination of inputs not in the table. Either answer would have been acceptable.
Your file must run without syntax or obvious run-time errors. It must also contain a header with the following format:
# File: DecisionTree.py # Student: # UT EID: # Course Name: CS303E # # Date: # Description of Program:
If you submit multiple times to Canvas, it will rename your file name
to something like Filename-1.py, Filename-2.py, etc.
Don't worry about that; we'll grade the latest version.
if x == 0 or y == 0: print("Point is on an axis") if x > 0 and y > 0: print("Point is in Quadrant I") if x < 0 and y > 0: print("Point is in Quadrant II") if x < 0 and y < 0: print("Point is in Quadrant III") if x > 0 and y < 0: print("Point is in Quadrant IV")This works because the conditions are mutually exclusive, i.e., it's impossible for any two conditions to both be True for any point. The problem is that you have to evaluate every one of the five conditions, even if you find, say, after the first test that the point is on an axis. This is quite inefficient. Instead, you might do the following:
if x == 0 or y == 0: print("Point is on an axis") elif x > 0 and y > 0: print("Point is in Quadrant I") elif y > 0: print("Point is in Quadrant II") elif x < 0: print("Point is in Quadrant III") else: print("Point is in Quadrant IV")Think about why this works; in particular, think about what must be true as you're evaluating any particular condition. This version is much more efficient because you only have to evaluate conditions until you find one that's true. And also notice that you can't just replace the elifs by ifs in this version because then multiple tests could be true simultaneously and you'd get the wrong answer.
Errors in Conditions: A common error by new programmers is to try something like:
if income == "Low", "Medium", "High" ...That doesn't work. It's interpreted by Python as:
if income == ("Low", "Medium", "High"): ...which is comparing the income to a triple of values, which is almost certainly False.
Similarly,
if income == "Low" or "Medium" or "High":won't work either. This is interpreted by Python as:
if (income == "Low") or "Medium" or "High":Remember that any non-empty string is treated as True in a Boolean context. So that test is always True, which is almost certainly not what you wanted. What was probably meant was:
if income == "Low" or income == "Medium" or income == "High": ...(Think about what happens if this used and rather than or.)
Another common pattern from new programmers is to do something like this:
if test: var = True else: var = FalseNote that test is already return a Boolean value. There's no reason not to use it directly. That is, replace the above code with:
var = testIf the value were reversed, just negate the condition. That is, replace:
if negTest: var = False else: var = Truewith:
var = not negTestFinally, there's usually no reason to have at test like:
if test == True: ...or
if negTest == False: ...These are just equivalent to:
if test: ...
if not negTest: ...