Restaurant Example
In this chapter we will look at examples where input and output are discrete and where the output has exactly 2 possible values (boolean classification).
goal predicate / output boolean of WillWait
attributes / input vector Here only discrete values.
Attributes


Training set
- positive examples - in which WillWait is true:
- negative examples - in which WillWait is false:

Decision-Tree-Learning Algorithm
Greedy divide-and-conquer algorithm: queue sorted by the "most important attribute".
/*
- examples: (x, y) pairs
- attributes: attributes of values x_1, x_2, ... x_n in x in example
- classification: y_j = true / false
*/function DTL(examples, attributes, parent_examples) returns a decision tree
if examples is empty then return MAJORITY(parent_examples.classifications)
else if all examples have the same classification then return the classification
else if attributes is empty then return MAJORITY(examples.classification)
else
A = empty
for each Attribute A' do
A ← maxImportance(A, A', examples)tree ← a new decision tree with root test Afor each (value v of A) do //all possible values from domain of A
exs ← {e : e ∈ examples | e.A = v} //all examples where this value occured
subtree ← DTL(exs, attributes\{A}, examples) //recursive call
add a new branch to tree with label "A = v" and subtree subtreereturn tree
The algorithm is recursive, every sub tree is an independent problem with fewer examples and one less attribute.
- no examples left
then it means that there is no example with this combination of values → We return a default value: the majority function on examples in the parents.
- Remaining examples are all positive or negative
we are done
- no attributes left but both positive and negative examples
These examples have the same description but different classifications, we return the plurality-classification (Reasons: noise in data, non determinism)
- some positive or negative examples
choose the most important attribute to split and repeat this process for subtree
Solution
Notice how it ignores the Price and Type attributes.


Importance
We choose the attribute with most importance - Which attribute is the best predictor for the classification?
The one with the highest information gain.
low importance: Type
because four possible values, each of which has the same number of positive as negative examples.
high importance: Patrons
values lead to example sets that are definitely true or false.