Exam on 2021-02-11
Part 1
a) Construct a neural network
Inputs
Outputs
You can also draw a separate network for every output.
Activation function:
b) Explain goal-based agents - in what way are they different from utility-based agents?
3. Goal-based agents
Add on: explicit goals , ability to plan into the future, model outcomes for predictions
- model the world , goals, actions and their effects explicitly
- more flexible, better maintainability
- Search and planning of actions to achieve a goal
4. Utility-based agents
Add on: take happiness into account (current state) → Humans are utility-based agents
- assess goals with a utility function (maximizing happiness, concept from micro-economics)
- resolve conflicting goals - goals are weighted differently, the aim is to optimize a set of values
- use expected utility for decision making
c) Explain the terms uninformed search, overfitting, activation function and deep learning.
Uninformed search
basic algorithms, no further information about solution costs (heuristics)
Overfitting
When the learned function fitted itself to the training set - but does not generalize well anymore
Tradeoff between complexity and simplicity
precision, fitting training data vs. generalizing, finding patterns
complex hypotheses - fit the data well vs. simpler hypotheses- might generalize better
Overfitting
The DTL algorithm will use any pattern it finds in the examples even if they are not correlated.
likelyhood of overfitting if hypothesis space, number of inputs
likelyhood of overfitting if examples.
Activation function
The function that defines when the (mathematial model of a) neuron-unit fires.
It uses a threshold, linear or logistic function on the weighted sum of all inputs.
Where is an activation function :
perceptron hard threshhold
sigmoid perceptron logistic function
Deep learning
Hierarchical structure learning: NNs with many layers
Feature extraction and learning (un)supervised
Problem: requires lots of data
Push for hardware (GPUs, designated chips): “Data-Driven” vs. “Model-Driven” computing
d) Multiple choice
1. Local beam search amounts to a locally confined run of random-restart searches. ✅
Idea keep states instead of just
Not the same as running searches in parallel: Searches that find good states recruit other searches to join them - choose top of all their successors
Problem often, all states end up on same local hill
Solution choose successors randomly, biased towards good ones
2. Neural networks are generally bad in object recognition. ❌
No - they are particularly good at it.
3. A rational agent does not necessarily always select the best possible action. ✅
Being rational does not mean being:
omniscient (knowing everything) percepts may lack relevant information
clairvoyant (seeing into the future) predictions are not always accurate
👉 successful (ideal outcome)
4. A game of poker is a discrete environment ✅
discrete vs. continuous
property values of the world
e) Construct a decision Tree
A decision tree uses the following example set:
Construct a decision tree from the given data, using the algorithm from the lecture. The next attribute is always chosen according to the predetermined order: . Which special case occurs? How does the algorithm deal with it?
We have a predefined order, therefore we do not have to calculate the importance of each attribute individually.
The domains of these attributes:
/*
- examples: (x, y) pairs
- attributes: attributes of values x_1, x_2, ... x_n in x in example
- classification: y_j = true / false
*/function DTL(examples, attributes, parent_examples) returns a decision tree
if examples is empty then return MAJORITY(parent_examples.classifications)
else if all examples have the same classification then return the classification
else if attributes is empty then return MAJORITY(examples.classification)
else
A = empty
for each Attribute A' do
A ← maxImportance(A, A', examples)tree ← a new decision tree with root test Afor each (value v of A) do //all possible values from domain of A
exs ← {e : e ∈ examples | e.A = v} //all examples where this value occured
subtree ← DTL(exs, attributes\{A}, examples) //recursive call
add a new branch to tree with label "A = v" and subtree subtreereturn tree
The algorithm is recursive, every sub tree is an independent problem with fewer examples and one less attribute.
The returned tree is not identical to the correct function / tree . It is a hypothesis based on examples.
in depth explaination
- no examples left
then it means that there is no example with this combination of values → We return a default value: the majority function on examples in the parents.
- Remaining examples are all positive or negative
we are done
- no attributes left but both positive and negative examples
These examples have the same description but different classifications, we return the plurality-classification (Reasons: noise in data, non determinism)
- some positive or negative examples
choose the most important attribute to split and repeat this process for subtree
- no examples left
In this special case using this predefined order: we have no attributes left but both positive and negative examples .
Reason: These examples have the same description but different classifications, we return the plurality-classification (Reasons: noise in data, non determinism)
f) Search problem - A* Search
Starting node
Goal node
If I marked it grey, that means its from another step.
Were using a graph search algorithm with an explored set.
This is the explored set (seen in the correct order):
g) Describe how hill climbing functions. Which problem can occur? Whats the solution to it?
Also called "hill-climbing":
escaping shoulders
getting stuck in loop at local maximum
Solution: Random-restart hill climbing: restart after step limit overcomes local maxima
Pseudocode
function HILL-CLIMBING(problem) returns a state that is a local maximum inputs: problem // a problem local variables: current, // a node neighbor // a node current ← MAKE-NODE(INITIAL-STATE[problem]) loop do neighbor ← a highest-valued successor of current if VALUE[neighbor] ≤ VALUE[current] then return STATE[current] current ← neighbor
h) DFS vs. IDS - are they optimal with infinite branching factors?
They still must terminate at some point even with infinite branches, since they use a stack as their frontier-set data-structure.
They explore the tree in its full depth first before continuing with its breadth.
3) Depth-first search DFS
Fronier as stack.
Much faster than BFS if solutions are dense.
Backtracking variant: store one successor node at a time - generate child node with change.
completeness No - only complete with explored set and in finite spaces.
time complexity - Terrible results if .
space complexity
solution optimality No
4) Depth-limited search DLS
DFS limited with - returns a subtree/subgraph.
Pseudocode
function DEPTH-LIMITED-SEARCH(problem, limit) returns a solution, or failure/cutoff return RECURSIVE-DLS(MAKE-NODE(problem.INITIAL-STATE), problem, limit)function RECURSIVE-DLS(node, problem, limit) returns a solution, or failure/cutoff if problem.GOAL-TEST(node.STATE) then return SOLUTION(node) else if limit = 0 then return cutoff else cutoff occurred? ← false for each action in problem.ACTIONS(node.STATE) do child ← CHILD-NODE(problem, node, action) result ← RECURSIVE-DLS(child, problem, limit − 1) if result = cutoff then cutoff occurred? ← true else if result != failure then return result if cutoff occurred? then return cutoff else return failure
completeness No - May not terminate (even for finite search space)
time complexity
space complexity
solution optimality No
5) Iterative deepening search IDS
Iteratively increasing in DLS to gain completeness.
Pseudocode
function ITERATIVE-DEEPENING-SEARCH(problem) returns a solution, or failure for depth = 0 to ∞ do result ← DEPTH-LIMITED-SEARCH(problem, depth) if result 6= cutoff then return result
completeness Yes
time complexity
in depth
Only times more inefficient than BFS therefore
space complexity
in depth
The total number of nodes generated in the worst case is:
Most of the nodes are in the bottom level, so it does not matter much that the upper levels are generated multiple times. In IDS, the nodes on the bottom level (depth ) are generated once, those on the next-to-bottom level are generated twice, and so on, up to the children of the root, which are generated times.
Therefore asymptotically the same as DFS.
solution optimality Yes, if step cost ≥ 1
Part 2
a) Cryptographic puzzle as a CSP
E G G
+ O I L
-------
M A Y O
Defined as a constraint satisfaction problem:
Constraints:
Each letter stands for a different digit:
The variables stand for the digit that gets carried over into the next column by the addition. Therefore stands for in our unknown numeric system, stands for , stands for .
b) 3-color-problem and backtracking-search heuristics
Least constraining value
Partial assignment:
Question: Which value would be assigned to the variable by the least-constraining-value heuristic?
Answer:
Minimum remaining values
Partial assignment:
Question: Which variable would be assigned next using the minimum-remaining-values heuristic?
Answer
Degree heuristic
Question: Which variable would be assigned a value from its domain first by the degree heuristic?
Answer:
Explanation
Variable ordering: fail first
- Minimum remaining (legal) values MRV
- degree heuristic - (used as a tie breaker), variable that occurs the most often in constraints
Value ordering: fail last
- least-constraining-value - a value that rules out the fewest choices for the neighbouring variables in the constraint graph.
c) Multiple Choice
1. A CSP can only contain binary constraints ❌
Backtracking search algorithm
Depth-first search with single-variable assignments:
- chooses values for one unassigned variable at a time - does not matter which one because of commutativity.
- tries all values in the domain of that variable
- backtracks when a variable has no legal values left to assign.
2. The backtracking search is an uninformed search algorithm. ✅
3. The axiom of continuity is: ❌
The right answer would be
d) Axiom of monotonicity from utility theory
Monotonicity
Agent prefers a higher probability of the state that it prefers.
e) Multiple Choice
1. Classical planning environments are static ✅
Classical Planning Environments
Used in classical planning:
- fully observable
- deterministic, finite
- static (change happens only when the agent acts)
- discrete in time, actions, objects, and effects
2. is syntactically correct in STRIPS ❌
No, STRIPS does not support quantors.
3. Unlike STRIPS, ADL supports equality ✅
4. In partial order planning, the constraint means that must be executed immediately before ❌
It means that must be executed some time before .
f) Formalizing a STRIPS action