Exam on 2018-06-26
Part 1)
a) Construct a neural network
Properties:
- max 1 inner layer
- 2 binary inputs
- 3 outputs
Activation function:
Boolean functions explained for
in depth
b) Describe the properties of DFIDS and DFS
Depth-first search DFS
Fronier as stack.
Much faster than BFS if solutions are dense.
completeness No - only with explored set and in finite spaces.
time complexity - Terrible results if .
space complexity - backtracking variant: store one successor node at a time
solution optimality No
Depth-limited search DLS
DFS limited with - returns a subtree/subgraph.
Pseudocode
function DEPTH-LIMITED-SEARCH(problem, limit) returns a solution, or failure/cutoff return RECURSIVE-DLS(MAKE-NODE(problem.INITIAL-STATE), problem, limit)function RECURSIVE-DLS(node, problem, limit) returns a solution, or failure/cutoff if problem.GOAL-TEST(node.STATE) then return SOLUTION(node) else if limit = 0 then return cutoff else cutoff occurred? ← false for each action in problem.ACTIONS(node.STATE) do child ← CHILD-NODE(problem, node, action) result ← RECURSIVE-DLS(child, problem, limit − 1) if result = cutoff then cutoff occurred? ← true else if result != failure then return result if cutoff occurred? then return cutoff else return failure
completeness No - May not terminate (even for finite search space)
time complexity
space complexity
solution optimality No
Iterative deepening search IDS
Iteratively increasing in DLS to gain completeness.
Pseudocode
function ITERATIVE-DEEPENING-SEARCH(problem) returns a solution, or failure for depth = 0 to ∞ do result ← DEPTH-LIMITED-SEARCH(problem, depth) if result 6= cutoff then return result
completeness Yes
time complexity
in depth
Only times more inefficient than BFS therefore
space complexity
in depth
The total number of nodes generated in the worst case is:
Most of the nodes are in the bottom level, so it does not matter much that the upper levels are generated multiple times. In IDS, the nodes on the bottom level (depth ) are generated once, those on the next-to-bottom level are generated twice, and so on, up to the children of the root, which are generated times.
Therefore asymptotically the same as DFS.
solution optimality yes, if step cost ≥ 1
c) Name the 4 components of an agents task description.
PEAS: Task description of Agents. ie. Autonomous taxi
Performance measure safety, destination, profits, legality, comfort, . . .
Environment streets/freeways, traffic, pedestrians, weather, . . .
Actuators steering, accelerator, brake, horn, speaker/display, . . .
Sensors video, accelerometers, gauges, engine sensors, keyboard, GPS, . . .
d) What is deep learning, when does it work well, where are its shortcomings?
Deep Learning
Deep learning: Hierarchical structure learning - Artificial neural network with many layers.
Feature extraction and learning (un)supervised
Problem: requires lots of data
Applications of Neural Networks
- Prediction of stock developments
- Prediction of air pollution (e.g., ozone concentration)
- (better than physical/chemical, statistical models)
- Prediction of electric power prizes
- Desulfurization in steel making
- Pattern recognition (e.g., handwriting)
- Face recognition
- Speech recognition
- Handwritten Digit Recognition (OCR)
Advantages and Disadvantages of Neural Networks
Pros
- Less need for determining relevant input factors
- Inherent parallelism
- Easier to develop than statistical methods
- Capability to learn and improve
- Quite good for complex pattern recognition tasks
- Usable for "unstructured" and "difficult" input (e.g., images)
- Fault tolerance
Cons
- Choice of parameters (layers, units) requires skill
- Requires sufficient training material
- Resulting hypotheses cannot be understood easily
- Knowledge implicit (subsymbolic representation)
- Verification, behavior prediction difficult (if not impossible)
Challenges
- Interpretability
- Explainability
- Trustworthiness
- Robustness (adversarial examples)
- Combining symbolic and subsymbolic approaches
e) Which functions can be represented with single and multi-layer perceptrons?
No hidden layers all linearly seperable functions
One hidden layer all continuous function, can be as precise as one wants
Two hidden layers all functions
Single-layer feed-forward neural networks (Perceptrons)
No hidden layers: Always represents a linear separator in the input space - therefore can not represent the XOR function.
Multilayer feed-forward neural networks
With a single, sufficiently large hidden layer (2 layers in total), it is possible to represent any continuous function of the inputs with arbitrary accuracy.
in depth
Above we see a nested non-linear function as the solution to the output.
With the sigmoid function as and a hidden layer, each output unit computes a soft-thresholded linear combination of several sigmoid functions.
For example, by adding two opposite-facing soft threshold functions and thresholding the result, we can obtain a "ridge" function.
Combining two such ridges at right angles to each other (i.e., combining the outputs from four hidden units), we obtain a "bump".
With more hidden units, we can produce more bumps of different sizes in more places.
With two hidden layers, even discontinuous functions can be represented.
f) Prove or contradict the following
is consistent for all consistent heuristics if:
Consistency
For all neighbours of :
This also implies admissibility and that is non decreasing along every path
Proof
We know that and are consistent and therefore:
This means that:
Which proves the consistency of .
g) Describe the working principle of Hill Climbing. What solution are generated when can't solutions be found?
We have 2 local variables:
current
and
neighbour
that get chosen randomly in the beginning. (random-restart-hill-climbing).
while(true){
neighbour := highest valued neighbour
if(neigbour.value ≤ current.value) return current.state
else current := neighbour
}
We pick the neighbour with the highest value until there are no higher values left - then we can't climb anymore. Then we return the highest found state.
We can get stuck in local maxima but can escape shoulders.
Part 2)
a) Three-colorability problem CSP → Constriant Graph
Variables
Domains
Constraints Adjacent states shall have different colors.
Because we only have binary constrains we can construct the graph as follows:
b) Three-colorability problem
Same variables, domains, constraints as before.
Explaination
Different general-purpose methods help improving the performance of the backtracking search.
Variable ordering: fail first
Choose variable with Minimum remaining (legal) values MRV .
Alternatively (as a tie breaker) choose variable that occurs the most often in constraints - degree heuristic .
Value ordering: fail last
least-constraining-value - Choose value that has the least constraints in child nodes
Least-constraining-value heuristic
Partial assignment: = green, = blue — which value would get?
The value would be the least constraining for all neighbours.
Minimum-remaining-values heuristic
Partial assignment: = blue, = red, = green
The variable has the minimum remaining values in its domains (only one).
Degree heuristic
Which variable would be assigned first?
The variable has the most neighbours.
c) Explain Backtracking Search for CSPs - why isn't it efficient?
CSPs as standard search problems
If we do not consider commutativity we have leafs else we have leafs.
The order of the assignments has no effect on the outcome. - Choosing a single variable variable in each tree level.
Backtracking search algorithm
Uninformed search algorithm (not effective for large trees).
Depth-first search:
- chooses values for one unassigned variable at a time - does not matter which one because of commutativity.
- tries all values in the domain of that variable
- backtracks when a variable has no legal values left to assign.
Because its ineffective we use heuristics / General-purpose methods for backtracking search.
d) Planning: What 3 problems occur in reasoning about actions?
Reasoning about the results of actions
frame problem ... what are things that stay unchanged after action?
ramification problem ... implicit effects (if person is in car, they both move)
qualification problem ... what are preconditions for actions? correct conceptualisation
e) Formalizing in STRIPS an action syntax
gets a requested action from a trusted source server and then installs it on the target server.
Predicates:
is a program
is a server
is trusted
is requested
is only available on
f) Describe the node types in decision networks
= influence diagrams.
General framework for rational decisions. Return the action with highest utility.
Decision networks are an extension of Bayesian networks.
Presents:
- current state
- possible actions
- resulting state from actions
- utility of each state
Node types:
Chance nodes (ovals) Random variables, uncertainty - Bayes Network
Decision nodes (rectangles) Decision maker has choice of action
Utility nodes (diamonds) Utility function