Exam on 2017-06-27

1. Explain the hill climbing algorithm

Also called "hill-climbing":

++ escaping shoulders

- getting stuck in loop at local maximum

Solution: Random-restart hill climbing: restart after step limit overcomes local maxima

2. What is an order constraint and when does it lead to problems?

ABA \prec B Action AA must be executed before action BB .

Must be free of contradictions/cycles : (AB)(BA)(A \prec B) \wedge (B\prec A)

else we have a problem.

3. What does it mean for a heuristic to be consistent?

What properties do the constants a,b0a,b \geq0 need to have if h(n)=ah1(n)+bh2(n)h(n)=a\cdot h_1(n)+ b\cdot h_2(n) and h1,h2h_1, h_2 are consistent?

h(n)=ah1(n)+bh2(n)h(n)=a\cdot h_1(n)+ b\cdot h_2(n)

We want the following inequality to be true so that h(n)h(n) stays consistent as the sum of two consistent heuristics with weights a,ba,b :

h(n)a(c(n,a,n)+h1(n))+b(c(n,a,n)+h2(n)) h(n)\leq a \cdot \bigg(c(n,a,n') + h_1(n')\bigg)+ b\cdot \bigg( c(n,a,n') + h_2(n')\bigg) 

h(n)ac(n,a,n)+ah1(n)+bc(n,a,n)+bh2(n) h(n)\leq a \cdot c(n,a,n') +a \cdot h_1(n')+ b\cdot c(n,a,n') + b \cdot h_2(n') 

h(n)ac(n,a,n)+bc(n,a,n)+ah1(n)+bh2(n)undefinedh(n) h(n)\leq a \cdot c(n,a,n') +b\cdot c(n,a,n') + \underbrace{a \cdot h_1(n')+ b \cdot h_2(n')}_{h(n)} 

h(n)c(n,a,n)(a+b)+h(n)h(n)\leq c(n,a,n')\cdot (a+b) + h(n)

Then h(n)h(n) is consistent with any a,b0a, b \geq 0 .

4. Describe 4 different types of environments that agents can be situated in.

fully observable vs. partly observable

whether sensors can detect all relevant properties

single-agent vs. multi-agent

single agent, or multiple with cooperation, competition

deterministic vs. non deterministic

whether the next state can be determined by current state + the performed action or is fully independent and can't be foreseen (multiple possible following states).

stochastic

whether we know the probabilities of occurences

episodic vs. sequential

Episodic: the choice of action only depends on the current episode - percept history divided into independent episodes.

Sequential: storing entire history (accessing memories lowers performance)

static vs. dynamic vs. semi-dynamic

Static: world does not change during the reasoning / processing time of the agent (waits for agents response)

Semi-dynamic: environment is static, but performance score decreases while processing time.

discrete vs. continuous

property values of the world

known vs. unknown

state of knowledge about the "laws of physics" of the environment

5. Perceptron learning rule


The perceptron learning rule:

wi←wi+α⋅(y−hw(x))⋅g′(ini)⋅xiw_i\larr w_i + \alpha \cdot (y-h_{\mathbf{w}}(\mathbf{x})) \cdot g^{\prime}(i n_i) \cdot x_{i}wi​←wi​+α⋅(y−hw​(x))⋅g′(ini​)⋅xi​

We need the derivative of the identity function f(x)=xf(x) = x :

f(x)=1f'(x) = 1

Therefore

wi←wi+α⋅(y−hw(x))⋅1⋅xiw_i\larr w_i + \alpha \cdot (y-h_{\mathbf{w}}(\mathbf{x})) \cdot \textcolor{pink}1 \cdot x_{i}wi​←wi​+α⋅(y−hw​(x))⋅1⋅xi​
wi←wi+α(y−hw(x))⋅xiw_i \larr w_i + \alpha(y-h_{\mathbf{w}}(\mathbf{x})) \cdot x_iwi​←wi​+α(y−hw​(x))⋅xi​

6. CSP Problem

4 variables with different domains

AA ...

BB with domain DB={1,2,5}D_B=\{1,2,5\}

CC with domain DC={2,3,4,5}D_C=\{2,3,4,5\}

DD ....

5 constraints

C1(A,B)={(1,5),(3,4),(2,1),)}C_1(A,B)=\{(1,5),(3,4),(2,1),\dots)\}

C2(A,C)={}C_2(A,C)=\{\dots\}

Draw a constraint graph.

How would you pick the next variable / value with <heuristic name> if we assign the value 55 to AA ?

7. Construct a STRIPS action

We want to transfer a file between 2 harddrives.

On(x,y)\text{On}(x, y)xx is on yy

File(x)\text {File}(x)xx is a file

HD(x)\mathrm{HD}(x)xx is a harddrive

Empty(x)\text{Empty}(x)xx is empty

Preconditions:

  • AA contains file
  • BB is empty
  • both A,BA, B are harddrives

Effect:

  • BB is not empty anyymore
  • both A,BA, B contain the data

Action(Transfer(x,y,file)\text{Action(} Transfer(x,y, file)

PRECOND: HD(x)HD(y)File(file)On(file,x)Empty(y)¬On(file,y) \text{PRECOND:} ~\mathrm{HD}(x) \wedge \mathrm{HD}(y) \wedge \text {File}(file) \wedge \text{On}(file, x) \wedge \text{Empty}(y) \wedge \neg \text{On}(file,y) 

EFFECT: On(file,y)¬On(file,x)Empty(x)) \text{EFFECT:}~ \text{On}(file,y ) \wedge \neg \text{On}(file,x) \wedge \text{Empty}(x)~) 

8. Formal description of a search problem

Real World → Search problem → Search Algorithm

We define states, actions, solutions (= selecting a state space) through abstraction.

Then we turn states into nodes for the search algorithm.

Solution

A solution is a sequence of actions leading from the initial state to a goal state.

  1. Initial state
  1. Successor function
  1. Goal test
  1. path cost (additive)

9. Ramification problem, qualification problem - describe and give an example

Reasoning about the results of actions

frame problem ... what are things that stay unchanged after action?

would require a lot of axioms to describe.

ramification problem ... what are the implicit effects?

ie.: passengers of a car move with it

qualification problem ... what are preconditions for actions?

deals with a correct conceptualisation of things (no answer).

ie.: finding the right formalization of a human conversation

10. Decision Network, define node types

<Example containing something about the weather>

= influence diagrams.

General framework for rational decisions - return the action with highest utility.

Decision networks are an extension of Bayesian networks.

Presents:

  1. current state
  1. possible actions
  1. resulting state from actions
  1. utility of each state

Node types:

  1. Chance nodes (ovals)

    Random variables, uncertainty - Bayes Network

  1. Decision nodes (rectangles)

    Decision maker has choice of action

  1. Utility nodes (diamonds)

    Utility function

11. Why is entropy relevant for decision trees? How is H(p1,p2)H(p_1,p_2) defined?

Information gain measured with entropy

Entropy measures uncertainty of the occurence of a random variable in [shannon] or [bit].

Entropy of a random variable AA with values vkv_k each with probability P(vk)P(v_k) is defined as:

H(A)=∑kP(vk)log⁡2(1P(vk))=−∑kP(vk)log⁡2(P(vk))H(A) = \sum_k P(v_k)\log_2 \biggl( \frac{1}{P(v_k)} \biggl)= - \sum_k P(v_k)\log_2(P(v_k))H(A)=k∑​P(vk​)log2​(P(vk​)1​)=−k∑​P(vk​)log2​(P(vk​))

Entropy is at its maximum when all outcomes are equally likely.

💡
With lower entropy, we can ask fewer questions to get to a decision.

Entropy of a boolean random variable

Boolean random variable VV : Variable is true with probability qq and false with (1q)(1-q) .

Same formula as above.

B(q)=−(q⋅log⁡2(q)+(1−q)⋅log⁡2(1−q))B(q)=-(q\cdot \log_2(q)+(1-q)\cdot \log_2(1-q))B(q)=−(q⋅log2​(q)+(1−q)⋅log2​(1−q))

12. Multiple Choice part

Does BFS expand as least as often as it has nodes?

This depends on whether we test our goals on generation or on expansion.

Goal test at generation time: 1+b+b2+b3+...+bd=O(bd)1 + b + b^2 + b^3 + ... + b^d = O(b^d)

Goal test at expansion time: 1+b+b2+b3+...+bd+bd+1=O(bd+1) 1 + b + b^2 + b^3 + ... + b^d + \textcolor{pink}{b^{d+1}}= O(b^{d+1}) 

Can BFS be solved with a special case of A*?

No because BFS does not use a priority queue while A* does so.

Is local beam search a modification of the generic algorithm with cross-over?

No - Genetic algorithms GA is a modification of local beam search.

It uses Stochastic local beam search + successors from pairs of states.

Does STRIPS support the notation for equality?

No it doesn't - ADL does.

Does the POP algorithm only produce consistent plans?

Yes - because only consistent plans are generated the goal test is checking if any preconditions are open. If the goal is not reached it adds successor states.