Exam on 2016-06-28

1. Construct a neural network

2 inputs

2 outputs: nand, xor

Activation function:

g(x)={1 if x1,0 otherwise  g(x)=\left\{\begin{array}{ll}1 & \text { if } x \geq 1, \\0 & \text { otherwise }\end{array}\right. 

2. Explain the local beam search and mention its advantages / disadvantages

Idea keep kk states instead of just 11

Not the same as running kk searches in parallel: Searches that find good states recruit other searches to join them - choose top kk of all their successors

Problem often, all kk states end up on same local hill

Solution choose kk successors randomly, biased towards good ones

3. Describe the components of the goal-based agent

Add on: explicit goals , ability to plan into the future, model outcomes for predictions

4. Use the A* search

on a small graph

A* Search

Solves non-optimality and possibility of non-termination of greedy search.

Just like UCS - but g+hg+h for the priority queue.

f(n)=g(n)+h(n)f(n) = g(n) + h(n) where heuristic must be admissible.

completeness Yes - with finite nodes and bb , costs C\leq C^* , step-costs ε\geq \varepsilon

time complexityO(bεd)O(b^{\varepsilon d}) where bεb^\varepsilon is the effective branching factor with rel. error ε=hff\varepsilon = \frac{h - f^*}{f^*} .

space complexity Exponential, keeps all nodes in memory

optimality of solution Yes

  • non admissible h(n)h(n) for tree-search is not optimal (proof)

    SS \dots start node

    G,HG,H\dots goal nodes

    We immedeately expand non-optimal goal GG because h(G)=0h(G) = 0 although f(G)=5+0f(G) = 5+0 .

    Then we check f(T)=6+1f(T) = 6+1 and terminate search.

  • admissible h(n)h(n) for tree-search is optimal (proof)

    SS\dots start node

    GG\dots optimal goal node

    G2G_2\dots non goal node

    nn\dots unexpanded node on optimal path to GG

    Since there are no unexpanded nodes to G2G_2

    f(G2)=g(G2)+h(G2)f(G_2) = g(G_2) + h(G_2)

    f(G2)=g(G2)+0f(G_2) = g(G_2) + 0

    Because G2G_2 is suboptimal

    f(G2)=g(G2)>g(G)=g(n)+f(n) \textcolor{grey}{f(G_2) =} g(G_2) > g(G) \textcolor{grey}{= g(n)+ f^*(n)}  (cost to nn , and nn to goal)

    f(G2)>g(n)+f(n)f(G_2) > g(n)+ f^*(n)

    Because the heuristic is admissible

    f(G2)>g(n)+f(n)g(n)+h(n)=f(n) f(G_2) > g(n)+ f^*(n) \geq g(n)+ h(n)\textcolor{grey}{= f(n)} 

    f(G2)>f(n)f(G_2) > f(n)

    Therefore A* will never select G2G_2

  • consistent h(n)h(n) for graph search is optimal

    If h(n)h(n) decreases during optimal path to goal it is discarded by graph-search.

    (But in tree search it is not a problem since there is only a single path to the optimal goal).

    Solutions to problem:

    • consistency of hh (non decreasing ff on every path)
    • additional book-keeping

optimal efficiency Yes

A* expands the fewest nodes possible - no other optimal algorithm can expand fewer nodes than A* because not expanding nodes with f(n)<Cf(n) < C^* runs the risk of missing the optimal solution.

5. Multiple choice

What are the decisions of an rational agent based on?

Decisions based on evidence:

  • percept history
  • built-in knowledge of the environment - ie. fundamental knowledge laws like physics

BFS expands as many nodes as DFS

No - BFS uses a queue and DFS uses a stack.

BFS time complexityO(bd)O(b^d)

Goal test at generation time: 1+b+b2+b3+...+bd=O(bd)1 + b + b^2 + b^3 + ... + b^d = O(b^d)

Goal test at expansion time: 1+b+b2+b3+...+bd+bd+1=O(bd+1) 1 + b + b^2 + b^3 + ... + b^d + \textcolor{pink}{b^{d+1}}= O(b^{d+1}) 

DFS time complexityO(bm)O(b^m) - where maximum depth mm may be \infin

IDS is optimal for limited tree depth

Yes- but also for unknown tree depths

it has solution optimality if step cost ≥ 1

In general IDS is very attractive and prefered when the search space is large and the depth of the solution is unknown.

It is usually only a little more expensive than BFS.

STRIPS only allows disjunctions in goals

No it doesnt - only conjunctions

ADL supports the notation for equaliy / inequality

Yes

6. Constraint graphs

3 coloring problem - geographical map - draw a constraint graph

7. Heuristics for backtracking search

Variable ordering: fail first

  • Minimum remaining (legal) values MRV
  • degree heuristic - (used as a tie breaker), variable that occurs the most often in constraints

Value ordering: fail last

  • least-constraining-value - a value that rules out the fewest choices for the neighbouring variables in the constraint graph.

8. Explain the value of information - Can it be calculated without additional information?

Value of Information

value = ( avg. best action before obtaining)(avg. best action after obtaining) \text{value = }(\text{ avg. best action before obtaining}) - (\text{avg. best action after obtaining}) 

Value of perfect information VPI(= expected value of information)

Can be used if we do not have additional information - we then just use averages

Lets say the exact evidence eje_j (= perfect information) of the random variable EjE_j is currently unknown.

We define:

Best actionα\alphabefore learningej=Eje_j = E_junder all actionsa\textcolor{pink}a

EU(αe)=maxasP(RESULT(a)=sa,e)U(s) E U(\alpha \mid \mathbf{e})=\max _{\textcolor{pink}a} \sum_{s^{\prime}} P(\operatorname{RESULT}(\textcolor{pink}a)=s^{\prime} \mid \textcolor{pink}a, \mathbf{e}) \cdot U(s^{\prime}) 

Best actionαej\alpha_{e_j}after learningej=Eje_j = E_junder all actionsa\textcolor{pink}a

EU(αeje,ej)=maxasP(RESULT(a)=sa,e,ej)U(s) E U(\alpha_{e_{j}} \mid \mathbf{e}, e_{j})=\max _{\textcolor{pink}a} \sum_{s^{\prime}} P(\operatorname{RESULT}(\textcolor{pink}a)=s^{\prime} \mid \textcolor{pink}a, \mathbf{e}, e_{j}) \cdot U(s^{\prime}) 

Value of learning the exact evidence is the cost of discovering it for ourselves under e\mathbf{e} by averaging over all possible values ejke_{jk} of EjE_j .

VPIe(Ej)=(∑kP(Ej=ejk∣e)⋅EU(αejk∣e,Ej=ejk))−EU(α∣e)V P I_{\mathbf{e}}(E_{j})=\left(\sum_{k} P(E_{j}=e_{j k} \mid \mathbf{e}) \cdot EU(\alpha_{e_{j k}} \mid \mathbf{e}, E_{j}=e_{j k})\right)-E U(\alpha \mid \mathbf{e})VPIe​(Ej​)=(k∑​P(Ej​=ejk​∣e)⋅EU(αejk​​∣e,Ej​=ejk​))−EU(α∣e)

9. Explain the estimated utility, maximum estimated utility

Utility function

U(s)U(s) desireability of state

Expected Utility (average utility value)

Its impicit that ss' can follow from the current state ss .

EU(ae)=sP(RESULT(a)=sa,e)U(s) E U(a \mid \mathbf{e})=\sum_{s^{\prime}} P(\operatorname{RESULT}(a)=s^{\prime} \mid a, \mathbf{e}) \cdot U(s^{\prime}) 

Sum of: Probability of state occuring after action times its utility

Principle of maximum expected utility MEU

rational agent should choose the action that maximizes the agents expected utility

 action =argmaxaEU(ae) \text { action }=\underset{a}{\operatorname{argmax}} \space E U(a \mid \mathbf{e}) 

10. Construct a STRIPS action

Install a software

11. Explain the learning curve and mention reasons for its non-optimality

How do we know that hfh ≈ f ? (Hume’s Problem of Induction)

Learning curve = % correct on test set when using hh

Learning curve depends on realizability of the target function.

Non-realizability can be due to

  • missing attributes
  • too restrictive hypothesis space (e.g.,linear function)
  • redundant expressiveness (e.g., loads of irrelevant attributes)