Making decision, VPI

Human Irrationality

Evidence shows that humans are predictably irrational.

Normative theory How a rational agent should act = decision theory

Descriptive theory How actual agents really do act. (ie. humans)

Certainty Effect

  • Allais Paradox

    People are given the choice between 4 lotteries:

    AA 80% chance of 4000$

    BB 100% chance of 3000$

    CC 20% chance of 4000$

    DD 25% chance of 3000$

    Most people (descriptive):

    BAB \prec A taking the sure thing

    CDC \succ D taking the higher EMV

    Trying to find the humans utility function not possible. There is no function consistent with these choices.

    Lets set U($0)=0U(\$0) = 0

    BAU($3000)>0.8U($4000) B \prec A \Rarr U(\$ 3000)>0.8 \cdot U(\$ 4000) 

    CDU($3000)<0.8U($4000) C \succ D \Rarr U(\$ 3000)< 0.8 \cdot U(\$ 4000) 

People are strongly attracted to gains that are certain.

Reasons: Not wanting to calculate probability, not trusting the given probabilities, being risk averse, not wanting to regret a decision, \dots

Ambiguity aversion

  • Ellsberg Paradox

    Prizes fixed, probabilities are not fully known.

    Payoff will depend on the color of a ball chosen from an urn (1/3 red balls, 2/3 black and yellow balls, but you don’t know how many black and how many yellow).

    AA 100$ for a red ball

    BB 100$ for a black ball

    CC 100$ for a red or yellow ball

    DD 100$ for a black or yellow ball

    If you think red balls > black balls you should choose A over B and C over D.

    If you think red balls < black balls you should choose B over A and D over C.

    AA 1/3 chance

    BB [0 ; 2/3] chance

    CC [1/3 ; 3/3] chance

    DD 2/3 chance

    Most people prefer A over B and and D over C, not rational.

People prefer probabilities that contain known variables.

Decision Networks

= influence diagrams.

General framework for rational decisions - return the action with highest utility.

Decision networks are an extension of Bayesian networks.

Presents:

  1. current state
  1. possible actions
  1. resulting state from actions
  1. utility of each state

Node types:

  1. Chance nodes (ovals)

    Random variables, uncertainty - Bayes Network

  1. Decision nodes (rectangles)

    Decision maker has choice of action

  1. Utility nodes (diamonds)

    Utility function

Evaluating decision networks

Actions are selected by evaluating the decision network for each possible setting of the decision node. → Action with highest utility gets chosen.

Decision network algorithm

set evidence variables for current state
for each possible value of the decision node
1. set the decision node to that value
2. calculate the posterior probabilities for the parent nodes of the utility node,
using a standard probabilistic inference algorithm (with bayesian network)
2. calculate the resulting utility for the action
return the action with the highest utility

Information Value Theory

Data must be first extracted before analysis. We want to choose what information to acquire.

Value of Information

value = ( avg. best action before obtaining)(avg. best action after obtaining) \text{value = }(\text{ avg. best action before obtaining}) - (\text{avg. best action after obtaining}) 

Value of perfect information VPI(= expected value of information)

Lets say the exact evidence eje_j (= perfect information) of the random variable EjE_j is currently unknown.

We define:

Best actionα\alphabefore learningej=Eje_j = E_junder all actionsa\textcolor{pink}a

EU(αe)=maxasP(RESULT(a)=sa,e)U(s) E U(\alpha \mid \mathbf{e})=\max _{\textcolor{pink}a} \sum_{s^{\prime}} P(\operatorname{RESULT}(\textcolor{pink}a)=s^{\prime} \mid \textcolor{pink}a, \mathbf{e}) \cdot U(s^{\prime}) 

Best actionαej\alpha_{e_j}after learningej=Eje_j = E_junder all actionsa\textcolor{pink}a

EU(αeje,ej)=maxasP(RESULT(a)=sa,e,ej)U(s) E U(\alpha_{e_{j}} \mid \mathbf{e}, e_{j})=\max _{\textcolor{pink}a} \sum_{s^{\prime}} P(\operatorname{RESULT}(\textcolor{pink}a)=s^{\prime} \mid \textcolor{pink}a, \mathbf{e}, e_{j}) \cdot U(s^{\prime}) 

Value of learning the exact evidence is the cost of discovering it for ourselves under e\mathbf{e} by averaging over all possible values ejke_{jk} of EjE_j .

VPIe(Ej)=(∑kP(Ej=ejk∣e)⋅EU(αejk∣e,Ej=ejk))−EU(α∣e)V P I_{\mathbf{e}}(E_{j})=\left(\sum_{k} P(E_{j}=e_{j k} \mid \mathbf{e}) \cdot EU(\alpha_{e_{j k}} \mid \mathbf{e}, E_{j}=e_{j k})\right)-E U(\alpha \mid \mathbf{e})VPIe​(Ej​)=(k∑​P(Ej​=ejk​∣e)⋅EU(αejk​​∣e,Ej​=ejk​))−EU(α∣e)

VPI is non-negative

In the worst case, one can just ignore the received information.

e,EjVPIe(Ej)0 \forall \mathbf{e}, E_{j} \quad V P I_{\mathbf{e}}\left(E_{j}\right) \geq 0 

Important: this is about the expected value, not the actual value.

Additional information can lead to plans that turn out to be worse than the original plan.

Example: a medical test that gives a false positive result may lead to unnecessary surgery; but that does not mean that the test shouldn’t be done.

VPI is non-additive

The VPI can get higher or lower as new information gets aquired as combined information can have different effects.

VPIe(Ej,Ek)VPIe(Ej)+VPIe(Ek) \operatorname{VPI}_{\mathbf{e}}\left(E_{j}, E_{k}\right) \neq V P I_{\mathbf{e}}\left(E_{j}\right)+\operatorname{VPI}_{\mathbf{e}}\left(E_{k}\right) 

VPI is order independent

VPIe(Ej,Ek)=VPIe(Ej)+VPIe,ej(Ek)=VPIe(Ek)+VPIe,ek(Ej) V P I_{\mathbf{e}}\left(E_{j}, E_{k}\right)=V P I_{\mathbf{e}}\left(E_{j}\right)+V P I_{\mathbf{e}, e_{j}}\left(E_{k}\right)=V P I_{\mathbf{e}}\left(E_{k}\right)+V P I_{\mathbf{e}, e_{k}}\left(E_{j}\right) 

Decision-theoretic Expert Systems

Decision analysis

Decision theory applied to actual decision problems.

The decision maker states preferences that the decision analyst then uses to find the optimal action / controlling if automated system behaves correctly.

Expert systems

Early expert system research concentrated on answering questions rather than on making decisions.

Decision networks allows to recommend optimal decisions, reflecting preferences as well as available evidence.

The process of creating a decision-theoretic expert system

e.g., for selecting a medical treatment for congenital heart disease (aortic coarctation) in children

  1. create a causal model

    (e.g., determine symptoms, treatments, disorders, outcomes, etc.)

  1. simplify to a qualitative decision model
  1. assign probabilities

    (e.g., from patient databases, literature studies, experts subjective assessments, etc.)

  1. assign utilities

    (e.g., create a scale from best to worst outcome and give each a numeric value)

  1. verify and refine the model, evaluate the system against correct input-output-pairs, a so called gold standard
  1. perform sensitivity analysis

    (i.e., check whether the best decision is sensitive to small changes in the assigned probabilities and utilities.)