Making decision, VPI
Human Irrationality
Evidence shows that humans are predictably irrational.
Normative theory How a rational agent should act = decision theory
Descriptive theory How actual agents really do act. (ie. humans)
Certainty Effect
Allais Paradox
People are given the choice between 4 lotteries:
80% chance of 4000$
100% chance of 3000$
20% chance of 4000$
25% chance of 3000$
Most people (descriptive):
taking the sure thing
taking the higher EMV
Trying to find the humans utility function not possible. There is no function consistent with these choices.
Lets set
People are strongly attracted to gains that are certain.
Reasons: Not wanting to calculate probability, not trusting the given probabilities, being risk averse, not wanting to regret a decision,
Ambiguity aversion
Ellsberg Paradox
Prizes fixed, probabilities are not fully known.
Payoff will depend on the color of a ball chosen from an urn (1/3 red balls, 2/3 black and yellow balls, but you don’t know how many black and how many yellow).
100$ for a red ball
100$ for a black ball
100$ for a red or yellow ball
100$ for a black or yellow ball
If you think red balls > black balls you should choose A over B and C over D.
If you think red balls < black balls you should choose B over A and D over C.
1/3 chance
[0 ; 2/3] chance
[1/3 ; 3/3] chance
2/3 chance
Most people prefer A over B and and D over C, not rational.
People prefer probabilities that contain known variables.
Decision Networks
= influence diagrams.
General framework for rational decisions - return the action with highest utility.
Decision networks are an extension of Bayesian networks.
Presents:
- current state
- possible actions
- resulting state from actions
- utility of each state
Node types:
- Chance nodes (ovals)
Random variables, uncertainty - Bayes Network
- Decision nodes (rectangles)
Decision maker has choice of action
- Utility nodes (diamonds)
Utility function
Evaluating decision networks
Actions are selected by evaluating the decision network for each possible setting of the decision node. → Action with highest utility gets chosen.
Decision network algorithm
set evidence variables for current state
for each possible value of the decision node
1. set the decision node to that value
2. calculate the posterior probabilities for the parent nodes of the utility node,
using a standard probabilistic inference algorithm (with bayesian network)
2. calculate the resulting utility for the action
return the action with the highest utility
Example: Airport siting problem
Notice that, because the Noise, Deaths, and Cost chance nodes in refer to future states, they can never have their values set as evidence variables.
Information Value Theory
Data must be first extracted before analysis. We want to choose what information to acquire.
Example: Doctor
Doctor is not immediately provided with all possible diagnostic tests and questions.
Tests are expensive and sometimes hazardous (directly and because of associated delays until treatment).
Their importance depends on two factors:
- whether the test results would lead to a significantly better treatment plan
- how likely the various test results are.
Value of Information
Example: Oil Company
There are different indistinguishable blocks.
blocks has oil worth dollars.
Others are worthless.
Blocks cost dollars.
How much is the information whether a block 3 has oil or not worth to the company?
-
If the block has oil, the company will buy it and then profit dollars:
-
If the block has no oil, the company will buy a different block because the probability of finding oil in other ones changes from to .
Average profit: of dollars.
We then calculate the expected / average profit with the information:
Which is equal to the price we would be for a block if we would not have had this information.
-
Example: Oil Company (simplified)
There are boxes.
Opening a box costs costs dollars.
boxes contains dollars, others are worthless.
How much is the information whether box 3 has oil or not worth to the company?
-
Then we will pay the price to open it and take the money inside:
-
Then we will open another box - the probability of finding the prize changes from to and on average we make dollars.
We then calculate the average profit with the information:
Therefore this information has a value of .
This is what it would cost us to find it out ourselves and we are willing to pay someone to figure it out for us.
-
Value of perfect information VPI(= expected value of information)
Lets say the exact evidence (= perfect information) of the random variable is currently unknown.
We define:
Best actionbefore learningunder all actions
Best actionafter learningunder all actions
Value of learning the exact evidence is the cost of discovering it for ourselves under by averaging over all possible values of .
VPI is non-negative
In the worst case, one can just ignore the received information.
Important: this is about the expected value, not the actual value.
Additional information can lead to plans that turn out to be worse than the original plan.
Example: a medical test that gives a false positive result may lead to unnecessary surgery; but that does not mean that the test shouldn’t be done.
VPI is non-additive
The VPI can get higher or lower as new information gets aquired as combined information can have different effects.
VPI is order independent
Decision-theoretic Expert Systems
Decision analysis
Decision theory applied to actual decision problems.
The decision maker states preferences that the decision analyst then uses to find the optimal action / controlling if automated system behaves correctly.
Expert systems
Early expert system research concentrated on answering questions rather than on making decisions.
Decision networks allows to recommend optimal decisions, reflecting preferences as well as available evidence.
- are able to make decisions and use the value of information to decide whether to acquire it
- can calculate their sensitivity to small changes in probability and utility assessments.
The process of creating a decision-theoretic expert system
e.g., for selecting a medical treatment for congenital heart disease (aortic coarctation) in children
-
create a causal model
(e.g., determine symptoms, treatments, disorders, outcomes, etc.)
- simplify to a qualitative decision model
-
assign probabilities
(e.g., from patient databases, literature studies, experts subjective assessments, etc.)
-
assign utilities
(e.g., create a scale from best to worst outcome and give each a numeric value)
- verify and refine the model, evaluate the system against correct input-output-pairs, a so called gold standard
-
perform sensitivity analysis
(i.e., check whether the best decision is sensitive to small changes in the assigned probabilities and utilities.)