Agents

Agent = Architecture (Hardware) + Program (Software)

Environment difference between agent and environment: having a goal

Sensors (input) to percieve

Actuators (output) for actions

Agent function program on the architecture (hardware) to produce $f$ .

$f:$ percept histories / sequences $\mapsto$ actions

$\textit{f}: P \text{*} \mapsto A$

Important aspects

exploration get new data about the world, experiment

learning from experience, successful / failed actions

autonomy extend initial knowledge by experience

Rationality

Being rational does not mean being:

omniscient (knowing everything) percepts may lack relevant information

clairvoyant (seeing into the future) predictions are not always accurate

successful (ideal outcome)

Rational agent

Chooses action that maximizes utility (measured with performance / sueccess measure) for any percept sequence.

Decisions based on evidence:

percept history

built-in knowledge of the environment - ie. fundamental knowledge laws like physics

P EAS

Task description of Agents. ie: Autonomous car

Performance measure safety, destination, profits, legality, comfort, . . .

Environment streets/freeways, traffic, pedestrians, weather, . . .

Actuators steering, accelerator, brake, horn, speaker/display, . . .

Sensors video, accelerometers, gauges, engine sensors, keyboard, GPS, . . .

Environment Types

fully observable vs. partly observable

whether sensors can detect all relevant properties

single-agent vs. multi-agent

single agent, or multiple with cooperation, competition

deterministic vs. stochastic

whether the next state can be determined by current state + the performed action or is fully independent and can't be foreseen (multiple possible following states) - and we know the probabilities

episodic vs. sequential

Episodic: the choice of action only depends on the current episode - percept history divided into independent episodes.

Sequential: storing entire history (accessing memories lowers performance)

static vs. dynamic vs. semi-dynamic

Static: world does not change during the reasoning / processing time of the agent (waits for agents response)

Semi-dynamic: environment is static, but performance score decreases while processing time.

discrete vs. continuous

property values of the world

known vs. unknown

state of knowledge about the "laws of physics" of the environment

Examples

Agent Types

Sorted by capabilities / generality. All these can be turned into learning agents later.

1. S imple reflex agents

No memory (percept sequences), no internal states

Only suitable for very specific tasks

Possible non-termination
Example: the door could be stuck or not openableif closed(door) {open_door}

2. M odel-based reflex agent

Add on: internal state, model of environment and changes in it, ability to deal with uncertainty

memory , maintaining/update world state
UPDATE-STATE(state, latest_action, percept, model)

can reason about unobservable parts

has implicit goals

3. G oal-based agents

Add on: explicit goals , ability to plan into the future, model outcomes for predictions

model the world , goals, actions and their effects explicitly

more flexible, better maintainability

Search and planning of actions to achieve a goal

4. U tility-based agents

Add on: take happiness into account (current state) → Humans are utility-based agents

assess goals with a utility function (maximizing happiness, concept from micro-economics)

resolve conflicting goals - goals are weighted differently, the aim is to optimize a set of values

use expected utility for decision making

5. Learning agent

All the previous agents can be turned to learning agents

Able to learn the underlying functions of utility based agent.

Agents able to improve themselves by studying their own experiences.