Adaptive Constraint-Based Agents in Artificial Environments

[RESOURCES]   [The Orc Quest Example]   [The Nearby Wizard]   [Decision-Theoretic Planning]

[ Please note: The project has been discontinued as of May 31, 2005 and is superseded by the projects of the ii Labs. There won't be further updates to these pages. ]

Decision-Theoretic Planning

(Related publications: [PUBLink] [PUBLink])

The wizard invites you to dinner, and you gladly accept. A little while later, you are sitting in a comfortable armchair with a jar of the wizard's ethanol potion in your hands. The time has come to ask where he got his planning system from. The wizard's potion seems to have a paralyzing effect on your tongue, but finally you manage to put your question in reasonably comprehensible terms.

The wizard takes a while to answer and gives you an earnest look. Then he raises his eyebrows meaningfully: «Have you heard of EXCALIBUR?» You are seized with a shivering fit because EXCALIBUR is a popular name for swords used by computer players' avatars that generally enjoy tyrannizing harmless Orcs.

«No, no,» the wizard says appeasingly, «not the sword!» His voice takes on a vibrant tone: «Did you ever consider the possibility that you are part of a computer program and that your behavior is largely guided by a planning system?» He pauses briefly to let you reflect on all this question's implications. «Well,» he continues, casting a thunderbolt to conjure up a more threatening atmosphere, «what I mean is the EXCALIBUR Agent's Planning System!» He lowers his voice: «We are the agents. You understand?»

This sounds like one of those popular conspiracy theories. You are, however, not particularly well versed in philosophy and already have difficulty explaining why there are now two wizards and why all of the equipment in the room is swaying curiously. «Please put off describing the system for now,» you say, putting an end to this rather dubious discussion. «If resources are such a vital component, why aren't there other planning systems like the EXCALIBUR agent's planning system?»

«Well,» the wizards say, «I must admit that there have, of course, also been other approaches at trying to focus on optimizing things other than plan length. These are mostly subsumed under the term decision-theoretic planning. But most of the research in this area aims to optimize probabilistic issues for reasoning under conditions of uncertainty (see Section [A Single-Plan Approach] for a description of probabilistic planning approaches). Although goals like maximizing probability to attain a goal can also be interpreted as resource-related optimization (the probability being the resource to be maximized), probability is usually considered to have only monotonical features, i.e., a plan cannot have a higher probability of attaining the goal than a subplan that also attains the goal. Your problem involving an action like using a magic potion has no counterpart here.

But there are also resource-oriented planning systems that are able to handle more complex problems, e.g., Williamson and Hanks' Pyrrhus planning system [PUBLink] and Ephrati, Pollack and Milshtein's A*-based approach [PUBLink]. The difference between these and the EXCALIBUR agent's planning system is the conceptual approach.»

«Sorry, did you say A*? It's funny that you should mention this term because it sometimes comes mysteriously to mind - usually when I think about how to get to a certain place. Do you know what it means?»

«Well, all it means is that the EXCALIBUR agent's planning system has specialized heuristics for specific subtasks. Don't worry about it! But - A* brings me directly to the point. The resource-oriented planning systems mentioned above construct plans in a stepwise manner, e.g., action by action, and if they fail, they go back to an earlier partial plan and try to elaborate this. In order to choose a partial plan for an elaboration - called refinement - an upper bound of the quality is computed for each partial plan. The plan that yields the highest upper bound is chosen for elaboration. Sounds like a great concept, doesn't?»

«Probably not - since you ask!» you reply and grunt loudly, impressed by your own cleverness. The wizards give you a rather disconcerted look, but quickly collect themselves and continue: «Er, you're right, the concept is great for things like path finding, for which it is easy to determine bounds because the actions can only add values. But for complex planning problems it is not normally possible to calculate bounds on the quality value for a partial plan. We spoke about the problems with bounds before. Does a single action like building a trap for someone puts bounds on the total fun you will have?»

«Definitely not! Things can actually get worse!» you answer bad-temperedly, «If I don't take precautions, the person trapped may be a player's avatar, and then the world is usually reset to a former state only to build a trap for the person who will build the trap.»

«Exactly,» the wizards continue, «The EXCALIBUR agent's planning system takes a different approach. It iteratively repairs complete grounded plans for which a concrete quality can be determined. The repair methods can exploit the quality information to improve the plan, e.g., by replacing a set of actions that cause a loss in quality. This technique, based on local-search methods, is not so - let's say unfocused - with respect to the plan quality, and there is practically no restriction on the kind of objective function to be optimized (see also ILOG's white paper [PUBLink]). The ASPEN system [PUBLink] is very similar to EXCALIBUR in this respect. To sum up, decision-theoretic planning is a very broad term, which, of course, also encompasses the EXCALIBUR agent's planning system. But the system is not what one would normally expect of a decision-theoretic planning system. So... well... you look a bit too drunk to go home now, don't you?»

This sounds like a good chance to have another jar of the ethanol potion with the wizards, and you gleefully agree. However, your impressive cleverness seems to have failed this time because you are not even quick enough to reply to the wizards' «Great, I'm always glad of opportunities to apply teleportation spells!» After a short puff, you find yourself in a trap you recently built yourself - together with a large number of angry and well-armed humans that were trapped while looking for an Orc who had just pestered some other humans.

[RESOURCES]   [The Orc Quest Example]   [The Nearby Wizard]   [Decision-Theoretic Planning]

For questions, comments or suggestions, please contact us.

Last update:
May 19, 2001 by Alexander Nareyek