International Planning Competition 2018
Probabilistic Tracks

The International Planning Competition is organized in the context of the International Conference on Planning and Scheduling (ICAPS). It empirically evaluates state-of-the-art planning systems on a number of benchmark problems. The goals of the IPC are to promote planning research, highlight challenges in the planning community and provide new and interesting problems as benchmarks for future research.

Since 2004, probabilistic tracks have been part of the IPC under different names (as the International Probabilistic Planning competition or as part of the uncertainty tracks). After 2004, 2006, 2008, 2011 and 2014, the 6th edition of the IPC probabilistic tracks will be held in 2018 and conclude together with ICAPS, in June 2018, in Delft (Netherlands). This time it is organized by Thomas Keller, Scott Sanner and Buser Say.

The deterministic part of IPC is organized by Florian Pommerening and Alvaro Torralba. You can find information about it on ipc2018-classical.bitbucket.io.

The temporal part of IPC is organized by Andrew Coles, Amanda Coles and Moises Martinez. You can find information about it on ipc2018-temporal.bitbucket.io.

To stay up-to-date with the probabilistic tracks, register for the mailing list at https://groups.google.com/forum/#!forum/ipc2018-probabilistic

Preliminary Schedule

Event Date
Call for domains / expression of interest July 14, 2017
Domain submission deadline November 30, 2017
Demo problems provided January 17, 2018
Expression of interest February 4, 2018
Initial planner submission March 25, 2018
Final planner submission April 15, 2018
Planner abstract submission deadline May 20, 2018
Contest run May-June, 2018
Results announced June, 2018
Result analysis deadline July, 2018

Tracks

The competition is run in the tracks discrete MDP, continuous MDP and discrete SSP. These tracks focus on the maximization of the expected reward in a discrete or continuous environment (discrete and continuous MDP tracks) or on the minimization of the expected cost to reach a goal (discrete SSP track).

Additionally, there are the novel discrete data-based MDP and continuous data-based MDP tracks, which are versions of the discrete and continuous MDP tracks where a set of sample traces is provided as input rather than a declarative model of the MDP.

Discrete MDP Track

  • Organizer: Thomas Keller (University of Basel)
  • Model:
    • Markov decision process (MDP)
    • fixed initial state
    • finite horizon
    • no discounting of rewards (the discount factor is 1)
    • binary variables
    • discrete transitions
    • conditional (state-dependent) rewards
    • action preconditions
    • no dead-ends (there is an applicable action in every reachable state)
  • Resources:
    • memory limit of 4Gb
    • time limit depends on the instance
      As a rule of thumb, we provide 50*H seconds for each instance (exceptions are possible), where H is the finite horizon (this is slightly more time per instance than at IPC 2014)
  • Evaluation:
    • 100 runs per instance
      If less than 100 runs are completed in time, the reward of an estimated worst case policy is used for each uncompleted run.This policy is very poor and will likely ruin your average results, so please make sure that your planner finishes in time!
    • We provide a minimal reference policy for each instance. We will implement a suboptimal solver for each domain to ensure that policies that score in an instance (see below) are of reasonable quality.
    • The instance score of a planner for a given instance is 0 if the average accumulated reward R of the 100 runs of the planner is less than or equal to the average accumulated reward of the minimal reference policy R_0, and

      (R-R_0)/R*

      otherwise, where R* is the highest average accumulate reward of all participants.
    • The total score of a planner is the sum of its instance scores (this is different from IPC 2011 and 2014) and hence ranges between 0 and the number of instances at IPC 2018. To make sure that all domains are equally important, there is an equal number of instances for each domain.
  • Minimal input language requirements:
    • For RDDL, we use the same fragment as in IPC 2011 and 2014 with the changes listed below.
      Please have a look at the demo domains for examples of these changes!
      • action-preconditions and state-invariants sections
        The most important addition to the fragment of RDDL that was used in the previous competitions is the action-preconditions section. At IPC 2011 and 2014, there was a state-action-constraints section that could be used to describe invariants, static preconditions and action preconditions, even though the latter was not used in any domain. As it turns out, it is problematic semantically to combine state invariants and action preconditions in one section, since the latter must be checked by the planner at runtime, whereas the former can be assumed to be true (and it is impossible to check all successor states if a state invariant is violated). We therefore decided to remove the state-action-constraints section and replace it with an action-preconditions section and a state-invariants section. The action-preconditions section contains a set of Boolean formulas, each of which must be true for an action to be applicable in some state, and the state-invariants section contains formulas that can be assume to be true in all reachable states.
      • no max-nondef-actions entry in the instance
        Related to the former bullet point, the max-nondef-actions entry of the instance block was an additional way of restricting applicable actions. Since this can be done equivalently in the action-preconditions section, the max-nondef-actions entry is omitted.
      • type hierarchy
        This is not a new concept in RDDL, but it hasn't been used at IPC 2011 and 2014.
      • KronDelta is now optional
        In an attempt to make modeling domains with RDDL easier, KronDelta is now optional. This means that, whenever a value is used in place of a probability distribution, it is assumed that, in place of the value, there is a KronDelta distribution that places all weight on that value.
      • & for logical conjunctions (instead of ^)
        For consistency with the sign for disjunctions (|), we use & rather than ^ for logical conjunctions.
      • no equals sign following the requirements keyword
        Again, this is for consistency reasons (all other RDDL keywords that specify a list do not use the equals sign either).
      • no non-fluents section
        We believe that the separation of a planning task into domain, non-fluents and instance does not have significant advantage over the separation into domain and instance. Since the latter is well-known from all other planning tracks of the IPC, we move the objects and non-fluents block from the non-fluents section to the instance section.
    • Since we promised to provide a proper PPDDL model of all instances in the SSP track, we decided to do so also for the discrete MDP track! The PDDL fragment that is required is described below.
      Please have a look at the demo domains for examples of these changes!
      • full ADL
        As in IPC 2008, we require full ADL support from participants. However, we are not interested in domains where a proper handling of the ADL features (e.g., conditional effects or disjunctive preconditions) is crucial. Even though we can not (yet) guarantee this, we believe that a straightforward compilation of ADL to STRIPS should be possible in all instances.
      • probabilistic effects
        This wouldn't be the probabilistic tracks of IPC 2018 if the probabilistic keyword of PPDDL wouldn't be required.
      • reward function
        We use the reward keyword that has been introduced for PPDDL and the metric to maximize the reward in every instance.
      • constant numeric functions
        To be able to encode probabilities and rewards instance-dependent, we use numeric constants (numeric functions that are part of no effect). These numeric functions are not part of the state, they are just used for grounding, and it is not necessary that participating planners can deal with numeric state variables.
      • the finite horizon is modeled with a predefined integer-valued variable "steps-to-go"
        To our knowledge, there is no predefined way in PPDDL to encode the finite horizon of an MDP. There are many simple solutions to this problem, and we chose one that does not require changes to the syntax of PPDDL: we use an integer-valued variable "steps-to-go" to model the number of remaining steps. We guarantee that the value of steps-to-go is a positive integer value initially and that its value only decreases in effects. Furthermore, for the discrete MDP track, the goal is always that the steps-to-go variable is equal to 0 and nothing else.
      • conditional rewards
        Effects on the reward variable can be conditional (state-dependent).
      • add-after-delete semantics
        If a variable is both added and deleted in the same effect, the add effect "wins".
    • We currently do not plan to support other input languages. However, if you'd like to participate but require a different input language than RDDL or PPDDL, please let us know and we'll try to find a solution!
  • Optional input language requirements:
    More on this is coming soon...
  • Demo domains and instances:

Discrete SSP Track

  • Organizer: Thomas Keller (University of Basel)
  • Model:
    • stochastic shortest path problem (SSP)
    • fixed initial state
    • set of goal states
    • binary variables
    • discrete transitions
    • unconditional (state-independent) action costs
    • action preconditions
    • no dead-ends (there is an applicable action in every reachable state)
  • Resources:
    • memory limit of 4Gb
    • time limit of 50 minutes (this corresponds to 30 seconds per run)
  • Evaluation:
    • 100 runs per instance
      If less than 100 runs are completed in time, the cost of an estimated worst case policy is used for each uncompleted run. This policy is very poor and will likely ruin your average results, so please make sure that your planner finishes in time!
    • We provide a minimal reference policy for each instance. We will implement a suboptimal solver for each domain to ensure that policies that score in an instance (see below) are of reasonable quality.
    • The instance score of a planner for a given instance is 0 if the average total cost C of the 100 runs of the planner is higher than or equal to the average total cost of the minimal reference policy C_0, and

      C*/C

      otherwise, where C* is the lowest average total cost of all participants.
    • The total score of a planner is the sum of its instance scores and hence ranges between 0 and the number of instances at IPC 2018. To make sure that all domains are equally important, there is an equal number of instances for each domain.
  • Minimal input language requirements:
    • For RDDL, we use the same fragment as for the discrete MDP track described above with exception to the differences listed below.
      Please have a look at the demo domains for examples of these changes!
      • new keyword "terminal" to specify terminal states
        To model an SSP, there must be some way to specify the set of terminal (or goal) states. Since RDDL does not have a way to do this, we introduce the new keyword "terminal" that allows to specify a formula that describes all terminal states. Additionally, terminal replaces the horizon keyword in SSPs.
      • reward function is state-independent and non-positive
        We keep the established way to encode rewards (and costs) of RDDL, but guarantee for the discrete SSP track that the reward function is independent from the current state and does not evaluate to a positive value for any action that is applicable in at least one reachable state.
    • For PPDDL, we use the same fragment as for the discrete MDP track described above with exception to the differences listed below.
      Please have a look at the demo domains for examples of these changes!
      • total-cost instead of reward
        In the SSP setting, the objective is to minimize the total-cost rather than maximize the reward. Furthermore, effects on total-cost are unconditional.
      • no steps-to-go
    • We currently do not plan to support other input languages. However, if you'd like to participate but require a different input language than RDDL or PPDDL, please let us know and we'll try to find a solution!
  • Optional input language requirements:
    More on this is coming soon...
  • Demo domains and instances:

Discrete Data-based MDP Track

  • Organizer: Thomas Keller (University of Basel)
  • 4Gb memory limit
  • 50*H sec time limit, where H is the problem's finite horizon (Please note that, like pretty much everything else, this is not fixed yet. If you believe that there should be some preprocessing time to deal with the data set, let contact us via the mailing list or via email if you have an opinion on this topic.
  • Planning task is a Markov Decision Process with the same properties as in the discrete MDP track.
  • A set of sample trajectories is provided, each consisting of a number of state-action-successor state triples equal to the problem's finite horizon.
  • The MDP's reward formula is provided in a declarative form.
  • Scores are based on the average accumulated undiscounted reward over 100 runs

Continuous MDP Track

  • Organizers: Scott Sanner and Buser Say (University of Toronto)
  • 4Gb memory limit
  • 50*H sec time limit, where H is the problem's finite horizon
  • Planning task is a Markov Decision Process with the following properties:
    • Coming soon!
  • A declarative model of the MDP in RDDL is provided. We use the same subset of RDDL that was used at the probabilistic tracks of IPC 2011 and of IPC 2014, with the addition of the following features:
    • interm fluents (we can provide a compilation to a domain without interm fluents if this is requested)
  • Scores are based on the average accumulated undiscounted reward over 100 runs

Continuous Data-based MDP Track

  • Organizers: Scott Sanner and Buser Say (University of Toronto)
  • 4Gb memory limit
  • 50*H sec time limit, where H is the problem's finite horizon. We are also discussing to add a preprocessing time to deal with the data set. Please contact us via the mailing list or via email if you have an opinion on this topic.
  • Planning task is a Markov Decision Process with the same properties as in the continuous MDP track.
  • A set of sample trajectories is provided, each consisting of a number of state-action-successor state triples equal to the problem's finite horizon.
  • The MDP's reward formula is provided in a declarative form.
  • Scores are based on the average accumulated undiscounted reward over 100 runs

Registration

Discrete tracks

Unlike in IPC 2011 and 2014, competitors do not run their planners themselves and connect to a provided server. This time, the competitors must submit the source code of their planners that will be run by the organizers on the actual competition domains/problems, unknown to the competitors until this time. This way, no fine-tuning of the planners will be possible.

All competitors must submit an abstract (max. 300 words) and a 4-page paper describing their planners. After the competition we encourage the participants to analyze the results of their planner and submit an extended version of their abstract. An important requirement for IPC 2018 competitors is to give the organizers the right to post their paper and the source code of their planners on the official IPC 2018 web site.

Registration Process

There are three important dates for the registration of planners in the discrete tracks. This starts with an expression of interest in participation in one or more tracks, which is due February 4, 2018. Please let us know which tracks you are interested in and which input language you are planning to use by sending a mail to ipc2018-probabilistic-organizers@googlegroups.com.

We will use the container technology "Singularity" this year to promote reproducibility and help with compilation issues that have caused problems in the past. More details on Singularity can be found below.

The second step is to register your planner until March 25, 2018. To do so, create a repository (mercurial and git repositories are accepted) on bitbucket and give read access to ipc2018-probabilistic-bot. Then create one branch per track you want to participate in and name it according to the following list.

  • ipc2018-disc-mdp (discrete MDP track)
  • ipc2018-disc-ssp (discrete SSP track)
  • ipc2018-disc-data (discrete data-based track)
Up to three versions of the same planner are allowed to participate. To submit three (or two) different versions of the same planner, simply create three (or two) different repositories. In each branch, add a file called Singularity to the root directory of your repository. This file is used to bootstrap a singularity container and to run the planner. We'll add more information on Singularity and a demo planner here in the next few days.

The third important date is the final submission deadline on April 15, 2018. On that day, we will fork your repository. If you find any bugs in your code afterwards, you can create a pull request to our fork with a patch fixing the bug. If we detect any bugs while running your code, we'll let you know and you are also allowed to provide a bug fix. However, only bug fixes will be accepted after the deadline (in particular, we will not accept patches modifying behavior or tuning parameters).

Continuous tracks

More information on the registration procedure for the continuous tracks is coming soon...

Competition Setup

Discrete tracks

To estimate the quality of the computed policies, we execute the policies 100 times and use the average as a metric for the policy's quality. We do so by having your planner interact as a client with the rddlsim server. Detailed information on the protocol that is used to exchange messages between client and server can be found in the file PROTOCOL.txt in the root directory of rddlsim.

For IPC 2018, we made some small changes compared to previous competitions:

  • In the session-request message that is sent initially from the client to the server, the client has to provide the input language it uses (rddl or ppddl).
  • The answer of the server is the session-init message. While domain and instance have been provided as two files in previous competitions, the server provides them for IPC 2018 with the session-init message in the task tag (this allows us to determine exactly when each participant starts to plan).
  • As rddlsim communicates with its clients via XML messages, "<" and ">" are replaced by the strings "\&lt;" and "\&gt;", respectively (sending "<" and ">" would invalidate the XML). The domain and instance descriptions are therefore sent in base64 encoding which you have to decode in your planner. You should be able to find code that does this for most programming languages. However, if you encounter problems with this, please let us know and we'll try to find a solution.
  • Finally, round-request messages must contain an additional tag execute-policy with content "yes" or "no". If set to "yes", the next round is "executed" and is therefore regarded as one of the 100 runs that are used to compute instance scores. Otherwise, the round is a simulation round that does not count (you can find the reason for this change here).

If you want to test your planner locally, please update to the latest version of rddlsim, recompile and start the server with the command

./run rddl.competition.Server PROBLEM_DIR PORT 100 1 1 1 ./ 1

where PROBLEM_DIR is a directory that contains rddl domains and instances and PORT is the port where the client can connect (by default, this is 2323 for rddlsim). If you use PPDDL as input language, make sure to use the same directory structure as in the tarball that contains all demo domains and instances (and invoke rddlsim on the rddl files, not the ppddl ones).

We will add new information on the competition setup soon...

Continuous tracks

More information on the registration procedure for the continuous tracks is coming soon...

Calls for Participation and Domains

Please forward the following calls to all interested parties.

Organizers

Contact us: ipc2018-probabilistic-organizers@googlegroups.com