International Planning Competition 2018
Probabilistic Tracks

The International Planning Competition is organized in the context of the International Conference on Planning and Scheduling (ICAPS). It empirically evaluates state-of-the-art planning systems on a number of benchmark problems. The goals of the IPC are to promote planning research, highlight challenges in the planning community and provide new and interesting problems as benchmarks for future research.

Since 2004, probabilistic tracks have been part of the IPC under different names (as the International Probabilistic Planning competition or as part of the uncertainty tracks). After 2004, 2006, 2008, 2011 and 2014, the 6th edition of the IPC probabilistic tracks will be held in 2018 and conclude together with ICAPS, in June 2018, in Delft (Netherlands). This time it is organized by Thomas Keller, Scott Sanner and Buser Say.

The deterministic part of IPC is organized by Florian Pommerening and Alvaro Torralba. You can find information about it on ipc2018.bitbucket.io.

Mailing List: https://groups.google.com/forum/#!forum/ipc2018-probabilistic

Preliminary Schedule

Event Date
Call for domains / expression of interest July, 2017
Domain submission deadline November, 2017
Demo problems provided November, 2017
Initial planner submission January, 2018
Planner submission deadline February, 2018
Planner abstract submission deadline May, 2018
Contest run May, 2018
Results announced June, 2018
Result analysis deadline July, 2018

Tracks

The competition is run in the tracks discrete MDP, continuous MDP and discrete SSP. These tracks focus on the maximization of the expected reward in a discrete or continuous environmen (discrete and continuous MDP tracks) or on the minimization of the expected cost to reach a goal (discrete SSP track).

Additionally, there are the novel discrete data-based MDP and continuous data-based MDP tracks, which are versions of the discrete and continuous MDP tracks where a set of sample traces is provided as input rather than a declarative model of the MDP.

Important: Most of the rules are not set in stone (yet), and we are open to discuss almost everything. If there is a rule that prevents you from competing in any track, and there is a modification that would allow you to compete, please let us know by writing to the mailing list or via email.

Discrete MDP Track

  • 4Gb memory limit
  • 50*H sec time limit, where H is the problem's finite horizon
  • Planning task is a Markov Decision Process with the following properties:
    • binary variables
    • discrete transitions
    • state-dependent (positive and/or negative) rewards
    • action preconditions
    • no dead-ends (i.e., at least one action is applicable in every reachable state)
  • A declarative model of the MDP in RDDL is provided. We use the same subset of RDDL that was used at the probabilistic tracks of IPC 2011 and of IPC 2014, with the addition of the following features:
    • interm fluents (we can provide a compilation to a domain without interm fluents if this is requested)
  • If there is interest for a PPDDL version of the MDP (not just a fully grounded version as in the probabilistic tracks of IPC 2011 and of IPC 2014), we are going to provide one. In that case, please make sure to express your interest as soon as possible by writing to the mailing list or via email. However, please make sure that your planner is able to handle the following PPDDL features:
    • conditional effects
    • arithmetic expressions (it is not required to support numerical state variables, only the evaluation of a state-dependent arithmetic expression to compute transition probabilities)
    • state-dependent rewards
  • Scores are based on the average accumulated undiscounted reward over 100 runs

Continuous MDP Track

  • 4Gb memory limit
  • 50*H sec time limit, where H is the problem's finite horizon
  • Planning task is a Markov Decision Process with the following properties:
    • Coming soon!
  • A declarative model of the MDP in RDDL is provided. We use the same subset of RDDL that was used at the probabilistic tracks of IPC 2011 and of IPC 2014, with the addition of the following features:
    • interm fluents (we can provide a compilation to a domain without interm fluents if this is requested)
  • Scores are based on the average accumulated undiscounted reward over 100 runs

Discrete SSP Track

  • 4Gb memory limit
  • 50*H sec time limit, where H is the problem's finite horizon
  • Planning task is a Stochastic Shortest Path Problem with the following properties:
    • binary variables
    • discrete transitions
    • state-independent action costs
    • goals
    • action preconditions
    • no dead-ends(we add a very expensive "give-up" action that directly leads to a goal state to every domain that comes with dead-ends)
  • A declarative model of the MDP in RDDL is provided. We use the subset of RDDL that was used at the probabilistic tracks of IPC 2011 and of IPC 2014, with the following changes:
    • the reward function is guaranteed to be state-independent
    • there is no finite-horizon, but a description of goal states; we'll provide a document that describes the syntax of the goal description as soon as possible
    • interm fluents are used (we can provide a compilation to a domain without interm fluents if this is requested)
  • We plan to also provide a declarative model of the MDP in PPDDL. We would still like to receive your expression of interest via the mailing list or via email. We plan to use the following PPDDL features in the task description:
    • conditional effects; however, unlike in the discrete MDP track, we'll make sure that a compilation without conditional effects remains compact
    • state-dependent arithmetic expressions to compute transition probabilities; we would like to keep these for a more expressive language. However, if this is problematic for you, please let us know by writing to the mailing list or by email.
  • Scores are based on the average cost to reach a goal state over 100 runs

Discrete Data-based MDP Track

  • 4Gb memory limit
  • 50*H sec time limit, where H is the problem's finite horizon (Please note that, like pretty much everything else, this is not fixed yet. If you believe that there should be some preprocessing time to deal with the data set, let contact us via the mailing list or via email if you have an opinion on this topic.
  • Planning task is a Markov Decision Process with the same properties as in the discrete MDP track.
  • A set of sample trajectories is provided, each consisting of a number of state-action-successor state triples equal to the problem's finite horizon.
  • The MDP's reward formula is provided in a declarative form.
  • Scores are based on the average accumulated undiscounted reward over 100 runs

Continuous Data-based MDP Track

  • 4Gb memory limit
  • 50*H sec time limit, where H is the problem's finite horizon. We are also discussing to add a preprocessing time to deal with the data set. Please contact us via the mailing list or via email if you have an opinion on this topic.
  • Planning task is a Markov Decision Process with the same properties as in the continuous MDP track.
  • A set of sample trajectories is provided, each consisting of a number of state-action-successor state triples equal to the problem's finite horizon.
  • The MDP's reward formula is provided in a declarative form.
  • Scores are based on the average accumulated undiscounted reward over 100 runs

Score Computation

All five tracks have in common that the quality of the policy of each competitor is estimated as the average reward (for the MDP tracks) or cost (for the SSP track) over 100 runs (see the server documentation below for more information on the technical details on the evaluation procedure). Even though the time limit to complete the runs is selected such that there is a deliberation time of 0.5 seconds for each decision, it is also possible to compute a policy in a preprocessing step that takes almost all of the total available time and execute that policy shortly before the time limit is reached. As the total time limit is significantly higher than the time limit of the last two competitions (for a problem with the same horizon), we expect that both strategies are competetive options with different advantages and disadvantages.

Scores for each participant are computed with the same procedure that has been used at the probabilistic tracks of IPC 2011 and of IPC 2014: along with the competitors, we will run a simple planner that serves as the baseline. All competitors that

  • perform equal to or worse than the baseline planner
  • fail to finish 100 runs in time
  • execute an inapplicable action in any of the 100 runs
are assigned a score of 0 for that instance. The best competitor receives a score of 100 for that instance, and all remaining ones are linearly interpolated in between based on the achieved average reward / cost.

From the per-instance results, we compute per-domain results as the average over all instances, and from the per-domain results we compute the total score as the average over all domains. The total scores are finally used to rank the planners (the planner with the highest total score wins the competition).

Server Documentation

Your planner interacts as a client with the rddlsim server that is used for evaluation. For now, we refer to the file PROTOCOL.txt which can be found in the root directory of rddlsim for information on the used protocol. Please note, though, that there will be an additional message the client has to send to indicate that it is about to start to execute the 100 runs that are actually used to compute the planner's score (you can find the reason why this change is necessary in this paper). We'll make sure that a list of all TCP/IP messages that are required to compete in any track of the IPC 2018 probabilistic tracks is available here as soon as possible.

Registration

Potential participants are requested to subscribe to the mailing list and send us an email expressing interest. Information on submitting the planner will be provided on the mailing list and the homepage later on.

The competitors must submit the source code of their planners that will be run by the organizers on the actual competition domains/problems, unknown to the competitors until this time. This way no fine-tuning of the planners will be possible.

All competitors must submit an abstract (max. 300 words) and a 4-page paper describing their planners. After the competition, we encourage the participants to analyze the results of their planner and submit an extended version of their abstract. An important requirement for IPC 2018 competitors is to give the organizers the right to post their paper and the source code of their planners on the official IPC 2018 web site.

Organizers

Contact us: ipc2018-probabilistic-organizers@googlegroups.com