Skip to content

[Rule] PrizeCollectingSteinerForest to SteinerTree #1027

@zhangjieqingithub

Description

@zhangjieqingithub

Source

PrizeCollectingSteinerForest

Target

SteinerTree

Motivation

This is the natural companion rule for the biology-paper PCSF model.

The key modeling point is:

  • the artificial root belongs in the reduction rule,
  • not in the base PCSF model.

This rule records the well-known auxiliary-root construction: turn a forest into a single tree in an augmented graph by adding an artificial root r, while compiling the omitted-prize tradeoff into the target tree formulation through an explicit per-vertex terminal-pair gadget.

Reference

The cited papers reduce PCSF to Prize-Collecting Steiner Tree (PCST) (a target that still carries node prizes), not to plain Steiner Tree. We use the same artificial-root idea, plus a standard auxiliary-terminal gadget to translate the remaining prize term into ordinary Steiner-tree edge costs. This combined construction is treated as a well-known auxiliary-root gadget in the prize-collecting Steiner literature (see Bienstock-Goemans-Simchi-Levi-Williamson 1993 for the original prize/penalty framework predating Tuncbag).

  • Daniel Bienstock, Michel X. Goemans, David Simchi-Levi, David Williamson, "A note on the prize collecting traveling salesman problem," Mathematical Programming 59 (1993), 413-420. https://doi.org/10.1007/BF01581256
  • Nurcan Tuncbag et al., "Simultaneous Reconstruction of Multiple Signaling Pathways via the Prize-Collecting Steiner Forest Problem," Journal of Computational Biology 20(2):124-136, 2013. https://doi.org/10.1089/cmb.2012.0092
  • Nurcan Tuncbag et al., "Simultaneous Reconstruction of Multiple Signaling Pathways via the Prize-Collecting Steiner Forest Problem," RECOMB 2012, LNBI 7262, pp. 287-301, 2012. https://doi.org/10.1007/978-3-642-29627-7_31

The Tuncbag paper gives the exact PCSF objective
f'(F) = beta * sum_{v notin V_F} p(v) + sum_{e in E_F} c(e) + omega * kappa(F)
and explains the artificial-root transformation by attaching a new root to every original vertex with cost omega. The remaining step — translating the prize term beta * p(v) for v notin V_F into plain Steiner-tree edge costs via a constant-size auxiliary-terminal gadget — is the routine extension recorded by this rule.

Reduction Algorithm

This proposal is for the first concrete undirected repo variant.

Let the source instance be:

  • a graph G = (V, E)
  • vertex prizes p(v) >= 0
  • edge costs c(e) >= 0
  • parameters beta >= 0, omega >= 0

Let V_p = { v in V : p(v) > 0 } denote the vertices that actually carry a prize. Vertices with p(v) = 0 contribute nothing to the omitted-prize term and need no gadget.

Construction

Build the target graph H = (V_H, E_H) with edge costs c_H and terminal set T_H as follows.

  1. Add a new artificial root r. Set V_H = V cup {r}.

  2. Keep every original edge e = (u, v) in E with cost c_H(e) = c(e).

  3. Add a root-attachment edge (r, v) of cost omega for every original vertex v in V.

  4. Per-vertex prize gadget. For each v in V_p (vertex with positive prize):

    • add an auxiliary vertex t_v (so V_H gains one node per prized vertex),
    • add edge (v, t_v) with cost 0,
    • add edge (r, t_v) with cost beta * p(v).

    Intuition:

    • if v is included in the projected forest (so v is connected to r through original vertices), the gadget is paid by taking the 0-cost edge (v, t_v),
    • if v is omitted, the only way to connect t_v to the tree is the (r, t_v) edge of cost beta * p(v).
  5. Target terminal set. T_H = {r} cup { t_v : v in V_p }. Original vertices are non-terminals (Steiner vertices).

Scaling parameter beta

If both p(v) and c(e) are integral and beta is integral, take beta = 1 (the construction above uses beta directly; no scaling needed because all charges are commensurate).

If any of p(v), c(e), or beta is rational, multiply all edge costs in H by a common denominator before solving the target instance (Steiner tree optima are preserved under positive scaling). If a future variant requires integer costs but the source has real-valued data, pick a scale factor S >= max(c(e), beta*p(v)) + 1 to disambiguate ties; for the integral case used here, beta = 1 with the original costs suffices.

Solving and witness extraction

Solve the resulting SteinerTree instance on H to obtain an optimal Steiner tree T* spanning T_H.

To recover the source PCSF witness (V_F, E_F):

  • E_F = T* cap E(G) (intersect the target tree edges with the original edge set E),
  • V_F = { v in V : t_v in V(T*) and the edge (v, t_v) belongs to T* } cup { v in V : v is an endpoint of some edge in E_F } cup { v in V : (r, v) in T* and v is the unique attachment of its component }.

Equivalently, take the connected components of T* after removing r and all gadget vertices t_v; the union of all original vertices appearing in these components is V_F, and E_F is the set of original edges among them.

Correctness intuition

  • Every tree component of a source forest F becomes attached exactly once to the artificial root r, so T* pays exactly omega * kappa(F).
  • Original selected edges retain their original costs in the target objective.
  • For v in V_p: if v in V_F, the gadget edge (v, t_v) costs 0; if v notin V_F, the only way to reach the terminal t_v is the (r, t_v) edge of cost beta * p(v). Summing over V_p reproduces beta * sum_{v notin V_F} p(v).
  • Vertices with p(v) = 0 have no gadget and contribute nothing to the omitted-prize sum, matching the source objective.
  • Conversely, any source forest F can be lifted to a feasible Steiner tree in H by attaching each component once to r and resolving each gadget locally (included v: take (v, t_v); omitted v: take (r, t_v)).

The two objectives agree exactly: cost_H(T*) = f'(F*).

Size Overhead

Let:

  • n = num_vertices
  • m = num_edges
  • k = | V_p | = number of original vertices with p(v) > 0 (at most n)

Exact target sizes:

Target metric Formula
num_vertices n + k + 1
num_edges m + n + 2 * k (original edges + root-attachment edges + two gadget edges per prized vertex)
num_terminals k + 1

Conservative upper bounds (using k <= n):

Target metric Upper bound
num_vertices <= 2 * n + 1
num_edges <= m + 3 * n
num_terminals <= n + 1

The overhead is linear in the source graph size.

For the #[reduction(overhead = {...})] macro (which uses source-instance getters), the overhead expressions are:

num_vertices = "num_vertices + num_vertices_with_prize + 1"
num_edges    = "num_edges + num_vertices + 2 * num_vertices_with_prize"
num_terminals = "num_vertices_with_prize + 1"

(num_vertices_with_prize is the getter on the PCSF source type that returns |V_p|. If the source variant exposes only num_vertices, use the bound versions 2 * num_vertices + 1, num_edges + 3 * num_vertices, num_vertices + 1.)

Validation Method

  • Compare the target Steiner-tree optimum against brute-force PCSF evaluation on small graphs (n <= 6).
  • Closed-loop: lift the optimal source forest to a target tree, solve the target instance, and check cost_H(T*) = f'(F*).
  • Witness round-trip: extract (V_F, E_F) from T* and verify it is a forest on G whose PCSF objective matches.
  • Include cases:
    • empty graph (n = 0),
    • empty forest optimum (large beta * p(v) is not paid — vertex omitted),
    • one connected tree optimum,
    • multiple-tree optimum from a large omega / edge-cost tradeoff,
    • at least one vertex omitted at the optimum so the (r, t_v) half of the gadget is exercised,
    • zero-prize vertices alongside positive-prize vertices.

Example

We pick an instance where a prize vertex is omitted at the optimum (so the gadget is genuinely exercised), per the check-issue recommendation.

Source instance

  • path 0 - 1 - 2 with edges (0,1) and (1,2)
  • edge costs c(0,1) = 10, c(1,2) = 10
  • vertex prizes p(0) = 5, p(1) = 1, p(2) = 5
  • beta = 1, omega = 1

Feasible source forests and their objectives

For each candidate forest F = (V_F, E_F):

V_F E_F edge cost omitted prize components kappa(F) omega * kappa(F) total
{} {} 0 5+1+5=11 0 0 11
{0} {} 0 1+5=6 1 1 7
{2} {} 0 5+1=6 1 1 7
{0, 2} {} 0 1 2 2 3
{0, 1} {(0,1)} 10 5 1 1 16
{1, 2} {(1,2)} 10 5 1 1 16
{0, 1, 2} {(0,1),(1,2)} 20 0 1 1 21

The optimum is V_F* = {0, 2}, E_F* = {}, objective 3. Vertex 1 is omitted even though p(1) = 1 > 0, because paying beta * p(1) = 1 is cheaper than paying the edge cost to connect it (c(0,1) = 10 or c(1,2) = 10).

Augmented target instance

Vertices: V_H = {0, 1, 2, r, t_0, t_1, t_2} (so |V_H| = n + k + 1 = 3 + 3 + 1 = 7).

Edges and costs:

edge cost origin
(0,1) 10 original
(1,2) 10 original
(r,0) 1 root-attach (omega)
(r,1) 1 root-attach (omega)
(r,2) 1 root-attach (omega)
(0, t_0) 0 gadget include-edge
(r, t_0) 5 gadget omit-edge (beta*p(0))
(1, t_1) 0 gadget include-edge
(r, t_1) 1 gadget omit-edge (beta*p(1))
(2, t_2) 0 gadget include-edge
(r, t_2) 5 gadget omit-edge (beta*p(2))

Total target edges: m + n + 2k = 2 + 3 + 6 = 11. Terminal set: T_H = {r, t_0, t_1, t_2} (|T_H| = k+1 = 4).

Lifted optimal target tree

The lift of the source optimum V_F* = {0, 2}, E_F* = {} is:

  • root attach 0: edge (r, 0) cost 1
  • root attach 2: edge (r, 2) cost 1
  • include gadget for 0: edge (0, t_0) cost 0
  • include gadget for 2: edge (2, t_2) cost 0
  • omit gadget for 1: edge (r, t_1) cost 1

This is a tree spanning T_H = {r, t_0, t_1, t_2} (vertices 0, 2 are Steiner relays). Vertex 1 and its incident original edges are not in T*.

Total target objective: 1 + 1 + 0 + 0 + 1 = 3. Matches the source optimum of 3.

Witness extraction

Removing r and all t_v from T* leaves the original vertices {0, 2} with no edges among them. So V_F = {0, 2}, E_F = T* cap E(G) = {}. This is the source optimum, confirming the round-trip.

Why this example exercises the novel gadget

Vertex 1 is omitted at the optimum, so the gadget edge (r, t_1) (cost beta * p(1) = 1) is selected in T*. A bug that charged beta * p(v) when v is included (instead of omitted), or that omitted the (r, t_v) edge entirely, would change this example's optimum and be caught by the closed-loop test.

BibTeX

@article{BienstockGoemansSimchiLeviWilliamson1993,
  author  = {Daniel Bienstock and Michel X. Goemans and David Simchi-Levi and David Williamson},
  title   = {A note on the prize collecting traveling salesman problem},
  journal = {Mathematical Programming},
  volume  = {59},
  number  = {1},
  pages   = {413--420},
  year    = {1993},
  doi     = {10.1007/BF01581256},
  url     = {https://doi.org/10.1007/BF01581256}
}

@article{TuncbagEtAl2013PCSF,
  author  = {Nurcan Tuncbag and Alfredo Braunstein and Andrea Pagnani and Shao-Shan Carol Huang and Jennifer Chayes and Christian Borgs and Riccardo Zecchina and Ernest Fraenkel},
  title   = {Simultaneous Reconstruction of Multiple Signaling Pathways via the Prize-Collecting Steiner Forest Problem},
  journal = {Journal of Computational Biology},
  volume  = {20},
  number  = {2},
  pages   = {124--136},
  year    = {2013},
  doi     = {10.1089/cmb.2012.0092},
  url     = {https://doi.org/10.1089/cmb.2012.0092}
}

@inproceedings{TuncbagEtAl2012RECOMB,
  author    = {Nurcan Tuncbag and Alfredo Braunstein and Andrea Pagnani and Shao-Shan Carol Huang and Jennifer Chayes and Christian Borgs and Riccardo Zecchina and Ernest Fraenkel},
  title     = {Simultaneous Reconstruction of Multiple Signaling Pathways via the Prize-Collecting Steiner Forest Problem},
  booktitle = {Research in Computational Molecular Biology (RECOMB 2012)},
  series    = {Lecture Notes in Bioinformatics},
  volume    = {7262},
  pages     = {287--301},
  year      = {2012},
  publisher = {Springer},
  doi       = {10.1007/978-3-642-29627-7_31},
  url       = {https://doi.org/10.1007/978-3-642-29627-7_31}
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    GoodAn issue passed all checks.ruleA new reduction rule to be added.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions