Source
PrizeCollectingSteinerForest
Target
SteinerTree
Motivation
This is the natural companion rule for the biology-paper PCSF model.
The key modeling point is:
- the artificial root belongs in the reduction rule,
- not in the base PCSF model.
This rule records the well-known auxiliary-root construction: turn a forest into a single tree in an augmented graph by adding an artificial root r, while compiling the omitted-prize tradeoff into the target tree formulation through an explicit per-vertex terminal-pair gadget.
Reference
The cited papers reduce PCSF to Prize-Collecting Steiner Tree (PCST) (a target that still carries node prizes), not to plain Steiner Tree. We use the same artificial-root idea, plus a standard auxiliary-terminal gadget to translate the remaining prize term into ordinary Steiner-tree edge costs. This combined construction is treated as a well-known auxiliary-root gadget in the prize-collecting Steiner literature (see Bienstock-Goemans-Simchi-Levi-Williamson 1993 for the original prize/penalty framework predating Tuncbag).
- Daniel Bienstock, Michel X. Goemans, David Simchi-Levi, David Williamson, "A note on the prize collecting traveling salesman problem," Mathematical Programming 59 (1993), 413-420. https://doi.org/10.1007/BF01581256
- Nurcan Tuncbag et al., "Simultaneous Reconstruction of Multiple Signaling Pathways via the Prize-Collecting Steiner Forest Problem," Journal of Computational Biology 20(2):124-136, 2013. https://doi.org/10.1089/cmb.2012.0092
- Nurcan Tuncbag et al., "Simultaneous Reconstruction of Multiple Signaling Pathways via the Prize-Collecting Steiner Forest Problem," RECOMB 2012, LNBI 7262, pp. 287-301, 2012. https://doi.org/10.1007/978-3-642-29627-7_31
The Tuncbag paper gives the exact PCSF objective
f'(F) = beta * sum_{v notin V_F} p(v) + sum_{e in E_F} c(e) + omega * kappa(F)
and explains the artificial-root transformation by attaching a new root to every original vertex with cost omega. The remaining step — translating the prize term beta * p(v) for v notin V_F into plain Steiner-tree edge costs via a constant-size auxiliary-terminal gadget — is the routine extension recorded by this rule.
Reduction Algorithm
This proposal is for the first concrete undirected repo variant.
Let the source instance be:
- a graph
G = (V, E)
- vertex prizes
p(v) >= 0
- edge costs
c(e) >= 0
- parameters
beta >= 0, omega >= 0
Let V_p = { v in V : p(v) > 0 } denote the vertices that actually carry a prize. Vertices with p(v) = 0 contribute nothing to the omitted-prize term and need no gadget.
Construction
Build the target graph H = (V_H, E_H) with edge costs c_H and terminal set T_H as follows.
-
Add a new artificial root r. Set V_H = V cup {r}.
-
Keep every original edge e = (u, v) in E with cost c_H(e) = c(e).
-
Add a root-attachment edge (r, v) of cost omega for every original vertex v in V.
-
Per-vertex prize gadget. For each v in V_p (vertex with positive prize):
- add an auxiliary vertex
t_v (so V_H gains one node per prized vertex),
- add edge
(v, t_v) with cost 0,
- add edge
(r, t_v) with cost beta * p(v).
Intuition:
- if
v is included in the projected forest (so v is connected to r through original vertices), the gadget is paid by taking the 0-cost edge (v, t_v),
- if
v is omitted, the only way to connect t_v to the tree is the (r, t_v) edge of cost beta * p(v).
-
Target terminal set. T_H = {r} cup { t_v : v in V_p }. Original vertices are non-terminals (Steiner vertices).
Scaling parameter beta
If both p(v) and c(e) are integral and beta is integral, take beta = 1 (the construction above uses beta directly; no scaling needed because all charges are commensurate).
If any of p(v), c(e), or beta is rational, multiply all edge costs in H by a common denominator before solving the target instance (Steiner tree optima are preserved under positive scaling). If a future variant requires integer costs but the source has real-valued data, pick a scale factor S >= max(c(e), beta*p(v)) + 1 to disambiguate ties; for the integral case used here, beta = 1 with the original costs suffices.
Solving and witness extraction
Solve the resulting SteinerTree instance on H to obtain an optimal Steiner tree T* spanning T_H.
To recover the source PCSF witness (V_F, E_F):
E_F = T* cap E(G) (intersect the target tree edges with the original edge set E),
V_F = { v in V : t_v in V(T*) and the edge (v, t_v) belongs to T* } cup { v in V : v is an endpoint of some edge in E_F } cup { v in V : (r, v) in T* and v is the unique attachment of its component }.
Equivalently, take the connected components of T* after removing r and all gadget vertices t_v; the union of all original vertices appearing in these components is V_F, and E_F is the set of original edges among them.
Correctness intuition
- Every tree component of a source forest
F becomes attached exactly once to the artificial root r, so T* pays exactly omega * kappa(F).
- Original selected edges retain their original costs in the target objective.
- For
v in V_p: if v in V_F, the gadget edge (v, t_v) costs 0; if v notin V_F, the only way to reach the terminal t_v is the (r, t_v) edge of cost beta * p(v). Summing over V_p reproduces beta * sum_{v notin V_F} p(v).
- Vertices with
p(v) = 0 have no gadget and contribute nothing to the omitted-prize sum, matching the source objective.
- Conversely, any source forest
F can be lifted to a feasible Steiner tree in H by attaching each component once to r and resolving each gadget locally (included v: take (v, t_v); omitted v: take (r, t_v)).
The two objectives agree exactly: cost_H(T*) = f'(F*).
Size Overhead
Let:
n = num_vertices
m = num_edges
k = | V_p | = number of original vertices with p(v) > 0 (at most n)
Exact target sizes:
| Target metric |
Formula |
num_vertices |
n + k + 1 |
num_edges |
m + n + 2 * k (original edges + root-attachment edges + two gadget edges per prized vertex) |
num_terminals |
k + 1 |
Conservative upper bounds (using k <= n):
| Target metric |
Upper bound |
num_vertices |
<= 2 * n + 1 |
num_edges |
<= m + 3 * n |
num_terminals |
<= n + 1 |
The overhead is linear in the source graph size.
For the #[reduction(overhead = {...})] macro (which uses source-instance getters), the overhead expressions are:
num_vertices = "num_vertices + num_vertices_with_prize + 1"
num_edges = "num_edges + num_vertices + 2 * num_vertices_with_prize"
num_terminals = "num_vertices_with_prize + 1"
(num_vertices_with_prize is the getter on the PCSF source type that returns |V_p|. If the source variant exposes only num_vertices, use the bound versions 2 * num_vertices + 1, num_edges + 3 * num_vertices, num_vertices + 1.)
Validation Method
- Compare the target Steiner-tree optimum against brute-force PCSF evaluation on small graphs (
n <= 6).
- Closed-loop: lift the optimal source forest to a target tree, solve the target instance, and check
cost_H(T*) = f'(F*).
- Witness round-trip: extract
(V_F, E_F) from T* and verify it is a forest on G whose PCSF objective matches.
- Include cases:
- empty graph (
n = 0),
- empty forest optimum (large
beta * p(v) is not paid — vertex omitted),
- one connected tree optimum,
- multiple-tree optimum from a large
omega / edge-cost tradeoff,
- at least one vertex omitted at the optimum so the
(r, t_v) half of the gadget is exercised,
- zero-prize vertices alongside positive-prize vertices.
Example
We pick an instance where a prize vertex is omitted at the optimum (so the gadget is genuinely exercised), per the check-issue recommendation.
Source instance
- path
0 - 1 - 2 with edges (0,1) and (1,2)
- edge costs
c(0,1) = 10, c(1,2) = 10
- vertex prizes
p(0) = 5, p(1) = 1, p(2) = 5
beta = 1, omega = 1
Feasible source forests and their objectives
For each candidate forest F = (V_F, E_F):
V_F |
E_F |
edge cost |
omitted prize |
components kappa(F) |
omega * kappa(F) |
total |
{} |
{} |
0 |
5+1+5=11 |
0 |
0 |
11 |
{0} |
{} |
0 |
1+5=6 |
1 |
1 |
7 |
{2} |
{} |
0 |
5+1=6 |
1 |
1 |
7 |
{0, 2} |
{} |
0 |
1 |
2 |
2 |
3 |
{0, 1} |
{(0,1)} |
10 |
5 |
1 |
1 |
16 |
{1, 2} |
{(1,2)} |
10 |
5 |
1 |
1 |
16 |
{0, 1, 2} |
{(0,1),(1,2)} |
20 |
0 |
1 |
1 |
21 |
The optimum is V_F* = {0, 2}, E_F* = {}, objective 3. Vertex 1 is omitted even though p(1) = 1 > 0, because paying beta * p(1) = 1 is cheaper than paying the edge cost to connect it (c(0,1) = 10 or c(1,2) = 10).
Augmented target instance
Vertices: V_H = {0, 1, 2, r, t_0, t_1, t_2} (so |V_H| = n + k + 1 = 3 + 3 + 1 = 7).
Edges and costs:
| edge |
cost |
origin |
(0,1) |
10 |
original |
(1,2) |
10 |
original |
(r,0) |
1 |
root-attach (omega) |
(r,1) |
1 |
root-attach (omega) |
(r,2) |
1 |
root-attach (omega) |
(0, t_0) |
0 |
gadget include-edge |
(r, t_0) |
5 |
gadget omit-edge (beta*p(0)) |
(1, t_1) |
0 |
gadget include-edge |
(r, t_1) |
1 |
gadget omit-edge (beta*p(1)) |
(2, t_2) |
0 |
gadget include-edge |
(r, t_2) |
5 |
gadget omit-edge (beta*p(2)) |
Total target edges: m + n + 2k = 2 + 3 + 6 = 11. Terminal set: T_H = {r, t_0, t_1, t_2} (|T_H| = k+1 = 4).
Lifted optimal target tree
The lift of the source optimum V_F* = {0, 2}, E_F* = {} is:
- root attach
0: edge (r, 0) cost 1
- root attach
2: edge (r, 2) cost 1
- include gadget for
0: edge (0, t_0) cost 0
- include gadget for
2: edge (2, t_2) cost 0
- omit gadget for
1: edge (r, t_1) cost 1
This is a tree spanning T_H = {r, t_0, t_1, t_2} (vertices 0, 2 are Steiner relays). Vertex 1 and its incident original edges are not in T*.
Total target objective: 1 + 1 + 0 + 0 + 1 = 3. Matches the source optimum of 3.
Witness extraction
Removing r and all t_v from T* leaves the original vertices {0, 2} with no edges among them. So V_F = {0, 2}, E_F = T* cap E(G) = {}. This is the source optimum, confirming the round-trip.
Why this example exercises the novel gadget
Vertex 1 is omitted at the optimum, so the gadget edge (r, t_1) (cost beta * p(1) = 1) is selected in T*. A bug that charged beta * p(v) when v is included (instead of omitted), or that omitted the (r, t_v) edge entirely, would change this example's optimum and be caught by the closed-loop test.
BibTeX
@article{BienstockGoemansSimchiLeviWilliamson1993,
author = {Daniel Bienstock and Michel X. Goemans and David Simchi-Levi and David Williamson},
title = {A note on the prize collecting traveling salesman problem},
journal = {Mathematical Programming},
volume = {59},
number = {1},
pages = {413--420},
year = {1993},
doi = {10.1007/BF01581256},
url = {https://doi.org/10.1007/BF01581256}
}
@article{TuncbagEtAl2013PCSF,
author = {Nurcan Tuncbag and Alfredo Braunstein and Andrea Pagnani and Shao-Shan Carol Huang and Jennifer Chayes and Christian Borgs and Riccardo Zecchina and Ernest Fraenkel},
title = {Simultaneous Reconstruction of Multiple Signaling Pathways via the Prize-Collecting Steiner Forest Problem},
journal = {Journal of Computational Biology},
volume = {20},
number = {2},
pages = {124--136},
year = {2013},
doi = {10.1089/cmb.2012.0092},
url = {https://doi.org/10.1089/cmb.2012.0092}
}
@inproceedings{TuncbagEtAl2012RECOMB,
author = {Nurcan Tuncbag and Alfredo Braunstein and Andrea Pagnani and Shao-Shan Carol Huang and Jennifer Chayes and Christian Borgs and Riccardo Zecchina and Ernest Fraenkel},
title = {Simultaneous Reconstruction of Multiple Signaling Pathways via the Prize-Collecting Steiner Forest Problem},
booktitle = {Research in Computational Molecular Biology (RECOMB 2012)},
series = {Lecture Notes in Bioinformatics},
volume = {7262},
pages = {287--301},
year = {2012},
publisher = {Springer},
doi = {10.1007/978-3-642-29627-7_31},
url = {https://doi.org/10.1007/978-3-642-29627-7_31}
}
Source
PrizeCollectingSteinerForest
Target
SteinerTree
Motivation
This is the natural companion rule for the biology-paper PCSF model.
The key modeling point is:
This rule records the well-known auxiliary-root construction: turn a forest into a single tree in an augmented graph by adding an artificial root
r, while compiling the omitted-prize tradeoff into the target tree formulation through an explicit per-vertex terminal-pair gadget.Reference
The cited papers reduce PCSF to Prize-Collecting Steiner Tree (PCST) (a target that still carries node prizes), not to plain Steiner Tree. We use the same artificial-root idea, plus a standard auxiliary-terminal gadget to translate the remaining prize term into ordinary Steiner-tree edge costs. This combined construction is treated as a well-known auxiliary-root gadget in the prize-collecting Steiner literature (see Bienstock-Goemans-Simchi-Levi-Williamson 1993 for the original prize/penalty framework predating Tuncbag).
The Tuncbag paper gives the exact PCSF objective
f'(F) = beta * sum_{v notin V_F} p(v) + sum_{e in E_F} c(e) + omega * kappa(F)and explains the artificial-root transformation by attaching a new root to every original vertex with cost
omega. The remaining step — translating the prize termbeta * p(v)forv notin V_Finto plain Steiner-tree edge costs via a constant-size auxiliary-terminal gadget — is the routine extension recorded by this rule.Reduction Algorithm
This proposal is for the first concrete undirected repo variant.
Let the source instance be:
G = (V, E)p(v) >= 0c(e) >= 0beta >= 0,omega >= 0Let
V_p = { v in V : p(v) > 0 }denote the vertices that actually carry a prize. Vertices withp(v) = 0contribute nothing to the omitted-prize term and need no gadget.Construction
Build the target graph
H = (V_H, E_H)with edge costsc_Hand terminal setT_Has follows.Add a new artificial root
r. SetV_H = V cup {r}.Keep every original edge
e = (u, v) in Ewith costc_H(e) = c(e).Add a root-attachment edge
(r, v)of costomegafor every original vertexv in V.Per-vertex prize gadget. For each
v in V_p(vertex with positive prize):t_v(soV_Hgains one node per prized vertex),(v, t_v)with cost0,(r, t_v)with costbeta * p(v).Intuition:
vis included in the projected forest (sovis connected torthrough original vertices), the gadget is paid by taking the0-cost edge(v, t_v),vis omitted, the only way to connectt_vto the tree is the(r, t_v)edge of costbeta * p(v).Target terminal set.
T_H = {r} cup { t_v : v in V_p }. Original vertices are non-terminals (Steiner vertices).Scaling parameter beta
If both
p(v)andc(e)are integral andbetais integral, takebeta = 1(the construction above usesbetadirectly; no scaling needed because all charges are commensurate).If any of
p(v),c(e), orbetais rational, multiply all edge costs inHby a common denominator before solving the target instance (Steiner tree optima are preserved under positive scaling). If a future variant requires integer costs but the source has real-valued data, pick a scale factorS >= max(c(e), beta*p(v)) + 1to disambiguate ties; for the integral case used here,beta = 1with the original costs suffices.Solving and witness extraction
Solve the resulting
SteinerTreeinstance onHto obtain an optimal Steiner treeT*spanningT_H.To recover the source PCSF witness
(V_F, E_F):E_F = T* cap E(G)(intersect the target tree edges with the original edge setE),V_F = { v in V : t_v in V(T*) and the edge (v, t_v) belongs to T* } cup { v in V : v is an endpoint of some edge in E_F } cup { v in V : (r, v) in T* and v is the unique attachment of its component }.Equivalently, take the connected components of
T*after removingrand all gadget verticest_v; the union of all original vertices appearing in these components isV_F, andE_Fis the set of original edges among them.Correctness intuition
Fbecomes attached exactly once to the artificial rootr, soT*pays exactlyomega * kappa(F).v in V_p: ifv in V_F, the gadget edge(v, t_v)costs0; ifv notin V_F, the only way to reach the terminalt_vis the(r, t_v)edge of costbeta * p(v). Summing overV_preproducesbeta * sum_{v notin V_F} p(v).p(v) = 0have no gadget and contribute nothing to the omitted-prize sum, matching the source objective.Fcan be lifted to a feasible Steiner tree inHby attaching each component once torand resolving each gadget locally (includedv: take(v, t_v); omittedv: take(r, t_v)).The two objectives agree exactly:
cost_H(T*) = f'(F*).Size Overhead
Let:
n = num_verticesm = num_edgesk = | V_p |= number of original vertices withp(v) > 0(at mostn)Exact target sizes:
num_verticesn + k + 1num_edgesm + n + 2 * k(original edges + root-attachment edges + two gadget edges per prized vertex)num_terminalsk + 1Conservative upper bounds (using
k <= n):num_vertices<= 2 * n + 1num_edges<= m + 3 * nnum_terminals<= n + 1The overhead is linear in the source graph size.
For the
#[reduction(overhead = {...})]macro (which uses source-instance getters), the overhead expressions are:(
num_vertices_with_prizeis the getter on the PCSF source type that returns|V_p|. If the source variant exposes onlynum_vertices, use the bound versions2 * num_vertices + 1,num_edges + 3 * num_vertices,num_vertices + 1.)Validation Method
n <= 6).cost_H(T*) = f'(F*).(V_F, E_F)fromT*and verify it is a forest onGwhose PCSF objective matches.n = 0),beta * p(v)is not paid — vertex omitted),omega/ edge-cost tradeoff,(r, t_v)half of the gadget is exercised,Example
We pick an instance where a prize vertex is omitted at the optimum (so the gadget is genuinely exercised), per the check-issue recommendation.
Source instance
0 - 1 - 2with edges(0,1)and(1,2)c(0,1) = 10,c(1,2) = 10p(0) = 5,p(1) = 1,p(2) = 5beta = 1,omega = 1Feasible source forests and their objectives
For each candidate forest
F = (V_F, E_F):V_FE_Fkappa(F)omega * kappa(F){}{}5+1+5=1111{0}{}1+5=67{2}{}5+1=67{0, 2}{}13{0, 1}{(0,1)}516{1, 2}{(1,2)}516{0, 1, 2}{(0,1),(1,2)}021The optimum is
V_F* = {0, 2},E_F* = {}, objective3. Vertex1is omitted even thoughp(1) = 1 > 0, because payingbeta * p(1) = 1is cheaper than paying the edge cost to connect it (c(0,1) = 10orc(1,2) = 10).Augmented target instance
Vertices:
V_H = {0, 1, 2, r, t_0, t_1, t_2}(so|V_H| = n + k + 1 = 3 + 3 + 1 = 7).Edges and costs:
(0,1)(1,2)(r,0)(r,1)(r,2)(0, t_0)(r, t_0)beta*p(0))(1, t_1)(r, t_1)beta*p(1))(2, t_2)(r, t_2)beta*p(2))Total target edges:
m + n + 2k = 2 + 3 + 6 = 11. Terminal set:T_H = {r, t_0, t_1, t_2}(|T_H| = k+1 = 4).Lifted optimal target tree
The lift of the source optimum
V_F* = {0, 2}, E_F* = {}is:0: edge(r, 0)cost 12: edge(r, 2)cost 10: edge(0, t_0)cost 02: edge(2, t_2)cost 01: edge(r, t_1)cost 1This is a tree spanning
T_H = {r, t_0, t_1, t_2}(vertices0,2are Steiner relays). Vertex1and its incident original edges are not inT*.Total target objective:
1 + 1 + 0 + 0 + 1 = 3. Matches the source optimum of3.Witness extraction
Removing
rand allt_vfromT*leaves the original vertices{0, 2}with no edges among them. SoV_F = {0, 2},E_F = T* cap E(G) = {}. This is the source optimum, confirming the round-trip.Why this example exercises the novel gadget
Vertex
1is omitted at the optimum, so the gadget edge(r, t_1)(costbeta * p(1) = 1) is selected inT*. A bug that chargedbeta * p(v)whenvis included (instead of omitted), or that omitted the(r, t_v)edge entirely, would change this example's optimum and be caught by the closed-loop test.BibTeX