Skip to content

[Rule] Vertex Cover to Multiple Copy File Allocation #425

Description

@isPANN

Source: VERTEX COVER (implemented in this repo as Decision<MinimumVertexCover<SimpleGraph, i32>> with all vertex weights set to 1)
Target: MULTIPLE COPY FILE ALLOCATION (implemented as a new decision wrapper Decision<MultipleCopyFileAllocation>, i.e. DecisionMultipleCopyFileAllocation)
Motivation: Use the uniform-cost strong-NP-completeness result for MULTIPLE COPY FILE ALLOCATION recorded by Van Sickle and Chandy (1977, IFIP Congress 77) and catalogued by Garey and Johnson (1979, Appendix A4.1 [SR6]). The right gadget is not “use the same graph and make access expensive”; that makes copies too cheap and collapses the optimum to “copy everywhere.” Instead, attach a hub that cheaply keeps every original vertex within distance 1, add two private leaves that force the hub into every optimum, and represent each source edge by two demand vertices whose distance jumps from 1 to 2 exactly when the edge is left uncovered. With uniform storage = 2 and uniform usage = 1, the normalized MCFA cost becomes n + 2m + 4 + |C| + 2·(# uncovered edges), so the decision threshold encodes a size-k vertex cover exactly.
Reference: Van Sickle and Chandy (1977, Information Processing 77 / IFIP Congress 77); Garey and Johnson (1979, Computers and Intractability, Appendix A4.1 [SR6], p. 227)

GJ Source Entry

[SR6] MULTIPLE COPY FILE ALLOCATION
INSTANCE: Graph G = (V, E), for each v ∈ V a usage u(v) ∈ Z⁺ and a storage cost s(v) ∈ Z⁺, and a positive integer K.
QUESTION: Is there a subset V' ⊆ V such that, if for each v ∈ V we let d(v) denote the number of edges in the shortest path in G from v to a member of V', we have

∑_{v ∈ V'} s(v) + ∑_{v ∈ V} d(v)·u(v) ≤ K ?

Reference: [Van Sickle and Chandy, 1977]. Transformation from VERTEX COVER.
Comment: NP-complete in the strong sense, even if all v ∈ V have the same value of u(v) and the same value of s(v).

Reduction Algorithm

Summary:
Let the source be a unit-weight VERTEX COVER instance (G = (V, E), k) with n = |V| and m = |E|. In repo terms, this is Decision<MinimumVertexCover<SimpleGraph, i32>> with all weights equal to 1 and decision bound k.

Because the codebase model MultipleCopyFileAllocation is an optimization problem over fields graph, usage, and storage, the actual target of the reduction should be a decision wrapper Decision<MultipleCopyFileAllocation> registered in the usual way:

  • implement DecisionProblemMeta for MultipleCopyFileAllocation with DECISION_NAME = "DecisionMultipleCopyFileAllocation"
  • add inherent getters on Decision<MultipleCopyFileAllocation> delegating to the inner problem: num_vertices(), num_edges(), and k() (the bound, converted to usize)
  • invoke register_decision_variant! with size getters num_vertices and num_edges

The reduction itself is:

  1. Graph gadget.
    Construct a graph H with four kinds of vertices:

    • a hub vertex r
    • two leaves p and q, each adjacent only to r
    • one original vertex x_v for each v ∈ V
    • for each source edge e = {u, v} ∈ E, two edge-demand vertices a_e and b_e

    Add edges:

    • {r, p} and {r, q}
    • {r, x_v} for every v ∈ V
    • {a_e, x_u}, {a_e, x_v}, {b_e, x_u}, {b_e, x_v} for every e = {u, v} ∈ E

    No other edges are added.

  2. Uniform costs.
    Set

    • storage(z) = 2 for every vertex z ∈ V(H)
    • usage(z) = 1 for every vertex z ∈ V(H)

    This stays inside the uniform-cost regime explicitly noted by Garey and Johnson (1979, Appendix A4.1 [SR6]) for the Van Sickle-Chandy hardness result.

  3. Decision bound.
    Set

    • K = n + 2m + 4 + k
  4. No isolated-vertex precondition is needed.
    Unlike the broken same-graph draft, this gadget handles isolated source vertices directly. If v is isolated in G, then x_v is simply another spoke of the hub r; it contributes the same baseline access cost 1 whether or not it is selected, and it does not interfere with the cover accounting. In particular, the empty-edge case m = 0 is handled correctly: the source answer is YES iff k ≥ 0, and the target answer is YES via the single-copy placement {r}.

  5. Normalization lemma.
    Any feasible MCFA solution S ⊆ V(H) can be transformed in polynomial time, without increasing cost, into a set of the form

    • S' = {r} ∪ {x_v : v ∈ C}
      for some C ⊆ V.

    The repairs are:

    • If p or q is selected, replace that leaf by r. Storage is unchanged and all distances weakly improve because r dominates both leaves and every original vertex.
    • After removing selected leaves, if r ∉ S, add r. This increases storage by 2, but decreases the access term by at least 3: r itself drops by at least 1, and each of p and q drops from distance at least 2 to distance 1.
    • With r selected, any selected edge-demand vertex a_e or b_e can be replaced by one of its endpoint vertices x_u or x_v without increasing cost. The replaced demand vertex goes from distance 0 to 1, but the newly selected endpoint goes from distance 1 to 0, and other distances weakly improve.
  6. Cost formula in normal form.
    Fix C ⊆ V and consider S(C) = {r} ∪ {x_v : v ∈ C}.
    Let U(C) be the set of source edges uncovered by C.

    Then:

    • storage cost is 2(|C| + 1)
    • each of the two leaves contributes access cost 1
    • each unselected original vertex contributes access cost 1
    • for each covered source edge, both demand vertices are at distance 1, contributing 2
    • for each uncovered source edge, both demand vertices are at distance 2, contributing 4

    Therefore

    • cost(S(C)) = 2(|C| + 1) + 2 + (n - |C|) + 2(m - |U(C)|) + 4|U(C)|
    • cost(S(C)) = n + 2m + 4 + |C| + 2|U(C)|
  7. Forward direction.
    If G has a vertex cover C with |C| ≤ k, then |U(C)| = 0, so

    • cost(S(C)) = n + 2m + 4 + |C| ≤ n + 2m + 4 + k = K
      Hence the constructed Decision<MultipleCopyFileAllocation> instance is YES.
  8. Reverse direction.
    Suppose the MCFA instance has a solution of cost at most K.
    Normalize it to S(C) as above. Then

    • n + 2m + 4 + |C| + 2|U(C)| ≤ K = n + 2m + 4 + k
      so
    • |C| + 2|U(C)| ≤ k

    Now repair uncovered edges in the obvious way: for each uncovered source edge in U(C), add one of its endpoints to C. This produces a genuine vertex cover C* with

    • |C*| ≤ |C| + |U(C)| ≤ |C| + 2|U(C)| ≤ k
      Hence the source VERTEX COVER instance is YES.
  9. Solution extraction.
    From any target witness, first normalize it to S(C) = {r} ∪ {x_v : v ∈ C}. Then repair uncovered edges as in Step 8 to obtain a source vertex cover C* of size at most k. This is polynomial-time witness extraction.

  10. Time complexity.
    Building H, the uniform usage vector, the uniform storage vector, and the decision bound K takes O(n + m) time.

Size Overhead

Symbols:

  • n = num_vertices of source graph G
  • m = num_edges of source graph G
  • k = source decision bound
Target metric (code name) Polynomial
num_vertices num_vertices + 2*num_edges + 3
num_edges num_vertices + 4*num_edges + 2
k num_vertices + 2*num_edges + 4 + k

Derivation:

  • Vertices in H: n original vertices, 2m edge-demand vertices, one hub, and two leaves, for a total of n + 2m + 3.
  • Edges in H: n hub-to-original edges, 2 hub-to-leaf edges, and 4m endpoint-incidence edges, for a total of n + 4m + 2.
  • The target decision bound is K = n + 2m + 4 + k.
  • Both cost vectors have length n + 2m + 3:
    • usage = [1; n + 2m + 3]
    • storage = [2; n + 2m + 3]

Validation Method

  • Closed-loop test: start from a unit-weight Decision<MinimumVertexCover<SimpleGraph, i32>> instance, build the target Decision<MultipleCopyFileAllocation> instance, brute-force all copy placements, and verify:
    • VC(G) ≤ k iff MCFA(H) ≤ n + 2m + 4 + k
    • normalized target witnesses extract back to source covers of size at most k
  • Corner case with isolated vertices: source graph on {0,1,2,3} with one edge {0,1} and isolated vertices 2,3, bound k = 1.
    • Here n = 4, m = 1, so K = 4 + 2 + 4 + 1 = 11.
    • The target YES witness {r, x_0} has cost exactly 11, so isolated vertices do not break the reduction.
  • Empty-edge case: if E = ∅, then the source is always YES for k ≥ 0.
    • The target witness {r} has cost n + 4 = K when k = 0, so the reduction still matches.
  • Worked cycle case: for C_6 and k = 3, the target optimum is exactly 25 (not 6, and not 114), attained by the hub plus the three alternating cover vertices.
  • Adversarial non-cover check: for the same C_6, any normalized choice of only two source vertices leaves at least two source edges uncovered, so its cost is at least n + 2m + 4 + 2 + 2·2 = 28 > 25.

Example

Source instance (VERTEX COVER):
Let G = C_6 with vertices {0,1,2,3,4,5} and edges

  • {0,1}, {1,2}, {2,3}, {3,4}, {4,5}, {5,0}

Take bound k = 3. This is a YES instance: for example,

  • C = {1,3,5}
    is a vertex cover of size 3.

Constructed target instance (Decision):
Build H with:

  • hub r
  • leaves p, q
  • original vertices x_0, x_1, x_2, x_3, x_4, x_5
  • for each cycle edge e_i, two demand vertices a_i, b_i

Counts:

  • num_vertices = 6 + 2·6 + 3 = 21
  • num_edges = 6 + 4·6 + 2 = 32

Uniform costs:

  • storage(z) = 2 for all 21 vertices
  • usage(z) = 1 for all 21 vertices

Decision bound:

  • K = n + 2m + 4 + k = 6 + 12 + 4 + 3 = 25

Solution mapping:
Take the copy set

  • S = {r, x_1, x_3, x_5}

Its cost is:

  • storage: 4 selected vertices, so 4·2 = 8
  • access from unselected original vertices x_0, x_2, x_4: 3
  • access from leaves p, q: 2
  • access from the 12 demand vertices: every source edge is covered by {1,3,5}, so each demand vertex is at distance 1, contributing 12

Total:

  • 8 + 3 + 2 + 12 = 25 = K

Verification:

  • Forward: the source cover {1,3,5} of size 3 maps to an MCFA placement of cost exactly 25.
  • Reverse: any normalized target solution has cost
    • 22 + |C| + 2|U(C)|
      because n + 2m + 4 = 22 here.
      So cost at most 25 implies
    • |C| + 2|U(C)| ≤ 3
      and the repair argument yields a vertex cover of size at most 3.
  • For instance, C = {1,3} is not a cover of C_6; it leaves two edges uncovered, so its normalized MCFA cost is
    • 22 + 2 + 2·2 = 28 > 25

References

  • [VanSickleChandy1977] Lawrence Van Sickle and K. Mani Chandy (1977). Computational Complexity of Network Design Algorithms. In Information Processing 77: Proceedings of IFIP Congress 77, pp. 235-239.
  • [GareyJohnson1979] Michael R. Garey and David S. Johnson (1979). Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman. Appendix A4.1, entry [SR6].

Metadata

Metadata

Assignees

No one assigned

    Labels

    BlockedRule prerequisite not met: source or target not implementedIncompleteReduction doesn't cover all source instances (Rule Check 5)ruleA new reduction rule to be added.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    OnHold

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions