[Rule] Vertex Cover to Multiple Copy File Allocation

**Source:** VERTEX COVER (implemented in this repo as `Decision<MinimumVertexCover<SimpleGraph, i32>>` with all vertex weights set to `1`)
**Target:** MULTIPLE COPY FILE ALLOCATION (implemented as a new decision wrapper `Decision<MultipleCopyFileAllocation>`, i.e. `DecisionMultipleCopyFileAllocation`)
**Motivation:** Use the uniform-cost strong-NP-completeness result for MULTIPLE COPY FILE ALLOCATION recorded by Van Sickle and Chandy (1977, IFIP Congress 77) and catalogued by Garey and Johnson (1979, Appendix A4.1 [SR6]). The right gadget is not “use the same graph and make access expensive”; that makes copies too cheap and collapses the optimum to “copy everywhere.” Instead, attach a hub that cheaply keeps every original vertex within distance 1, add two private leaves that force the hub into every optimum, and represent each source edge by two demand vertices whose distance jumps from 1 to 2 exactly when the edge is left uncovered. With uniform `storage = 2` and uniform `usage = 1`, the normalized MCFA cost becomes `n + 2m + 4 + |C| + 2·(# uncovered edges)`, so the decision threshold encodes a size-`k` vertex cover exactly.
**Reference:** Van Sickle and Chandy (1977, *Information Processing 77 / IFIP Congress 77*); Garey and Johnson (1979, *Computers and Intractability*, Appendix A4.1 [SR6], p. 227)

## GJ Source Entry

> [SR6] MULTIPLE COPY FILE ALLOCATION
> INSTANCE: Graph G = (V, E), for each v ∈ V a usage u(v) ∈ Z⁺ and a storage cost s(v) ∈ Z⁺, and a positive integer K.
> QUESTION: Is there a subset V' ⊆ V such that, if for each v ∈ V we let d(v) denote the number of edges in the shortest path in G from v to a member of V', we have
>
> ∑_{v ∈ V'} s(v) + ∑_{v ∈ V} d(v)·u(v) ≤ K ?
>
> Reference: [Van Sickle and Chandy, 1977]. Transformation from VERTEX COVER.
> Comment: NP-complete in the strong sense, even if all v ∈ V have the same value of u(v) and the same value of s(v).

## Reduction Algorithm

**Summary:**
Let the source be a unit-weight VERTEX COVER instance `(G = (V, E), k)` with `n = |V|` and `m = |E|`. In repo terms, this is `Decision<MinimumVertexCover<SimpleGraph, i32>>` with all weights equal to `1` and decision bound `k`.

Because the codebase model [`MultipleCopyFileAllocation`](src/models/graph/multiple_copy_file_allocation.rs) is an optimization problem over fields `graph`, `usage`, and `storage`, the actual target of the reduction should be a decision wrapper `Decision<MultipleCopyFileAllocation>` registered in the usual way:
- implement `DecisionProblemMeta for MultipleCopyFileAllocation` with `DECISION_NAME = "DecisionMultipleCopyFileAllocation"`
- add inherent getters on `Decision<MultipleCopyFileAllocation>` delegating to the inner problem: `num_vertices()`, `num_edges()`, and `k()` (the bound, converted to `usize`)
- invoke `register_decision_variant!` with size getters `num_vertices` and `num_edges`

The reduction itself is:

1. **Graph gadget.**
   Construct a graph `H` with four kinds of vertices:
   - a hub vertex `r`
   - two leaves `p` and `q`, each adjacent only to `r`
   - one original vertex `x_v` for each `v ∈ V`
   - for each source edge `e = {u, v} ∈ E`, two edge-demand vertices `a_e` and `b_e`

   Add edges:
   - `{r, p}` and `{r, q}`
   - `{r, x_v}` for every `v ∈ V`
   - `{a_e, x_u}`, `{a_e, x_v}`, `{b_e, x_u}`, `{b_e, x_v}` for every `e = {u, v} ∈ E`

   No other edges are added.

2. **Uniform costs.**
   Set
   - `storage(z) = 2` for every vertex `z ∈ V(H)`
   - `usage(z) = 1` for every vertex `z ∈ V(H)`

   This stays inside the uniform-cost regime explicitly noted by Garey and Johnson (1979, Appendix A4.1 [SR6]) for the Van Sickle-Chandy hardness result.

3. **Decision bound.**
   Set
   - `K = n + 2m + 4 + k`

4. **No isolated-vertex precondition is needed.**
   Unlike the broken same-graph draft, this gadget handles isolated source vertices directly. If `v` is isolated in `G`, then `x_v` is simply another spoke of the hub `r`; it contributes the same baseline access cost `1` whether or not it is selected, and it does not interfere with the cover accounting. In particular, the empty-edge case `m = 0` is handled correctly: the source answer is YES iff `k ≥ 0`, and the target answer is YES via the single-copy placement `{r}`.

5. **Normalization lemma.**
   Any feasible MCFA solution `S ⊆ V(H)` can be transformed in polynomial time, without increasing cost, into a set of the form
   - `S' = {r} ∪ {x_v : v ∈ C}`
   for some `C ⊆ V`.

   The repairs are:
   - If `p` or `q` is selected, replace that leaf by `r`. Storage is unchanged and all distances weakly improve because `r` dominates both leaves and every original vertex.
   - After removing selected leaves, if `r ∉ S`, add `r`. This increases storage by `2`, but decreases the access term by at least `3`: `r` itself drops by at least `1`, and each of `p` and `q` drops from distance at least `2` to distance `1`.
   - With `r` selected, any selected edge-demand vertex `a_e` or `b_e` can be replaced by one of its endpoint vertices `x_u` or `x_v` without increasing cost. The replaced demand vertex goes from distance `0` to `1`, but the newly selected endpoint goes from distance `1` to `0`, and other distances weakly improve.

6. **Cost formula in normal form.**
   Fix `C ⊆ V` and consider `S(C) = {r} ∪ {x_v : v ∈ C}`.
   Let `U(C)` be the set of source edges uncovered by `C`.

   Then:
   - storage cost is `2(|C| + 1)`
   - each of the two leaves contributes access cost `1`
   - each unselected original vertex contributes access cost `1`
   - for each covered source edge, both demand vertices are at distance `1`, contributing `2`
   - for each uncovered source edge, both demand vertices are at distance `2`, contributing `4`

   Therefore
   - `cost(S(C)) = 2(|C| + 1) + 2 + (n - |C|) + 2(m - |U(C)|) + 4|U(C)|`
   - `cost(S(C)) = n + 2m + 4 + |C| + 2|U(C)|`

7. **Forward direction.**
   If `G` has a vertex cover `C` with `|C| ≤ k`, then `|U(C)| = 0`, so
   - `cost(S(C)) = n + 2m + 4 + |C| ≤ n + 2m + 4 + k = K`
   Hence the constructed `Decision<MultipleCopyFileAllocation>` instance is YES.

8. **Reverse direction.**
   Suppose the MCFA instance has a solution of cost at most `K`.
   Normalize it to `S(C)` as above. Then
   - `n + 2m + 4 + |C| + 2|U(C)| ≤ K = n + 2m + 4 + k`
   so
   - `|C| + 2|U(C)| ≤ k`

   Now repair uncovered edges in the obvious way: for each uncovered source edge in `U(C)`, add one of its endpoints to `C`. This produces a genuine vertex cover `C*` with
   - `|C*| ≤ |C| + |U(C)| ≤ |C| + 2|U(C)| ≤ k`
   Hence the source VERTEX COVER instance is YES.

9. **Solution extraction.**
   From any target witness, first normalize it to `S(C) = {r} ∪ {x_v : v ∈ C}`. Then repair uncovered edges as in Step 8 to obtain a source vertex cover `C*` of size at most `k`. This is polynomial-time witness extraction.

10. **Time complexity.**
    Building `H`, the uniform `usage` vector, the uniform `storage` vector, and the decision bound `K` takes `O(n + m)` time.

## Size Overhead

**Symbols:**
- `n = num_vertices` of source graph `G`
- `m = num_edges` of source graph `G`
- `k =` source decision bound

| Target metric (code name) | Polynomial |
|----------------------------|------------|
| `num_vertices`             | `num_vertices + 2*num_edges + 3` |
| `num_edges`                | `num_vertices + 4*num_edges + 2` |
| `k`                        | `num_vertices + 2*num_edges + 4 + k` |

**Derivation:**
- Vertices in `H`: `n` original vertices, `2m` edge-demand vertices, one hub, and two leaves, for a total of `n + 2m + 3`.
- Edges in `H`: `n` hub-to-original edges, `2` hub-to-leaf edges, and `4m` endpoint-incidence edges, for a total of `n + 4m + 2`.
- The target decision bound is `K = n + 2m + 4 + k`.
- Both cost vectors have length `n + 2m + 3`:
  - `usage = [1; n + 2m + 3]`
  - `storage = [2; n + 2m + 3]`

## Validation Method

- Closed-loop test: start from a unit-weight `Decision<MinimumVertexCover<SimpleGraph, i32>>` instance, build the target `Decision<MultipleCopyFileAllocation>` instance, brute-force all copy placements, and verify:
  - `VC(G) ≤ k` iff `MCFA(H) ≤ n + 2m + 4 + k`
  - normalized target witnesses extract back to source covers of size at most `k`
- Corner case with isolated vertices: source graph on `{0,1,2,3}` with one edge `{0,1}` and isolated vertices `2,3`, bound `k = 1`.
  - Here `n = 4`, `m = 1`, so `K = 4 + 2 + 4 + 1 = 11`.
  - The target YES witness `{r, x_0}` has cost exactly `11`, so isolated vertices do not break the reduction.
- Empty-edge case: if `E = ∅`, then the source is always YES for `k ≥ 0`.
  - The target witness `{r}` has cost `n + 4 = K` when `k = 0`, so the reduction still matches.
- Worked cycle case: for `C_6` and `k = 3`, the target optimum is exactly `25` (not `6`, and not `114`), attained by the hub plus the three alternating cover vertices.
- Adversarial non-cover check: for the same `C_6`, any normalized choice of only two source vertices leaves at least two source edges uncovered, so its cost is at least `n + 2m + 4 + 2 + 2·2 = 28 > 25`.

## Example

**Source instance (VERTEX COVER):**
Let `G = C_6` with vertices `{0,1,2,3,4,5}` and edges
- `{0,1}`, `{1,2}`, `{2,3}`, `{3,4}`, `{4,5}`, `{5,0}`

Take bound `k = 3`. This is a YES instance: for example,
- `C = {1,3,5}`
is a vertex cover of size `3`.

**Constructed target instance (Decision<MultipleCopyFileAllocation>):**
Build `H` with:
- hub `r`
- leaves `p, q`
- original vertices `x_0, x_1, x_2, x_3, x_4, x_5`
- for each cycle edge `e_i`, two demand vertices `a_i, b_i`

Counts:
- `num_vertices = 6 + 2·6 + 3 = 21`
- `num_edges = 6 + 4·6 + 2 = 32`

Uniform costs:
- `storage(z) = 2` for all 21 vertices
- `usage(z) = 1` for all 21 vertices

Decision bound:
- `K = n + 2m + 4 + k = 6 + 12 + 4 + 3 = 25`

**Solution mapping:**
Take the copy set
- `S = {r, x_1, x_3, x_5}`

Its cost is:
- storage: `4` selected vertices, so `4·2 = 8`
- access from unselected original vertices `x_0, x_2, x_4`: `3`
- access from leaves `p, q`: `2`
- access from the 12 demand vertices: every source edge is covered by `{1,3,5}`, so each demand vertex is at distance `1`, contributing `12`

Total:
- `8 + 3 + 2 + 12 = 25 = K`

**Verification:**
- Forward: the source cover `{1,3,5}` of size `3` maps to an MCFA placement of cost exactly `25`.
- Reverse: any normalized target solution has cost
  - `22 + |C| + 2|U(C)|`
  because `n + 2m + 4 = 22` here.
  So cost at most `25` implies
  - `|C| + 2|U(C)| ≤ 3`
  and the repair argument yields a vertex cover of size at most `3`.
- For instance, `C = {1,3}` is not a cover of `C_6`; it leaves two edges uncovered, so its normalized MCFA cost is
  - `22 + 2 + 2·2 = 28 > 25`

## References

- **[VanSickleChandy1977]** Lawrence Van Sickle and K. Mani Chandy (1977). *Computational Complexity of Network Design Algorithms*. In *Information Processing 77: Proceedings of IFIP Congress 77*, pp. 235-239.
- **[GareyJohnson1979]** Michael R. Garey and David S. Johnson (1979). *Computers and Intractability: A Guide to the Theory of NP-Completeness*. W. H. Freeman. Appendix A4.1, entry [SR6].

Target metric (code name)	Polynomial
`num_vertices`	`num_vertices + 2*num_edges + 3`
`num_edges`	`num_vertices + 4*num_edges + 2`
`k`	`num_vertices + 2*num_edges + 4 + k`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Rule] Vertex Cover to Multiple Copy File Allocation #425

GJ Source Entry

Reduction Algorithm

Size Overhead

Validation Method

Example

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Rule] Vertex Cover to Multiple Copy File Allocation #425

Description

GJ Source Entry

Reduction Algorithm

Size Overhead

Validation Method

Example

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions