Skip to content

[Model] MinimumCardinalityKey #444

@isPANN

Description

@isPANN

Motivation

MINIMUM CARDINALITY KEY (P174) from Garey & Johnson, A4 SR26. A classical NP-complete problem from relational database theory. Given a set of attribute names and functional dependencies, the problem asks whether there exists a candidate key of cardinality at most M. This is fundamental to database normalization: identifying the smallest key determines the most efficient way to uniquely identify rows in a relation. The problem connects graph-theoretic vertex cover to database schema design.

Associated rules:

  • R120: Vertex Cover -> Minimum Cardinality Key (this model is the target)
  • R122: Minimum Cardinality Key -> Prime Attribute Name (this model is the source)

Definition

Name: MinimumCardinalityKey
Canonical name: Minimum Cardinality Key (also: Minimum Key, Least Cardinality Candidate Key)
Reference: Garey & Johnson, Computers and Intractability, A4 SR26, p.232

Mathematical definition:

INSTANCE: A set A of "attribute names," a collection F of ordered pairs of subsets of A (called "functional dependencies" on A), and a positive integer M.
QUESTION: Is there a key of cardinality M or less for the relational system <A,F>, i.e., a minimal subset K ⊆ A with |K| <= M such that the ordered pair (K,A) belongs to the "closure" F* of F defined by (1) F ⊆ F*, (2) B ⊆ C ⊆ A implies (C,B) in F*, (3) (B,C),(C,D) in F* implies (B,D) in F*, and (4) (B,C),(B,D) in F* implies (B,C ∪ D) in F*?

Variables

  • Count: num_attributes binary variables, one per attribute.
  • Per-variable domain: binary {0, 1} — whether the attribute is included in the candidate key K.
  • dims(): vec![2; num_attributes]
  • evaluate(): Let K = {i : config[i] = 1}. Compute the closure of K under F* (repeatedly apply functional dependencies until no new attributes are added). Return true iff closure(K) = A, |K| <= bound_k, and K is minimal (no proper subset of K also has closure = A).

Schema (data type)

Type name: MinimumCardinalityKey
Variants: none (no graph or weight parameters; the problem is purely set-theoretic)

Field Type Description
num_attributes usize Number of attributes
dependencies Vec<(Vec<usize>, Vec<usize>)> Functional dependencies F; each pair (lhs, rhs) means lhs -> rhs
bound_k usize The positive integer M: key must have cardinality <= bound_k

Size fields (getter methods for overhead expressions and declare_variants!):

  • num_attributes() — returns num_attributes
  • num_dependencies() — returns dependencies.len()
  • bound_k() — returns bound_k

Notes:

  • This is a satisfaction (decision) problem: Metric = bool, implementing SatisfactionProblem.
  • No weight type is needed.
  • The closure F* is computed using Armstrong's axioms: reflexivity (B ⊆ C implies C -> B), transitivity ((B,C) and (C,D) imply (B,D)), and augmentation/union ((B,C) and (B,D) imply (B, C ∪ D)).
  • A key K must satisfy: (1) K determines all of A (i.e., (K,A) in F*), and (2) K is minimal (no proper subset of K also determines A).
  • The number of candidate keys can be exponential in |A|.

Complexity

  • Best known exact algorithm: Brute-force enumeration of all subsets of A of size at most M, checking each for the key property via attribute closure computation. Closure computation is O(|F| * |A|) per subset (linear pass through dependencies). Total: O(binom(|A|, M) * |F| * |A|). For M = |A|/2, this is O(2^|A| * |F| * |A|). Lucchesi and Osborn (1978) give an algorithm that finds all candidate keys in time polynomial in |A|, |F|, and the number of keys |K|, but since |K| can be exponential, the worst case remains exponential.
  • NP-completeness: NP-complete [Lucchesi and Osborn, 1978], via transformation from VERTEX COVER.
  • References:
    • C. L. Lucchesi and S. L. Osborn (1978). "Candidate keys for relations." J. Computer and System Sciences, 17(2), pp. 270-279.
    • W. Lipsky, Jr. (1977). "Two NP-complete problems related to information retrieval." Fundamentals of Computation Theory, Springer.

Extra Remark

Full book text:

INSTANCE: A set A of "attribute names," a collection F of ordered pairs of subsets of A (called "functional dependencies" on A), and a positive integer M.
QUESTION: Is there a key of cardinality M or less for the relational system <A,F>, i.e., a minimal subset K ⊆ A with |K| <= M such that the ordered pair (K,A) belongs to the "closure" F* of F defined by (1) F ⊆ F*, (2) B ⊆ C ⊆ A implies (C,B) ∈ F*, (3) (B,C),(C,D) ∈ F* implies (B,D) ∈ F*, and (4) (B,C),(B,D) ∈ F* implies (B,C ∪ D) ∈ F*?
Reference: [Lucchesi and Osborn, 1978] (cited as 1977 preprint in GJ), [Lipsky, 1977a]. Transformation from VERTEX COVER. See [Date, 1975] for general background on relational data bases.

Connection to Vertex Cover: The Minimum Cardinality Key problem generalizes Vertex Cover. In the reduction, vertices become attributes, edges become functional dependencies (each endpoint determines the edge attribute), and a vertex cover corresponds to a set of attributes that determines all edge attributes. The key cardinality bound M corresponds directly to the vertex cover budget k.

How to solve

  • It can be solved by (existing) bruteforce -- enumerate all subsets of A of size <= M, compute attribute closure for each, check if closure equals A.
  • It can be solved by reducing to integer programming -- binary variable x_i per attribute; constraint that the closure of selected attributes covers all attributes; sum constraint sum(x_i) <= M. The closure constraint can be linearized using auxiliary variables.
  • Other: Lucchesi-Osborn algorithm enumerates all candidate keys in output-polynomial time. Can also reduce to Set Cover / Hitting Set and use known solvers.

Example Instance

Instance 1 (YES — has small key):
num_attributes = 6 (attributes: {0, 1, 2, 3, 4, 5})
Functional dependencies F:

  • {0, 1} -> {2}
  • {0, 2} -> {3}
  • {1, 3} -> {4}
  • {2, 4} -> {5}

bound_k = 2

Analysis of candidate key {0, 1}:

  • Start: {0, 1}
  • Apply {0, 1} -> {2}: closure = {0, 1, 2}
  • Apply {0, 2} -> {3}: closure = {0, 1, 2, 3}
  • Apply {1, 3} -> {4}: closure = {0, 1, 2, 3, 4}
  • Apply {2, 4} -> {5}: closure = {0, 1, 2, 3, 4, 5} = A
  • {0, 1} is a key of cardinality 2 <= bound_k = 2.
  • Check minimality: {0} alone: closure = {0} (no applicable FD). {1} alone: closure = {1}. Neither determines A.
  • So {0, 1} is a minimal key (candidate key) of size 2.

Answer: YES.

Instance 2 (NO — no small key):
num_attributes = 6 (attributes: {0, 1, 2, 3, 4, 5})
Functional dependencies F:

  • {0, 1, 2} -> {3}
  • {3, 4} -> {5}

bound_k = 2

No subset of size <= 2 can determine all of A:

  • {0, 1, 2} -> {3} requires all three attributes. Any 2-element subset misses at least one, so the FD cannot fire.
  • Even {0, 1}: closure = {0, 1} (cannot fire {0,1,2}->{3} without 2). Not a key.
  • No 2-element subset determines A.

Answer: NO.

Metadata

Metadata

Assignees

No one assigned

    Labels

    GoodAn issue passed all checks.modelA model problem to be implemented.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Done

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions