Skip to content

[Model] HighlyConnectedDeletion #1022

@zhangjieqingithub

Description

@zhangjieqingithub

Motivation

Highly Connected Deletion is a graph-clustering optimization problem from the biological-network literature. It asks for deleting as few edges as possible so that the remaining graph decomposes into dense clusters.

This is not the same as:

  • ClusterDeletion, which requires every cluster to be a clique
  • generic cut / partition problems, which do not encode the "highly connected" cluster condition

A dedicated model is useful because the objective and feasibility semantics are specific, and the paper's notion explicitly allows isolated leftover vertices.

Associated rule:

Definition

Name: HighlyConnectedDeletion
Reference:

  • Falk H�cffner, Christian Komusiewicz, Adrian Liebtrau, Rolf Niedermeier, "Partitioning Biological Networks into Highly Connected Clusters with Maximum Edge Coverage," IEEE/ACM Transactions on Computational Biology and Bioinformatics 11(3):455-467, 2014. https://doi.org/10.1109/TCBB.2013.177
  • Erez Hartuv, Ron Shamir, "A clustering algorithm based on graph connectivity," Information Processing Letters 76(4-6):175-181, 2000. https://doi.org/10.1016/S0020-0190(00)00142-3

Given a simple undirected graph G = (V, E), find a minimum-cardinality edge set F subseteq E such that every connected component of G - F is either:

  • an isolated vertex, or
  • a highly connected graph on at least 3 vertices.

A graph H is highly connected if
lambda(H) > |V(H)| / 2,
equivalently,
delta(H) > |V(H)| / 2.

Paper-specific nuance adopted here:

  • isolated vertices may remain as unclustered leftovers
  • isolated edges / 2-vertex components are not valid clusters

Variables

  • Count: |E| binary variables, one per edge
  • Per-variable domain: {0,1}
  • Meaning: x_e = 1 iff edge e is deleted

A configuration is feasible iff, after deleting exactly those edges with x_e = 1, every connected component is either an isolated vertex or a highly connected graph of size at least 3.

Schema (data type)

Type name: HighlyConnectedDeletion
Variants: none initially; simple undirected graph only

Field Type Description
graph SimpleGraph The input graph G = (V, E)

Complexity

A conservative exact baseline is brute-force over all subsets of edges to delete.

This gives the worst-case complexity
O(2^num_edges * poly(num_vertices, num_edges))

so the registry complexity string should be
2^num_edges.

Literature-backed remarks:

  • the problem is NP-hard
  • the paper also gives parameterized algorithms and kernelization with respect to the number of deleted edges
  • since this optimization model has no deletion budget in the input, the clean registry complexity is the conservative brute-force bound above

References:

Extra Remark

This model is weaker than clique-based clustering:

  • every clique of size at least 3 is highly connected
  • not every highly connected graph is a clique

So HighlyConnectedDeletion should not be folded into a clique-deletion model.

Reduction Rule Crossref

How to solve

Example Instance

Let

  • V = {0,1,2,3}
  • E = {{0,1}, {0,2}, {1,2}, {2,3}}

This is a triangle on {0,1,2} with one leaf vertex 3 attached to 2.

Expected Outcome

An optimal solution is to delete the single edge {2,3}.

Then the remaining graph has:

  • one component G[{0,1,2}] = K3, which is highly connected
  • one isolated vertex {3}, which is allowed as an unclustered leftover

Thus the optimal value is 1, with deletion configuration
x_(2,3) = 1
and all other deletion variables equal to 0.

Zero deletions are infeasible because the whole graph is connected on 4 vertices and has minimum degree 1, which is not greater than 4/2 = 2.

BibTeX

@article{HueffnerKomusiewiczLiebtrauNiedermeier2014,
  author  = {Falk H{"u}ffner and Christian Komusiewicz and Adrian Liebtrau and Rolf Niedermeier},
  title   = {Partitioning Biological Networks into Highly Connected Clusters with Maximum Edge Coverage},
  journal = {IEEE/ACM Transactions on Computational Biology and Bioinformatics},
  volume  = {11},
  number  = {3},
  pages   = {455--467},
  year    = {2014},
  doi     = {10.1109/TCBB.2013.177},
  url     = {https://doi.org/10.1109/TCBB.2013.177}
}

@article{HartuvShamir2000,
  author  = {Erez Hartuv and Ron Shamir},
  title   = {A clustering algorithm based on graph connectivity},
  journal = {Information Processing Letters},
  volume  = {76},
  number  = {4--6},
  pages   = {175--181},
  year    = {2000},
  doi     = {10.1016/S0020-0190(00)00142-3},
  url     = {https://doi.org/10.1016/S0020-0190(00)00142-3}
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    GoodAn issue passed all checks.modelA model problem to be implemented.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions