Skip to content

Kotlin Multiplatform (KMP) library for string unicode normalization (UAX #15)

License

Notifications You must be signed in to change notification settings

Doist/doistx-normalize

Repository files navigation

doistx-normalize

badge-version badge-android badge-jvm badge-js badge-ios badge-ios badge-ios badge-macos badge-windows badge-linux

Kotlin Multiplatform (KMP) library that adds support for normalization as described by Unicode Standard Annex #15 - Unicode Normalization Forms, by extending the String class with a normalize(Form) method.

All normalization forms are supported:

  • Form.NFC: Normalization Form C, canonical decomposition followed by canonical composition.
  • Form.NFD: Normalization Form D, canonical decomposition.
  • Form.NFKC: Normalization Form KC, compatibility decomposition followed by canonical composition.
  • Form.NFKD: Normalization Form KD, compatibility decomposition.

Usage

"Äffin".normalize(Form.NFC) // => "Äffin"
"Äffin".normalize(Form.NFD) // => "A\u0308ffin"
"Äffin".normalize(Form.NFKC) // => "Äffin"
"Äffin".normalize(Form.NFKD) // => "A\u0308ffin"

"Henry \u2163".normalize(Form.NFC) // => "Henry \u2163"
"Henry \u2163".normalize(Form.NFD) // => "Henry \u2163"
"Henry \u2163".normalize(Form.NFKC) // => "Henry IV"
"Henry \u2163".normalize(Form.NFKD) // => "Henry IV"

Setup

repositories {
   mavenCentral()
}

kotlin {
   sourceSets {
      val commonMain by getting {
         dependencies {
            implementation("com.doist.x:normalize:1.2.0")
         }
      }
   }
}

Development

Building this project can be tricky. Kotlin/Native supports cross-compilation (so .klib artifacts can be produced on any host), but keep in mind:

  • macOS is required to run tests on Apple platforms.
  • Linux targets must be built on Linux due to depending on libunistring.
  • JVM/Android and JS targets can be cross-compiled.

The defaults can be adjusted using two project properties:

  • targets is a string for which targets to build, test, or publish, depending on the task that runs.
    • all (default): All possible targets in the current host.
    • native: Native targets only (e.g., on macOS, that's macOS, iOS, watchOS and tvOS).
    • common: Common targets only (e.g., JVM, JS, Wasm).
    • host: Host OS only.
  • publishRootTarget is a boolean that indicates whether the kotlinMultiplatform root publication is included when publishing enabled targets (can only be done once).

In CI/CD, tests run on Linux, macOS, and Windows, while publishing happens from Linux.

Release

To release a new version, ensure CHANGELOG.md is up-to-date, and push the corresponding tag (e.g., v1.2.3). GitHub Actions handles the rest.

License

Released under the MIT License.

Unicode's normalization test suite is subject to this license.

About

Kotlin Multiplatform (KMP) library for string unicode normalization (UAX #15)

Resources

License

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 9