Skip to content

DLP username matcher masks ordinary English words in free-form log messages #6

@DarkiT

Description

@DarkiT

Summary

When DLP is enabled, ordinary English words in free-form log messages are masked as username, which makes operational logs hard to read.

For example:

  • cleanup sessions finished count=0 becomes cl***up se****ns fi****ed co*nt=0
  • mounted embedded static site becomes mo***ed em****ed st**ic site

Reproduction

package main

import (
    "fmt"
    dlp "github.com/darkit/slog/dlp"
)

func main() {
    e := dlp.NewDlpEngine()
    e.Enable()

    fmt.Println(e.DesensitizeText("cleanup sessions finished count=0"))
    fmt.Println(e.DesensitizeText("mounted embedded static site"))
}

Actual behavior

cl***up se****ns fi****ed co*nt=0
mo***ed em****ed st**ic site

Expected behavior

These ordinary words should not be masked by default in free-form log messages.

Root cause

In v0.2.0, the default username matcher is too broad:

UsernamePattern = `[a-zA-Z0-9_]{3,16}`

That pattern matches many normal English words such as:

  • cleanup
  • sessions
  • finished
  • count
  • mounted
  • embedded
  • static
  • site

And because DesensitizeText() applies auto-detection / replacement to the whole free-form message, these words are treated as username and get masked.

Impact

  • Breaks log readability
  • Reduces observability during troubleshooting
  • Makes DLP unsafe to enable globally for application logs

Suggested fixes

At least one of the following would help:

  1. Do not enable the username matcher by default for free-form message text
  2. Support disabling specific matchers (for example username) via config
  3. Support field-scoped DLP so only sensitive key/value fields are masked, instead of scanning the whole message body
  4. Tighten the default username matcher so it does not match arbitrary common words

Workaround used downstream

We had to disable DLP entirely for runtime logs to avoid damaging normal operational output.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions