feat: plugins: add new filter llm_tag plugin#123
Merged
Conversation
Add new LLM-based log tagging filter plugin with OpenAI-compatible API support
Add build configuration for LLM tagging filter
Extend YAML config parser to support complex nested structures in filter plugins
Add support for complex array properties containing objects in filter plugins
|
CodeAnt AI is reviewing your PR. |
|
CodeAnt AI finished reviewing your PR. |
There was a problem hiding this comment.
7 issues found across 9 files
Prompt for AI agents (all 7 issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="source/plugins/filter_llm_tag/llm_tag.h">
<violation number="1" location="source/plugins/filter_llm_tag/llm_tag.h:20">
Rename the include guard macro to FLB_FILTER_LLM_TAG_H so it matches this header and cannot collide with a future llm_classify guard.</violation>
</file>
<file name="source/src/flb_config.c">
<violation number="1" location="source/src/flb_config.c:899">
Storing a cfl_variant pointer inside flb_kv.val (which is freed with flb_sds_destroy) will corrupt memory when the filter instance is destroyed. Use a container that does not assume string ownership for variant values instead of writing the pointer into kv->val.</violation>
</file>
<file name="source/plugins/filter_llm_tag/llm_tag.c">
<violation number="1" location="source/plugins/filter_llm_tag/llm_tag.c:246">
LLM request failures drop the original log record, causing data loss</violation>
<violation number="2" location="source/plugins/filter_llm_tag/llm_tag.c:520">
Debug logging prints every filter property value, leaking secrets such as model_api_key into logs. Remove this property-dumping block or at least mask sensitive values before logging to avoid exposing credentials.</violation>
</file>
<file name="source/src/config_format/flb_cf_yaml.c">
<violation number="1" location="source/src/config_format/flb_cf_yaml.c:851">
`kvlist_deep_copy` reuses the original variant pointer for non-string/array/kvlist entries, causing both lists to own the same object and leading to use-after-free/double-free when one is destroyed. Clone the variant before inserting it into the copy.</violation>
</file>
<file name="source/src/flb_openai_client.c">
<violation number="1" location="source/src/flb_openai_client.c:238">
Internally created TLS contexts leak when upstream creation fails because the cleanup path never destroys `client->tls`.</violation>
<violation number="2" location="source/src/flb_openai_client.c:589">
Destroying the caller-owned TLS context causes double free/invalid reuse when the same TLS is shared outside this client.</violation>
</file>
Since this is your first cubic review, here's how it works:
- cubic automatically reviews your code and comments on bugs and improvements
- Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
- Ask questions if you need clarification on any suggestion
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
There was a problem hiding this comment.
5 issues found across 9 files
Prompt for AI agents (all 5 issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="source/src/flb_config.c">
<violation number="1" location="source/src/flb_config.c:899">
Storing a `struct cfl_variant *` in `kv_prop->val` means the property cleanup path calls `flb_sds_destroy` on a non-SDS pointer, leading to invalid free/memory corruption when the filter instance is destroyed. Use a structure that owns the variant or avoid the generic kv list for non-string data.</violation>
</file>
<file name="source/src/flb_openai_client.c">
<violation number="1" location="source/src/flb_openai_client.c:307">
`json_get_key` assumes each key/value only consumes two tokens, so it fails to find keys that appear after nested values (e.g., `choices` after `usage`), causing valid OpenAI responses to be rejected.</violation>
<violation number="2" location="source/src/flb_openai_client.c:589">
Destroying `client->tls` here frees TLS contexts that were passed in by the caller, causing double frees/use-after-free for shared TLS objects.</violation>
</file>
<file name="source/plugins/filter_llm_tag/llm_tag.c">
<violation number="1" location="source/plugins/filter_llm_tag/llm_tag.c:397">
Records whose LLM classification fails are dropped because the code continues without re-emitting or keeping the original record, leading to data loss on transient API failures.</violation>
</file>
<file name="source/src/config_format/flb_cf_yaml.c">
<violation number="1" location="source/src/config_format/flb_cf_yaml.c:851">
Primitive variants are not really copied in kvlist_deep_copy: the default branch inserts the original variant pointer, so source and copy share ownership and freeing either list frees the same variant. Allocate a fresh variant per primitive instead of reusing the pointer.</violation>
</file>
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
- filter_llm_tag: preserve original records on LLM API failures - filter_llm_tag: remove debug logging that exposed API keys - flb_cf_yaml: properly clone primitive variants in kvlist_deep_copy - flb_openai_client: fix TLS context leak on upstream creation failure - flb_openai_client: fix JSON parsing for nested objects in responses
|
CodeAnt AI is running Incremental review |
|
CodeAnt AI Incremental review completed. |
patrick-stephens
previously approved these changes
Nov 28, 2025
Contributor
|
@niedbalski looks like Windows builds are failing: |
- Replace POSIX clock_gettime with flb_time_get for cross-platform timing - Add flb_strcasestr helper for case-insensitive search (strcasestr is GNU extension) - Add flb_strtok_r helper to use strtok_s on Windows, strtok_r elsewhere - Include flb_compat.h for strncasecmp macro on Windows
This was referenced May 8, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
User description
Add new LLM-based log tagging filter plugin with OpenAI-compatible API support.
Example config with OpenAI:
Example config with local Ollama:
Summary by cubic
Adds a new llm_tag filter to classify logs with LLMs and retag records, and updates config parsing to support arrays of objects in filter properties. Adds safeguards to preserve records on LLM failures.
New Features
Bug Fixes
Risk: 3/5
Written for commit 3e71878. Summary will update automatically on new commits.
CodeAnt-AI Description
LLM-based log tagging filter classifies logs via OpenAI-compatible endpoints and re-emits them with configured tags
What Changed
Impact
✅ LLM-based tagging for security logs✅ Original logs preserved when LLM API fails✅ Filter configs accept nested rule arrays💡 Usage Guide
Checking Your Pull Request
Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.
Talking to CodeAnt AI
Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:
This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.
Example
Preserve Org Learnings with CodeAnt
You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:
This helps CodeAnt AI learn and adapt to your team's coding style and standards.
Example
Retrigger review
Ask CodeAnt AI to review the PR again, by typing:
Check Your Repository Health
To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.