"Lepszy rydz niż nic" — Polish proverb meaning "better something than nothing"
Build and deploy LLM-based classifiers in minutes, not days.
Rydz uses logprobs to extract classification probabilities from LLMs in a single API call — no fine-tuning, no training data, no ML pipeline. Just a prompt and a model.
The key advantage: Rydz does not just return a label. It tells you how uncertain the model is.
Docs: DeepWiki
Practical agent-oriented examples: USAGE_PATTERNS.md
A perfect classifier you don't have time to build is worth less than a good-enough one you can deploy right now. Rydz gives you the latter — and, crucially, it tells you when the answer is shaky.
That matters because one of the standard complaints about AI systems is that they sound equally confident when they are right and when they are wrong. Rydz is useful precisely because it exposes that uncertainty instead of hiding it.
Rydz is an ad hoc classifier creation tool: define labels, write a prompt, and get usable probabilities immediately.
It also fits standard ML workflows. In practice, "complex ML model vs no classifier" is rarely a useful comparison. A better baseline is traditional ML vs ad hoc LLM classifier, then compare quality lift against total cost (data collection, training, inference, maintenance) to make a real cost/benefit decision.
Fun fact: Contrary to popular belief, "rydz" in the Polish proverb doesn't refer to a mushroom. It's Camelina sativa — a humble oil plant that thrives where nothing else will grow. Seemed like a fitting name for a library that gets the job done when fancier solutions aren't an option.
from rydz import get_logprobs_response, get_probability
prompt = """Classify the sentiment of this review as POSITIVE or NEGATIVE.
Review: "The battery lasts forever and the screen is gorgeous!"
Sentiment:"""
resp = get_logprobs_response("openai:gpt-4o-mini", prompt)
print(f"positive: {get_probability(resp, 'POSITIVE'):.1%}")
print(f"negative: {get_probability(resp, 'NEGATIVE'):.1%}")
# positive: 99.2% negative: 0.8%from rydz import get_logprobs_response, get_probability, tmap
def classify(text):
prompt = f"Is this spam? Answer YES or NO.\n\n{text}\n\nAnswer:"
resp = get_logprobs_response("xai:grok-4-1-fast-non-reasoning", prompt)
return get_probability(resp, "YES")
results = list(tmap(classify, texts, workers=16))- You craft a prompt that frames the classification task
- Rydz sends it to the LLM and requests logprobs for the first output token
- Probabilities of your target labels are extracted directly — no sampling, no repeated calls
- One API call → one classification with confidence scores
This means: high throughput, low cost, low latency. Thousands of input tokens, 1 output token.
It also means you can route low-confidence cases to humans, apply thresholds, or trigger fallback logic instead of blindly trusting every answer.
| approach | calls per classification | confidence score | training data | setup time |
|---|---|---|---|---|
| Rydz (LLM + logprobs) | 1 | yes, native | none | minutes |
| LLM + text parsing OR structured output | 1 | no | none | minutes |
| LLM + repeated sampling | 5–20 | approximate | none | minutes |
| traditional ML | 1 | yes | thousands+ | days–weeks |
Logprobs give you calibrated confidence in a single call: no repeated sampling, no parsing "yes"/"no" from free text, and no training data collection. You get a probability distribution over your labels directly from the model's internals.
This is the opposite of the usual "black box AI" problem: instead of a bare answer, you get an explicit signal of uncertainty.
Real-world benchmark: 280 book fragments × 8 classification criteria (~1.5M input tokens, 2200+ data points) — processed in 36 seconds for $0.25 using two cloud providers in parallel, or 4 minutes using a local model (Bielik) on a single RTX 3090.
Rydz works best with instruction-tuned (chat) models — they follow the prompt and put the answer label as the first token, which is exactly what logprobs extraction needs.
Reasoning models (e.g. GLM-5, Kimi K2.5, DeepSeek V3.2) — they emit chain-of-thought tokens before the answer, so they require a higher max_tokens value, cost more, and take longer.
To use reasoning models, add reasoning=True to the get_logprobs_response call or add :reasoning suffix to the model name:
get_logprobs_response("together:moonshotai/Kimi-K2.5", prompt, reasoning=True)
get_logprobs_response("together:moonshotai/Kimi-K2.5:reasoning", prompt)A single model with a hand-crafted prompt is just the starting point. Rydz's low cost and high throughput make it practical to build more advanced systems:
- Ensemble / majority voting — score the same input with multiple models and aggregate results for higher accuracy and resilience
- Cache-friendly repeated queries — keep one long document fixed, vary short entity-relation questions at the end, and exploit prompt cache / KV cache for much cheaper repeated scoring
- Automatic prompt optimization — integrate with frameworks like DSPy to optimize prompts systematically instead of relying on intuition alone
Rydz is also a practical baseline for production evaluation: compare your traditional ML pipeline against a strong ad hoc LLM classifier, then decide if the extra quality justifies the extra cost and complexity.
You don't have to pick one path upfront — start simple, then scale only where the cost/benefit is clear.
- One thing, done well — logprobs-based classification
- Uncertainty is visible — get probability distributions, not just labels
- Multiple providers — OpenAI, xAI, Together, Fireworks, Hyperbolic, Cerebras, OpenRouter, LM Studio
- Provider quirks handled — different APIs, limits, endpoints — all behind one interface
- Parallel processing — classify thousands of items across providers in seconds
- Minimal dependencies — just
openai - Reasoning models — extract logprobs from first token after chain-of-thought (v0.2)
pip install git+https://github.com/mobarski/rydzor
uv pip install git+https://github.com/mobarski/rydzIn the future it will be added to the PyPI.
Models use the provider:model_name convention (optional :reasoning suffix for reasoning models):
"openai:gpt-4.1-nano"
"xai:grok-4-1-fast-non-reasoning"
"together:moonshotai/Kimi-K2-Instruct-0905"
"hyperbolic:Qwen/Qwen3-Next-80B-A3B-Instruct"
"lmstudio:bielik-11b-v3.0-instruct"
"together:moonshotai/Kimi-K2.5:reasoning"| provider | env variable | notes |
|---|---|---|
| lmstudio | LMSTUDIO_API_KEY |
| provider | env variable | notes |
|---|---|---|
| openai | OPENAI_API_KEY | |
| xai | XAI_API_KEY | |
| together | TOGETHER_API_KEY | |
| hyperbolic | HYPERBOLIC_API_KEY | |
| fireworks | FIREWORKS_API_KEY | |
| cerebras | CEREBRAS_API_KEY | |
| openrouter | OPENROUTER_API_KEY | most inference providers = no logprobs |
| GOOGLE_API_KEY | no logprobs |
from rydz import register_provider
register_provider("myprovider", "https://api.example.com/v1", quirks={"max_tokens": 4})
# uses MYPROVIDER_API_KEY env variable, model string: "myprovider:model-name"You can also create aliases for existing providers — useful for multiple API keys (higher rate limits, separate billing):
from rydz import register_alias
register_alias("openai2", "openai")
# uses OPENAI2_API_KEY env variable, model string: "openai2:gpt-4.1-nano"
register_alias("openai3", "openai", quirks={"top_logprobs": 10})
# same as above but with custom quirksBy default, Rydz reads API keys from environment variables. In corporate environments you may need to fetch keys from a secret manager (AWS Secrets Manager, Azure Key Vault, HashiCorp Vault, etc.). Use the get_api_key quirk:
from rydz import set_quirk
set_quirk("openai", "get_api_key", lambda model: vault.get_secret("openai-api-key"))The function receives the full model string (e.g. "openai:gpt-4.1-nano") so you can route different models to different secrets. Works with custom providers and aliases too:
set_quirk("myprovider", "get_api_key", my_secret_manager.get_key)E-commerce & Marketing — sentiment analysis in product reviews, matching ad profiles to article content
Customer Support — automatic ticket categorization and priority routing, intent detection in customer messages
Media & Publishing — content moderation, scene and emotion tagging in narratives, content complexity matching to target audience
Legal & Compliance — contract risk clause detection, document confidentiality classification
Software & AI — safety analysis of AI-generated code, detecting policy violations in LLM outputs
These are just examples. For best results, point your favorite AI assistant to this repo and ask how Rydz can help in your business.
- Multimodal input — classify images alongside text using vision-capable models
- More local inference providers — vllm, sglang, llama-server, ollama
- More online inference providers — huggingface, featherless, nscale, zai
- Advanced usage examples — evaluation, majority voting, automatic prompt optimization
MIT
- 0.2.1 - make reasoning the canonical name for thinking/reasoning in the API and in the docs
- 0.2 - thinking/reasoning model support (logprobs extraction after chain-of-thought)