Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
language: ruby
rvm:
- 2.7
- 3.0
- 3.1
- 3.2
- 3.3
- ruby-head
branches:
except:
Expand Down
70 changes: 65 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -318,12 +318,9 @@ Examples:

If you need to validate if a specific text is fulfilling the pattern you can use the validate method.

If a string pattern supplied and no other parameters supplied the output will be an array with the errors detected.
When you supply a single pattern and do **not** supply `expected_errors` or `not_expected_errors`, the method returns an **array of error symbols**: an empty array `[]` when the text is valid, or one or more of `:min_length`, `:max_length`, `:length`, `:value`, `:string_set_not_allowed`, `:required_data`, `:excluded_data` when invalid.


Possible output values, empty array (validation without errors detected) or one or more of: :min_length, :max_length, :length, :value, :string_set_not_allowed, :required_data, :excluded_data

In case an array of patterns supplied it will return only true or false
When an array of patterns is supplied, the method returns only `true` or `false`.

Examples:

Expand Down Expand Up @@ -443,6 +440,69 @@ StringPattern.block_list_enabled = true
"2-20:Tn".gen #>AAñ34Ef99éNOP
```

#### StringPattern.analyze

To inspect how a pattern is parsed without generating or validating:

```ruby
p = StringPattern.analyze("10-20:LN/x/")
# => #<Struct min_length=10, max_length=20, symbol_type="LN/x/", required_data=..., string_set=..., unique=false>
p.min_length # => 10
p.max_length # => 20
p.symbol_type # => "LN/x/"
```

Useful for debugging or building tools on top of the pattern DSL. Invalid patterns return the pattern string; use `silent: true` to avoid logging.

#### Error handling and logging

By default, when generation is impossible (e.g. invalid pattern or `dont_repeat` exhausted), `generate` returns an empty string `""` and a message is printed. You can:

- Set `StringPattern.logger = Logger.new($stderr)` to send messages to a logger instead of `puts`.
- Set `StringPattern.raise_on_error = true` to raise `StringPattern::GenerationImpossibleError` or `StringPattern::InvalidPatternError` instead of returning `""`.

#### Reproducible generation (seed)

Pass `seed:` to get the same string for the same pattern in tests:

```ruby
"10:N".gen(seed: 42) # => same result every time
```

#### Batch generation (sample)

Generate up to `n` distinct strings without mutating the global dont_repeat cache:

```ruby
StringPattern.sample("4:N", 10) # => array of 10 distinct 4-digit strings
```

#### Boolean validation (valid?)

Check if text matches a pattern without building the full error list:

```ruby
StringPattern.valid?(text: "user@domain.com", pattern: "14-40:@") # => true
```

#### UUID

Generate a random UUID v4 or validate one:

```ruby
StringPattern.uuid # => "550e8400-e29b-41d4-a716-446655440000"
StringPattern.valid_uuid?(some_str) # => true or false
```

#### block_list as Proc

You can set `block_list` to a Proc for custom blocking:

```ruby
StringPattern.block_list = ->(s) { s.include?("forbidden") }
StringPattern.block_list_enabled = true
```


## Contributing

Expand Down
11 changes: 5 additions & 6 deletions lib/string/pattern/add_to_ruby.rb
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ def to_sp
elsif token == :literal and text.size == 2
text = text[1]
else
puts "Report token not controlled: type: #{type}, token: #{token}, text: '#{text}' [#{ts}..#{te}]"
StringPattern.log_message("Report token not controlled: type: #{type}, token: #{token}, text: '#{text}' [#{ts}..#{te}]")
end
end

Expand Down Expand Up @@ -165,7 +165,7 @@ def to_sp
set_negate = false
else
pats += "]"
end
end

end
elsif type == :group
Expand All @@ -190,7 +190,6 @@ def to_sp
patg << pats
pats = ""
elsif patg.empty?
# for the case the first element was not added to patg and was on pata fex: (a+|b|c)
patg << pata.pop
end
end
Expand Down Expand Up @@ -299,11 +298,11 @@ def to_sp
end
if pats != ""
if pata.empty?
if pats[0] == "[" and pats[-1] == "]" #fex: /[12ab]/
if pats[0] == "[" and pats[-1] == "]"
pata = ["1:#{pats}"]
end
else
pata[-1] += pats[1] #fex: /allo/
pata[-1] += pats[1]
end
end
if pata.size == 1 and pata[0].kind_of?(String)
Expand All @@ -325,7 +324,7 @@ def generate(pattern, expected_errors: [], **synonyms)
if pattern.is_a?(String) || pattern.is_a?(Array) || pattern.is_a?(Symbol) || pattern.is_a?(Regexp)
StringPattern.generate(pattern, expected_errors: expected_errors, **synonyms)
else
puts " Kernel generate method: class not recognized:#{pattern.class}"
StringPattern.log_message(" Kernel generate method: class not recognized:#{pattern.class}")
end
end

Expand Down
10 changes: 5 additions & 5 deletions lib/string/pattern/analyze.rb
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
class StringPattern
###############################################
# Analyze the pattern supplied and returns an object of Pattern structure including:
# min_length, max_length, symbol_type, required_data, excluded_data, data_provided, string_set, all_characters_set
###############################################
# Analyzes a pattern string and returns a Pattern struct.
# @param pattern [String, Symbol] Pattern in format "length:type" or "min-max:type" (e.g. "10:N", "5-15:L")
# @param silent [Boolean] If true, invalid patterns do not log a message.
# @return [Struct, String] Pattern struct with min_length, max_length, symbol_type, required_data, excluded_data, data_provided, string_set, all_characters_set, unique; or the pattern string if invalid.
def StringPattern.analyze(pattern, silent: false)
#unless @cache[pattern.to_s].nil?
# return Pattern.new(@cache[pattern.to_s].min_length.clone, @cache[pattern.to_s].max_length.clone,
Expand All @@ -16,7 +16,7 @@ def StringPattern.analyze(pattern, silent: false)
min_length, symbol_type = pattern.to_s.scan(/^!?(\d+):(.+)/)[0]
max_length = min_length
if min_length.nil?
puts "pattern argument not valid on StringPattern.generate: #{pattern.inspect}" unless silent
StringPattern.log_message("pattern argument not valid on StringPattern.generate: #{pattern.inspect}") unless silent
return pattern.to_s
end
end
Expand Down
21 changes: 21 additions & 0 deletions lib/string/pattern/email.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# frozen_string_literal: true

class StringPattern
# Validates email format using the same rules as pattern type @:
# - Forbids consecutive/adjacent invalid sequences (.. __ -- etc.)
# - Local part: [a-z0-9]+([\+\._\-][a-z0-9])*
# - Domain part: [0-9a-z]+([\.-][a-z0-9])*
def self.valid_email?(string)
return false if string.nil? || !string.is_a?(String)
return false if string.index("@").to_i <= 0

wrong = %w(.. __ -- ._ _. .- -. _- -_ @. @_ @- .@ _@ -@ @@)
return false if Regexp.union(*wrong) === string

local = string[0..(string.index("@") - 1)]
domain = string[(string.index("@") + 1)..-1]
local_ok = local.scan(/([a-z0-9]+([\+\._\-][a-z0-9]|)*)/i).join == local
domain_ok = domain.scan(/([0-9a-z]+([\.-][a-z0-9]|)*)/i).join == domain
local_ok && domain_ok
end
end
51 changes: 24 additions & 27 deletions lib/string/pattern/generate.rb
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,8 @@ class StringPattern
# the generated string
###############################################
def StringPattern.generate(pattern, expected_errors: [], **synonyms)
seed_given = synonyms.key?(:seed)
saved_rng = seed_given ? srand(synonyms[:seed]) : nil
tries = 0
begin
good_result = true
Expand Down Expand Up @@ -95,7 +97,7 @@ def StringPattern.generate(pattern, expected_errors: [], **synonyms)
string << pat
end
else
puts "StringPattern.generate: it seems you supplied wrong array of patterns: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}"
StringPattern.log_message("StringPattern.generate: it seems you supplied wrong array of patterns: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}")
return ""
end
}
Expand All @@ -120,7 +122,7 @@ def StringPattern.generate(pattern, expected_errors: [], **synonyms)
}
unless excluded_data.size == 0
if (required_chars.flatten & excluded_data.flatten).size > 0
puts "pattern argument not valid on StringPattern.generate, a character cannot be required and excluded at the same time: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}"
StringPattern.log_message("pattern argument not valid on StringPattern.generate, a character cannot be required and excluded at the same time: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}")
return ""
end
end
Expand All @@ -130,7 +132,7 @@ def StringPattern.generate(pattern, expected_errors: [], **synonyms)
elsif pattern.kind_of?(Regexp)
return generate(pattern.to_sp, expected_errors: expected_errors)
else
puts "pattern argument not valid on StringPattern.generate: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}"
StringPattern.log_message("pattern argument not valid on StringPattern.generate: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}")
return pattern.to_s
end

Expand Down Expand Up @@ -169,20 +171,20 @@ def StringPattern.generate(pattern, expected_errors: [], **synonyms)

unless deny_pattern
if required_data.size == 0 and expected_errors_left.include?(:required_data)
puts "required data not supplied on pattern so it won't be possible to generate a wrong string. StringPattern.generate: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}"
StringPattern.log_message("required data not supplied on pattern so it won't be possible to generate a wrong string. StringPattern.generate: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}")
return ""
end

if excluded_data.size == 0 and expected_errors_left.include?(:excluded_data)
puts "excluded data not supplied on pattern so it won't be possible to generate a wrong string. StringPattern.generate: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}"
StringPattern.log_message("excluded data not supplied on pattern so it won't be possible to generate a wrong string. StringPattern.generate: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}")
return ""
end

if expected_errors_left.include?(:string_set_not_allowed)
string_set_not_allowed = all_characters_set - string_set

if string_set_not_allowed.size == 0
puts "all characters are allowed so it won't be possible to generate a wrong string. StringPattern.generate: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}"
StringPattern.log_message("all characters are allowed so it won't be possible to generate a wrong string. StringPattern.generate: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}")
return ""
end
end
Expand All @@ -205,7 +207,7 @@ def StringPattern.generate(pattern, expected_errors: [], **synonyms)
expected_errors_left.delete(:length)
expected_errors_left.delete(:min_length)
else
puts "min_length is 0 so it won't be possible to generate a wrong string smaller than 0 characters. StringPattern.generate: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}"
StringPattern.log_message("min_length is 0 so it won't be possible to generate a wrong string smaller than 0 characters. StringPattern.generate: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}")
return ""
end
elsif expected_errors_left.include?(:max_length) or expected_errors_left.include?(:length)
Expand Down Expand Up @@ -264,7 +266,7 @@ def StringPattern.generate(pattern, expected_errors: [], **synonyms)
end
if ((0...string.length).find_all { |i| string[i, 1] == rd_to_set }).size == 0
if positions_to_set.size == 0
puts "pattern not valid on StringPattern.generate, not possible to generate a valid string: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}"
StringPattern.log_message("pattern not valid on StringPattern.generate, not possible to generate a valid string: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}")
return ""
else
k = positions_to_set.sample
Expand All @@ -289,7 +291,7 @@ def StringPattern.generate(pattern, expected_errors: [], **synonyms)
string_set_not_allowed = all_characters_set - string_set if string_set_not_allowed.size == 0

if string_set_not_allowed.size == 0
puts "Not possible to generate a non valid string on StringPattern.generate: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}"
StringPattern.log_message("Not possible to generate a non valid string on StringPattern.generate: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}")
return ""
end
(rand(string.size) + 1).times {
Expand Down Expand Up @@ -502,24 +504,11 @@ def StringPattern.generate(pattern, expected_errors: [], **synonyms)
expected_errors_left.delete(:string_set_not_allowed)
end

error_regular_expression = false
error_regular_expression = !StringPattern.valid_email?(string)

if deny_pattern and expected_errors.include?(:length)
good_result = true #it is already with wrong length
else
# I'm doing this because many times the regular expression checking hangs with these characters
wrong = %w(.. __ -- ._ _. .- -. _- -_ @. @_ @- .@ _@ -@ @@)
if !(Regexp.union(*wrong) === string) #don't include any or the wrong strings
if string.index("@").to_i > 0 and
string[0..(string.index("@") - 1)].scan(/([a-z0-9]+([\+\._\-][a-z0-9]|)*)/i).join == string[0..(string.index("@") - 1)] and
string[(string.index("@") + 1)..-1].scan(/([0-9a-z]+([\.-][a-z0-9]|)*)/i).join == string[string[(string.index("@") + 1)..-1]]
error_regular_expression = false
else
error_regular_expression = true
end
else
error_regular_expression = true
end

if expected_errors.size == 0
if error_regular_expression
Expand All @@ -540,7 +529,9 @@ def StringPattern.generate(pattern, expected_errors: [], **synonyms)
end
end until good_result or tries > 100
unless good_result
puts "Not possible to generate an email on StringPattern.generate: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}"
msg = "Not possible to generate an email on StringPattern.generate: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}"
raise StringPattern::GenerationImpossibleError, msg if @raise_on_error
StringPattern.log_message(msg)
return ""
end
end
Expand Down Expand Up @@ -569,7 +560,9 @@ def StringPattern.generate(pattern, expected_errors: [], **synonyms)
end
end
if @block_list_enabled
if @block_list.is_a?(Array)
if @block_list.respond_to?(:call)
good_result = false if @block_list.call(string)
elsif @block_list.is_a?(Array)
@block_list.each do |bl|
if string.match?(/#{bl}/i)
good_result = false
Expand All @@ -580,10 +573,14 @@ def StringPattern.generate(pattern, expected_errors: [], **synonyms)
end
end until good_result or tries > 10000
unless good_result
puts "Not possible to generate the string on StringPattern.generate: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}"
puts "Take in consideration if you are using StringPattern.dont_repeat=true that you don't try to generate more strings that are possible to be generated"
msg = "Not possible to generate the string on StringPattern.generate: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}"
msg += "\nTake in consideration if you are using StringPattern.dont_repeat=true that you don't try to generate more strings that are possible to be generated"
raise StringPattern::GenerationImpossibleError, msg if @raise_on_error
StringPattern.log_message(msg)
return ""
end
return string
ensure
srand(saved_rng) if saved_rng
end
end
19 changes: 3 additions & 16 deletions lib/string/pattern/validate.rb
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ def StringPattern.validate(text: "", pattern: "", expected_errors: [], not_expec
max_length = patt.max_length.clone
symbol_type = patt.symbol_type.clone
else
puts "String pattern class not supported (#{pat.class} for #{pat})"
StringPattern.log_message("String pattern class not supported (#{pat.class} for #{pat})")
return false
end

Expand Down Expand Up @@ -133,7 +133,7 @@ def StringPattern.validate(text: "", pattern: "", expected_errors: [], not_expec
required_chars << rd if rd.size == 1
}
if (required_chars.flatten & excluded_data.flatten).size > 0
puts "pattern argument not valid on StringPattern.validate, a character cannot be required and excluded at the same time: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}"
StringPattern.log_message("pattern argument not valid on StringPattern.validate, a character cannot be required and excluded at the same time: #{pattern.inspect}, expected_errors: #{expected_errors.inspect}")
return ""
end
end
Expand Down Expand Up @@ -183,20 +183,7 @@ def StringPattern.validate(text: "", pattern: "", expected_errors: [], not_expec
end
}
else #symbol_type=="@"
string = text_to_validate
wrong = %w(.. __ -- ._ _. .- -. _- -_ @. @_ @- .@ _@ -@ @@)
if !(Regexp.union(*wrong) === string) #don't include any or the wrong strings
if string.index("@").to_i > 0 and
string[0..(string.index("@") - 1)].scan(/([a-z0-9]+([\+\._\-][a-z0-9]|)*)/i).join == string[0..(string.index("@") - 1)] and
string[(string.index("@") + 1)..-1].scan(/([0-9a-z]+([\.-][a-z0-9]|)*)/i).join == string[string[(string.index("@") + 1)..-1]]
error_regular_expression = false
else
error_regular_expression = true
end
else
error_regular_expression = true
end

error_regular_expression = !StringPattern.valid_email?(text_to_validate)
if error_regular_expression
detected_errors.push(:value)
end
Expand Down
Loading