Regularity is a friendly regular expression builder for R inspired by Ruby's Regularity library and R's Magrittr package package.
What's simpler to create / decipher?
/^[0-9]{3}-[A-Za-z]{2}#?[a|b]a{2,4}\$$/
Or
Regularity() %>%
StartWith(3, 'digits') %>%
Then('-') %>%
Then(2, 'letters') %>%
Maybe('#') %>%
OneOf(c('a','b')) %>%
Between(c(2,4), 'a') %>%
EndWith('$')I know which I'd choose!
Regularity is not currently on CRAN as it's still in early development but in the meantime it can be installed in R using devtools
install_github("martineastwood/Regularity")All you need to do is create a Regularity object and then chain the
regex functions together. The functions either take a single pattern, e.g. Then("xyz"),
or a numbered constraint such as Then(2, 'digits').
The following special identifers are supported:
digit <- '[0-9]'
lowercase <- '[a-z]'
uppercase <- '[A-Z]'
letter <- '[A-Za-z]'
alphanumeric <- '[A-Za-z0-9]'
whitespace <- '\s'
space <- ' '
tab <- '\t'Also, it doesn't matter if these identifiers are pluralized, i.e. Then(2, 'letters') works just
the same as Then(1, 'letter')
The following functions are currently supported:
StartWith(pattern): The line must start with the specified pattern. This must be called before any of the other functions. (Also aliased to StartsWith).
Append(pattern): Append a pattern to the end (Also aliased to Then), e.g. Append('abc')
EndWith(pattern): The line must end with the specified pattern. This must be the final function called, e.g. EndWith('X')
Maybe(pattern): Zero or one of the specified pattern, e.g. Maybe(4, 'digits')
OneOf(values): Specify a choice, e.g. OneOf(c('a', 'b', 'c'))
Between(range, pattern): Specify a bounded repetition, e.g. between(c(2,4), 'digits')
ZeroOrMore(pattern): Specify that the pattern or identifer should appear zero or many times, e.g. ZeroOrMore('letters')
OneOrMore(pattern): Specify that the pattern or identifer should appear one or many times, e.g. OneOrMore('letters')
AtLeast(n, pattern): Specify that the pattern or identifer should appear n or more times, e.g. AtLeast(5, 'letters')