Skip to content

lutaml/parslet

 
 

Repository files navigation

Plurimath Parslet

This is a fork of the original Parslet gem, which is a parser library for Ruby.

This differs from the original Parslet gem, such that it is Opal (JS based Ruby) compatible.

Introduction

Parslet makes developing complex parsers easy. It does so by:

  • providing the best error reporting possible

  • not generating reams of code for you to debug

Parslet takes the long way around to make your job easier. It allows for incremental language construction. Often, you start out small, implementing the atoms of your language first; parslet takes pride in making this possible.

Eager to try this out? Please see the associated web site: https://kschiess.github.io/parslet/

Synopsis

require 'parslet'
include Parslet

# parslet parses strings
str('foo').
  parse('foo') # => "foo"@0

# it matches character sets
match['abc'].parse('a') # => "a"@0
match['abc'].parse('b') # => "b"@0
match['abc'].parse('c') # => "c"@0

# and it annotates its output
str('foo').as(:important_bit).
  parse('foo') # => {:important_bit=>"foo"@0}

# you can construct parsers with just a few lines
quote = str('"')
simple_string = quote >> (quote.absent? >> any).repeat >> quote

simple_string.
  parse('"Simple Simple Simple"') # => "\"Simple Simple Simple\""@0

# or by making a fuss about it
class Smalltalk < Parslet::Parser
  root :smalltalk

  rule(:smalltalk) { statements }
  rule(:statements) {
    # insert smalltalk parser here (outside of the scope of this readme)
  }
end

# and then
Smalltalk.new.parse('smalltalk')

Features

  • Tools for every part of the parser chain

  • Transformers generate Abstract Syntax Trees

  • Accelerators transform parsers, making them quite a bit faster

  • Pluggable error reporters

  • Graphviz export for your parser

  • Rspec testing support rig

  • Simply Ruby, composable and hackable

Compatibility

This library is intended to work with Ruby variants >= 2.7. It has been tested on:

  • MRI Ruby 2.7+

  • JRuby

  • Opal (works as Ruby 3.2, with the limitation of StringScanner#charpos not being available, which is used in some examples)

Please report as a bug if you encounter issues.

Status

Production worthy.

Installation

Add this line to your application’s Gemfile:

gem 'plurimath-parslet'

And then execute:

$ bundle install

Or install it yourself as:

$ gem install plurimath-parslet

Usage

require 'parslet'

The gem maintains the original parslet namespace for compatibility.

Examples

The examples/ directory contains 27 interactive examples that demonstrate various parsing scenarios and techniques. These examples are designed to help you learn parslet by showing real-world parsing problems and their solutions.

How to run examples

Each example can be run directly from the command line:

$ ruby examples/run_boolean_algebra.rb
$ ruby examples/run_json.rb
$ ruby examples/run_calc.rb

Available examples

Example Description Key Features

run_boolean_algebra.rb

Boolean expression parser with operator precedence

AND/OR operators, parentheses, DNF transformation

run_calc.rb

Basic calculator with arithmetic operations

Operator precedence, expression evaluation

run_capture.rb

Named capture groups and result extraction

Capture syntax, result processing

run_comments.rb

Comment parsing (single-line and multi-line)

Comment syntax, nested structures

run_deepest_errors.rb

Advanced error reporting and debugging

Error handling, parse failure analysis

run_documentation.rb

Markdown-style documentation parser

Headers, lists, formatting, code blocks

run_email_parser.rb

Email address validation and parsing

Email format validation, domain parsing

run_empty.rb

Empty rule behavior and edge cases

Empty matches, optional content

run_erb.rb

ERB template parsing

Template syntax, embedded Ruby code

run_ip_address.rb

IPv4 address parsing and validation

IP format validation, octet parsing

run_json.rb

Complete JSON parser with all data types

Objects, arrays, strings, numbers, booleans, null

run_local.rb

Local variable scoping demonstration

Variable declarations, scope management

run_mathn.rb

Mathematical expression parsing

Math operations, Ruby mathn compatibility

run_minilisp.rb

Minimal Lisp interpreter

S-expressions, nested structures, symbols

run_modularity.rb

Modular parser design patterns

Parser composition, reusable components

run_nested_errors.rb

Nested error handling strategies

Error propagation, context preservation

run_optimized_erb.rb

Performance-optimized ERB parsing

Greedy parsing, performance comparison

run_parens.rb

Parentheses matching and balancing

Balanced expressions, nesting validation

run_prec_calc.rb

Calculator with full operator precedence

Complex precedence rules, associativity

run_readme.rb

README-style documentation parsing

Document structure, sections, formatting

run_scopes.rb

Variable scope handling in parsers

Block scoping, variable shadowing

run_seasons.rb

Transform chains and data processing

Multi-stage transformations, data flow

run_sentence.rb

Natural language sentence parsing

Grammar rules, sentence structure

run_simple_xml.rb

Basic XML parsing

Tags, attributes, nested elements

run_string_parser.rb

String literal parsing with escaping

Quote handling, escape sequences

run_html5.rb

HTML5 document parsing with modern features

DOCTYPE, void elements, attributes, nested tags, comments

run_markdown.rb

Markdown document parsing and transformation

Headers, paragraphs, bold/italic text, links, lists, code blocks

Example structure

Each example follows a consistent structure:

  • Educational comments explaining the parsing problem

  • Sample input data demonstrating various test cases

  • Parser demonstration showing both successful parsing and error handling

  • Output explanation describing what the parser supports and how it works

Learning path

For beginners, we recommend starting with these examples in order:

  1. run_simple_xml.rb - Basic parsing concepts

  2. run_calc.rb - Operator precedence and evaluation

  3. run_json.rb - Complex data structures

  4. run_boolean_algebra.rb - Transformations and logic

  5. run_minilisp.rb - Advanced parsing techniques

Development

After checking out the repo, run:

$ bundle install

Available rake tasks

Testing

  • rake spec - Run all tests (438 examples covering all functionality)

  • rake spec:unit - Run unit tests only

  • rake spec:opal - Run Opal (JavaScript) tests (437 examples)

Running Opal tests

The Opal test suite runs the same specs as the Ruby test suite but in a JavaScript environment via Node.js. This ensures parslet works correctly when compiled to JavaScript with Opal.

To run all Opal tests:

$ bundle exec rake spec:opal

The Opal specs are located in the spec-opal/ directory and mirror the structure of the main spec/ directory.

Note
Some Opal tests may fail due to environment differences between Ruby and JavaScript execution, but the core parsing functionality is fully supported.

Benchmarking

  • rake benchmark - Run quick benchmarks (alias for benchmark:quick)

  • rake benchmark:quick - Run example-focused benchmarks only

  • rake benchmark:examples - Run example-focused benchmarks

  • rake benchmark:all - Run comprehensive benchmark suite (all categories)

  • rake benchmark:export - Run benchmarks and export results to JSON/YAML files

What gets benchmarked

The benchmark suite measures parsing performance across different scenarios:

Basic Parsing Operations

  • str('hello') - Simple string matching performance

  • match('[a-z]').repeat(1) - Character class matching with repetition

  • Email-like pattern matching - Complex regex-style parsing ([email protected])

Calculator Parser (from example/calc.rb)

  • Simple expressions: 1+2

  • Medium complexity: 1+2*3-4/2

  • Complex expressions: 123*456+789-321/3*2+1

  • Full pipeline (parse + transform + evaluate)

JSON Parser (from example/json.rb)

  • Simple objects: {"key": "value"}

  • Arrays: [1, 2, 3, 4, 5]

  • Complex nested structures with multiple data types

  • Parse vs. transform performance comparison

String Parsing

  • Simple quoted strings: "hello world"

  • Long strings (1000+ characters)

  • Escaped strings with backslash sequences: "hello \"world\" with escapes"

Repetition Patterns

  • repeat(1) with varying input lengths (short/medium/long)

  • Bounded repetition repeat(3,6)

  • Optional repetition repeat (zero or more)

  • Performance scaling with input size

Transform Operations

  • Simple AST transformations (number/string conversion)

  • Medium complexity (multiple rules, arrays)

  • Complex nested transformations with multiple rule types

Sample benchmark output
Plurimath Parslet Performance Benchmarks
==================================================

Basic Parsing Operations
------------------------------
ruby 3.3.2 (2024-05-30 revision e5a195edf6) [arm64-darwin23]
Warming up --------------------------------------
        str('hello')    17.235k i/100ms
match('[a-z]').repeat(1)
                         3.502k i/100ms
  email-like pattern     2.780k i/100ms
Calculating -------------------------------------
        str('hello')    174.636k (± 2.1%) i/s    (5.73 μs/i)
match('[a-z]').repeat(1)
                         35.182k (± 2.6%) i/s   (28.42 μs/i)
  email-like pattern     27.874k (± 8.5%) i/s   (35.88 μs/i)

Comparison:
        str('hello'):   174636.1 i/s
match('[a-z]').repeat(1):    35182.1 i/s - 4.96x  slower
  email-like pattern:    27873.8 i/s - 6.27x  slower

Calculator Parser Benchmarks
------------------------------
 parse simple: '1+2'     18.791k (± 3.2%) i/s   (53.22 μs/i)
parse medium: '1+2*3-4/2'
                          8.871k (± 6.4%) i/s  (112.73 μs/i)
parse complex: '123*456+789-321/3*2+1'
                          5.872k (± 4.3%) i/s  (170.30 μs/i)
    full calc simple      7.516k (± 8.5%) i/s  (133.06 μs/i)
   full calc complex      3.018k (± 1.9%) i/s  (331.34 μs/i)
Benchmark results export

Results are exported to multiple formats for analysis:

  • benchmark/results.json - Detailed benchmark data with iterations/second, standard deviation, and microseconds per iteration

  • benchmark/results.yaml - YAML format results for easy reading

  • benchmark/summary.json - Performance summary with fastest/slowest operations and insights

  • benchmark/summary.yaml - YAML format summary

The exported data includes:

  • Ruby version and platform information

  • Parslet version and benchmark tool versions

  • Detailed performance metrics for each test case

  • Statistical analysis (standard deviation, error percentages)

  • Performance comparisons and insights

  • Identification of performance bottlenecks and optimization opportunities

Building and distribution

  • rake build - Build plurimath-parslet-3.0.0.gem into the pkg directory

  • rake build:checksum - Generate SHA512 checksum of the gem

  • rake install - Build and install gem into system gems

  • rake install:local - Build and install gem without network access

  • rake release[remote] - Create tag and push gem to rubygems.org

Documentation

  • rake rdoc - Build RDoc HTML files

  • rake rdoc:coverage - Print RDoc coverage report

  • rake rerdoc - Rebuild RDoc HTML files

Maintenance

  • rake clean - Remove temporary products

  • rake clobber - Remove generated files

  • rake clobber_rdoc - Remove RDoc HTML files

  • rake stat - Print lines of code statistics

Example coverage

All 27 examples in the examples/ directory are covered by specs and tested automatically:

  • boolean_algebra.rb, calc.rb, capture.rb, comments.rb, deepest_errors.rb

  • documentation.rb, email_parser.rb, empty.rb, erb.rb, html5.rb

  • ip_address.rb, json.rb, local.rb, markdown.rb, mathn.rb

  • minilisp.rb, modularity.rb, nested_errors.rb, optimized_erb.rb, parens.rb

  • prec_calc.rb, readme.rb, scopes.rb, seasons.rb, sentence.rb

  • simple_xml.rb, string_parser.rb

Contributing

  1. Fork it

  2. Create your feature branch (git checkout -b my-new-feature)

  3. Commit your changes (git commit -am 'Add some feature')

  4. Push to the branch (git push origin my-new-feature)

  5. Create a new Pull Request

License

The gem is available as open source under the terms of the MIT License.

(c) 2010-2018 Kaspar Schiess.

2025 Augmented by Ribose Inc.

About

A small PEG based parser library. See the Hacking page in the Wiki as well.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Ruby 83.9%
  • HTML 14.4%
  • CSS 1.1%
  • Other 0.6%