Reference
=========

At the core of DHParser lies a parser generator for parsing expression
grammars. As a parser generator it offers similar functionality as
pyparsing_ or lark_. But it goes far beyond a mere parser generator by
offering rich support of the testing an debugging of grammars,
tree-processing (always needed in the XML-prone Digital Humanities ;-), 
fail-tolerant grammars and some (as of now, experimental) support for
editing via the `language server protocol`_.


:py:mod:`ebnf`
   Although DHParser also offers a
   Python-interface for specifying grammars (similar to pyparsing_), the
   recommended way of using DHParser is by specifying the grammar in
   EBNF_. Here it is described how grammars are specified in EBNF_ and
   how parsers can be auto-generated from these grammars and how they
   are used to parse text.

:py:mod:`nodetree`
   Syntax-trees are the central
   data-structure of any parsing system. The description to this modules
   explains how syntax-trees are represented within DHParser, how they
   can be manipulated, queried and serialized or deserialized as XML,
   S-expressions or json.

:py:mod:`transform`
   It is not untypical for
   digital humanities applications that document tress are transformed
   again and again to produce different representations of research data
   or various output forms. DHParser supplies the scaffolding for two
   different types of tree transformations, both of which a variations
   of the `visitor pattern`_. The scaffolding supplied by the
   transform-module allows to specify tree-transformations in a
   declarative style by filling in a dictionary of tag-names with lists
   of transformation functions that are called in sequence on a node. A
   number of transformations are pre-defined that cover the most needed
   cases that occur in particular when transforming concrete syntax
   trees to more abstract syntax trees. (An example for this kind of
   declaratively specified transformation is the
   ``EBNF_AST_transformation_table`` within DHParser's ebnf-module.)

:py:mod:`compile`
   offers an
   object-oriented scaffolding for the `visitor pattern`_ that is more
   suitable for complex transformations that make heavy use of
   algorithms as well as transformations from trees to non-tree objects
   like program code. (An example for the latter kind of transformation
   is the :py:class:`~ebnf.EBNFCompiler`-class of DHParser's
   ebnf-module.)

:py:mod:`pipeline`
   offers support for "processing-pipelines" composed out of "junctions"
   A processing pipe-line consists of a series of tree-transformations
   that are applied in sequence. "Junctions" declare which
   source-tree-stage is transformed by which transformation-routine into
   which destination tree-stage. Processing-pipelines can contain
   bifurcations, which are needed if from one source-document different
   kinds of output-data shall be derived.

:py:mod:`testing`
   provides a rich framework for
   unit-testing of grammars, parsers and any kind of tree-transformation.
   Usually, developers will not need to interact with this module directly,
   but rely on the unit-testing script generated by the "dhparser.py"
   command-line tool. The tests themselves a specified declaratively
   in test-input-files (in the very simple ".ini"-format) that reside by
   default in the "test_grammar"-directory of a DHParser-project.

:py:mod:`preprocess`
   provides support for DSL-pre-processors as well as source
   mapping of (error-)locations from the preprocessed document to the original
   document(s). Pre-processors are a practical means for adding features to
   a DSL which are difficult or impossible to define with context-free-grammars
   in EBNF-notation, like for example scoping based on indentation (as used
   by Python) or chaining of source-texts via an "include"-directive.

:py:mod:`parse`
   contains the parsing algorithms and the
   Python-Interface for defining parsers. DHParser features a packrat-parser
   for parsing-expression-grammars with full left-recursion support as well
   configurable error catching an continuation after error. The
   Python-Interface allows to define grammars directly as Python-code
   without the need to compile an EBNF-grammar first. This is an alternative
   approach to defining grammars similar to that of pyparsing_.

:py:mod:`dsl`
   contains high-level functions for compiling
   ebnf-grammars and domain specific languages "on the fly".

:py:mod:`error`
   defines the ``Error``-class, the objects of which describe
   errors in the source document. Errors are defined by - at least - an
   error code (indicating at the same time the level of severity), a human
   readable error message and a position in the source text.

:py:mod:`trace`
   Apart from unit-testing DHParser offers "post-mortem"
   debugging of the parsing process itself - as described in the
   :doc:`StepByStepGuide`. This is helpful to figure out why a parser went
   wrong. Again, there is little need to interact with this module directly,
   as it functionality is turned on by setting the configuration variables
   ``history_tracking`` and, for tracing continuation after errors,
   ``resume_notices``, which in turn can be triggered by calling the
   auto-generated -Parser.py-scripts with the parameter ``--debug``.

:py:mod:`log`
   logging facilities for DHParser as well as tracking of the
   parsing-history in connection with module :py:mod:`trace`.

:py:mod:`configuration`
    the central place for all configuration settings of
    DHParser. Be sure to use the ``access``, ``set`` and  ``get`` functions
    to change presets and configuration values in order to make sure that
    changes to the configuration work when used in combination with
    multithreading or multiprocessing.

:py:mod:`server`
    In order to avoid startup times or to provide a language
    sever for a domain-specific-language (DSL), DSL-parsers generated by
    DHParser can be run as a server. Module :py:mod:`server` provides
    the scaffolding for an asynchronous language server. The
    -Server.py"-script generated by DHParser provides a minimal language
    server (sufficient) for compiling a DSL. Especially if used with the
    just-in-time compiler `pypy`_ using the -Server.py script allows for
    a significant speed-up.

lsp
    (as of now, this is just a stub!) provides data classes that
    resemble the typescript-interfaces of the `language server protocol specification`_.

:py:mod:`stringview`
    defines a low level class that provides views on slices
    of strings. It is used by the :py:mod:`parse`-module to avoid
    excessive copying of data when slicing strings. (Python always
    creates a copy of the data when slicing strings as a design
    decision.) If any, this module can significantly be sped up by
    compiling it with cython_. (Use the ``cythonize_stringview``-script
    in  DHParser's main directory or, even better, compile (almost) all
    modules with the ``build_cython-modules``-script. This yields a 2-3x
    speed increase.)

:py:mod:`toolkit`
    various little helper functions for DHParser. Usually,
    there is no need to call any of these directly.



Module ``ebnf``
---------------

.. automodule:: ebnf
   :members:

Module ``nodetree``
-------------------

.. automodule:: nodetree
   :members:

Module ``transform``
--------------------

.. automodule:: transform
   :members:

Module ``compile``
------------------

.. automodule:: compile
   :members:

Module ``pipeline``
-------------------

.. automodule:: pipeline
   :members:

Module ``parse``
----------------

.. automodule:: parse
   :members:

Module ``dsl``
--------------

.. automodule:: dsl
   :members:

Module ``preprocess``
---------------------

.. automodule:: preprocess
   :members:

Module ``error``
----------------

.. automodule:: error
   :members:

Module ``testing``
------------------

.. automodule:: testing
   :members:

Module ``trace``
----------------

.. automodule:: trace
   :members:

Module ``log``
--------------

.. automodule:: log
   :members:

Module ``configuration``
------------------------

.. automodule:: configuration
   :members:

Module ``server``
-----------------

.. automodule:: server
   :members:

Module ``stringview``
---------------------

.. automodule:: stringview
   :members:

Module ``toolkit``
------------------

.. automodule:: toolkit
   :members:

Module ``versionnumber``
------------------------

.. automodule:: versionnumber
   :members:


.. _pyparsing: https://github.com/pyparsing/pyparsing/
.. _lark: https://github.com/lark-parser/lark
.. _cython: https://cython.org/
.. _`language server`: https://langserver.org/
.. _`language server protocol`: https://microsoft.github.io/language-server-protocol/
.. _`language server protocol specification`: https://microsoft.github.io/language-server-protocol/specifications/specification-current/
.. _EBNF: https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form
.. _`visitor pattern`: https://en.wikipedia.org/wiki/Visitor_pattern
.. _pypy: https://www.pypy.org/
