keywords2189%python.module
fault.syntax

Programming language profile and parser class for Keyword Oriented Syntax.

Provides a data structure for describing keyword based languages that allows a naive processor to assign classifications to the tokens contained by a string.

Profile instances have limited information regarding languages and Parser instances should have naive processing algorithms. This intentional deficiency is a product of the goal to keep Keywords based interpretations trivial to implement and their parameter set small so that the the profile may be quickly and easily defined by users without requiring volumes of documentation to be consumed.

Parser Process Methods

Parser instances have two high-level methods for producing tokens: Parser.process_lines and Parser.process_document. The former resets the context provided to Parser.delimit for each line, and the latter maintains it throughout the lifetime of the generator.

While Parser.process_document appears desireable in many cases, the limitations of Profile instances may make it reasonable to choose Parser.process_lines in order to avoid the effects of an inaccruate profile or a language that maintains ambiguities with respect to the parser's capabilities.

Engineering

Essentially, this is a lexer whose tokens are defined by Profile instances. The language types that are matches for applications are usually keyword based and leverage whitespace for isolation of fields.

typing
import

itertools0
import

functools0
import

string0
import

Tokens0
data

Tokens = typing.Iterable[typing.Tuple[str,str,str]]

Profile0
class

Data structure describing the elements of a Keyword Oriented Syntax.

Empty strings present in any of these sets will usually refer to the End of Line. This notation is primarily intended for area exclusions for supporting line comments, but literals and enclosures may also use them to represent the beginning or end of a line.

While Profile is a tuple subclass, indexes should not be used to access members.

Profile__slots__0
data

__slots__ = ()
	@classmethod

Profilefrom_keywords_v10
classmethod

from_keywords_v1(Class, **wordtypes)

Profilewords0
property
typing.Mapping[str, typing.Set[str]]

Dictionary associating sets of identifier strings with a classification identifier.

Profileexclusions0
property
typing.Set[typing.Tuple[str,str]]

Comment start and stop pairs delimiting an excluded area from the source.

Exclusions are given the second highest priority by Parser.

Profileliterals0
property
typing.Set[typing.Tuple[str,str]]

The start and stop pairs delimiting a literal area within the syntax document. Primarily used for string quotations, but supports distinct stops for handling other cases as well.

Literals are given the highest priority by Parser.

Profileenclosures0
property
typing.Set[typing.Tuple[str,str]]

The start and stop pairs delimiting an expression.

Enclosures have the highest precedence during expression processing.

Profilerouters0
property
typing.Set[str]

Operators used to designate a resolution path for selecting an object to be used.

Profileoperations0
property
typing.Set[str]

Set of operators that can perform some manipulation to the objects associated with the adjacent identifiers.

Profileterminators0
property
typing.Set[str]

Operators used to designate the end of a statement, expression, or field.

Profileoperators290%
property
typing.Iterable[str]

Emit all unit operators employed by the language associated with a rank and context effect. Operators may appears multiple times. Empty strings represent end of line.

Operators are emitted by their classification in the following order:

  1. Operations

  2. Routers

  3. Terminators

  4. Enclosures

  5. Literals

  6. Exclusions

Order is deliberate in order to allow mappings to be directly built so later classes will overwrite earlier entries in cases of ambiguity.

Parser0
class

Keyword Oriented Syntax parser providing tokenization and region delimiting.

Instances do not hold state and methods of the same instance may be used by multiple threads.

ParserEngineering

This is essentially a tightly coupled partial application for tokenize and delimit. from_profile builds necessary parameters using a Profile instance and the internal constructor, __init__, makes them available to the methods.

Applications should create and cache an instance for a given language.

Parserintegrate_switches170%
staticmethod

integrate_switches(tokens, context)

Qualify the syntax fields in tokens by interpreting switches.

Parserintegrate_switchesParameters

tokens

Iteraable of triples produced by process_document or process_lines.

context

The presumed switch state, 'inclusion', 'exclusion', or 'literal'. Defaults to 'inclusion'.

Parserfrom_profile0
classmethod

from_profile(Class, profile)

Primary constructor for Parser.

Instances should usually be cached when repeat use is expected as some amount of preparation is performed by from_profile.

Parser__init__0
method

__init__(self, profile, opset, opmap, delimiter, optable, exits, classify_id, classify_op)

WARNING

The initializer's parameters are subject to change. from_profile should be used to build instances.

Parserprocess_line0
method
typing.Iterable[Tokens]

process_line(self, line)

Process a single line of syntax into tokens.

Parserprocess_lines0
method
typing.Iterable[typing.Iterable[Tokens]]

process_lines(self, lines)

Process lines using context resets; tokenize and delimit multiple lines resetting the context at the end of each line.

This is the recommended method for extracting tokens from a file for syntax documents that are expected to restate line context, have inaccurate profiles, or are incomplete.

The produced iterators may be ran out of order as no parsing state is shared across lines.

Essentially, map(Parser.process_line, line_iter).

Parserprocess_document0
method
typing.Iterable[typing.Iterable[Tokens]]

process_document(self, lines)

Process lines of a complete source code file using continuous context; tokenize and delimit multiple lines maintaining the context across all lines.

This is the recommended method for extracting tokens from a file for syntax documents that are expected to not restate line context and have accurate profiles.

The produced iterators must be ran in the produced order as the context is shared across instances.

Parserallocstack0
method

allocstack(self)

Allocate context stack for use with delimit.

Parserdelimit295%
method
Tokens

delimit(self, context, tokens)

Insert switch tokens into an iteration of tokens marking the boundaries of expressions, comments and quotations.

context is manipulated during the iteration and maintains the nested state of comments. allocstack may be used to allocate an initial state.

This is a relatively low-level method; process_lines or process_document should normally be used.

Parsertokenize0
method
Tokens

tokenize(self, line)

Tokenize a string of syntax according to the profile.

Direct use of this is not recommended as boundaries are not signalled. process_line, process_lines, or process_document should be used. The raw tokens, however, are usable in contexts where boundary information is not desired or is not accurate enough for an application's use.