javadoc with linked source code.
2003-12-02 -- Source code will be available shortly.
Upon this basic layer, features such removing comments and allowing name=value notation can be added by activating (or writing) a set of operators that traverse the structure and do their bits on each node.
Schema validation based on regular expression, xpath evaluation, and resolution of local references with in files are among the higher level features of the toolkit.
The project currently has functioning, but non-optimized code for all these features, but there is still tons of work to be done.
A few example applications of the toolkit will hopefully help to clarify the direction of the project.
log4j loggers.
# Level settings for log4J.
# level must be one of fatal, error, warn, info, debug or all.
# root logger
-error
com
saelist
stx
-info
Pair=-error
parser
LstxParser=-error
command
AbstractCommand=-error
ReplaceCommand=-info
SmtpCommand=-info
TransformCommand=-info
util
Strings=-info
Net
Since Logger names usually don't start with a "-", it is used to mark distinguish loglevels from Loggers.
The Java code to interpret this, using the toolkit, is given below, mainly to demonstrate its brevity:
protected static void setLogLevels(String filename) throws IOException {
Pair logConfig = LstxParser.parse(Strings.loadFile(filename));
for(Iterator levels = logConfig.select("//*[starts-with(., '-')"); levels.hasNext(); ) {
Pair level = (Pair) levels.next();
Level logLevel = toLevel(level.getText().substring(1));
getLogger(pathOf(level.getParent(), ".")).setLevel(logLevel);
}
}
The getLogger method returns the rootLogger for the empty string and the relevant logger for other strings.The toPath method returns the path from the root to the given node for the first '-error' it would return com.saelist.stx.Pair.
The toLevel method converts the string (error, debug etc) into a Level object.
def
email-address=[a-zA-Z0-9.-_]+@[a-zA-Z0-9.-_]+\.[a-zA-Z0-9.-_]+
message *
to +={def/email-address}
cc *={def/email-address}
bcc *={def/email-address}
from={def/email-address}
subject=.+
attachment *
mime-type ?=
file=.+ # Paths to files to be attached.
is-html ?=true|false
text
.+ *
This fragment uses references to avoid the cluttering the schema with repeated instances of the email-address regular expression. Dereferencing is accomplished by running an dereferencing operator over the nodes.Notice that x=y is equivalent to
x
y
There is no qualitative difference between names and values.The schema is based on regular expressions on two levels. Each element has a qualifier expression that constrains how may be composed and a quantifier which constrains how often it may occur (?, + and * denote optinal, one-or-more, and zero-or-more. Default is exactly once).
A valid email message could be
message
to:someone@somewhere.org
from:me@here.org
subject:something
attachement
mime-type:image/jpeg
file:a/b/c.jpg
attachment
file:d/e/f.gif
text
Hi there
Some message,...
that is it!
enjoy.
Applying the schema to the data is up to the code. There is no automatic binding as in XML DTDs.
template
message
to={job/email}
cc=sent-cvs@peersoft.de
from=hannes@textsoft.de
subject=Contract role - {job/title} (reference: {job/reference})
attachment
mime-type=application/msword
file=/home/hannes/cv-hannes.doc
text
Hello {job/first-name}
I noted with intereest [...]
This template is applied to the following structure to yield a valid SMPT message according to example 2.
job
title=OO Developer
job-type=Contract
text=An OO Developer is required [...]
location=City, London
start=ASAP
agency=JM Contracts
contact=Sarah Brown
first-name=Sarah
last-name=Brown
email=sarahb@jmms.co.uk
reference=JSJMC-FESB14
posted=26/11/2003 10:57:31
- Using schemas to offload and standardize much of the input validation can also help reduce the volume of code and allow it to focus on the normal use case and away from the handling of exceptions.
- In many cases using indentation rather than free-form tagged text results in clearer less cluttered text.
- The approach taken by the toolkit is to treat the stucture on the lowest level as tree of homogenious nodes and leave to up to higher levels and application code to create distinctions such as tags, text, attributes, comments, processing instructions, etc. At the lowest level the Pair and Pairlist are simply Abstract Data Types akin to Map and List.