Department of Computer Science |
|

****XCES is now an EAGLES standard
****
XML version of the CES DTDs BETA RELEASE
This is a Beta release of XCES, which instantiates the EAGLES Corpus Encoding Standard (CES) DTDs for linguistic corpora, developed by the Department of Computer Science, Vassar College, and Equipe Langue et Dialogue, LORIA/CNRS. XCES is under development and subject to change.
We are developing documentation to support XCES. However, the existing CES documentation supporting general encoding practices for linguistic corpora and tag usage is largely relevant to the XCES instantiation, and should be consulted.
XCES is under development. Because the XML framework provides us with means to go well beyond the capabilities of SGML, this development is taking several forms: (1) XML support for additional types of annotation and resources, including discourse/dialogue, lexicons, and speech; (2) creation of additional XSLT scripts to perform common operations and trasduce among formats (including different annotation formats); (3) development of a set of XML schemas instantiating an abstract data model for linguistic annotations, together with a hierarchy of derived types for a broad range of annotation types; and (4) creation of a repository of annotation formats for "off the shelf" use or easy modification via the XCES schemas.
<?xml version="1.0"?> <!DOCTYPE cesDoc PUBLIC "-//CES//DTD XML cesDoc//EN" "dtd/xcesDoc.dtd" [ ]>...cesAna resources :
<?xml version="1.0"?> <!DOCTYPE cesDoc PUBLIC "-//CES//DTD XML cesAna//EN" "dtd/xcesAna.dtd" [ ]>...cesAlign resources :
<?xml version="1.0"?> <!DOCTYPE cesDoc PUBLIC "-//CES//DTD XML cesAlign//EN" "dtd/xcesAlign.dtd" [ ]>...
Support for the XLink specification by including the sub-dtd xlink.ent (for simple, extended, locators and arc elements) is under development.
We are currently implementing the use of XPointers and XPaths for locator element types.
Usage:
java -mx64m \ -Dcom.jclark.xsl.sax.parser=com.jclark.xml.sax.Driver \ com.jclark.xsl.sax.Driver your-xces-doc.xml xsl/html/cesDoc.xslOnly HTML output is supported by the stylesheets. Output produced with the stylesheets can be customized by setting or overriding variables within the xsl/html/config.xsl file. If you do not want to modify the XSL source files, you can use a driver; see : xsl/html/driver.xsl
We are currently working on a set of stylesheets to support the cesAna and cesAlign DTDs.