From IntereditionWiki

Revision as of 12:15, 24 November 2012 by Paolo (Talk | contribs)

21/11/2012, Morning

For names, refer to the list of participants.

Discussion on text modelling

Two main alternatives to XML:

  • range based model (for annotations)
  • variant graph (for collation)
    • Tara is using (with Collatex) a graph-based model more complex sthan Schmidt's one.

Paolo is interested in a text model that can represent textual layers (graphical, alphabetical, linguistic; see Orlandi, Informatica testuale)

  • Tara: we can stretch Schmidt's graph model to represent the different Orlandi's layers. They're still graphs
  • Gregor: variant graphs are not suitable for this; they're not variants. LMNL (range-based model), instead, seems to be fit for Orlandi's "textual layers".

Gregor's presentation

Gregor describes his implementation of a range-based (LMNL-based) textual model, used so far for the back-end of the Faust Project.

  • the implementation is in Java.
  • annotations that have a name and are namespaced (this comes from XML)
  • text is a sequence of events
  • layers have names like "TEIW" or "Europeana" and contain texts
  • layers have anchors, so one text can point to another or to ranges of another text
  • one pointer can point to multiple anchors. E. g. layer 'alignment' points to two different anchors (and aligns them)
  • layers can include whatever data (Json, an XML file etc.)
  • a TextRepository is just a collection of those layers. It's something I can query
  • you can create a graph of the layers existing in a text repository
  • TextStream
    • Gregor's model is differnt that XML
      • the SAX API works with XML trees
      • you can walk through the tree there
      • a range-based model, isntead does not have such easy stacks
      • How do you transform XML into range-model? Any element (with opening and closing tags) becomes a range

Setting the agenda: a round-up on the interests of each of the participants

  • practicality: what can we build on top of e.g. a range-based model (from the datastore to the presentation layer)
  • query/search functions on top of a text model
  • variant graph vs. range-based models
  • processing (equivalent to XSLT?), querying (equivalent to XPath/XQuery)
  • variant graph: traversal patterns?
  • interfaces, APIs, JS libraries
  • problem of variation and how it is handled on different (conceptual) layers of a text
  • common model? can we find a generalized model incorporating features from the variant graph and a range-based model
  • bridge the gap!
  • integration scenarios; import of existing data, multiple use cases on top of those (what is the smallest thing that could possibly work? -- how do we get there)

Division of labour

See TheHague112012DivisionOfLabour