Darmstadt Bootcamp

From IntereditionWiki

Revision as of 21:35, 20 September 2011 by KathrynP (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Recently we had a Call For Participation out for Interedition’s next developement bootcamp. The 6th Interedition on interoperability and light weight infrastructures for digital scholarship will run from 28 February to 4 March 2011

Hacking

The Bootcamp

The 6th Interedition Development Bootcamp will take place from 28 February to 4 March 2011 at the Technische Universität Darmstadt. The bootcamp is being organized by a joint group from Technische Universität Darmstadt and the Universität Trier:

  • Andrea Rapp (Darmstadt)
  • Thomas Burch (Trier)
  • Vera Hildenbrandt (Trier)
  • Julianne Nyhan (Trier)
  • Oliver Schmid (Trier)

The primary objective of the bootcamp is the development of prototypes for interoperable ‘microservice’ tools for text scholarship and digital editions. Much of the work will focus on the first Interedition prototypool, CollateX, described in more detail below; development on other related tools will also be welcomed. Another important objective is to give developers and early stage researchers an opportunity to meet and share their own projects and experiences with tool interoperability in textual scholarship.

Some examples of projects to be undertaken at the bootcamp include:

For CollateX:

  • Further development work on the core
  • Creation of a test harness and robust test suite

For related work:

  • Use of the collation service to analyze topic maps, and compare the
  • semantics of text
  • Development of shared back-end protocols that would allow for interoperability of text transcription tools
  • Proof-of-concept creation of a full microservice-based toolchain for the production and initial publication of digitally edited text

Preliminary Program

Please cf. the Darmstadt Bootcamp/Updated Program

Monday 28 February

  • Introduction of participants and projects
  • Introduction to CollateX and the principles of microservices
  • Division of Tasks/Labor

Tuesday 1 March

  • Development

Wednesday 2 March

  • Development

Thursday 3 March

  • Unconference on participants’ projects
  • Development

Friday 4 March

  • Development
  • Documentation
  • Ungathering on…
    • Interoperability of tools and data for digital humanities
    • Supporting the Open Source Development Community in Digital Humanities
  • Round up, reporting

About the CollateX Prototype

The Interedition team has recently (July 2010) released version 0.9 CollateX, a webservice-enabled text collation engine that will be of use to a wide range of digital humanities projects. This release is the result of the work done during the fourth Interedition Bootcamp in April 2010 in Firenze, Italy and is the first official release of CollateX. It features baselesultiple witness alignment, parallel segmentation and handling transpositions. It can export an alignment table as a critical apparatus iEI format. CollateX is available as a Java application and has Pythoindings available, but it is primarily designed to be run as a RESebservice using the Tomcat or Jetty webserver. More information can bound on http://collatex.sourceforge.net.

Participants

Some core info on the Darmstadt Bootcamp Participanta.

Daily updates

In an attempt to summarize progress Darmstadt Bootcamp/Daily Progress

Tue, Mar 1

Annotation/Linking Group

The group discussed and adopted a common model for the exchange of image-text linkage data. The model defines a minimal and an extended input set, and an output set. A JSON linearization of the model has been agreed on. Grant has implemented a service that converts the model to the TILE input format. Doug has worked on a stand-alone service that automatically detects lines in an image. Moritz has worked on a service that transforms between XML files and the JSON linearization. Cesar and Joaquin have worked on intrgrating Google Books with TILE.

Wed, Mar 2

Annotation/Linking Group

Having a basic idea of the data exchange model, the group further discussed more specific applications of the model. This resulted in dividing up the group into developing A.) A REST application in WebDav or CGI that saves image and text linkages into a database B.) Developing different services to start using this model to exchange data. Joaquin and Cesar worked on A and are continuing to develop the service. Grant, Gunter, and Moritz worked on B. Moritz has developed a service that uses Python to change Faust XML data into the JSON standard created by the group. Gunter did the same for the Montfort Edition stored in a Fedora based repository. Grant has developed plugins for TILE to import and export the same JSON data model. Gunter is researching further ways to integrate his data with the group's. Right now, both Grant and Moritz are attempting to connect their services to REST.

Thu, Mar 3

Annotation/Linking Group

Asaf has implemented a text server that can harvest annotations compliant with the Open Annotation Collaboration format. It can then serve the text enriched with the harvested annotations. Grant has extended TILE so that at startup, a URL can be passed in which holds the annotation data in the specified JSON format. TILE will then load the annotations from that URL; it is to write back the changes or additions that are made to the annotation to that URL via HTTP. Cesar and Joaquim have continued working on the annotation data store. Moritz has implemented a service that accepts as input the URL of an XML file and an XPath expression. It will tokenize the XML file using the XPath expression and return a template of the annotation data in JSON format. Gunter has connected his data to TILE. Doug has implemented a service that automatically recognizes lines in an image.