1 Opening/welcom at 10.10 am
3 Peter explains why the fore runner group of the Interedition Action started out with a collation tool as a prototype 1 (need some history)
* Is a primary task involved in digital scholarly editing * Discussion on follow up for Collate * Decided should be collaborative development (TextGrid, ITSEE, Huygens Institute) * Interedition as COST Action followed, development of collation tool was suggested as prototype 1 for the Action
Ronald demos the state of the first prototype, Peter explains the different use cases that the tool supports.
Stefan and Ronald ask for more input, because the algorithm can be tweaked in so many ways that mutually exclude each other. These tweaks are really interpretative moves by the scholar (what possible edit steps are more valuable or likely). It's suggested that any scholar using the collation tool should be able to choose the tweaks as to match his/hers ideas on text variant typology.
Ronald shows that the algorithm can give multiple solutions to collation. It tells you the multiple ways how one witness changed into another.
It's open, in Sourceforge, it's interoperable as far that it doesn't care what your input is. As long as one provides an input filter that represents the data in it's data structure.
What about next steps?
Stefan: integrate it into TextGrid (as GUI) James: how does it handle when fragments are moved considerable distances through a text. Stefan: higher level block approach Peter: make it work with a wider range of texts, see what happens. Joris: can we give it a REST interface, Stefan: it's going to be SOAP Frederico: can do both, if TextGRID will do SOAP, somebody else can implement REST
Zeth: wider discussion, give me a frequency list etc.
Dirk: shouldn't we talk to people like McGann, Maikonen etc.? Peter: as far as they told me, they're finished Dirk: Hans is connecting them (in Action A32) Fredrico: there's a German student working on opening up Juxta Susan: we're talking to Juxta to see whether it's a basis for making the Versioning Machine more interoperable
Zeth: it's very limited. Susan: it's just visualization
Dirk: it's just that more people are working on collation tool
Peter: going along the webservice/soap road we should be able to have the tools talking to each other
Zeth: just state what you want to exchange, REST or SOAP leave it up to the implementers
Peter: timetable, by january TextGRID wants to have something working. Zeth and TextGRID are going to take a look. Who else wants in?
Andrea: I'll send the information what we're doing and we can elaborate the protocols for exhange from there.
Peter: we'll have a manual available on how to put your text to the CollateX web service and describing what you'll receive back. The task force is made up by Peter/Zeth, Andrea, Dirk, Stefan/Michael.
James: I would be wanting to test it.
Joris: Do Ronald and Bram want to be in? Ronald/Peter: yes. Joris: must there be a format/structure proposal. Peter: rather have raw text in and out.
Stefan is okay with rawtext.
Peter We need to define themes for next prototypes, baring in mind what was dicussed yesterday. One of the major points for interoperability is the ability to discover and exchange texts.
Joris: we're talking just identifying text. Peter: also how we are going to state things about the text. And then a stage for requesting the text. Malte: does this need a prototype, isn't it just a agreement on protocol Peter: we need that documentation. But we need also to actually do it (so a prototype). CTS is very near, but it has some weaknesses. Peter: system for distributed redundancy with manangement of rights. I have this text, You may use it / You may copy it on your server to another / You may even redistributed it. Zeth/Ronald: use existing distributed technology Joris: especially also from a sustainability view point.
Dino: would this mean one can add a server dialogueing with others. Joris/Peter: yes, it's precisely the aim to have such an open extensible system.
James: problem is conflict resolution
Peter: we need policy, guidelines and example data. AND prototype. Documentation on the Wiki. What are we going to do / implement when Prototype 1 is done?
Joris: do we copy the prototype 1 organisational model?
Andrea: we need a very argumentative and structure document that we can use to attract other
Peter will start of with a discussion piece on the wiki. Joris will open up the wiki on tuesday next.
Peter: so PROTOTYPE 2 is about discovering and exchanging text. PROTOTYPE 3 could be on images.
Susan: is multimedia, streams. James: it's really all 'give me this bit of that thing' we should be able to treat it in the same way. (NOTE: Abstracted API?)
Peter: it's about time bases media. Susan: we may cope with that in one prototype or in two, depending on the applicability of the earlier prototypes to other media.
Also: mesh ups, interacting information.
Andreas: we're working on archeological application handling the same problems.
Peter: so Prototype 3 could be about media. Prototype 4 could be about the different media services interacting.
Susan: but can we handle this large system.
Peter: it's not implementing the complete system, but about addressing what a system should do, supporting that with an number of prototype proofing the viability of the proposed solutions for things like identifying, meshing up, exchanging etc.
Dino: Are we talking modular system? Peter: yes, the web services will be build as independent services that will exchange meta(data). A services will say " I have this document in RDF"
Zeth: RDF is local implementation that will pass through the Intereditionfilter (preferably based on an existing open standard of course) and you could recreate the RDF on your side (if you want).
Frederico: can we use FRBUR for exchange? Peter: that's on the object leven, we need way below the object level (on the level of the word e.g.)
Peter: refers to the article Fotis sent around. It describes exactly what we want (in 1996 by the way!)
Susan: do we need to integrate the people from the library community. There's been a lot of theoretical and implementation by that community.
Peter: strong point. We need to invite such people to the next meeting. As we work the draft we need to pass it to the library people for commenting and suggesting.
(ACTION POINT: at least invite three people on this subject: KB, Birmingham UL, DANS, OTA, Library Europeana (Steven Grathman, Hamburg; Jan Christoph Maisten; Bettina Wagner))
Frederico: can we produce a map who's working on what (like a 1920's gangsta map?)
7) Do we need to talk on that. We document on the wiki, have the developers self managing. The Action just requests a monthly report.
Might tie in with STMS's reports: 500 words (Frederico) Dirk very usefull instrument.
Peter: can we use STSM's to have over the ocean contacts? Dirk/Joris: if chair can justify and rule of mutual benefit apply's, it's possible. Susan: 1 connection rule? ACTION: Joris follows up (mail by Hans)
ACTION: Joris will check out the procedure for STSM's and put that on the Wiki.
Dirk: it's the most flexible instrument, keep it like that. Apply with 250 words by STSM-coordinator. approce by MC. Write short 500 word report afterwards.
Frederico: 90 Euro's a day. Travel 200 Euro. Minimum of 5 days, maximum
Peter wants 1 boot camp a year.
Dirk: Formally that's a Workshop. Peter: we might organize 10 STMS's to the same place.
Dirk: you might use it for a 10 day sprint to invite experts (like the juxta folks) to implement it at a particular institute. A bootcamp should return a tangible result (like CollateX).
We decide to have an STMS's bootcamp in February in Pisa. We dovetail that with an MC and a WG2/WG3. Probably 5 developers Probably WG3/WG2 meeting on Thursday 12 MC on Friday 13
Closing by Peter.
High level main requirements
- Sustainable / Stable (in stead of the term stable, which confuses. Self management? Versioning? Distributed redundancy?)
- Producable (?)
Suggested specific requirements
- mechanism to retrieve underlying XML from HTML (meta tag?)
- common format for text retrieval, service registry, lincence retrieval
- 'processing' material (which the lack of possibilities is seen as an inhibitor for the adaption of digital services at large)
- rights management / authorization
- flexibility of output:
- separation of model, presentation
- visualization / presentation tools
- distributed sources
- end user aspects / the teaching room
- tools to enhance trust (community level stamp of quality)
- digital born edition (on line editing tools)
- TEI/Word conversion
- collaborative editing / workbench / environment
- Named entities
- Dublic Core
- Text structure (same as modeling?)
- Example data (which edition to use for the prototypes)
- Development process