JISCCOSTToolsWorkshop

From IntereditionWiki

Information on the draft FP7 proposal

To solicit interested partners for the FP7 application that the Interedition group is preparing a 'mission statement' was drafted as a result of this workshop, describing what Interedition as a FP7 project wants to accomplish.

The statement is here. The (very) [proposal is here]. The [diagram|overview diagram of work packages] currently defined in the draft proposal is here.

A list of currently interested partners is found at Interested partners.

Live Minutes of the Meeting

The remainder of this page is dedicated to the live minutes of the meeting. 'Live minutes' are minutes that are put on line as the meeting is taking place. We provide this service so that our interested partners may return feedback even while the meeting is on.

Everyone is setting up his/her computer. Meeting will start soon.

Who's here Peter Robinson, ITSEE, univ. of Birmingham has tools and contents to contribute

Tara Andrews, Univ. of Oxford, Arminian history, tools for research

Ronald Dekker, Huygens Instituut KNAW head of development team, working on CollateX etc.

Tomasz Parkola, Poznan Supercomputer center, contribution from the digital libraries viewpoint

Paul Spence, Kings Collega London, digital text research, interest in textual scholarship

Barbara Bordalejo, ITSEE

Brit Hopman, National Library Netherlands, closely working with Huygens Instituut in 17th century scientific correspondence, developing tools

Sally Chambers, European Library, The Hague, working towards becoming an aggregator for Europeana, the consortium of research libraries etc., keen to make data available for researchers

Ulrich Tiedau, Univ. College London, CCH

Sindre Soerensen, Norway, Univ. of Bergen, Axis, software developer and engineer,

Harold Short, Kings College London, CCH, content development and content analysis, special interest in infrastructure

Federico Meschini, Univ. of Leicester, repositories, research libraries, semantic interoperability

Andrea Scotti, Florence, director of a project aiming to prosuce generic tools in the ontological and semantic wordt to analyse and publish data

Joris van Zundert, Huygens Instituut, chair of Interedition, project leader of Alfalab *Royal Netherlands Academy of Arts and Sciences

Karina van Dalen-Oskam, Huygens Instituut, head of department ICT Texts, leader of Interedition WP1, European Dimension, scholarly interest in tools for text analysis

Joris (chair) explains the background of this meeting, which is the COST Action Interedition which is in it's second year. One of the aims is applting for FP7 grants. A bit early now, but there is an appropriate all for us and we thought we have to apply. The ESF is anxiously looking for Humanities applications, so now is the time to act.

The focus of the call is highly technological, which is why our first draft also is very technologiaclly geared.

After a short introduction from everyone present, Joris point out all those who are very much interested in what we are doing and whom he has put on a separate list on this wiki oage, 'Interested partners'.

Peter Robinson and Barbara Bordalejo will be in The Hague at the Huygens Instituut in about a month time and together with Joris c.s. will further develop the proposal. Today we need as much feedback from you as possible to bring to that meeting. Apart from what we gather today, you can also send us emails with remarks and suggestions.

Joris gives a very short overview of what the proposal is about (please see the diagram and the document you have received yesterday afternoon, or if you did not receive that, let us know, we will send it through the mail as soon as possible).

Key is we are looking for a way to professionalize the way of building tools and have them function in a good infrastructure. There are three layers in our perception, developers, researchers, and intermediates. Cf. the illustartion sent separately for the relation between the proposed work packages.

Key technological idea for the proposal is 'microservices' (we are looking for another word, but this explains best what we want to do). These are 'implementation agnostic' applications. We do not want to prescibe the developers how to do their work. The protocol we are devising will enable this. We think we can primarily build this on technologies that already exist, which is one of the key strengths of our proposal, but we can benefit from new IT methodologies.

We are very happy that several people from or related to the European Library are here today. Digital content is very important for our proposal, we need contact to use the tools. This seems to be the first time we have people from the whole chain together in a group. We have tries to balance this in the proposed work pachages as they are now. They will work together but will not depend too much on each other so we can guarantee results from the separate work packages.

One of the things we want to do today is find out in which of the WPs you are interested. And we like to hear where we have missed things.

Andrea: long term sustainability seems not to be addressed. Life after the funding period, business model, etc. This would need to be part of a WP or a separate WP. Depends on the way you want to organize it. Peter: or in different WPs? Tara: or in each of the WPs. Andrea; a business model is very important. Federico: let's look at what project BAMBOO is doing. Sally: each of the stakeholders should perhaps explain what they want from the infrastructure and build the business model from that. Peter: it could be built into the Europan Library for instance. In the end we want to create something that will be essential for everyone woriking or interested in text. Ronald: but this is not a business model yet. Peter: It can be a start. Tara: We will have to do something more specific for the proposal.

Peter: we will have to make a statement of what we will put in ourselves. Harold: commitment of the partners is essential, but we also have to think more broadly about that, for the long term. Strength of the proposal is the generic aspect.

Sally: multicultural and multilingual aspects would be an important topic. Paul: that may not be too difficult. Harold: the linguistic aspect is very important.

Harold: (1) on p. 9 there is a very partial description of what a digital humanist is. This description is not really necessary and is better left out. (2) differentiate between the tools needed for already existing digital resources and for not yet existing digital resources. Different kinds of modeling are needed, esp. with cross repository usage. (3) cf. WP3: we may not want to turn our back on standards. It would be an idea to involve some of the South-African universities on their open source development ideas. Same for people from Chapel Hill, San Diego.

Peter: overall positive rhetoric is very strong, but the draft is not very concrete yet on what exactly is going to happen. We need some clear use cases in there. From a collection of these we can give a clearer description of what will happen. Things are expanding way beyond textual scholarship and becoming very ambitious.

Tara adds that we need use cases from as wide an area of expertise as possible. Federico and Andrea: we have to present a general model from as best (and diverse) use cases as possible. And show that they can be supported through that model.

Barbara: we have to focus on what we are best at.

Tomasz: the concept of microservices which can be combined could be our basic model. Then we can test this model by applying use cases.

Ronald: to sell the microservices idea you need a set of use cases that represent the model, and are somehow coherent as well to show how everything collaborates.

Andrea: the sharing of topics is e.g. the use of text mining by new testament researchers and people in health care using textual data.

Peter: is reminded of Sindre's idea of 'smart data'. We know how to make text smart. The key is metadata en ontologies e.g. We have to build the specifications of what smart data is.

Joris quotes a remark of Tomasz that the proposal may be short on innovative IT technology. Joris suggests that if a number of technology partners would concentrate on actually implementing smart data on the basis of encapsulation data with metadata that makes it discoverable, explorable, authorized etc. that would be a very technological innovative contribution to the engineering part of the proposal.

Andrea refers to a European project called Smart Museum which we may want to link up with (mobile phones 'recognize' art objects and download information on them on demand). He will provide us with the necessary information. They're linked to Europeana.

In the meantime a skype session with Susan and Julianne lead to them professing their interest in WP5. Susan says that there should be a one-page summary of all work packages so that they can be sent round to further interested people, and in particular that the content WP should be shown to library parters so they can see what we would like to do Julianne says that there is a relevant ESF call we should look at

Btit and Sally give a report on the European Library and ongoing work for proposlas in progress. One is e-Scholar, 'mediation in research':how to bridge the gap between content and users, and make the content more accessible to users. Different groups of users have been identified and contacted. Apart from users, VRE suppliersm content providers and aggregators. They are working in the humanities. A workshop in July 2009 led to the decision to apply for two different European calls, the e-Infrsatructures call and the IP call that will open in November. Work is being on on the infrastructures call. Key is how to bring existing infrstructures together, comparable to what e.g. CLARIN is doing. Typology, archeology, early print book and musicology were identified as the mein topics for the application. There is a clear overlap with what the Interedition-and-partners-group is trying to accomplish.

Sally adds that they would be happy to join in the proposal. They are focusing on European research libraries. They tty to collect content and gather it into Europeana. It will have to be available in a way that can be integrated in the work flow of the actual researchers. They will have to be able to actually work with it. Full text is a topic as well, going further than looking at metadata only. OCR techniques are an important means to get more content, and all the technical aspects that go with that. Multilingual data are important as well; how to integrate data in different languages, looking at mapping of subject headings etc., how to make these available. The end of October a position paper will be ready. Polish, Hungarian, Italian, English, French, German are the six languages at the moment involved.

Andrea points out a Danish project on automatic translation. They could perhaps be involved. Andrea has connections and could make contact.

Sally refers to the University of Bolzano as being a good possible partner for this as well.

Peter concludes that two proposals are being prepared, ours and e-scholar, and that they should be merged somehow.

The general opinion is that it is not wise to work on two different proposals.

Brit explains a lot of work still has to be done for e-scholar. Tara asks if their work could have a lot of use cases relevant for our proposal, which Brit confirms.

The week of the 19th October a lot of work will be done on our proposal; we will have to find a way to have people from e-scholar there as well.

We'll have to make a schedule for the work in a small separate group, probable.

Sally proposes to get a large legal partner involved representing several other partners. Peter: perhaps we need co-chairs from both groups. It is a joining of two projects.

Peter suggests that before 19th October three people from both groups should further work on the draft proposal, to be elaborated on from 19th October onwards.

Joris: this joined group will write the summary and send that out to all interested partners solliciting for feedback and declaration of interest. This will take the next two weeks. Details will follow through the mail. Tara will send the document with the guide for proposals.

Andrea: we will still have to decide how much to ask for, but at least we have to ask for more than we think wee need.

Joris mentions that the call has 115 million euro to spend and the highest possibile request is 23 mln. Brit adds that in the last call proposals asked for 1 to 4 mln euro. It will depend on how many partners there are. The project e-scholar may be looked at as a short term project, so less than 4 years. Peter thinks that our wish to make new tools is adding to this, making it a longer term project. Sally remarks that a maximum of three years is most convenient for the European Library. Ronald and Tara remark that it is not clear yet if we can do what we want to do in three years.

Peter: the danger of this project is that it is trying to do too much. We'll have to take care of that. So let's try to aim for three years.

Joris thanks everyone for being here. We also thank Susan and Julianne for their input through skype.

A draft one page description to solicit interest was produced here at first. It's been redirected to mission statement to be able to better structure this site.

(End of Minutes)