I work for Data Archiving and Networked Services (DANS) in the Netherlands. Driven by data, DANS ensures that access to digital research data keeps improving, through its services and by taking part in national & international projects and networks. DANS is an institute of the Royal Netherlands Academy of Arts and Sciences (KNAW) and the Netherlands Organisation for Scientific Research (NWO).
Dirk Roorda Phone: +31 6 13665023 Email: email@example.com
Introductory for Bootcamp Leuven 2012
- what you are working on
- Archiving data in general, processes for ingest, standards for metadata, ways of disseminating data. In particular: textual data.
- what you hope to achieve in the bootcamp
- We have a dataset, a linguistical representation of the Hebrew Old Testament, query-able with a special front-end. Our archived version is at http://wivu.dans.knaw.nl . I am looking for ways to make this data available for the long-term. Can we export the data to RDF and do meaningful things with it?
- the sort of data you use in your work
- Recently we had a primitive digitisation of the correspondence of Descartes, which had to be converted to TEI. There are mathematical formulas in it. I did it using Perl and TeX.
- the sort of data you produce in your work
- We transform and curate data.
- how you might envision your tool(s) hooking together with other tools
- My main question is: if you have a specialised tool to do text analysis, how do you preserve the results of your analysis in such a way that it can be redone, even if you cannot preserve the original analysis tool?
- what sort of text analysis you are interested in
- All kinds, and then: nothing in particular. It is the long-term archiving and use that I am interested in.
- how you envision tools working together to achieve this
- Analysis tools are optimised to bring lots of data together and perform intensive and intricate computation on them.
- A long-term solution should not try to do the computation. It should store the results of the computations that have been done, and interlink them with the source material.
- So I am looking for efficient ways to store and link large portions of computation results.
- Search indexes are a good example.
- And RDF might be a good vehicle. Possibly in the form of OpenAnnotation.