publication . Article . 2018

EAD ODD: a solution for project-specific EAD schemes

Romary, Laurent; Riondet, Charles;
Open Access
  • Published: 30 Apr 2018
  • Publisher: Springer Science and Business Media LLC
  • Country: France
Abstract
This article tackles the issue of integrating heterogeneous archival sources in one single data repository, namely the European Holocaust Research Infrastructure (EHRI) portal, whose aim is to support Holocaust research by providing online access to information about dispersed sources relating to the Holocaust (http://portal.ehri-project.eu). In this case, the problem at hand is to combine data coming from a network of archives in order to create an interoperable data space which can be used to search for, retrieve and disseminate content in the context of archival-based research. The scholarly purpose has specific consequences on our task. It assumes that the information made available to the researcher is as close as possible to the originating source in order to guarantee that the ensuing analysis can be deemed reliable. In the EHRI network of archives, as already observed in the case of the EU Cendari project, one cannot but face heterogeneity. The EHRI portal brings together descriptions from more than 1900 institutions. Each archive comes with a whole range of idiosyncrasies corresponding to the way it has been set up and evolved over time. Cataloging practices may also differ. Even the degree of digitization may range from the absence of a digital catalogue to the provision of a full-fledged online catalogue with all the necessary APIs for anyone to query and extract content. There is indeed a contrast here with the global endeavour at the international level to develop and promote standards for the description of archival content as a whole. Nonetheless, in a project like EHRI, standards should play a central role. They are necessary for many tasks related to the integration and exploitation of the aggregated content, namely: ● Being able to compare the content of the various sources, thus being able to develop quality-checking processes; ● Defining of an integrated repository infrastructure where the content of the various archival sources can be reliably hosted; ● Querying and re-using content in a seamless way; ● Deploying tools that have been developed independently of the specificities of the information sources, for instance in order to visualise or mine the resulting pool of information. The central aspect of the work described in this paper is the assessment of the role of the EAD (Encoded Archival Description) standard as the basis for achieving the tasks described above. We have worked out how we could develop a real strategy of defining specific customization of EAD that could be used at various stages of the process of integrating heterogeneous sources. While doing so, we have developed a methodology based on a specification and customization method inspired from the extensive experience of the Text Encoding Initiative (TEI) community. In the TEI framework, as we show in section 1, one has the possibility to model specific subsets or extensions of the TEI guidelines while maintaining both the technical (XML schemas) and editorial (documentation) content within a single framework. This work has led us quite far in anticipating that the method we have developed may be of a wider interest within similar environments, but also, as we believe, for the future maintenance of the EAD standard. Finally this work, successfully tested and implemented in the framework of EHRI [Riondet 2017], can be seen as part of the wider endeavour of European research infrastructures in the humanities such as CLARIN and DARIAH to provide support for researchers to integrate the use of standards in their scholarly practices. This is the reason why the general workflow studied here has been introduced as a use case in the umbrella infrastructure project PARTHENOS which aims, among other things, at disseminating information and resources about methodological and technical standards in the humanities.
Persistent Identifiers
Fields of Science and Technology classification (FOS)
02 engineering and technology, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, 05 social sciences, 0509 other social sciences, 050904 information & library sciences
Sustainable Development Goals (SDG)
9. Industry and infrastructure, 16. Peace & justice
Subjects
free text keywords: Library and Information Sciences, History, Building and Construction, Software, Encoded Archival Description, EAD, Text Encoding Initiative, TEI, Standard specification, Research infrastructures, Customization, [SHS.INFO]Humanities and Social Sciences/Library and information sciences, [SHS.HIST]Humanities and Social Sciences/History, Data science, Information repository, Digitization, Encoded Archival Description, computer.file_format, computer, Documentation, Interoperability, Personalization, Workflow, Technical standard, Computer science
Communities
Other Communities
  • Rural Digital Europe
  • Social Science and Humanities
Funded by
EC| EHRI
Project
EHRI
European Holocaust Research Infrastructure
  • Funder: European Commission (EC)
  • Project Code: 654164
  • Funding stream: H2020 | RIA
Validated by funder
,
EC| EHRI
Project
EHRI
European Holocaust Research Infrastructure
  • Funder: European Commission (EC)
  • Project Code: 261873
  • Funding stream: FP7 | SP4 | INFRA
,
EC| PARTHENOS
Project
PARTHENOS
Pooling Activities, Resources and Tools for Heritage E-research Networking, Optimization and Synergies
  • Funder: European Commission (EC)
  • Project Code: 654119
  • Funding stream: H2020 | RIA

● Archives Portal Europe network of excellence, D6.1 First Analysis report: Applying Web 2.0 solutions in archival applications, http://apexproject.eu/images/docs/D61_Web20_In_Archival_Applications.pdf, accessed on January 9th 2018

● Archives Portal Europe network of excellence D6.6 Second analysis report: Applying Web 2.0 solutions in archival applications, 2014, http://apexproject.eu/images/docs/D66_Web20_In_Archival_Applications_final.pdf, accessed on January 9th 2018

● Bunn, 2013. “Developing Descriptive Standards: A Renewed Call to Action.” Archives and Records 34 (2): 235-47. doi:10.1080/23257962.2013.830066.

● EAD and EAC-CPF working groups, AFNOR, Proposals for evolution of EAD, https://www2.archivists.org/sites/all/files/France_Proposals%20for%20evolution%20o f%20EAD_0.rtf, accessed on January 9th 2018

● Experts group on archival description (ICA). “Records in Contexts, a Conceptual Model for Archival Description. Consultation Draft v0.1.” Conseil international des Archives, September 2016. http://www.ica.org/sites/default/files/RiC-CM-0.1.pdf, accessed on January 9th 2018

● Gartner, Richard. “An XML Schema for Enhancing the Semantic Interoperability of Archival Description.” Archival Science 15, no. 3 (September 1, 2015): 295-313. doi:10.1007/s10502-014-9225-1.

● Library of Congress, Development of the Encoded Archival Description DTD, 2013, http://www.loc.gov/ead/eaddev.html, accessed on January 9th 2018

● Lieske, Christian, Rahtz, Sebastian and Sasaki Felix, Internationalization and Localization of XML: Introducing "ITS", XTech 2006, Amsterdam, The Netherlands, May 2006, http://xtech06.usefulinc.com/schedule/paper/55, accessed on January 9th 2018

● Medves, Maud, Romary, Laurent. EAG(CENDARI): customising EAG for research purposes. Building infrastructures for archives in a digital world, Jun 2013, Dublin, Ireland. 2014. 〈hal-00959841v2〉 [OpenAIRE]

● Rahtz, Sebastian, and Lou Burnard. 2014. “Advanced Topics in ODD.” In ODD: One Document Does it All. Workshop at the Text Encoding Initiative Conference and Members Meeting, Oct 22-24 Evanston, IL. http://tei.it.ox.ac.uk/Talks/2014-10- odds/talk-05-advanced.xml

● Riondet, Charles, Romary, Laurent, Van Nispen, Annelies, Rodriguez, Kepa Joseba, Bryant, Mike. Report on Standards. [Contract] D.11.4, Inria Paris. 2017. 〈hal01503235〉

● Romary, Laurent, Riondet, Charles. Ongoing maintenance and customization of archival standards using ODD (EAC-CPF revision proposal). EAC-CPF revision proposal. 2017. 〈hal-01677185〉

● Romary, Laurent, Banski, Piotr, Bowers, Jack, Degl'innocenti, Emiliano, Ďurčo, Matej, et al.. Report on Standardization (draft). [Technical Report] Deliverable 4.2 Inria. 2017. 〈hal-01560563〉

● Shaw, Elizabeth J. 2001. “Rethinking Balancing Flexibility and Interoperability.” New Review of Information Networking 7 (1): 117-31. doi:10.1080/13614570109516972

● Walsh, Norman , Literate Programming in XML, 2002, http://nwalsh.com/docs/articles/xml2002/lp/paper.htmlaccessed on January 9th 2018

Abstract
This article tackles the issue of integrating heterogeneous archival sources in one single data repository, namely the European Holocaust Research Infrastructure (EHRI) portal, whose aim is to support Holocaust research by providing online access to information about dispersed sources relating to the Holocaust (http://portal.ehri-project.eu). In this case, the problem at hand is to combine data coming from a network of archives in order to create an interoperable data space which can be used to search for, retrieve and disseminate content in the context of archival-based research. The scholarly purpose has specific consequences on our task. It assumes that the information made available to the researcher is as close as possible to the originating source in order to guarantee that the ensuing analysis can be deemed reliable. In the EHRI network of archives, as already observed in the case of the EU Cendari project, one cannot but face heterogeneity. The EHRI portal brings together descriptions from more than 1900 institutions. Each archive comes with a whole range of idiosyncrasies corresponding to the way it has been set up and evolved over time. Cataloging practices may also differ. Even the degree of digitization may range from the absence of a digital catalogue to the provision of a full-fledged online catalogue with all the necessary APIs for anyone to query and extract content. There is indeed a contrast here with the global endeavour at the international level to develop and promote standards for the description of archival content as a whole. Nonetheless, in a project like EHRI, standards should play a central role. They are necessary for many tasks related to the integration and exploitation of the aggregated content, namely: ● Being able to compare the content of the various sources, thus being able to develop quality-checking processes; ● Defining of an integrated repository infrastructure where the content of the various archival sources can be reliably hosted; ● Querying and re-using content in a seamless way; ● Deploying tools that have been developed independently of the specificities of the information sources, for instance in order to visualise or mine the resulting pool of information. The central aspect of the work described in this paper is the assessment of the role of the EAD (Encoded Archival Description) standard as the basis for achieving the tasks described above. We have worked out how we could develop a real strategy of defining specific customization of EAD that could be used at various stages of the process of integrating heterogeneous sources. While doing so, we have developed a methodology based on a specification and customization method inspired from the extensive experience of the Text Encoding Initiative (TEI) community. In the TEI framework, as we show in section 1, one has the possibility to model specific subsets or extensions of the TEI guidelines while maintaining both the technical (XML schemas) and editorial (documentation) content within a single framework. This work has led us quite far in anticipating that the method we have developed may be of a wider interest within similar environments, but also, as we believe, for the future maintenance of the EAD standard. Finally this work, successfully tested and implemented in the framework of EHRI [Riondet 2017], can be seen as part of the wider endeavour of European research infrastructures in the humanities such as CLARIN and DARIAH to provide support for researchers to integrate the use of standards in their scholarly practices. This is the reason why the general workflow studied here has been introduced as a use case in the umbrella infrastructure project PARTHENOS which aims, among other things, at disseminating information and resources about methodological and technical standards in the humanities.
Persistent Identifiers
Fields of Science and Technology classification (FOS)
02 engineering and technology, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, 05 social sciences, 0509 other social sciences, 050904 information & library sciences
Sustainable Development Goals (SDG)
9. Industry and infrastructure, 16. Peace & justice
Subjects
free text keywords: Library and Information Sciences, History, Building and Construction, Software, Encoded Archival Description, EAD, Text Encoding Initiative, TEI, Standard specification, Research infrastructures, Customization, [SHS.INFO]Humanities and Social Sciences/Library and information sciences, [SHS.HIST]Humanities and Social Sciences/History, Data science, Information repository, Digitization, Encoded Archival Description, computer.file_format, computer, Documentation, Interoperability, Personalization, Workflow, Technical standard, Computer science
Communities
Other Communities
  • Rural Digital Europe
  • Social Science and Humanities
Funded by
EC| EHRI
Project
EHRI
European Holocaust Research Infrastructure
  • Funder: European Commission (EC)
  • Project Code: 654164
  • Funding stream: H2020 | RIA
Validated by funder
,
EC| EHRI
Project
EHRI
European Holocaust Research Infrastructure
  • Funder: European Commission (EC)
  • Project Code: 261873
  • Funding stream: FP7 | SP4 | INFRA
,
EC| PARTHENOS
Project
PARTHENOS
Pooling Activities, Resources and Tools for Heritage E-research Networking, Optimization and Synergies
  • Funder: European Commission (EC)
  • Project Code: 654119
  • Funding stream: H2020 | RIA

● Archives Portal Europe network of excellence, D6.1 First Analysis report: Applying Web 2.0 solutions in archival applications, http://apexproject.eu/images/docs/D61_Web20_In_Archival_Applications.pdf, accessed on January 9th 2018

● Archives Portal Europe network of excellence D6.6 Second analysis report: Applying Web 2.0 solutions in archival applications, 2014, http://apexproject.eu/images/docs/D66_Web20_In_Archival_Applications_final.pdf, accessed on January 9th 2018

● Bunn, 2013. “Developing Descriptive Standards: A Renewed Call to Action.” Archives and Records 34 (2): 235-47. doi:10.1080/23257962.2013.830066.

● EAD and EAC-CPF working groups, AFNOR, Proposals for evolution of EAD, https://www2.archivists.org/sites/all/files/France_Proposals%20for%20evolution%20o f%20EAD_0.rtf, accessed on January 9th 2018

● Experts group on archival description (ICA). “Records in Contexts, a Conceptual Model for Archival Description. Consultation Draft v0.1.” Conseil international des Archives, September 2016. http://www.ica.org/sites/default/files/RiC-CM-0.1.pdf, accessed on January 9th 2018

● Gartner, Richard. “An XML Schema for Enhancing the Semantic Interoperability of Archival Description.” Archival Science 15, no. 3 (September 1, 2015): 295-313. doi:10.1007/s10502-014-9225-1.

● Library of Congress, Development of the Encoded Archival Description DTD, 2013, http://www.loc.gov/ead/eaddev.html, accessed on January 9th 2018

● Lieske, Christian, Rahtz, Sebastian and Sasaki Felix, Internationalization and Localization of XML: Introducing "ITS", XTech 2006, Amsterdam, The Netherlands, May 2006, http://xtech06.usefulinc.com/schedule/paper/55, accessed on January 9th 2018

● Medves, Maud, Romary, Laurent. EAG(CENDARI): customising EAG for research purposes. Building infrastructures for archives in a digital world, Jun 2013, Dublin, Ireland. 2014. 〈hal-00959841v2〉 [OpenAIRE]

● Rahtz, Sebastian, and Lou Burnard. 2014. “Advanced Topics in ODD.” In ODD: One Document Does it All. Workshop at the Text Encoding Initiative Conference and Members Meeting, Oct 22-24 Evanston, IL. http://tei.it.ox.ac.uk/Talks/2014-10- odds/talk-05-advanced.xml

● Riondet, Charles, Romary, Laurent, Van Nispen, Annelies, Rodriguez, Kepa Joseba, Bryant, Mike. Report on Standards. [Contract] D.11.4, Inria Paris. 2017. 〈hal01503235〉

● Romary, Laurent, Riondet, Charles. Ongoing maintenance and customization of archival standards using ODD (EAC-CPF revision proposal). EAC-CPF revision proposal. 2017. 〈hal-01677185〉

● Romary, Laurent, Banski, Piotr, Bowers, Jack, Degl'innocenti, Emiliano, Ďurčo, Matej, et al.. Report on Standardization (draft). [Technical Report] Deliverable 4.2 Inria. 2017. 〈hal-01560563〉

● Shaw, Elizabeth J. 2001. “Rethinking Balancing Flexibility and Interoperability.” New Review of Information Networking 7 (1): 117-31. doi:10.1080/13614570109516972

● Walsh, Norman , Literate Programming in XML, 2002, http://nwalsh.com/docs/articles/xml2002/lp/paper.htmlaccessed on January 9th 2018

Any information missing or wrong?Report an Issue