1. Open Access and Long-term Archiving / By Ute Schwens & Reinhard Altenhöner, German National
Library
开放获取与长期存档 / 犹特·史万斯,莱因哈德·阿腾赫纳 德国国家图书馆
In the context of the discussion on Open Access, the entire publication chain, from the writing of the
text to making available the published article, is increasingly taken into account. This chain also
includes guaranteeing the article’s long-term accessibility and ‘citability‘. Ensuring this long-term
availability, in other words the long-term archiving of digital objects, includes all those measures that
serve to permanently preserve these objects for posterity. These include the preservation of the
substance of the material content on the one hand, and the guaranteed usability of digital resources on
the other(38).
在讨论开放获取时,越来越多地考虑从撰写文本到发表论文的整个出版链。这个出版链里,还
包括保证论文的长期可得及“可引”。确保长期可得,换句话说,就是数字对象的长期存档,包
括为后世而建立的永久保存措施。这些措施包括,保存实质的内容,以及保证数字资源的可用
性(注 38)。
注 38: Liegmann, Hans & Schwens, Ute, 'Langzeitarchivierung digitaler Ressourcen' = [数字资源长期
存档], in: Kuhlen, R., Seeger, T. & Strauch, D. (eds), Grundlagen der praktischen Information und
Dokumentation,. Vol.1: Handbuch zur Einführung in die Informationswissenschaft und – praxis, 5th
ed., Munich, 2004.
Measures to preserve the substance of the contents of data are successful when data deriving from a
whole variety of sources and stored on a whole variety of storage media (including existing networks)
are successfully transferred to a homogeneous storage system and preserved there in a stable fashion.
Important components of this system are therefore automated control mechanisms which monitor the
continuous system-internal data-transfer. However, the fact that technical platforms have short half-
lives affects this system too, and forces a constant change of data-storage medium generations and the
migration of data collections that this may involve.
当来源不同、存储媒介各异(包括现有网络)的数据,能成功地转换到单一存储系统、并以稳
定方式保存时,保存数据实质内容的措施才是成功的。因此,持续监测内部数据转换的自动控
制机制,是系统的重要组成部分。然而,事实上技术平台的半衰期较短,也影响着系统,以至
被迫经常更换数据存储媒介,并迁移其中的数据。
Preserving the usability of digital resources is far more complex. The user of the future may well not be
in a position to interpret the originally archived material (the data flow), since the necessary technical
environment (operating systems, applications) will have long since ceased to be available. For this
reason, experiments are being conducted with processes that aim to emulate obsolete systems.
保存数字资源的可用性是很复杂的过程。未来的用户很可能无法解读当初存档的资料(数据
流),因为从前必需的技术环境(操作系统及应用软件)不再可得,因此,必须通过实验,对
已淘汰的系统进行仿真。
These two briefly described approaches only apply when the digital object with its specific
characteristics has already been generated. In addition, however, a number of important initiatives
worldwide are working towards promoting the use of data formats that are stable in the long term, and
of open standards already at the publishing stage of the digital resources. Taken together, all the
selected measures also contribute to the preservation of older states of the art in order to be able to
integrate them into current and future academic processes. That is the primary goal of the long-term
archiving of digital resources.
上面简述的两种方法,只适用于已经生成的数字对象。此外,世界各地有若干重要的倡议,正
2. 在努力推广采用长期稳定的数据格式,以及已经处于数字资源出版阶段的开放标准。综观来
看,所有措施不但可保存旧有状态,还可以把它们融入当前及未来的学术过程中。这是长期存
档数字资源的主要目标。
The question of the context and business model in which digital publications are generated is irrelevant
for (technical) long-term archiving processes, as Open Access journals in principle undergo the same
technical processes as commercial e-journals of specialist academic publishers. The German National
Library Law (Gesetz über die Deutsche Nationalbibliothek) provides for this equal treatment where
longterm archiving is concerned(39). Since 29 June 2006, this law has obliged the German National
Library to collect all works published after 1913 in Germany, in German or about Germany. This legal
obligation to collect materials is linked to the obligation to permanently preserve and make archived
materials available.
数字出版的背景和商业模式,与(技术上)长期存档的过程无关,因为开放获取期刊原则上与
专业学术出版社的商业电子期刊,在技术层面并无不同。德国国家图书馆法(Gesetz über die
Deutsche Nationalbibliothek)在长期存档方面予以同等对待(注 39)。自 2006 年 6 月 29 日以
来,该法要求德国国家图书馆收集所有 1913 年以来在德国出版的、以德文或有关德国的出版
物。这一收集资料的法律义务,与永久保存并使存档资料可供使用的义务挂钩。
注 39: http://www.d-nb.de/wir/pdf/dnbg.pdf
In 2004, in response to the challenge which this duty involves, the German National Library started the
project ‘Co-operative Development of a Long-Term Digital Information Archive’, known by the
German acronym kopal(40), with funds from the German Federal Ministry of Education and Research.
This project, carried out by the German National Library and the Göttingen State and University
Library, the Society for Academic Data-processing (Gesellschaft für wissenschaftliche
Datenverarbeitung, GWDG) and IBM Germany, pursues the goal of implementing and testing a
cooperatively created and operated long-term archiving system for digital documents and data as a
sustainable solution both for long-term preservation and guaranteed long-term availability of digital
resources.
2004 年,为了面对此义务带来的挑战,在德国联邦教育与研究部的资助下,德国国家图书馆启
动“长期数字信息档案合作发展”计划,德文简称 kopal(注 40),由德国国家图书馆、哥廷根州
及大学图书馆、学术数据处理学会(Gesellschaft für wissenschaftliche Datenverarbeitung,
GWDG)以及德国 IBM 公司实施,目的在于共同建立与运作数字文件与数据的长期存档系统,
经由应用及测试之后,希望找到长期保存并保证可长期使用数字资源的可持续解决方案。
注 40 http://kopal.langzeitarchivierung.de
The starting point of the archive system is the Digital Information Archiving System (DIAS) developed
by IBM in collaboration with the Dutch National Library (Koninklijke Bibliotheek). In its architecture
and implementation, DIAS is consistently geared to the Standard Open Archive Information System
(OAIS), which has also been established via ISO since 2003, and has provided a kind of conceptual
framework and orientation point for corresponding systems.
这个存档系统以 IBM 公司与荷兰国家图书馆(Koninklijke Bibliotheek)协作开发的“数字信息存
档系统”(DIAS)为起点。DIAS 的架构与应用适合开放档案信息系统标准(OAIS),OAIS 自
2003 年已成为一个国际标准(ISO 14721:2003),为相应系统提供一种概念框架和定位点。
For the development of the kopal project, a number of important components were added to DIAS, and
its architecture was adjusted. The system was thus made client- or multi-user compatible, and, in
particular, the grouping of storage and administration of objects was replaced by a technical approach
3. geared to individual objects. The object-related comprehensive metadata information necessary for this
purpose was formulated as Universal Object Format (UOF) and anchored in the system. Finally, tools
were created to homogenise the metadata to posted objects that address and operate the open,
standardised interfaces in the system. The corresponding modular software library koLibRI is available
for other institutions to use under an Open Source licence. This architecture and orientation means that
kopal is in a position to store publications permanently and securely, to migrate them if necessary on
the grounds of extended metainformation using automated processes, or to make them available in
appropriately generated emulation environments. From a technical point of view, the kopal solution
does not involve any demands on or tying-down of publications, nor, in particular, of the production
processes behind them.
为了发展 kopal 计划,多个重要组件被加入 DIAS,其架构也做了调整。该系统因之与主机或多
用户架构兼容,尤其是,原有的存储与对象管理群组,被适合个别对象的技术方法所取代。为
此目的所必备的、与对象相关的综合性元数据信息,以“通用对象格式”(UOF)表达,附着于
系统内。最后,创建工具把元数据赋予公布的对象,表达和操作系统中开放、标准化的接口。
在开放源码许可下,相应的模块软件库 koLibRI 提供给其它机构使用。kopal 的架构及方向,意
味着它可以永久与安全地存储出版物,必要时,根据扩展元信息、以自动处理方式迁移它们,
或者使它们在适当生成的模拟环境中可用。从技术观点来看,kopal 的解决方案对出版物、尤其
是其生产过程没有任何要求或约束。
What, then, are the differences between longterm archiving of Open Access publications and the
publications of commercial publishers? Differences and open questions can be found primarily in two
areas:
那么,开放获取出版物的长期存档与商业出版社的出版物有何不同?它们的差异和开放议题,
主要集中在两个领域:
* A standardisation of publication processes across different media would seem simpler in the case
of Open Access models, since editors as a rule belong to a more homogeneous community (university,
research institutes, learned societies, etc.). Competition plays less of a role here than in the commercial
world; the use of the same standards and interfaces is preferred to the unique position of a single
producer as is required by the market. On the other hand, experience suggests that a commercial
publisher can impose on its authors much more rigid demands relating to the semantic and syntactic-
technical quality of submitted articles, and thus require that authors actively cooperate in the specific
publishing chain at an early stage.
* 在开放获取模式中,跨媒体的出版过程标准化似乎更简单,因为编辑通常属于同质社区
(大学、研究机构、学会等),这儿很少商业世界中的竞争,愿意采用相同的标准及接口,形
成市场所需求的单一生产者独特的地位。另一方面,过去的经验显示,商业出版社可以就投稿
论文的语义与语法质量,向作者提出更严格的要求,并要求作者在初期就积极配合特定的出版
程序。
* Access to Open Access publications in the archive of the German National Library with its long-
term availability features of the archived items can be granted on the same basis as access to the
documents of the server of origin. Of course, the rights owner must give his or her consent according to
copyright regulations, but most licences involved in the context of Open Access recommend the receipt
of this consent so as not to fall back into access restrictions or discussions about cost.
* 德国国家图书馆档案中的存档件具有长期可获得性,获取其中的开放获取出版物,与获取
原始服务器上的文件,没有条件上的差异。当然,版权所有人必须根据版权规定授权,而大部
分开放获取许可建议签收授权,以免重蹈获取限制或被费用局限。
4. Both points could also be negotiated with those commercial publishers who operate appropriate
corresponding business models for electronic publications.
与经营电子出版物的商业出版社谈判时,也不免遇到前述两点。
For the publishing author, what we have said so far means that when submitting the article to whoever
will publish it, he or she should insist that the question of long-term availability of the publication be
explicitly clarified. In this context, it is ultimately irrelevant whether this responsibility is exercised
directly by the institution to which the article is submitted, or by some other institution, for example
acting under a legal obligation, as in the case of the German National Library. As a rule, the latter form
of long-term archiving will be the most appropriate for the majority of Open Access repositories. The
German National Library is currently setting up submission interfaces for this very purpose.
Appropriate agreements should be implemented, including a catalogue of rules for the long-term
handling of the digital object.
身为作者,我们的意思很清楚,交付论文时,对方需明确表示出版物长期可用。在此前提下,
最终由谁履行该责任并不重要,可以是接受该论文的机构或是其他机构,如像德国国家图书馆
这类具有法律义务的机构。后者的长期存档形式,通常更适合多数开放获取典藏库的期望。德
国国家图书馆正为此设立提交论文的界面。应当有适当的协议,包括处理长期存档数字对象的
编目规则。
For the Open Access movement, the theme of guaranteeing the long-term availability of digital objects
certainly has potential: the use of existing technical and operational options and the design of
corresponding workflows guaranteeing the availability of publications at a high technical level could
play an increasingly important role in the competition for the optimal form of publication, especially in
an institutional context. An important sub-component here is the system of ‘persistent identifiers’
whose use ensures that sources and articles are quotable, and which guarantees that citations will
permanently be understood in an open world, and that they will not just exist in a closed and often only
partially accessible service.
对开放获取运动,保证数字对象长期可得的主题,当然是有潜力的:使用现有技术及运作办
法,设计相应的工作流程,在最高技术层面保障出版物的可得性,可以在最佳出版形式竞争、
尤其是机构层面,日益发挥重要作用。另一个重要成份是“永久标识”系统,确保来源与论文都
是可以被引用的,保障引用的内容永远在开放世界被了解,而不是只存在于封闭且仅部分可获
取的服务之中。
p. 58-61
Open Access: Opportunities and challenges. A handbook [开放获取 : 机会及挑战] / European
Commission/German Commission for UNESCO). -- Luxembourg: Office for Official Publications of
the European Communities, 2008. -- 144 pp., 14.8 x 21.0 cm. -- ISBN 978-92-79-06665-8. -- EUR
23459, http://tinyurl.com/3q8wo5