Session 4-1

Digitization of a Catalogue of Oracle Bones

  • Tomohiko Morioka (Kyoto University)


This report describes a digitization of “Catalogue of the Oracle Bones in the Kyoto University Research Institute for Humanistic Studies”(京都大學人文科學硏究所藏甲骨文字; ZOB). Oracle Bone script is an ancient Chinese script. Oracle Bone characters were engraved on animal bones or turtle shells (in this report, “Oracle Bone” is used as a general term for these materials). ZOB is a catalogue of Oracle Bone collection in Institute for Research in Humanities, Kyoto University. This catalogue consists of “PLATES PART 1”(圖版册 上; published in 1959), “PLATES PART 2”(圖版册 下; published in 1959), “TEXT”(本文篇; published in 1960) and “INDEX”(索引; published in 1968). ZOB is an early work of Oracle Bone studies, therefore some interpretations may be obsoleted and it lacks recent results of studies. On the other hand, the Oracle Bones collection is a rare heritage, so the PLATE and INDEX parts are usable resources. In addition, information of old work may have academic value in the history of Oracle Bone studies.

At the first step of the digitization, we scanned every page of ZOB, and made a Web service to view these pages. Then we extracted images of rubbings and photos of Oracle Bone pieces from the PLATES parts. Each file name of extracted rubbing image is “rubbings/<ID>(.<ext>)” (e.g. rubbings/B1234.png). <ID> consists of <MATERIAL-TYPE> and <NUMBER>. <MATERIAL-TYPE> is a symbol to indicate material of Oracle Bone: “B”= animal bone; “S”= turtle shell. Likewise, each file name of extracted photo image is “photos/<ID>(.<ext>)” (e.g. photos/S0123.tif).

We also digitized the INDEX part as an Oracle Bone character database. Currently, Oracle Bone characters are not included in Unicode, so we use CHISE technology to represent them. We defined these character features to represent information of the INDEX part:

zinbun-oracle-page page number of the INDEX part
=zinbun-oracle glyph ID of Oracle Bone character
<-denotational link for abstract Oracle Bone character
shuowen-radical number of shuowen radical(説文部首)
<-Oracle-Bones corresponding modern Chinese characters
sources identifiers of Oracle Bones

We also added character feature ‘ideographic-structure’ to represent visible structure of character. It is the same format to represent IDS (Ideographic Description Sequence)[2] in CHISE. In original IDS, each component must be modern Chinese characters included in UCS, however our extended IDS accepts Oracle Bone characters (or modern Chinese characters not included in UCS) as components. This feature is basically not depended on interpretation of Oracle Bone character. It is usable to search Oracle Bone characters.

This Oracle Bone character database is integrated with the CHISE character ontology. The source code of it is available at [3] as a part of XEmacs CHISE. Likewise for modern Chinese characters (or other various characters), information of Oracle Bone character can be viewed by CHISE-wiki, such as: http://www.chise.org/chisewiki/view.cgi?character=rep.zinbun-oracle:339

In CHISE-wiki, character feature ‘sources’ works as links for Oracle Bone images (rubbings) which include the displayed Oracle Bone character. Likewise, character feature ‘zinbun-oracle-page’ works as a link for a page of the INDEX. “CHISE IDS Find”[4] is also available to search Oracle Bone characters by their components. In addition, each Oracle Bone character which has character feature ‘<-Oracle-Bones’ is linked from the corresponding modern Chinese character (e.g. http://www.chise.org/chisewiki/view.cgi?character=吉).


Keywords
Oracle Bones, ancient Chinese characters, character database, rubbings, Linked Open Data

Reference

  1. Morioka, Tomohiko: “CHISE: Character Processing based on Character Ontology”, Large-scale Knowledge Resources (LKR 2008), pp. 148–162, LNAI 4938.
  2. Defined in ISO/IEC 10646 (UCS)
  3. http://git.chise.org/gitweb/?p=chise/xemacs-chise.git;a=blob;f=lisp/utf-2000/Oracle-Bones.el
  4. http://www.chise.org/ids-find