Session 5-2

SMART-GS Web: A HTML5-Powered, Collaborative Manuscript Transcription Platform

  • Yuta Hashimoto (Kyoto University)


Some historically important manuscripts, especially those written in the modern age, are hard to read due to their authors' unclear handwriting. Transcription processes for these manuscripts tend to be more time-consuming, eventually decreasing historians' productivity. When manuscripts are written in East-Asian languages such as Japanese, which have a vast number of characters, transcriptions are even harder.

SMART-GS, a desktop application for image-based historical studies, has been developed by Japanese historians and developers since 2006 to help historians work on such manuscripts. The system supports a variety of features to help historians work with illegible manuscripts such as image-and-text markup, search for handwritten text, and perform other research tasks. SMART-GS has been successfully applied to several historical research projects, including the transcription project of Baron Yuzaburo Kuratomi's diary. However, due to its lack of a web interface, its actual use has been limited to somewhat small circles. SMART-GS Web, a web version of SMART-GS has been developed to offer SMART-GS' various features to a wider community of users (Fig. 1).

With the development and diffusion of HTML5 technologies and modern web browsers such as Firefox and Google Chrome, we can implement rich features such as image editing that were previously only possible on desktop applications. In particular, newly introduced protocols like WebSocket and WebRTC enable users to communicate with each other in real-time. SMART-GS Web makes use of these technologies and offers following new features that the original SMART-GS doesn't have:

Real-time Collaboration

Transcriptions of huge amount of manuscripts are often done by teamwork. SMARTGS Web offers groupware features for this purpose. Every change made by each user will be immediately reflected in other users' workspaces without page reloading. In addition, the mouse cursor and text cursor of each member will be shared with other users in real-time so that they can see what the others are currently focusing. It's also easy to share resources on SMART-GS Web among users: manuscript images imported into a project will be stored in Amazon S3 storage and made available to every project member. Metadata added to images as well as transcription texts will be similarly stored in cloud storage and made searchable by an indexing server.

TEI Support

SMART-GS Web has an embedded HTML editor for transcription text. But it's also possible to export these transcriptions to TEI documents. Markups on images will include TEI's <sourceDoc> elements. SMART-GS Web's TEI support is not yet sophisticated, so I am planning further development.

Vertical Text Editing

The editor embedded in SMART-GS Web supports vertical text editing, which is enabled by the writing-mode property introduced in CSS3. This feature will be especially useful when transcribing manuscripts written in East-Asian languages such as Japanese.

SMART-GS Web is not yet a mature project, and needs much more improvement. However it can make significant contributions to historians who work on historical manuscripts.



Fig. 1: SMART-GS Web running on Google Chrome



Keywords
historical research, transcription, TEI, collaboration