漢文電子佛典的緣起與遠景
(數位化文獻/多價文獻模式)
惠敏法師
一九九九年六月二十七日
呂美智、林佩琪、李志強摘要整理
參考資料:
"Applications of A New Document Model for Digitalization of East Asian Classical Documents", Howie Lan(藍效農) Instructional Technology Program Uninversity of California, Berkeley
Digital Library Project: University of California at Berkeley http://elib.cs.berkeley.edu/
"Multivalent Documents: Anytime, Anywhere, Any Type, Every Way User-Improvable Digital Documents and Systems", Thomas A. Phelps, A dissertation for Ph. D, University of California, Berkeley (1998)
"Multivalent Documents: A New Model for Digital Documents", Robert Wilensky Thomas A. Phelps, Technical Report, CSD-98-999 (1998)
"Multivalent Documents: Inducing Structure and Behaviors in Online Digital Documents", Thomas A. Phelps & Robert Wilensky, Proceedings of Hawaii International Conference on System Sciences '96 (Best Paper Award, Digital Documents Track)
運用新文獻模式資料,整理後推廣到數位化圖書館計劃(digitallibrary project),介紹有關國外的技術、搜集、行政。
多價文獻 ( 多訊息層次文獻模式 )
新、舊文獻不同的特性:
1.印刷文獻:Monolithic (單塊)
2.數位文獻:Multivalent (多價)
Multivalent View of Document(多價文獻模式)提供給下列人士參考:
l.決策者
2.專業管理者
3.技術執行者
Traditional View of Document(文獻的傳統觀念)
Physical Media (物理性質媒介)
l.Clay tablets (泥版)
2.Oracle bones (甲骨文)
3.Bamboo strips (竹簡)
4.Paper (紙本)
Traditional View of Document(文獻的傳統觀念)的缺點:
l.Equivalents of books (與書同義)
2.Basic Units are Stand-along and Disconnected (基本單元是獨立且不互相連結的)
3.Time consuming to search and find informations across basic units (互參、對照、查詢資料費時)
4.For human to "read" only ( 只提供 給 人" 讀 " 的功能)
The Multivalent Document (數位化文獻是「多價文獻 」 )
可提供多元性的價值與功能
- information entity
- content & functional factor
-
layers & behaviors
Layer:(同形式情報層)
Behaviors:(運用於layer的功能性處理因子) - incrementally and openly
|
Layer Examples:
- Scanned text image Layer ( 掃描 原版 圖擋層次 )
- Encoded text data Layer (編碼版層次)
- Font reconstructed Layer (字庫再現版層次)
- Character geometries Layer
- Meta data Layer
- Future content Layer
Examples of Within-layer Behaviors:
in scanned text image layer﹕
- Local enlargement
- Copy image part
- Paste image part
in encoded text layer﹕
- Character finding
- String sorting
- Character counting
in reconstructed display layer﹕
- Show Character with font
- Set Typeface
- Horizontal text display
- Vertical text display
Examples of Cross-layer Behaviors:
- Synonym lens同義字透鏡
- Translation翻譯
Multivalent View of Document(多價文獻的優點)
基本單位﹕字、筆劃、圖素(pixel)
-
重組 (rearranged)、連結 (connected) 簡易快速
-
Processor + Human Mind (同時可提供 給 機器與人" 讀 " 的功能)
若被傳統的文獻概念所限,則會造成以下的影響
-
不會連結不同的資料層次
<例如>“Dead”page images (認為 文件圖檔 單只是圖檔 ) -
連結部分資料層次,但僅限於單一的軟體
<例如>OCR software's post-process(光學字元辨視軟體的後置作業)
Multivalent Document Characteristics:
多價文獻三特色
1. Incremental Extendable(逐加延伸)
在以下幾個方面,具有逐加延伸的特性
- Data
- Layers
- Behaviors
2. Structurally Distributed(結構性分散管理)
可利用結構性分散管理
Layers can be at different locations in network(各層級可分散放置於網路上不同的位置)
例如:
- 大正藏1~55(編碼檔):CBETA
- 大正藏56~100(編碼檔):SAT
- 高麗藏(圖檔):海印寺,韓國
- 手抄古本(圖檔):京大、東大
- 房山石經(圖檔):北京圖書館
3. Internally Complementary(內部的相互補足)
相輔相成
- Layers or behaviors may not be perfect. (各層和表現方面無法完美)
- Some layers may be implemented less accurate than the others.(部份的層級展現上可能較不精確)
- The multivalent document framework best complementary from different components of the content to contribute to the document as a whole.(多價文獻可由不同的物件加以補足而臻至圓滿)