Resources and Tools for Corpus Compilation of Translated Literary Texts in Late Qing and Republican Period

Nason Anran Cao

翻译学报 ›› 2018, Vol. 2 ›› Issue (1) : 153-168.

PDF(824 KB)
PDF(824 KB)
翻译学报 ›› 2018, Vol. 2 ›› Issue (1) : 153-168.

Resources and Tools for Corpus Compilation of Translated Literary Texts in Late Qing and Republican Period

  • Nason Anran Cao
作者信息 +

Resources and Tools for Corpus Compilation of Translated Literary Texts in Late Qing and Republican Period

  • Nason Anran Cao
Author information +
文章历史 +

摘要

Corpus-based translation studies have increasingly attracted attention in recent years. Translated literature of the late Qing and Republican Era in China is one popular topic of research in translation studies. Building up a corpus of translated literature from this period may provide a stronger empirical basis for investigation. This article introduces online resources, including a variety of databases and tools, with a focus on the Chinese OCR software used in corpus compilation. Features of databases that contain translated literary texts of late Qing and Republican Period are compared, problems of OCR recognition for these texts are identified, and four OCR software packages are tested using sample texts. This article provides insights into large-scale digitization projects on Chinese texts as well.

Abstract

Corpus-based translation studies have increasingly attracted attention in recent years. Translated literature of the late Qing and Republican Era in China is one popular topic of research in translation studies. Building up a corpus of translated literature from this period may provide a stronger empirical basis for investigation. This article introduces online resources, including a variety of databases and tools, with a focus on the Chinese OCR software used in corpus compilation. Features of databases that contain translated literary texts of late Qing and Republican Period are compared, problems of OCR recognition for these texts are identified, and four OCR software packages are tested using sample texts. This article provides insights into large-scale digitization projects on Chinese texts as well.

关键词

Chinese databases / corpus-based translation studies / Chinese OCR / digital humanities / literary translation

Key words

Chinese databases / corpus-based translation studies / Chinese OCR / digital humanities / literary translation

引用本文

导出引用
Nason Anran Cao. Resources and Tools for Corpus Compilation of Translated Literary Texts in Late Qing and Republican Period[J]. 翻译学报. 2018, 2(1): 153-168
Nason Anran Cao. Resources and Tools for Corpus Compilation of Translated Literary Texts in Late Qing and Republican Period[J]. Journal of Translation Studies. 2018, 2(1): 153-168

参考文献

Baker, Mona (1993). “Corpus linguistics and Translation Studies.” In Text and Technology: In Honour of John Sinclair, ed. by Mona Baker, Gill Franciss, Elena Tognini-Bonelli, 233–250. Philadelphia: John Benjamins Publishing Company.
Candel-Mora, Miguel Ángel (2015). “Comparable Corpus Approach to Explore the Influence of Computer-Assisted Translation Systems on Textuality.”Procedia 198, 67-73.
Chaudhuri, Arindam, Krupa Mandaviya, Pratixa Badelia,Soumya K. Ghosh (2017). Optical Character Recognition Systems for Different Languages with Soft Computing. Cham: Springer.
Chen Hanying 陳含英 (2015). “Qian lun minguo qikan shi jin xiandai fanyi wenxue fabiao de zhu zhen di” 淺論民國期刊是近現代翻譯文學發表的主陣地 [A Brief Discussion on the Fact That Republican Journals Is the Main Platform for Publications of Modern Translated Literatures]. Taizhou xueyuan xuebao 台州學院學報 [Journal of Taizhou University] 37.1, 46-49.
Hu Kaibao (2016). Introducing Corpus-Based Translation Studies. Berlin: Springer.
Huang Libo and Zhiyu Zhu (2012). “Wan qing shiqi guanyu fanyi zhengce de taolun” 晚清時期關於翻譯政策的討論 [The Formulation of Translation Policies during the Late Qing Period]. Zhongguo fanyi 中國翻譯 [Chinese Translators Journal] 3, 26-33.
Kuebler, Sandra,Heike Zinsmeister (2015). Corpus Linguistics and Linguistically Annotated Corpora. London: Bloomsbury Academic.
Li Jie 李杰 and Muyun Fang 方木雲 (2016). “Wenzi shibie zhong tezheng yu xiangsi du duliang de yanjiu” 文字識別中特徵與相似度度量的研究 [Research on Feature and Similarity Measurement in Character Recognition]. Yancheng gong xueyuan xuebao (ziran kexue ban) 鹽城工學院學報 (自然科學版) [Journal of Yancheng Institute of Technology (Natural Science Edition)] 29.4, 42-46.
Li Peiying 李佩瑛 and Wanru Cheng 程婉如 (2009). “Qikan baozhi shuwei hua gongzuo liucheng zhinan” 期刊報紙數位化工作流程指南 [Instructions on the Process of Periodical and Newspaper Digitization]. Taibei: Zhongyang yanjiu yuan lishi yuyan yanjiu suo 台北:中央研究院歷史語言研究所 [Taipei: Institute of History and Philology, Academia Sinica].
Lu Ping, Yi Yang, Bin Sheng, Ping Li, Mingang Chen,Dan Wu (2015). “Chinese Character Recognition Based on 8-Direction Feature Extraction.” In Multimedia Technology IV, ed. by Aly A. Farag, Yang Jian, and Feng Jiao, 155-158. London: Taylor & Francis.
Shen Guo-rong (2010). “Corpus-Based Approaches to Translation Studies.” Cross-Cultural Communication 6.4, 181-187.
Wang Lingli 王玲麗 (2015). “Qian tan OCR jishu zai tushuguan wenxian ziyuan jia gong zhong de yingyong: yi shanghai tushuguan jindai wenxian quan wen OCR shuju zhizuo xiangmu wei li” 淺談 OCR 技術在圖書館文獻資源加工中的應用——以上海圖書館近代文獻全文OCR數據製作項目為例 [OCR Technology in Processing Library Document and Material Resources: An Example of OCR Data Processing Project Named “Full-text Database of Modern Texts” of Shanghai Library]. Suowei jishu 縮微技術 [Journal of Micrographics] 1, 23-26.
Wei Maoping 衛茂平 (2004). De yu wenxue han yi shi kao bian 德語文學漢譯史考辨 [The History of Chinese Translations of German Literatures]. Shanghai: Shanghai waiyu jiaoyu chuban she 上海外語教育出版社 [Shanghai: Shanghai Foreign Language Education Press].
Yang Linjie,Liangrui Peng (2013). “Local Projection-Based Character Segmentation Method for Historical Chinese Documents.”Proceedings Volume 8658, Document Recognition and Retrieval 10, 865800.
Yu Cuiyan 于翠豔, Dehua Fu 傅德華,Chunbo Li 李春博 (2012). “Guan yu ‘20 shiji zhongguo renwu zhuanji ziliao quan wen shuju ku’ de jinzhan yu kunhuo” 關於「20世紀中國人物傳記資料全文數據庫」的進展與困惑 [About the Progress and Confusion on “Full-Text Database of Twentieth-Century Chinese Biographies”]. Zhongguo suo yin 中國索引 [Journal of the China Society of Indexes] 10.1, 20-22.
Zanettin, Federico (2013). “Corpus Methods for Descriptive Translation Studies.”Procedia 95, 20-32.
Zha Mingjian (2016). “Modern China’s Translated Literature.” In A Companion to Modern Chinese Literature, ed. by Yingjin Zhang, 214-227. West Sussex: Wiley-Blackwell.
Zhang Ding 張丁 and Zhaohui Wang 王兆輝 (2011). “Shi lun minguo wenxian de shuzi hua jianshe” 試論民國文獻的數字化建設 [Digitization Projects for Republican Texts]. Shuzi yu suowei yingxiang 數字與縮微影像 [Digital and Micrographic Imaging] 1, 22-23.
Zhang Zhoucai 張軸材 (1999). “‘Si ku quan shu’ dianzi ban gongcheng yu zhongwen xinxi jishu” 《四庫全書》電子版工程與中文信息技術 [The Engineering for Electronic Publication of The Complete Library of the Four Treasuries and Chinese Information Technology]. Dianzi chuban 電子出版 [Electronic Publishing] 3, 3-6.
Zhong Weihe (2003). “An Overview of Translation in China: Practice and Theory.” Translation Journal 7.2. Http://translationjournal.net/journal/24china.htm. Accessed 20 May 2017.

PDF(824 KB)

Accesses

Citation

Detail

段落导航
相关文章

/