CANNA Version 3.2, a Japanese input system

June 1, 1994
Yuko Misao
misao@d1.bs2.mt.nec.co.jp
Akira Kon
kon@d1.bs2.mt.nec.co.jp
NEC Corporation
0. Preface
1. Overview
2. Features of Canna
Based on client-server model kana-to-kanji conversion
Supporting automatically kana-to-kanji conversion
Providing a unified user interface to input Japanese
Supporting customization by users
Based on lisp language customization
A mechanism to add a suitable part of speech to each registered
Providing a library which supports a unified user interface
Maintenance tools for conversion dictionaries
A customizing tool which supports easy customization
Supporting Nemacs(Mule), kinput2 and uum
3. Contents of Canna
4. Documents
5. Bugs
6. Acknowledgements
0. Preface

       This is the introduction of Canna.  For installation of Canna
  or changes from previous versions, see the file INSTALL or INSTALL.jp or
  CHANGES.jp respectively. (Sorry, but INSTALL and README may not be
  updated to Japanese.  And CHANGES isn't in English.)

1. Overview

       This directory includes files which are sources for a Japanese
  input system named Canna.  Canna provides a unified user interface
  for Japanese input.

  *Note:  Canna was called Iroha among developers of Canna
          previously. 

       There exists a commonly usable Japanese input system called
  Wnn.  Canna gives an alternative Japanese input system.

       Canna is based on a client-server model for kana-to-kanji
  conversion, that is, an application program which uses a Japanese
  input system communicates with a kana-to-kanji conversion server,
  which is a separate process from the application program.  Canna has
  several features as follows:

  1) Based on client-server model kana-to-kanji conversion
  2) Supporting automatically kana-to-kanji conversion
  3) Providing a unified user interface to input Japanese
  4) Supporting customization by users
  5) Based on lisp language customization
  6) A mechanism to add a suitable part of speech to each registered
     words by users
  7) Providing a library which supports a unified user interface
  8) Maintenance tools for conversion dictionaries
  9) A customizing tool which supports easy customization
 10) Supporting Nemacs(Mule), kinput2 and uum

       Below, I would like to describe details of Canna's features.

2. Features of Canna

2.1 Based on client-server model kana-to-kanji conversion

       Canna converts kana to kanji based on a client-server model.
  That is, an application program communicates with a kana-to-kanji
  conversion server and achieves Japanese input.

2.2 Supporting automatically kana-to-kanji conversion

       Canna is the first Japanese input system which supports
  automatically kana-to-kanji conversion, it is also a free software
  for Japanese input which based on client-server model.  Generally,
  it is not easy to input Japanese with kana-to-kanji conversion
  automatically in PC environment.  However, Canna makes it easier for
  supporting better user interface which extended their
  generally(unautomatically) conversion.

2.3 Providing a unified user interface to input Japanese

       Developers of Canna, including myself, used to use Egg on
  Nemacs when we had to input Japanese.  Egg was very harmonious with
  Nemacs and its user interface was convenient.

  *Note: Nemacs is a Japanese version of Emacs.  Egg is an interface
         between Nemacs and Wnn.  Egg provides a user interface by
         itself.

       On the other hand, if we had to input Japanese without Nemacs,
  for example, to input Japanese in command line of shell, we were not
  able to use Egg because Egg was usable only in Emacs environment.
  Though there is also a Japanese input system for TTYs called Uum,
  Uum has a different user interface from Egg and we did not become
  accustomed with it.

       Especially in using X clients, we were eager to use Egg's user
  interface to use them.  We found a lot of good tools on the X window
  system such as xmh and xcalendar.  Our desire was to localize those
  clients into our national language and to operate them with Egg's
  user interface.

<<Let's create another Egg by ourselves>>

  Thus, we decided to create an Egg like system and to provide it in a
  library.  This is the Canna.  Now, Canna provides more features than
  Egg.  Canna can be used in Emacs, in X environments, and on TTYs.

2.4 Supporting customization by users

       As well as key binding, it is possible to customize
  romaji-to-kana conversion rules, status describing strings,
  dictionaries, etc.  A customization is described in a customization
  file.  A customization file can be shared among applications using
  Canna.

2.5 Based on lisp language customization

       Because of new syntax, old application programs will have to be
  re-written to incorporate the new rules and conditions of Canna.

       According to use new customization syntax based on lisp
  language, it is possible to ignore the descriptions which old
  version's Canna cannot understand.  With new syntax, you can show
  version of Canna or connected server, and customize on conditions of
  these versions.

2.6 A mechanism to add a suitable part of speech to each registered
    words by users

       When we made our own user interface, we added one new idea into
  the word registration part of Canna.

       In kana-to-kanji conversion system, more detailed parts of
  speech are used than what are used in school grammars.  Canna uses
  almost 400 parts of speech.  On the other hand, it might be almost
  impossible and unkind to ask users what kind of a part of speech
  must be used for the registered word.

       In Canna, we made a new mechanism to add a suitable part of
  speech to registered words.  Canna shows several sentences using
  newly registered word, and ask user the usage is correct or not.
  Asking several questions to user helps Canna to decide what part of
  speech is suitable for the newly registered word.

2.7 Providing a library which supports a unified user interface

       A user-interface library is provided.  With this library,
  programmers can easily add a Japanese input system with unified user
  interface to application programs.

       This library has higher-level functions than Wnn's high-level
  library has, and interpretation of each input key is also processed
  inside of this library.

       An interface to this library is simple.  Giving input key to
  this library returns several pieces of information to display
  pre-edit status.  Interpretations of key functions are hidden inside
  of this library, and it is not necessary for application programs to
  be conscious of them.

       This library meets the case of treating inputs from several
  windows.  By giving context identifiers which correspond to each
  window with key inputs to this library makes it possible to process
  plural kana-to-kanji conversion concurrently in a process.

2.8 Maintenance tools for conversion dictionaries

       Canna provides more than ten tools to maintenance
  kana-to-kanji conversion dictionaries.  Using these tools, you can
  do the following things.

   - Conversion between text format dictionaries and binary format
     ones.
   - Creating, deleting, listing, renaming dictionaries by remote
     operation
   - Uploading and downloading dictionaries
   - Adding/deleting items to/from a dictionary by a batch procedure

2.9 A customizing tool which supports easy customization

       In place of editing a customization file directly, it is
  possible to set up customization by using a customizing tool.

2.10 Supporting Nemacs(Mule), kinput2 and uum

       Canna provides a patch for Nemacs(Mule), which makes
  Nemacs(Mule) possible to input Japanese in Canna's unified user
  interface.  And, after Mule 0.9.5, the distribution of Mule includes
  interface to Canna.

       If kinput2 use Canna as kana-to-kanji conversion engine,
  application programs which use kinput can use Canna's unified user
  interface to input Japanese

       Additionally, Canna supports a patch for uum, which makes
  possible to input Japanese in Canna on TTYs.  For uum, see
  canuum/README.jp.  (Sorry, but canuum/README.jp is in Japanese.)

3.  Contents of Canna

       Canna contains the following things.

   - Kana-to-kanji conversion server (cannaserver)
   - User interface library (libcanna.a, libcanna.so.1)
   - Remote version of dictionary accessing library (libRKC.a)
   - Local version of dictionary accessing library (libRK.a)
   - Wnn version of dictionary accessing library (RKwnn.o)
   - Maintenance tools for conversion dictionaries
   - Patch files for uum of Canna version
   - A sample program

4. Documents

       Under this directory, there are several documents.  Most of
  them are in Japanese.

  1) Document of Canna (in Japanese)

       The document is in jlatex format and is in doc/man/guide.
  This document describes the general comments of how to use Canna.

  2) Documents for commands

       The documents are in man macro format of roff and are in the
  respective directory.  Source codes are also available in same
  directory.  Japanese documents have suffix .jmn, and English have
  .man.

  3) Documents for application interface library

       The documents are in man macro format of roff and are in
  lib/canna and lib/RK.

  4) Document for kana-to-kanji conversion protocol (in Japanese)

       The document is in jlatex format and is in doc/intern.  Canna
  has the protocol versions 1.0, 1.1, 2.0, 2.1, 3.0 and 3.2.  The
  document of the protocol version 1.0 and 1.1 is the file
  proto12.tex.  Other version's document is protocol.tex.

  5) Document for lisp language of customization file (in Japanese)

       The document is in jlatex format and is in doc/lisp as
  canlisp.tex.

  6) Document for uum of Canna version

       The notes is in text file format and is in canuum as README.jp.

  7) Other Documents

       The documents are in jlatex format and are in doc/misc.

5. Bugs

       Canna have some bugs as follows:

  (1) Canuum doesn't run on Solaris 2.1.

  (2) Canuum doesn't run on SunOS 4.2 with compiler gcc.

  (3) On SONY NEWS, the terminal mode on pseudo tty is different which
      make by canuum.

  (4) No English documents as follows:
        canuum/canuum.man

  (5) When use automatically kana-to-kanji conversion, you may not
      convert correctly.

  (6) On Solaris 2.3, you will have some cores.

  (7) When you compile, you will have a warning message "Undefined row
      vectors: BM".  Don't worry it.

6. Acknowledgements

  Canna have been released as a Alpha Release on January 1994, and as
  a Beta Release on March.  Many people have cooperated and worked to
  make this release in canna mailing list since Alpha Release.  They
  had reported the compilation in respective version, the problems to
  compile, and the bugs which is not easy to find out and they had
  suggested their idea to correct as patch.  Our thanks go out to the
  following people.

    Shin Yoshimura (IIJ), Tomotake Furuhata (IBM), Nobuhiro Yasutomi
    (ISAC), Takanori Nishijima (Canon), Yoshinari Inoue (Kyoto
    University), Yuuji Hirose (Keio University), Tetsuaki Kiriyama
    (SONY), Fumitaka Murayama (Chubu University), Tetsuji Takada (The
    University of Electro-Communications), Takashi Taniguchi (The
    University of Electro-Communications), Akihisa Tsuboi (The
    University of Electro-Communications), Fumihiro Ueno (The
    University of Tokyo), Yasuharu Sasaki (The University of Tokyo),
    Yoshinari Kanaya (Tohoku University), Takeshi Hagiwara (Tokyo
    Institute of Technology), Yukinobu Moriya (Tokyo Institute of
    Technology), Takuya Harakawa (Tokyo Institute of Technology),
    Yasuhiro Matsuoka (Nagoya Institute of Technology), Yoshinori
    Sakamoto (Nara Institute of Science and Technology), Kenjirou
    Tsuji (Nihon SunSoft), Masayuki Koba (Hitachi Software Engineering
    Co., Ltd.), Toshihiko Kodama (Fujitsu), Noriaki Seki (Fuji Xerox
    Co.,Ltd.), Kenji Rikitake (TDI Kyoto Network Technology
    Laboratory), Atsushi Nakamura (Matsushita Research Institute
    Tokyo, Inc.), Kenichi Tanino (Musashi Institute of Technology),
    Masakazu Kano (Musashi Institute of Technology), Noriyoshi
    Kumazawa (Meiji University), Yoshikazu Tabata (Yokogawa Digital
    Computer Corporation), Hiroki Kanou (Waseda University), Masaki
    Takayama (NEC Software, Ltd.), Ryuichi Matsumoto (NEC Scientific
    Information System Development, Ltd.), Masao Negishi (DNP),
    Junji Sakai (NEC).

  Hajime Oiwa (Keio Univ.) agreed to support TUT code as a function of
  Canna.  Thank you very much.  The function to input TUT code have
  supported since Canna Version 2.2 patch level 2.

  Tomotake Furuhata (The Univ. of Tokyo) pointed out bugs in Imakefile
  on Canna Version 1.2 and sent the patch to correct (Canna have a
  large number of Imakefile).  Imakefiles after Canna Version 2.2
  based on his patch.  Thank you very much.

  Yasuhiro Masutani (Osaka Univ.) and Masaki Takayama (NEC Software,
  Ltd.) maintenanced the documents.  There is a problem that the
  installed directories do not agree with the description in
  documents, since the directories of Canna are changeable according to
  the environments of installation.  They cleared the problem.  Thank
  you very much.

  Takashi Matsuzawa made the part of engine switcher on HP-UX.  He
  gave them the advice.  Thank you very much.

  Isao Seki (PRUG), Yuuji Hirose (Keio Univ.) and Yasuhiro Matsuoka
  (Nagoya Institute of Technology) pointed out bugs on NEmacs or Mule
  with Canna, and sent the patch to correct them.  Thank you very
  much.  NEmacs and Mule aren't included in this release, but we will
  improve them.

  Yoshinori Sakamoto (Nara Institute of Science and Technology),
  Makoto Ishisone (SRA), Hisashi Minamino (SRA), Yoshinari Kanaya
  (Tohoku Univ), Junn Ohta (RICOH), Yoshitaka Tokugawa (DIT) and
  Takahiro Kanbe (FXIS) gave advice about the separation of
  cannaserver from TTY and the portable program of variable argument.
  Thank you very much.

  We debugged the problem on DEC Alpha, 64 bit workstation of Digital
  Equipment Co., at Alpha resource center in Digital Equipment
  Corporation Japan.  We thank Hiroaki Obata, Tsutomu Okihara and
  Satoshi Watanabe for their efforts.

  Masato Minda (ASTEC) set up and manage "Canna" mailing list.
  Therefore, we get many useful information such as above.  We thank
  him for his efforts.

  Kazuhiro Fujieda (Japan Advanced Institute of Science and
  Technology, Hokuriku), Hiroki Kanou (Waseda Univ.), Masaki Takayama
  (NEC Software, Ltd.)  and Hiroshi Yamada (NEC Corporation) gave the
  suggestions about certain part of speech and conjunction of words.
  Especially, Mr. Fujieda put the symbols which show the end of
  sentence in the kana-to-kanji conversion logic of Canna.  Thank you
  very much.

/* Copyright 1994 NEC Corporation, Tokyo, Japan.
 *
 * Permission to use, copy, modify, distribute and sell this software
 * and its documentation for any purpose is hereby granted without
 * fee, provided that the above copyright notice appear in all copies
 * and that both that copyright notice and this permission notice
 * appear in supporting documentation, and that the name of NEC
 * Corporation not be used in advertising or publicity pertaining to
 * distribution of the software without specific, written prior
 * permission.  NEC Corporation makes no representations about the
 * suitability of this software for any purpose.  It is provided "as
 * is" without express or implied warranty.
 *
 * NEC CORPORATION DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
 * INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN 
 * NO EVENT SHALL NEC CORPORATION BE LIABLE FOR ANY SPECIAL, INDIRECT OR
 * CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF 
 * USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR 
 * OTHER TORTUOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR 
 * PERFORMANCE OF THIS SOFTWARE. 
 */

($Id: README,v 2.15 1994/07/26 00:31:08 misao Exp $)