File indexing completed on 2024-04-14 03:40:37

0002 <BODY BGCOLOR="white">
0003 <center>
0004 <h2> The Home Page of the</h2>
0005 <H1>KANJIDIC/KANJD212</H1> 
0006 <h2>Project</h2>
0007 </center>
0009 <em>The following is summary information about <b>KANJIDIC</b> and
0010 <b>KANJD212</b> Kanji database files. For complete documentation, see the <A
0011 HREF="">kanjidic</A> and <A HREF="">kanjd212</A> documentation files. 
0012 </em>
0014 <p><b>Introduction</b>
0016 <p> The <b>KANJIDIC</b> file contains comprehensive information 
0017 about Japanese kanji. It is  a  text file 
0018 with one line for each  of the 6,355 kanji specified in the 
0019 JIS X 0208-1990  set, plus a header line.  (For information  
0020 about this set,  see Appendix A of the documentation file.) <p>
0022 The <b>KANJD212</b> file is also a text file containing comprehensive 
0023 information about the
0024 5,801 kanji in the JIS X 0212-1990 supplementary character set.
0025 <p>
0026 The files have been compiled and maintained by
0027 <A HREF="">Jim Breen</A>,
0028 and were cited as references used in the compilation of the new
0029 edition of the Nelson character dictionary.<p>
0030 <P>
0031 <B>KANJIDIC2</B>
0032 <p>
0033 The KANJIDIC2 file, which is an XML-format version of
0034 the data in KANJIDIC and KANJD212 is now available. The KANJIDIC2 page
0035 is <a href="">here</a>.
0036 <p>
0037 <B>Is it Public Domain?</B>
0038 </P>
0039 <P>
0040 KANJIDIC, KANJD212 and KANJIDIC2 can be freely used provided satisfactory 
0041 acknowledgement is made
0042 in any software product, server, etc. that uses them. There are a few other 
0043 conditions relating to distributing copies of the files with or without
0044 modification.  Copyright is vested in the
0045  <a HREF="">EDRG </a>
0046 (Electronic Dictionary Research Group). You can see the specific
0047  <a HREF="">licence statement </a>
0048 at the Group's site.
0049 </P>
0050 The files are available from the Monash University ftp site 
0051 <A HREF="">here</a> and
0052 <A HREF="">here</a>.
0053 <p>
0054 The information about each kanji consists of: 
0055 <UL> 
0056 <LI> coding information about the kanji: including its identification 
0057 in various code-sets (JIS, Unicode), and identifying codes derived 
0058 from its structure (radical, stroke-count, Four-Corner, SKIP); 
0060 <LI> lexicographic information about the kanji: pronunciations and 
0061 meanings; 
0063 <LI> indexing information providing identification of the kanji in 
0064 several major dictionaries and reference books. 
0065 </UL>
0067 <b>KANJIDIC</b>  is used now to build the kinfo.dat file which is 
0068 used by JDIC, JREADER, WinJDic  and JWPce.  kinfo.dat  contains  the  
0069 identical information,  but  in  a compressed form and in a 
0070 structure suitable for fast indexed access. <b>KANJIDIC</b>  is  
0071 also  used in its native format by the xjdic and MacJDic dictionary 
0072 programs,  the WWWJDIC server, and a growing number of other programs such as KDRILL and 
0073 KDIC. <p>
0075 The files contains a mixture of ASCII characters and kana/kanji 
0076 encoded using the EUC (Extended Unix Code) coding. The following is 
0077 an example (folded) of one of the entry lines in <b>KANJIDIC</b>; 
0078 the kanji with the meaning "east". <p>
0080 <img src="kdsamp1.gif">
0082 <p>
0084 Apart from the first two fields, the information fields 
0085 in each entry have an identifying capital letter. The fields in 
0086 the sample entry are:
0088 <UL> 
0090 <LI> the kanji itself; 
0092 <LI> the JIS code of the kanji in hexadecimal; 
0094 <LI> [U] the Unicode/ISO 10646 code of the kanji in hexadecimal; 
0096 <LI> [N] the index in Nelson (Modern Reader's Japanese-English 
0097 Character Dictionary) 
0099 <LI> [B] the classification radical number of the kanji (as in 
0100 Nelson); 
0102 <LI> [C] the "classical" radical number (where this differs from 
0103 the one used in Nelson); 
0105 <LI> [S] the total stroke-count of the kanji; 
0107 <LI> [G] the "grade" of the kanji, In this case, G2 means it is 
0108 a Jouyou (general use) kanji taught in the second year of elementary 
0109 schooling in Japan; 
0111 <LI> [H] the index in Halpern (New Japanese-English Character 
0112 Dictionary); 
0114 <LI> [F] the rank-order frequency of occurrence of the kanji in 
0115 Japanese; 
0117 <LI> [P] the "SKIP" coding of the kanji, as used in Halpern; 
0119 <LI> [K] the index in the  Gakken  Kanji  Dictionary  (A  New 
0120 Dictionary  of Kanji Usage); 
0122 <LI> [L] the index in Heisig (Remembering The  Kanji); 
0124 <LI> [I] the index in  the  Spahn  &  Hadamitsky  dictionary.  
0126 <LI> [Q] the Four-Corner code; 
0128 <LI> [MN,MP] the index and page number in the 13-volume Morohashi 
0129 "DaiKanWaJiten"; 
0131 <LI> [E] the index in Henshall (A Guide To Remembering Japanese 
0132 Characters); 
0134 <LI> [Y] the PinYin (Chinese) pronunciation(s) of the kanji; 
0136 <LI> the Japanese pronunciations or "readings" of the kanji. These 
0137 are in the katakana script for "ON" (i.e of Chinese origin) 
0138 readings, and hiragana for "KUN" (Japanese origin) readings. Thus <p>
0139 <img src="tou.gif"> <p>
0140 is the ON reading, and <p>
0141 <img src="higashi.gif"> <p>
0142 is the KUN reading. These are optionally followed by a "T" plus one or more 
0143 "nanori", which are the special readings only used with proper names. 
0145 <LI> Finally, the meanings associated with the kanji, each 
0146 encapsulated with {...}. 
0148 </UL> 
0150 <p> <b>KANJIDIC</b> is available from a number of ftp archive sites around the
0151 world. The master site is Monash University's <A
0152 HREF="">Nihongo FTP
0153 Archive</a>
0154 archive, where it is available in .zip and .gz compression. The
0155 "kinfo.dat" file is in the archive.
0156 <p>
0157 If you want to explore <b>KANJIDIC</b> using the Web, try my <A
0158 HREF="">WWWJDIC</a> server, which supports both KANJIDIC and
0159 KANJD212. If you don't have a browser with Japanese support, you can use
0160 a portal which provides graphics of the Japanese characters.
0161 <p>
0162 <b>Other Languages</b>
0163 <p>
0164 The Spanish version of KANJIDIC, with Spanish meanings translated by
0165 Francisco Gutierrez. 
0166 It can be downloaded <a
0167 href="">here</a> or <a
0168 href="">here</a>. (It
0169 also is in UTF8 coding.)
0170 <p>
0171 There is also a partial French version prepared by Alain Thierion available
0172 <a href="">here</a>
0173 <p>
0174 <em>
0175 <A HREF="">Return to Jim's Japanese Page</a>
0176 </em>
0177 </BODY>