File indexing completed on 2024-12-22 03:32:07
0001 <HEAD><TITLE>The KANJIDIC Home Page</TITLE></HEAD> 0002 <BODY BGCOLOR="white"> 0003 <center> 0004 <h2> The Home Page of the</h2> 0005 <H1>KANJIDIC/KANJD212</H1> 0006 <h2>Project</h2> 0007 </center> 0008 0009 <em>The following is summary information about <b>KANJIDIC</b> and 0010 <b>KANJD212</b> Kanji database files. For complete documentation, see the <A 0011 HREF="http://www.csse.monash.edu.au/~jwb/kanjidic_doc.html">kanjidic</A> and <A HREF="http://www.csse.monash.edu.au/~jwb/kanjd212_doc.html">kanjd212</A> documentation files. 0012 </em> 0013 0014 <p><b>Introduction</b> 0015 0016 <p> The <b>KANJIDIC</b> file contains comprehensive information 0017 about Japanese kanji. It is a text file 0018 with one line for each of the 6,355 kanji specified in the 0019 JIS X 0208-1990 set, plus a header line. (For information 0020 about this set, see Appendix A of the documentation file.) <p> 0021 0022 The <b>KANJD212</b> file is also a text file containing comprehensive 0023 information about the 0024 5,801 kanji in the JIS X 0212-1990 supplementary character set. 0025 <p> 0026 The files have been compiled and maintained by 0027 <A HREF="http://www.csse.monash.edu.au/~jwb/index.html">Jim Breen</A>, 0028 and were cited as references used in the compilation of the new 0029 edition of the Nelson character dictionary.<p> 0030 <P> 0031 <B>KANJIDIC2</B> 0032 <p> 0033 The KANJIDIC2 file, which is an XML-format version of 0034 the data in KANJIDIC and KANJD212 is now available. The KANJIDIC2 page 0035 is <a href="http://www.csse.monash.edu.au/~jwb/kanjidic2">here</a>. 0036 <p> 0037 <B>Is it Public Domain?</B> 0038 </P> 0039 <P> 0040 KANJIDIC, KANJD212 and KANJIDIC2 can be freely used provided satisfactory 0041 acknowledgement is made 0042 in any software product, server, etc. that uses them. There are a few other 0043 conditions relating to distributing copies of the files with or without 0044 modification. Copyright is vested in the 0045 <a HREF="http://www.edrdg.org/">EDRG </a> 0046 (Electronic Dictionary Research Group). You can see the specific 0047 <a HREF="http://www.edrdg.org/edrdg/licence.html">licence statement </a> 0048 at the Group's site. 0049 </P> 0050 The files are available from the Monash University ftp site 0051 <A HREF="http://ftp.monash.edu.au/pub/nihongo/kanjidic.gz">here</a> and 0052 <A HREF="http://ftp.monash.edu.au/pub/nihongo/kanjd212.gz">here</a>. 0053 <p> 0054 The information about each kanji consists of: 0055 <UL> 0056 <LI> coding information about the kanji: including its identification 0057 in various code-sets (JIS, Unicode), and identifying codes derived 0058 from its structure (radical, stroke-count, Four-Corner, SKIP); 0059 0060 <LI> lexicographic information about the kanji: pronunciations and 0061 meanings; 0062 0063 <LI> indexing information providing identification of the kanji in 0064 several major dictionaries and reference books. 0065 </UL> 0066 0067 <b>KANJIDIC</b> is used now to build the kinfo.dat file which is 0068 used by JDIC, JREADER, WinJDic and JWPce. kinfo.dat contains the 0069 identical information, but in a compressed form and in a 0070 structure suitable for fast indexed access. <b>KANJIDIC</b> is 0071 also used in its native format by the xjdic and MacJDic dictionary 0072 programs, the WWWJDIC server, and a growing number of other programs such as KDRILL and 0073 KDIC. <p> 0074 0075 The files contains a mixture of ASCII characters and kana/kanji 0076 encoded using the EUC (Extended Unix Code) coding. The following is 0077 an example (folded) of one of the entry lines in <b>KANJIDIC</b>; 0078 the kanji with the meaning "east". <p> 0079 0080 <img src="kdsamp1.gif"> 0081 0082 <p> 0083 0084 Apart from the first two fields, the information fields 0085 in each entry have an identifying capital letter. The fields in 0086 the sample entry are: 0087 0088 <UL> 0089 0090 <LI> the kanji itself; 0091 0092 <LI> the JIS code of the kanji in hexadecimal; 0093 0094 <LI> [U] the Unicode/ISO 10646 code of the kanji in hexadecimal; 0095 0096 <LI> [N] the index in Nelson (Modern Reader's Japanese-English 0097 Character Dictionary) 0098 0099 <LI> [B] the classification radical number of the kanji (as in 0100 Nelson); 0101 0102 <LI> [C] the "classical" radical number (where this differs from 0103 the one used in Nelson); 0104 0105 <LI> [S] the total stroke-count of the kanji; 0106 0107 <LI> [G] the "grade" of the kanji, In this case, G2 means it is 0108 a Jouyou (general use) kanji taught in the second year of elementary 0109 schooling in Japan; 0110 0111 <LI> [H] the index in Halpern (New Japanese-English Character 0112 Dictionary); 0113 0114 <LI> [F] the rank-order frequency of occurrence of the kanji in 0115 Japanese; 0116 0117 <LI> [P] the "SKIP" coding of the kanji, as used in Halpern; 0118 0119 <LI> [K] the index in the Gakken Kanji Dictionary (A New 0120 Dictionary of Kanji Usage); 0121 0122 <LI> [L] the index in Heisig (Remembering The Kanji); 0123 0124 <LI> [I] the index in the Spahn & Hadamitsky dictionary. 0125 0126 <LI> [Q] the Four-Corner code; 0127 0128 <LI> [MN,MP] the index and page number in the 13-volume Morohashi 0129 "DaiKanWaJiten"; 0130 0131 <LI> [E] the index in Henshall (A Guide To Remembering Japanese 0132 Characters); 0133 0134 <LI> [Y] the PinYin (Chinese) pronunciation(s) of the kanji; 0135 0136 <LI> the Japanese pronunciations or "readings" of the kanji. These 0137 are in the katakana script for "ON" (i.e of Chinese origin) 0138 readings, and hiragana for "KUN" (Japanese origin) readings. Thus <p> 0139 <img src="tou.gif"> <p> 0140 is the ON reading, and <p> 0141 <img src="higashi.gif"> <p> 0142 is the KUN reading. These are optionally followed by a "T" plus one or more 0143 "nanori", which are the special readings only used with proper names. 0144 0145 <LI> Finally, the meanings associated with the kanji, each 0146 encapsulated with {...}. 0147 0148 </UL> 0149 0150 <p> <b>KANJIDIC</b> is available from a number of ftp archive sites around the 0151 world. The master site is Monash University's <A 0152 HREF="http://ftp.monash.edu.au/pub/nihongo/00INDEX.html">Nihongo FTP 0153 Archive</a> 0154 archive, where it is available in .zip and .gz compression. The 0155 "kinfo.dat" file is in the kinfo26.zip archive. 0156 <p> 0157 If you want to explore <b>KANJIDIC</b> using the Web, try my <A 0158 HREF="http://www.csse.monash.edu.au/~jwb/wwwjdic.html">WWWJDIC</a> server, which supports both KANJIDIC and 0159 KANJD212. If you don't have a browser with Japanese support, you can use 0160 a portal which provides graphics of the Japanese characters. 0161 <p> 0162 <b>Other Languages</b> 0163 <p> 0164 The Spanish version of KANJIDIC, with Spanish meanings translated by 0165 Francisco Gutierrez. 0166 It can be downloaded <a 0167 href="http://www.csse.monash.edu.au/~jwb/kanjidic_es.gz">here</a> or <a 0168 href="http://ftp.monash.edu.au/pub/nihongo/kanjidic_es.gz">here</a>. (It 0169 also is in UTF8 coding.) 0170 <p> 0171 There is also a partial French version prepared by Alain Thierion available 0172 <a href="http://www.csse.monash.edu.au/~jwb/kanjidic_fr.gz">here</a> 0173 <p> 0174 <em> 0175 <A HREF="http://www.csse.monash.edu.au/~jwb/japanese.html">Return to Jim's Japanese Page</a> 0176 </em> 0177 </BODY> 0178