File indexing completed on 2024-04-21 03:41:59

0001 <HEAD><TITLE>The KANJIDIC Home Page</TITLE></HEAD>
0002 <BODY BGCOLOR="white">
0003 <center>
0004 <h2> The Home Page of the</h2>
0005 <H1>KANJIDIC/KANJD212</H1> 
0006 <h2>Project</h2>
0007 </center>
0008 
0009 <em>The following is summary information about <b>KANJIDIC</b> and
0010 <b>KANJD212</b> Kanji database files. For complete documentation, see the <A
0011 HREF="http://www.csse.monash.edu.au/~jwb/kanjidic_doc.html">kanjidic</A> and <A HREF="http://www.csse.monash.edu.au/~jwb/kanjd212_doc.html">kanjd212</A> documentation files. 
0012 </em>
0013 
0014 <p><b>Introduction</b>
0015 
0016 <p> The <b>KANJIDIC</b> file contains comprehensive information 
0017 about Japanese kanji. It is  a  text file 
0018 with one line for each  of the 6,355 kanji specified in the 
0019 JIS X 0208-1990  set, plus a header line.  (For information  
0020 about this set,  see Appendix A of the documentation file.) <p>
0021 
0022 The <b>KANJD212</b> file is also a text file containing comprehensive 
0023 information about the
0024 5,801 kanji in the JIS X 0212-1990 supplementary character set.
0025 <p>
0026 The files have been compiled and maintained by
0027 <A HREF="http://www.csse.monash.edu.au/~jwb/index.html">Jim Breen</A>,
0028 and were cited as references used in the compilation of the new
0029 edition of the Nelson character dictionary.<p>
0030 <P>
0031 <B>KANJIDIC2</B>
0032 <p>
0033 The KANJIDIC2 file, which is an XML-format version of
0034 the data in KANJIDIC and KANJD212 is now available. The KANJIDIC2 page
0035 is <a href="http://www.csse.monash.edu.au/~jwb/kanjidic2">here</a>.
0036 <p>
0037 <B>Is it Public Domain?</B>
0038 </P>
0039 <P>
0040 KANJIDIC, KANJD212 and KANJIDIC2 can be freely used provided satisfactory 
0041 acknowledgement is made
0042 in any software product, server, etc. that uses them. There are a few other 
0043 conditions relating to distributing copies of the files with or without
0044 modification.  Copyright is vested in the
0045  <a HREF="http://www.edrdg.org/">EDRG </a>
0046 (Electronic Dictionary Research Group). You can see the specific
0047  <a HREF="http://www.edrdg.org/edrdg/licence.html">licence statement </a>
0048 at the Group's site.
0049 </P>
0050 The files are available from the Monash University ftp site 
0051 <A HREF="http://ftp.monash.edu.au/pub/nihongo/kanjidic.gz">here</a> and
0052 <A HREF="http://ftp.monash.edu.au/pub/nihongo/kanjd212.gz">here</a>.
0053 <p>
0054 The information about each kanji consists of: 
0055 <UL> 
0056 <LI> coding information about the kanji: including its identification 
0057 in various code-sets (JIS, Unicode), and identifying codes derived 
0058 from its structure (radical, stroke-count, Four-Corner, SKIP); 
0059 
0060 <LI> lexicographic information about the kanji: pronunciations and 
0061 meanings; 
0062 
0063 <LI> indexing information providing identification of the kanji in 
0064 several major dictionaries and reference books. 
0065 </UL>
0066 
0067 <b>KANJIDIC</b>  is used now to build the kinfo.dat file which is 
0068 used by JDIC, JREADER, WinJDic  and JWPce.  kinfo.dat  contains  the  
0069 identical information,  but  in  a compressed form and in a 
0070 structure suitable for fast indexed access. <b>KANJIDIC</b>  is  
0071 also  used in its native format by the xjdic and MacJDic dictionary 
0072 programs,  the WWWJDIC server, and a growing number of other programs such as KDRILL and 
0073 KDIC. <p>
0074 
0075 The files contains a mixture of ASCII characters and kana/kanji 
0076 encoded using the EUC (Extended Unix Code) coding. The following is 
0077 an example (folded) of one of the entry lines in <b>KANJIDIC</b>; 
0078 the kanji with the meaning "east". <p>
0079 
0080 <img src="kdsamp1.gif">
0081 
0082 <p>
0083 
0084 Apart from the first two fields, the information fields 
0085 in each entry have an identifying capital letter. The fields in 
0086 the sample entry are:
0087 
0088 <UL> 
0089 
0090 <LI> the kanji itself; 
0091 
0092 <LI> the JIS code of the kanji in hexadecimal; 
0093 
0094 <LI> [U] the Unicode/ISO 10646 code of the kanji in hexadecimal; 
0095 
0096 <LI> [N] the index in Nelson (Modern Reader's Japanese-English 
0097 Character Dictionary) 
0098 
0099 <LI> [B] the classification radical number of the kanji (as in 
0100 Nelson); 
0101 
0102 <LI> [C] the "classical" radical number (where this differs from 
0103 the one used in Nelson); 
0104 
0105 <LI> [S] the total stroke-count of the kanji; 
0106 
0107 <LI> [G] the "grade" of the kanji, In this case, G2 means it is 
0108 a Jouyou (general use) kanji taught in the second year of elementary 
0109 schooling in Japan; 
0110 
0111 <LI> [H] the index in Halpern (New Japanese-English Character 
0112 Dictionary); 
0113 
0114 <LI> [F] the rank-order frequency of occurrence of the kanji in 
0115 Japanese; 
0116 
0117 <LI> [P] the "SKIP" coding of the kanji, as used in Halpern; 
0118 
0119 <LI> [K] the index in the  Gakken  Kanji  Dictionary  (A  New 
0120 Dictionary  of Kanji Usage); 
0121 
0122 <LI> [L] the index in Heisig (Remembering The  Kanji); 
0123 
0124 <LI> [I] the index in  the  Spahn  &  Hadamitsky  dictionary.  
0125 
0126 <LI> [Q] the Four-Corner code; 
0127 
0128 <LI> [MN,MP] the index and page number in the 13-volume Morohashi 
0129 "DaiKanWaJiten"; 
0130 
0131 <LI> [E] the index in Henshall (A Guide To Remembering Japanese 
0132 Characters); 
0133 
0134 <LI> [Y] the PinYin (Chinese) pronunciation(s) of the kanji; 
0135 
0136 <LI> the Japanese pronunciations or "readings" of the kanji. These 
0137 are in the katakana script for "ON" (i.e of Chinese origin) 
0138 readings, and hiragana for "KUN" (Japanese origin) readings. Thus <p>
0139 <img src="tou.gif"> <p>
0140 is the ON reading, and <p>
0141 <img src="higashi.gif"> <p>
0142 is the KUN reading. These are optionally followed by a "T" plus one or more 
0143 "nanori", which are the special readings only used with proper names. 
0144 
0145 <LI> Finally, the meanings associated with the kanji, each 
0146 encapsulated with {...}. 
0147 
0148 </UL> 
0149 
0150 <p> <b>KANJIDIC</b> is available from a number of ftp archive sites around the
0151 world. The master site is Monash University's <A
0152 HREF="http://ftp.monash.edu.au/pub/nihongo/00INDEX.html">Nihongo FTP
0153 Archive</a>
0154 archive, where it is available in .zip and .gz compression. The
0155 "kinfo.dat" file is in the kinfo26.zip archive.
0156 <p>
0157 If you want to explore <b>KANJIDIC</b> using the Web, try my <A
0158 HREF="http://www.csse.monash.edu.au/~jwb/wwwjdic.html">WWWJDIC</a> server, which supports both KANJIDIC and
0159 KANJD212. If you don't have a browser with Japanese support, you can use
0160 a portal which provides graphics of the Japanese characters.
0161 <p>
0162 <b>Other Languages</b>
0163 <p>
0164 The Spanish version of KANJIDIC, with Spanish meanings translated by
0165 Francisco Gutierrez. 
0166 It can be downloaded <a
0167 href="http://www.csse.monash.edu.au/~jwb/kanjidic_es.gz">here</a> or <a
0168 href="http://ftp.monash.edu.au/pub/nihongo/kanjidic_es.gz">here</a>. (It
0169 also is in UTF8 coding.)
0170 <p>
0171 There is also a partial French version prepared by Alain Thierion available
0172 <a href="http://www.csse.monash.edu.au/~jwb/kanjidic_fr.gz">here</a>
0173 <p>
0174 <em>
0175 <A HREF="http://www.csse.monash.edu.au/~jwb/japanese.html">Return to Jim's Japanese Page</a>
0176 </em>
0177 </BODY>
0178