File indexing completed on 2024-12-22 03:32:07
0001 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> 0002 <HTML> 0003 <HEAD> 0004 <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=euc-jp"> 0005 <META NAME="Generator" CONTENT="Jim's Markup Program - V0.99"> 0006 <TITLE> JMdict/EDICT Project</TITLE> 0007 </HEAD> 0008 <BODY BGCOLOR="white"> 0009 <!-- DO NOT EDIT!! 0010 This HTML document was generated by the "markup" program. 0011 Edit the original file instead. --> 0012 <H1 ALIGN=CENTER> JMdict/EDICT </H1> 0013 <P> 0014 </P> 0015 <H2 ALIGN=CENTER> JAPANESE/ENGLISH DICTIONARY PROJECT</H2> 0016 <BASEFONT SIZE="3"> 0017 <P> 0018 <I>Copyright (C) 2010 The Electronic Dictionary Research and Development Group.</I> 0019 </P> 0020 <P> 0021 <h2>Contents</h2> 0022 <a href="#IREF00">INTRODUCTION</a> 0023 <a href="#IREF01">CURRENT VERSION & DOWNLOAD</a> 0024 <a href="#IREF01a">PROJECT FORUM </a> 0025 <a href="#IREF02">FORMAT</a> 0026 <a href="#IREF03">PROJECT HISTORY</a> 0027 <a href="#IREF04">COPYRIGHT</a> 0028 <a href="#IREF05">LEXICOGRAPHICAL DETAILS</a> 0029 <a href="#IREF06">OTHER LANGUAGES</a> 0030 <a href="#IREF08">CONTRIBUTIONS</a> 0031 <a href="#IREF08a">RELATED PROJECTS</a> 0032 <a href="#IREF09">ACKNOWLEDGEMENTS</a> 0033 <a href="#IREF10">PUBLICATIONS</a> 0034 </P> 0035 <P> 0036 <a name="IREF00"><h2>INTRODUCTION</h2></a> 0037 </P> 0038 <P> 0039 The JMdict/EDICT project has as its goal the production of a freely 0040 available Japanese/English Dictionary in machine-readable form. 0041 </P> 0042 <P> 0043 The project began in 1991 with the expansion of the "EDICT" simple 0044 Japanese-English dictionary file. (See below under History) 0045 </P> 0046 <P> 0047 At present the project has the following dictionary files available: 0048 </P> 0049 <UL> 0050 <P> 0051 </P> 0052 <LI>the full JMdict file in XML format. The JMdict file is aimed at 0053 being a multilingual lexical database with Japanese as the pivot language 0054 and also includes 0055 translations of words and phrases in a number of languages other 0056 than English. More information is available from the 0057 <a HREF="http://www.csse.monash.edu.au/~jwb/j_jmdict.html">JMdict overview page. </a> 0058 <P> 0059 </P> 0060 </LI> 0061 <LI>the EDICT file, which contains a reduced amount of information, and 0062 is provided to maintain support for software which uses the original 0063 EDICT file format. A short 0064 <a HREF="http://www.csse.monash.edu.au/~jwb/edict.html">EDICT overview page </a> 0065 is available which lists some of the software which uses this file; 0066 <P> 0067 </P> 0068 </LI> 0069 <LI>the EDICT2 file, which is in an expanded format and contains almost 0070 all the information in the JMdict file; 0071 <P> 0072 </P> 0073 </LI> 0074 <LI>the EDICT_SUB file, which contains about 20% of the most common 0075 entries in the EDICT file. 0076 </LI> 0077 </UL> 0078 <P> 0079 An internal database is used to hold all the data associated with the project, 0080 and the files are generated from using conversion utility software. 0081 </P> 0082 <P> 0083 The files are copyright, and distributed in accordance with the 0084 Licence Statement, which can found at the WWW site of the 0085 <a HREF="http://www.edrdg.org/">Electronic Dictionary Research and Development Group </a> 0086 who are the owners of the copyright. 0087 </P> 0088 <P> 0089 <a name="IREF01"><h2>CURRENT VERSION & DOWNLOAD</h2> </a> 0090 </P> 0091 <P> 0092 The project's master database is continuously being updated and new 0093 versions of the file are generated daily. The date of generation is 0094 included in the header of the file. 0095 </P> 0096 <P> 0097 The files are currently distributed via the Monash University 0098 <a HREF="http://ftp.monash.edu.au/pub/nihongo/00INDEX.html">ftp server, </a> 0099 which also provides an rsync service. The main files available are: 0100 </P> 0101 <UL> 0102 <LI> 0103 <a HREF="http://ftp.monash.edu.au/pub/nihongo/JMdict.gz">JMdict.gz </a> 0104 - the full JMdict file, including English, German, French, Russian and Dutch glosses; 0105 </LI> 0106 <LI> 0107 <a HREF="http://ftp.monash.edu.au/pub/nihongo/JMdict_e.gz">JMdict_e.gz </a> 0108 - the JMdict file with only English glosses; 0109 </LI> 0110 <LI> 0111 <a HREF="http://ftp.monash.edu.au/pub/nihongo/edict.gz">edict.gz </a> 0112 - the "traditional" EDICT file; 0113 </LI> 0114 <LI> 0115 <a HREF="http://ftp.monash.edu.au/pub/nihongo/edict2.gz">edict2.gz </a> 0116 - the extended EDICT2 file. 0117 </LI> 0118 </UL> 0119 <P> 0120 <a name="IREF01a"><h2>PROJECT FORUM</h2></a> 0121 </P> 0122 <P> 0123 The are several forums where this project is actively discussed. 0124 </P> 0125 <P> 0126 The original forum was the 0127 <TT> sci.lang.japan</TT> 0128 <a HREF="http://groups.google.com/group/sci.lang.japan">Usenet newsgroup. </a> 0129 More recently a 0130 <a HREF="http://groups.yahoo.com/group/edict-jmdict/">mailing list </a> 0131 specifically for project discussion has begun. (Mail to 0132 <TT> edict-jmdict-subscribe@yahoogroups.com</TT> 0133 to initiate subscription.) 0134 </P> 0135 <P> 0136 <a name="IREF02"><h2>FORMAT</h2></a> 0137 </P> 0138 <P> 0139 The basic format of the entries in the dictionary files can be seen in 0140 detail by examining the 0141 <a HREF="http://www.csse.monash.edu.au/~jwb/jmdict_dtd_h.html">DTD </a> 0142 (Document Type Declaration) of the XML-format JMdict file. The DTD is 0143 heavily annotated with content and structural information. 0144 </P> 0145 <P> 0146 In summary, each dictionary entry is independent, although there may 0147 be cross-reference fields pointing to other entries. Each entry consists of 0148 </P> 0149 <OL type="a"> 0150 <P> 0151 </P> 0152 <LI>kanji elements, i.e. headwords containing at least one kanji character, 0153 plus associated tags indicating some status or characteristic of the 0154 headword. Where there are multiple headwords, they have been ordered 0155 according to frequency of usage, as far as this can be determined; 0156 <P> 0157 </P> 0158 </LI> 0159 <LI>reading elements, containing either the reading in kana of the headword, 0160 or the headword itself in the case of headwords only in kana. The elements 0161 also include tags indicating some status or characteristics. As with the 0162 kanji headwords, where there are multiple readings they have been ordered 0163 according to frequency of usage, as far as this can be determined; 0164 <P> 0165 </P> 0166 </LI> 0167 <LI>general coded information relating to the entry as a whole, such as 0168 original language, date-of-creation, etc. 0169 <P> 0170 </P> 0171 </LI> 0172 <LI>sense elements, containing the translational equivalents or glosses of 0173 the headword(s). As Japanese is not highly polysemous, there is often only 0174 one sense. Associated with the sense elements is other coded data indicating 0175 the part-of-speech, field of application, miscellaneous information, etc. 0176 As with headwords and readings, the glosses are ordered with the most common 0177 appearing first. 0178 </LI> 0179 </OL> 0180 <P> 0181 The format and coding of the distributed files is as follows: 0182 </P> 0183 <OL type="a"> 0184 <P> 0185 </P> 0186 <LI>the JMdict file contains the complete dictionary information 0187 in XML format as per the 0188 <a HREF="http://www.csse.monash.edu.au/~jwb/jmdict_dtd_h.html">DTD. </a> 0189 This file is in Unicode/ISO-10646 coding using UTF-8 encapsulation. 0190 <P> 0191 </P> 0192 </LI> 0193 <LI>the EDICT file is in a relatively simple format based on the text data 0194 file of the SKK input-method. Each entry is in the form: 0195 <P> 0196 </P> 0197 <DL><DD> 0198 KANJI [KANA] /(general information) gloss/gloss/.../ 0199 </DL> 0200 <P> 0201 or 0202 </P> 0203 <P> 0204 </P> 0205 <DL><DD> 0206 KANA /(general information) gloss/gloss/.../ 0207 </DL> 0208 <P> 0209 Where there are multiple senses, these are indicated by (1), (2), etc. 0210 before the first gloss in each sense. As this format only allows a single 0211 kanji headword and reading, entries are generated for each possible 0212 headword/reading combination. As the format restricts Japanese characters 0213 to the kanji and kana fields, any cross-reference data and other 0214 informational fields are omitted. 0215 </P> 0216 <P> 0217 The EDICT file is distributed in JIS X 0208 coding in EUC-JP encapsulation; 0218 </P> 0219 <P> 0220 </P> 0221 </LI> 0222 <LI>the EDICT2 file is in an expanded form of the original EDICT format. 0223 The main differences are the inclusion of multiple kanji headwords and 0224 readings, and the inclusion of cross-reference and other information 0225 fields, e.g.: 0226 <P> 0227 </P> 0228 <DL><DD> 0229 KANJI-1;KANJI-2 [KANA-1;KANA-2] /(general information) (see xxxx) gloss/gloss/.../ 0230 </DL> 0231 <P> 0232 In addition, the EDICT2 has as its last field the sequence number of the 0233 entry. This matches the "ent_seq" entity value in the XML edition. The 0234 field has the format: EntLnnnnnnnnX. The EntL is a unique string to help 0235 identify the field. The "X", if present, indicates that an audio clip 0236 of the entry reading is available from the JapanesePod101.com site. 0237 </P> 0238 <P> 0239 The EDICT2 file is distributed in JIS X 0208 and JIS X 0212 codings in EUC-JP 0240 encapsulation; 0241 </P> 0242 <P> 0243 </P> 0244 </LI> 0245 <LI>the EDICT_SUB file is in the same format as the EDICT file. 0246 <P> 0247 </P> 0248 </LI> 0249 </OL> 0250 <P> 0251 None of the files have the entries in any particular order. 0252 </P> 0253 <P> 0254 <a name="IREF03"><h2>PROJECT HISTORY</h2></a> 0255 </P> 0256 <P> 0257 The project was begun in 1991 by the current editor 0258 <a HREF="http://www.csse.monash.edu.au/~jwb/">(Jim Breen) </a> 0259 when an early DOS-based Japanese word-processor 0260 (MOKE - Mark's Own Kanji Editor) was released, containing an initial 0261 small version of the EDICT file. This was progressively expanded and edited over 0262 the following years. In 1999 the EDICT, which by this time contained 0263 about 60,000 entries, was converted into an expanded format and the first 0264 XML-format JMdict file released. The EDICT2 format was created in 2003, 0265 primarily for use with the 0266 <a HREF="http://www.csse.monash.edu.au/~jwb/cgi-bin/wwwjdic.cgi?1C">WWWJDIC </a> 0267 dictionary server. 0268 </P> 0269 <P> 0270 The growth in entries in the file is largely due to the efforts of Jim and the 0271 many people who contributed entries to it over the years. The increase in entry 0272 numbers has slowed as the file has achieved coverage of a large proportion 0273 of the Japanese lexicon. Much of the editorial work in recent years has 0274 concentrated on amendments and expansion to existing entries. 0275 </P> 0276 <P> 0277 A more expanded explanation of the early developments in the EDICT file 0278 can be found in the 0279 <a HREF="http://www.csse.monash.edu.au/~jwb/edict_doc_old.html">original documentation. </a> 0280 </P> 0281 <P> 0282 <a name="IREF04"><h2>COPYRIGHT</h2></a> 0283 </P> 0284 <P> 0285 Dictionary copyright is a difficult point, because clearly the first 0286 lexicographer who published "inu means dog" could not claim a copyright 0287 violation over all subsequent Japanese dictionaries. While it is usual to 0288 consult other dictionaries for "accurate lexicographic information", as 0289 Nelson put it, wholesale copying is, of course, not permissible, and 0290 contributors have been advised to avoid direct copying from other sources. 0291 What makes 0292 each dictionary unique (and copyright-able) is the particular selection of 0293 words, the phrasing of the meanings, the presentation of the contents (a very 0294 important point in the case of this project), and the means of publication. 0295 </P> 0296 <P> 0297 The files of the project are copyright, and distributed in accordance with the 0298 Licence Statement, which can found at the WWW site of the 0299 <a HREF="http://www.edrdg.org/">Electronic Dictionary Research and Development Group </a> 0300 who are the current owners of the copyright. As explained in the licence, the 0301 files are available for use for most purposes provided acknowledgement 0302 and distribution of the documentation is made. 0303 </P> 0304 <P> 0305 <a name="IREF05"><h2>LEXICOGRAPHICAL DETAILS</h2></a> 0306 </P> 0307 <P> 0308 </P> 0309 <OL type="A"> 0310 <LI>Inflections, etc. 0311 <P> 0312 In general no inflections of verbs or adjectives have been included, 0313 except in idiomatic expressions. Adverbs 0314 formed from adjectives (e.g., -ku or -ni) are generally not included. 0315 Verbs are, of course, in the plain or "dictionary" form. 0316 </P> 0317 <P> 0318 Composed forms, such as adverbs taking the "to" particle, keiyoudoushi 0319 adjectives, etc. are only included in their root from, however the 0320 part-of-speech (POS) marker is used to indicate their status. 0321 </P> 0322 <P> 0323 Nouns which can form a verb withe the auxiliary verb "suru" only appear 0324 in their noun form, but have a POS marker: "vs", to indicate the existence 0325 of a verbal form. In general the gloss only relates to the noun itself, but 0326 entries are being progressively expanded to include the verbal glosses as well. 0327 </P> 0328 <P> 0329 </P> 0330 </LI> 0331 <LI>Part of Speech Marking 0332 <P> 0333 The following POS markings are currently used: 0334 </P> 0335 <PRE> 0336 adj-i adjective (keiyoushi) 0337 adj-na adjectival nouns or quasi-adjectives (keiyodoshi) 0338 adj-no nouns which may take the genitive case particle `no' 0339 adj-pn pre-noun adjectival (rentaishi) 0340 adj-t `taru' adjective 0341 adj-f noun or verb acting prenominally (other than the above) 0342 adj former adjective classification (being removed) 0343 adv adverb (fukushi) 0344 adv-n adverbial noun 0345 adv-to adverb taking the `to' particle 0346 aux auxiliary 0347 aux-v auxiliary verb 0348 aux-adj auxiliary adjective 0349 conj conjunction 0350 ctr counter 0351 exp Expressions (phrases, clauses, etc.) 0352 id idiomatic expression 0353 int interjection (kandoushi) 0354 iv irregular verb 0355 n noun (common) (futsuumeishi) 0356 n-adv adverbial noun (fukushitekimeishi) 0357 n-pref noun, used as a prefix 0358 n-suf noun, used as a suffix 0359 n-t noun (temporal) (jisoumeishi) 0360 num numeric 0361 pn pronoun 0362 pref prefix 0363 prt particle 0364 suf suffix 0365 v1 Ichidan verb 0366 v5 Godan verb (not completely classified) 0367 v5aru Godan verb - -aru special class 0368 v5b Godan verb with `bu' ending 0369 v5g Godan verb with `gu' ending 0370 v5k Godan verb with `ku' ending 0371 v5k-s Godan verb - iku/yuku special class 0372 v5m Godan verb with `mu' ending 0373 v5n Godan verb with `nu' ending 0374 v5r Godan verb with `ru' ending 0375 v5r-i Godan verb with `ru' ending (irregular verb) 0376 v5s Godan verb with `su' ending 0377 v5t Godan verb with `tsu' ending 0378 v5u Godan verb with `u' ending 0379 v5u-s Godan verb with `u' ending (special class) 0380 v5uru Godan verb - uru old class verb (old form of Eru) 0381 v5z Godan verb with `zu' ending 0382 vz Ichidan verb - zuru verb - (alternative form of -jiru verbs) 0383 vi intransitive verb 0384 vk kuru verb - special class 0385 vn irregular nu verb 0386 vs noun or participle which takes the aux. verb suru 0387 vs-i suru verb - irregular 0388 vs-s suru verb - special class 0389 vt transitive verb 0390 </PRE> 0391 <P> 0392 </P> 0393 </LI> 0394 <LI>Field of Application 0395 <P> 0396 A number of entries are marked with a specific field of application. 0397 Current fields and tags are: 0398 </P> 0399 <P> 0400 </P> 0401 <PRE> 0402 Buddh Buddhist term 0403 MA martial arts term 0404 comp computer terminology 0405 food food term 0406 geom geometry term 0407 gram grammatical term 0408 ling linguistics terminology 0409 math mathematics 0410 mil military 0411 physics physics terminology 0412 </PRE> 0413 <P> 0414 </P> 0415 </LI> 0416 <LI>Miscellaneous Markings 0417 <P> 0418 </P> 0419 <PRE> 0420 X rude or X-rated term 0421 abbr abbreviation 0422 arch archaism 0423 ateji ateji (phonetic) reading 0424 chn children's language 0425 col colloquialism 0426 derog derogatory term 0427 eK exclusively kanji 0428 ek exclusively kana 0429 fam familiar language 0430 fem female term or language 0431 gikun gikun (meaning) reading 0432 hon honorific or respectful (sonkeigo) language 0433 hum humble (kenjougo) language 0434 iK word containing irregular kanji usage 0435 id idiomatic expression 0436 io irregular okurigana usage 0437 m-sl manga slang 0438 male male term or language 0439 male-sl male slang 0440 ng neuter gender 0441 oK word containing out-dated kanji 0442 obs obsolete term 0443 obsc obscure term 0444 ok out-dated or obsolete kana usage 0445 on-mim onomatopoeic or mimetic word 0446 poet poetical term 0447 pol polite (teineigo) language 0448 rare rare (now replaced by "obsc") 0449 sens sensitive word 0450 sl slang 0451 uK word usually written using kanji alone 0452 uk word usually written using kana alone 0453 vulg vulgar expression or word 0454 </PRE> 0455 <P> 0456 </P> 0457 </LI> 0458 <LI>Word Priority Marking 0459 <P> 0460 The ke_pri and equivalent re_pri fields in the JMdict file 0461 are provided to record 0462 information about the relative commonness or priority of the entry, and consist 0463 of codes indicating the word appears in various references which 0464 can be taken as an indication of the frequency with which the word 0465 is used. This field is intended for use either by applications which 0466 want to concentrate on entries of a particular priority, or to 0467 generate subset files. 0468 The current values in this field are: 0469 </P> 0470 <OL type="a"> 0471 <LI>news1/2: appears in the "wordfreq" file compiled by Alexandre Girardi 0472 from the Mainichi Shimbun. (See the Monash ftp archive for a copy.) 0473 Words in the first 12,000 in that file are marked "news1" and words 0474 in the second 12,000 are marked "news2". 0475 </LI> 0476 <LI>ichi1/2: appears in the "Ichimango goi bunruishuu", Senmon Kyouiku 0477 Publishing, Tokyo, 1998. (The entries marked "ichi2" were 0478 demoted from ichi1 because they were observed to have low 0479 frequencies in the WWW and newspapers.) 0480 </LI> 0481 <LI>spec1 and spec2: a small number of words use this marker when they 0482 are detected as being common, but are not included in other lists. 0483 </LI> 0484 <LI>gai1/2: common loanwords, also based on the wordfreq file. 0485 </LI> 0486 <LI>nfxx: this is an indicator of frequency-of-use ranking in the 0487 wordfreq file. "xx" is the number of the set of 500 words in which 0488 the entry can be found, with "01" assigned to the first 500, "02" 0489 to the second, and so on. 0490 </LI> 0491 </OL> 0492 <P> 0493 Entries with news1, ichi1, spec1 and gai1 values are marked with 0494 a "(P)" in the EDICT and EDICT2 files. 0495 </P> 0496 <P> 0497 While the priority markings accurately reflect the status of entries with 0498 regard to the various sources, they must be seen as 0499 only providing a crude indication of how common a word or expression actually 0500 is in Japanese. The "(P)" markings in the EDICT and EDICT2 files appear to 0501 identify a useful subset of "common" words, but there are clearly some 0502 marked entries which are not very common, and there are clearly unmarked 0503 entries which are in common use, particularly in the spoken language. 0504 </P> 0505 <P> 0506 </P> 0507 </LI> 0508 <LI>Okurigana Variants 0509 <P> 0510 Okurigana variants in headwords are handled by including each variant form 0511 as a headword. This is to enable software to match with variant forms. 0512 </P> 0513 <P> 0514 </P> 0515 </LI> 0516 <LI>Spellings 0517 <P> 0518 As far as possible variants of English translation and spelling are included. 0519 Where appropriate different translations are included for 0520 national variants (e.g. autumn/fall, tap/faucet, etc.). Common spelling 0521 variations such as -our/-or and -ize/-ise are handled either by repeating 0522 the gloss in both spellings or appending spelling variants in parentheses. 0523 No attempt is made to tag English spellings according to country of usage. 0524 </P> 0525 <P> 0526 </P> 0527 </LI> 0528 <LI>Gairaigo and Regional Words 0529 <P> 0530 For gairaigo which have not been derived from English words, 0531 the source language and the word in that language are included. Languages have 0532 been coded in the two-letter codes from the ISO 639-2:1998 "Codes for the 0533 representation of names of languages" standard, e.g. "(fre: avec)" in the 0534 EDICT/EDICT2 files and 0535 <lsource xml:lang="fre">avec</lsource> in the JMdict 0536 file. 0537 In the case of gairaigo which have a meaning which is not apparent from the 0538 original (usually English) words, the words in the source language are 0539 included as: (trans: original words). 0540 </P> 0541 <P> 0542 A number of tags 0543 are used to indicate that a word or phrase is associated with a particular 0544 regional language variant within Japan. The tags are: 0545 </P> 0546 <P> 0547 </P> 0548 <PRE> 0549 kyb Kyoto-ben 0550 osb Osaka-ben 0551 ksb Kansai-ben 0552 ktb Kantou-ben 0553 tsb Tosa-ben 0554 thb Touhoku-ben 0555 tsug Tsugaru-ben 0556 kyu Kyuushuu-ben 0557 rkb Ryuukyuu-ben 0558 </PRE> 0559 </LI> 0560 </OL> 0561 <P> 0562 <a name="IREF06"><h2>OTHER LANGUAGES</h2></a> 0563 </P> 0564 <P> 0565 The JMdict file has the capacity to record glosses for Japanese headwords in 0566 many languages. Although not maintained as part of the current project, the 0567 full JMdict file includes glosses for a large number of entries in French 0568 and German, and a smaller number of entries in Russian and Dutch. 0569 </P> 0570 <P> 0571 The sources for the main non-English glosses are: 0572 </P> 0573 <UL> 0574 <LI>the French material (58,000 entries) come from two sources: 0575 <UL> 0576 <LI>a 17,000 entry Japanese-French dictionary file from the 0577 <a HREF="http://dico.fj.free.fr/">Dictionnaire français-japonais </a> 0578 project being undertaken by Jean-Marc Desperrier. As Jean-Marc says on 0579 that page, his project's aim "est de traduire en français une partie du 0580 dictionnaire japonais-anglais Edict de Jim Breen". His project is 0581 continuing and is being supported by a number of French speakers; 0582 </LI> 0583 <LI>about 41,000 entries from a dictionary compiled by 0584 <a HREF="http://francais.sourceforge.jp/">le projet francais pour francophone </a> 0585 This file, which also appears to be based around 0586 translating the EDICT file, has been made by selecting the entries not 0587 already in Jean-Marc's file, and converting them to EDICT format. 0588 These entries have "JF2" at the end of them. 0589 <I>(There is some evidence that this file may</I> 0590 <I>use translations generated by an online resource such as Babelfish.)</I> 0591 </LI> 0592 </UL> 0593 </LI> 0594 <LI>the German material (79,000 entries) is from the 0595 <a HREF="http://wadoku.de/">WaDokuJT </a> 0596 Japanese-German dictionary file compiled by Ulrich Apel; 0597 </LI> 0598 <LI>a small Japanese-Russian 0599 dictionary file being compiled by Oleg Volkov. 0600 <a HREF="http://ftp.monash.edu.au/pub/nihongo/jr-edict.doc.rus.win1251.txt">(documentation </a> 0601 - in Russian) 0602 </LI> 0603 </UL> 0604 <P> 0605 <a name="IREF08"><h2>CONTRIBUTIONS</h2></a> 0606 </P> 0607 <P> 0608 Contribution of new entries and amendments to existing entries is most 0609 welcome. A special 0610 <a HREF="http://www.csse.monash.edu.au/~jwb/cgi-bin/wwwnewword.cgi">WWW page </a> 0611 is available for this purpose. 0612 </P> 0613 <P> 0614 <a name="IREF08a"><h2>RELATED PROJECTS</h2></a> 0615 </P> 0616 <P> 0617 A number of other Japanese dictionary projects are closely related to this 0618 one. Among them are: 0619 </P> 0620 <OL type="a"> 0621 <P> 0622 </P> 0623 <LI>the 0624 <a HREF="http://www.csse.monash.edu.au/~jwb/enamdict_doc.html">ENAMDICT/JMnedict </a> 0625 Japanese Proper Names Dictionary project, which currently has nearly 0626 600,000 named entities. The files are available in EDICT or XML formats. 0627 <P> 0628 </P> 0629 </LI> 0630 <LI>the 0631 <a HREF="http://www.csse.monash.edu.au/~jwb/kanjidic.html">KANJIDIC </a> 0632 and 0633 <a HREF="http://www.csse.monash.edu.au/~jwb/kanjidic2/index.html">KANJIDIC2 </a> 0634 project, which maintains and distributes databases of information about 0635 kanji. 0636 <P> 0637 </P> 0638 </LI> 0639 <LI>the 0640 <a HREF="http://www.csse.monash.edu.au/~jwb/compdic_doc.html">COMPDIC </a> 0641 file in EDICT format of computing and telecomms terminology. In 2008 the 0642 COMPDIC material was included in the main EDICT/JMdict database with tagging 0643 indication the entries relate to ICT. A separate "COMPDIC" file is extracted 0644 for distribution. 0645 <P> 0646 </P> 0647 </LI> 0648 <LI>the 0649 <a HREF="http://www.csse.monash.edu.au/~jwb/kradinf.html">RADKFILE/KRADFILE </a> 0650 file of visual elements in kanji, which can be used for finding kanji 0651 in dictionaries. 0652 <P> 0653 </P> 0654 </LI> 0655 </OL> 0656 <P> 0657 <a name="IREF09"><h2>ACKNOWLEDGEMENTS</h2></a> 0658 </P> 0659 <P> 0660 Since 1991 a large number of people have contributed to this project; far too 0661 many to list here. All their contributions have been most welcome, indeed 0662 without the assistance of speakers and students of Japanese this 0663 project would not have achieved as much. 0664 </P> 0665 <P> 0666 The EDICT/JMdict has been granted approval to use material from the 0667 <a HREF="http://nlpwww.nict.go.jp/wn-ja/index.en.html">Japanese WordNet. </a> 0668 This approval is most welcome. (See the Japanese WordNet 0669 <a HREF="http://nlpwww.nict.go.jp/wn-ja/license.txt">licence.) </a> 0670 </P> 0671 <P> 0672 <a name="IREF10"><h2>PUBLICATIONS</h2></a> 0673 </P> 0674 <P> 0675 Some publications by Jim Breen about the EDICT/JMdict project: 0676 </P> 0677 <UL> 0678 <LI>An early technical report from 1993; 0679 <a HREF="ftp://ftp.monash.edu.au/pub/nihongo/ejdic_report1.ps.gz">(postscript) </a> 0680 </LI> 0681 <LI>an overview paper from 1995; 0682 <a HREF="jsaa_paper/hpaper.html">(html) </a> 0683 <a HREF="ftp://ftp.monash.edu.au/pub/nihongo/elec_dic.ps.gz">(postscript) </a> 0684 </LI> 0685 <LI>a 1999 conference paper about WWWJDIC; 0686 <a HREF="ftp://ftp.monash.edu.au/pub/nihongo/www-jdict.ps.gz">(postscript) </a> 0687 <a HREF="ftp://ftp.monash.edu.au/pub/nihongo/www-jdict.pdf">(pdf) </a> 0688 <a HREF="wwwjdic_article/wwwjdic_article.html">(html). </a> 0689 </LI> 0690 <LI>a 0691 <a HREF="http://www.csse.monash.edu.au/~jwb/jmdictart.html">paper </a> 0692 about JMdict presented at the COLING Multilingual 0693 Linguistic Resources Workshop in Geneva in August 2004. 0694 </LI> 0695 <LI>an earlier 0696 <a HREF="http://www.csse.monash.edu.au/~jwb/ws2002_paper.html">JMdict paper </a> 0697 about some of the practical issues, presented at the Papillon 0698 Project workshop in Tokyo in July 2002. 0699 </LI> 0700 </UL> 0701 </BODY></HTML>