Warning, /kdevelop/kdev-ruby/parser/README.md is written in an unsupported language. File is not indexed.
0001 # Parser 0002 0003 ## Intro 0004 0005 This is a bison-generated pure LALR parser. It's based on the MRI parser. 0006 This parser has been designed to be small, simple and yet powerful. Its code 0007 resides in four different files: parser.y, node.h, node.c and main.c. 0008 The main.c file is conceived only for testing purposes (see the section 0009 "Testing and Debugging the parser"). More important are the files node.h and 0010 node.c that, in short, define what is a node and what can we (and the parser) 0011 do with nodes. Finally, the parser.y file is the grammar of the parser. This 0012 is the parser in its very basic shape. To see what this parser really looks 0013 like, we have to generate three files: parser.h, parser.c and hash.c. 0014 The parser\_gen.{h, c} files are generated by bison taking the grammar file as 0015 its input. The last, (but not least) file to generate is hash.c and it's 0016 generated by gperf taking the file tools/gperf.txt as its input. It contains 0017 a hash table that is used by the parser to match keywords quickly. The 0018 parser\_gen.{h, c} files are generated by cmake. The hash.c file is generated 0019 with the tools/gperf.rb script. 0020 0021 ## Testing and Debugging the Parser 0022 0023 All the info on testing the parser can be found [here](http://techbase.kde.org/Projects/KDevelop4/Ruby#Testing). 0024 0025 What represents all those integers from the output ? In short, 0026 it's the representation of an AST printed in pre-order. As you 0027 will see, the parser tries to beautify this output by telling you if the 0028 expression is a condition inside of, for example, a for statement, or it will 0029 output "Root" and "Next" if there is a list of inner statements. Moreover, the 0030 parser sometimes outputs names between parenthesis. Those names are variables, 0031 the name of a function, a class, etc. Sadly, sometimes the output is scary 0032 and a complete mess. In those cases, experience and patience 0033 will be our friends ;) 0034 0035 ## Character encodings 0036 0037 As stated before, this parser is meant to be simple and small. This means 0038 that by now we only support UTF-8 encoding. This doesn't mean that 0039 other encodings will never be supported by this parser, it's just that 0040 the developers haven't had enough time to write the code. 0041