Warning, /kdevelop/kdevelop/kdevplatform/language/duchain/Mainpage.dox is written in an unsupported language. File is not indexed.
0001 /*! 0002 * @mainpage Definition-Use Chain and Type System 0003 * 0004 * Overview | \ref duchain-design "Design" | \ref Implementing "Implementing" | \ref Using "Using" 0005 * 0006 * The definition-use chain and type system provide a language-neutral 0007 * representation of source code structure, used to provide language-based 0008 * features to all implemented languages in a generic manner. 0009 * 0010 * An introduction to the duchain can be found in the \ref duchain-design document. 0011 * 0012 * Details about how to provide a duchain and type system for your favourite 0013 * language can be found here: \ref Implementing. 0014 * 0015 * @licenses 0016 * @lgpl 0017 * 0018 * For questions and discussions about editor either contact the author 0019 * or the <a href="mailto:kdevelop-devel@kde.org">kdevelop-devel@kde.org</a> 0020 * mailing list. 0021 */ 0022 0023 /** \page duchain-design Definition-Use Chain Design 0024 0025 * @ref index "Overview" | Design | \ref Implementing "Implementing" | \ref Using "Using" 0026 0027 \section overview Overview 0028 0029 The duchain is a sequence of contexts in a code file, and the associated definitions which occur in those contexts. A simplified way of thinking about it is that for each set of brackets (curly {} or not ()), there is a separate context. Each context is represented by a \ref KDevelop::DUContext. Each context will have one parent context (except in the case of the top level context which has none), and any number of child contexts (including none). Additionally, each context can import any number of other contexts. The reason for this will become clear later. Thus, the \ref KDevelop::DUContext structure resembles a directed acyclic graph, for those familiar with the concept. 0030 0031 0032 \section parsing Parsing 0033 0034 These \ref KDevelop::DUContext "DUContexts" are created on the first pass after parsing the code to an AST (abstract syntax tree). Also, in this stage the data types are parsed, and any declarations which are encountered are recorded against the context in which they are encountered in. Each declaration is represented by a Declaration. 0035 0036 Parsing code is arranged into builder classes, which subclass the AST visitor pattern. They are designed to be able to subclass each other, thus achieving multiple goals with each pass (as described in the above paragraph). 0037 0038 For most languages, the first pass is accomplished by the \ref KDevelop::AbstractContextBuilder "AbstractContextBuilder", \ref KDevelop::AbstractTypeBuilder "AbstractTypeBuilder", and \ref KDevelop::AbstractDeclarationBuilder "AbstractDeclarationBuilder". The customised builder class is a subclass of each of these classes. Thus, in the first pass, the \ref KDevelop::AbstractContextBuilder "AbstractContextBuilder" creates the \ref KDevelop::DUContext "DUContext" tree, the \ref KDevelop::AbstractTypeBuilder "AbstractTypeBuilder" records which \ref KDevelop::AbstractType "types" are encountered, and the \ref KDevelop::AbstractDeclarationBuilder "AbstractDeclarationBuilder" creates \ref KDevelop::Declaration "Declaration" instances which are associated with the current type and context. 0039 0040 The second pass is the creation of uses, accomplished a subclass of both the \ref KDevelop::AbstractContextBuilder and the \ref KDevelop::AbstractUseBuilder. On the second pass, we only iterate previously parsed contexts (as they are already created). Then, as variable uses are encountered, a \ref KDevelop::Use is created for each. A \ref KDevelop::Declaration is searched for in the current context, and if one is found, they are associated with each other. 0041 0042 0043 \section classes Classes and their purposes 0044 0045 \li \ref KDevelop::DUChain - a global object which keeps track of all loaded source files and the top level context of their definition-use chains. 0046 0047 \li \ref KDevelop::DUContext - an object which represents a single context in a source file, and stores information about parent and child \ref KDevelop::DUContext "DUContexts", and \ref KDevelop::Declaration "Declarations", \ref KDevelop::Definition "Definitions" and \ref KDevelop::Use "Uses" which occur in them. Also provides convenience methods for searching the chain. 0048 0049 \li \ref KDevelop::Declaration - an object which represents a single declaration. Has several subclasses which store more information specific to the type of declaration which is being represented. 0050 0051 \li \ref KDevelop::Use - an object which represents a use of a particular declaration. 0052 0053 \li \ref KDevelop::PersistentSymbolTable - a hash which stores identifiers available in the top level context of a source file and their respective \ref KDevelop::Declaration "Declarations". 0054 0055 \li KDevelop::*Builder - objects whose purpose is to iterate the parsed AST and produce instances of the duchain objects. 0056 0057 \li \ref KDevelop::AbstractType - the base class for types. 0058 0059 0060 \section searching-duchain Definition-use chain searching 0061 0062 Because iterating a complete definition-use chain can become expensive when they are large, when a search is being performed (eg. for a declaration corresponding to a certain identifier) it is first performed up to the top level context, then the symbol table is consulted. The symbol table is a hash of all identifiers which are known to the entire duchain. All potential matches are evaluated to see if they are visible from the location of the use. 0063 0064 0065 \section locking Locking 0066 0067 The duchain is designed to operate in a multithreaded environment. This means that multiple parse jobs may be operating simultaneously, reading from and writing to the duchain. Thus, locking is required. 0068 0069 A single read-write lock is used to serialise writes to the chain and allow concurrent reads. Thus, to call non-const methods, you must hold a write lock, and for const methods, a read lock. Customised read/write lockers have been created, called DUChainWriteLocker and DUChainReadLocker. You must not request a write lock while holding a read lock, or you could cause a deadlock. 0070 0071 Also, when manipulating text editor ranges, the \ref KTextEditor::SmartInterface must be locked. \warning You must <em>never</em> attempt to acquire the duchain read or write lock when holding the smart lock, else you may cause a deadlock. See code in \ref KDevelop::AbstractContextBuilder::openContextInternal and \ref KDevelop::DUChainBase. 0072 0073 0074 \section plugin-interface Interface for plugins 0075 0076 As plugins will be accessing the \ref KDevelop::DUChain from the main thread, they will need to hold a read lock. 0077 0078 0079 \section text-editor-integration Text editor integration 0080 0081 The main classes are subclasses of a base class, \ref KDevelop::DUChainBase. This object holds a reference to the text range. When the source file is opened in an editor, the \ref KDevelop::EditorIntegrator will create smart text ranges, which are bound to the editor's copy of the document. From there, highlighting can be applied to these ranges, as well as other advanced functions (see the \ref KTextEditor documentation for possibilities). The language support will convert these ranges to smart ranges when the corresponding document is loaded into an editor. 0082 0083 0084 \section future Future features - ideas 0085 0086 The completed duchain should allow for code refactoring, intelligent navigation, improved automatic code generation (eg. "create switch statement"), context-sensitive code completion, integration of documentation, debugger integration, a code structure view, call graph, static code analysis etc. 0087 0088 */ 0089 0090 /** 0091 * \page Implementing Implementing Definition-Use Chains for a specific language 0092 * 0093 * \ref index "Overview" | \ref duchain-design "Design" | Implementing | \ref Using "Using" 0094 * 0095 * \section create Creating the Definition-Use Chain 0096 * 0097 * To create a definition-use chain for a programming language, you need the following: 0098 * \li a parser for the language, 0099 * \li a context builder, 0100 * \li a type builder, 0101 * \li a declaration builder, 0102 * \li and a use builder. 0103 * 0104 * Once you have everything up to the declaration builder, your language's classes, functions etc. 0105 * will automatically appear in the class browser, and be able to perform limited refactoring. 0106 * 0107 * Once you have the use builder, you will automatically have full support for context browsing. 0108 * 0109 * Code completion support requires further work specific to your language, see \ref cc 0110 * 0111 * \subsection parser Parser 0112 * Parsers in %KDevelop can be created in any way as long as they produce an AST (abstract 0113 * syntax tree). Most supported languages have parsers generated by kdevelop-pg-qt. 0114 * This is a LL parser generator, and allows you to specify the grammar from which the 0115 * parser and AST are generated. The parser will also need a lexer, common solutions are to 0116 * use flex to create one for you, or to create one by hand. 0117 * 0118 * \subsection builders DUChain Builders 0119 * 0120 * The abstract builder classes (detailed below) provide convenience functions for creating a 0121 * definition-use chain. They are template classes which require 2 or 3 class 0122 * types: 0123 * - T: your base AST node type 0124 * - NameT: your identifier AST node type, if you have only one, or your base AST node type 0125 * if more than one exist 0126 * - Base class: your base class, eg. for your use builder, you will usually supply your custom 0127 * context builder here. 0128 * 0129 * \subsection context Context Builder 0130 * By subclassing \ref KDevelop::AbstractContextBuilder "AbstractContextBuilder", you will have everything you need to 0131 * keep track of contexts as you iterate the AST. When a new context is encountered, such 0132 * as a new block (eg. between {} brackets), create a new context with KDevelop::AbstractContextBuilder::openContext(), 0133 * and close it with KDevelop::AbstractContextBuilder::closeContext(). 0134 * 0135 * Some languages do not need a context to be created for each block, for example languages 0136 * where declarations are visible after the block in which they were defined (eg. php). 0137 * 0138 * \subsection type Type Builder 0139 * By subclassing \ref KDevelop::AbstractTypeBuilder "AbstractTypeBuilder", you can create types 0140 * when one is encountered in your AST by calling openType(). Again, you need to closeType() 0141 * when the type is exited. Complex types are built up this way by creating the type at each node, 0142 * ie. with int[], first an array type is opened, then an integral type representing an integer 0143 * is opened and closed, then when the array type is closed, you can retrieve the lastType() 0144 * and set that as the type which is being made into an array. 0145 * 0146 * \subsection declaration Declaration Builder 0147 * By subclassing \ref KDevelop::AbstractDeclarationBuilder "AbstractDeclarationBuilder", you can create 0148 * declarations when they are encountered in your AST. Usually you will assign the lastType() 0149 * or currentType() to them within closeDeclaration(). 0150 * 0151 * \subsection use Use Builder 0152 * By subclassing \ref KDevelop::AbstractUseBuilder "AbstractUseBuilder", you can create uses when they are encountered 0153 * in your AST, and they will be automatically registered with the current context. 0154 * 0155 * \subsection expr Expression Visitor 0156 * By subclassing \ref KDevelop::AbstractExpressionVisitor "AbstractExpressionVisitor", you can provide a convenient 0157 * way to request the type and/or instance information for a given expression. This is useful in both 0158 * figuring out what code completion to offer when it is invoked after or inside an expression, and to 0159 * allow complete use building, ie. for uses within expressions. In c++, this is a very complex matter, 0160 * involving type conversion and overload resolution, while still tracking template types. 0161 * 0162 * \section cc Implementing Code Completion 0163 * 0164 * To provide code completion for your language, you will need to implement the following: 0165 * \todo complete this section 0166 */ 0167 0168 /** 0169 * \page Using Using already created Definition-Use Chains in plugins 0170 * 0171 * \ref index "Overview" | \ref duchain-design "Design" | \ref Implementing "Implementing" | Using 0172 * 0173 * \section intro Introduction 0174 * This section is designed for developers who want to use definition-use chains, for example to provide 0175 * code generation, refactoring, or other advanced language-specific functionality. First some important 0176 * fundamentals of using the duchain classes will be covered. 0177 * 0178 * \subsection pointers Definition-use chain pointers and references 0179 * As the definition-chain is a dynamic entity, safe pointers (DU*Pointer) and indirect references (Indexed*) 0180 * are required to reference objects in a thread-safe way, and in a way that allows minimisation of memory use by saving 0181 * non-referenced chains to disk. While you do not hold the KDevelop::DUChain::lock(), 0182 * these pointers and references should not be accessed, because the objects they will return may be 0183 * modified by other threads. 0184 * 0185 * The KDevelop::DUChain::lock() is a read-write lock, which means that if you don't intend to 0186 * change the chain (which you won't, unless you are a language plugin developer), you only need a read-lock. 0187 * This has the advantage of allowing multiple threads to safely read from the chain simultaneously. 0188 * The easiest way to acquire this lock is to use KDevelop::DUChainReadLocker: 0189 * \code 0190 * KDevelop::TopDUContextPointer topContext; 0191 * 0192 * // Retrieve the top context for myUrl (see explanation below) 0193 * topContext = KDevelop::DUChainUtils::standardContextForUrl( myUrl ); 0194 * 0195 * // Lock the duchain for reading 0196 * KDevelop::DUChainReadLocker readLock( KDevelop::DUChain::lock() ); 0197 * 0198 * // Check if the top context pointer is valid 0199 * if ( topContext ) { 0200 * ... 0201 * } 0202 * \endcode 0203 * Before accessing the top context, this code will block until a read-only lock has been acquired. 0204 * It is then safe to access const functions of all duchain objects. The lock will continue to be held 0205 * until readLock goes out of scope. 0206 * 0207 * \note It is safe to recursively acquire a read-lock (or a write-lock), but not safe to request a write lock 0208 * once a read lock is held (this may result in a deadlock). 0209 * \note You must not attempt to acquire the duchain lock when you already hold the smart lock (this may result in a deadlock). 0210 * 0211 * In debug builds, if you attempt to access something in the duchain which you do not hold the proper lock 0212 * for, you will encounter an assert (usually triggered by the ENSURE_CHAIN_READ_LOCKED or ENSURE_CHAIN_WRITE_LOCKED macros). 0213 * 0214 * For more information about duchain pointers, see KDevelop::DUChainPointer. 0215 * 0216 * \section searching-declarations Searching for declarations 0217 * If you have a qualified identifier and want to retrieve declaration(s) which share the 0218 * identifier, you can do this by using the symbol table (KDevelop::PersistentSymbolTable): 0219 * 0220 * \code 0221 * KDevelop::DUChainReadLocker lock( KDevelop::DUChain::lock() ); 0222 * KDevelop::PersistentSymbolTable::Declarations declarations = KDevelop::PersistentSymbolTable::self().getDeclarations( qualifiedIdentifier ); 0223 * 0224 * for (PersistentSymbolTable::Declarations::Iterator it = declarations.iterator(); it; ++it) { 0225 * KDevelop::Declaration* decl = it->declaration(); 0226 * // Use the declaration... 0227 * } 0228 * \endcode 0229 * 0230 * \section accessing Accessing a definition-use chain 0231 * The first step in using a duchain is to retrieve the chain that you are interested in. 0232 * Presumably you will know the URL of the file for which you want to retrieve the chain. 0233 * Some languages (notably C and C++) can have several different chains for one file depending on 0234 * what the definitions of macros were when the files were parsed. Because of this, the recommended 0235 * way to access the duchain for a document is via KDevelop::DUChainUtils. 0236 * 0237 * \subsection accessing-topcontext Accessing top level contexts 0238 * \todo include a note on how to request loading of contexts from disk, and requesting parsing of files which 0239 * are not currently in the duchain. 0240 * 0241 * Top level contexts (TopDUContext) can be retrieved through KDevelop::DUChainUtils::standardContextForUrl(). 0242 * This is the context which is presented to the user when the file is opened (for highlighting, completion etc.). 0243 * In case it is not the context which you are after, all contexts for a file can be retrieved via 0244 * KDevelop::DUChain::chainsForDocument(). 0245 * 0246 * \subsection accessing-declaration Accessing declarations at a specific location 0247 * If you have a url and a cursor location, you can attempt to retrieve the declaration located at that position 0248 * with KDevelop::DUChainUtils::itemUnderCursor(). 0249 * 0250 * \section navigating Navigating a definition-use chain 0251 * \subsection navigating-duobject All duchain objects 0252 * All duchain objects inherit from KDevelop::DUChainBase. This is in turn a subclass of KDevelop::DocumentRangeObject. 0253 * Thus, you can retrieve the text range of every object via KDevelop::DocumentRangeObject::range(). If the document 0254 * is currently loaded in a text editor, it will likely have a smart range (KTextEditor::SmartRange), which tracks the position 0255 * of the range when the document is changed. This can be accessed via KDevelop::DocumentRangeObject::smartRange(). 0256 * 0257 * \subsection navigating-contexts Contexts 0258 * Now that you have a chain, you'll probably want to be able to navigate around it. You can iterate contexts 0259 * by using KDevelop::DUContext::childContexts(). You can then retrieve from each 0260 * context a list of local declarations with KDevelop::DUContext::localDeclarations(), and a list of all 0261 * uses in the context with KDevelop::DUContext::uses(). 0262 * 0263 * Imported contexts are usually contexts which have declarations which are visible in the current context. 0264 * For example: 0265 * \code 0266 * for (int i = 0; i < count(); ++i) { 0267 * qDebug() << i; 0268 * } 0269 * \endcode 0270 * The code which contains the debug statement will import the for conditions context, which contains the declaration 0271 * of i. Thus, i's declaration is visible to the debug statement. 0272 * 0273 * Usually, you will not have to worry about these details, as the search functions already take them into account. 0274 * If you want to find a declaration for a given identifier in a given context, you can use one of the 0275 * KDevelop::DUContext::findDeclarations() or KDevelop::DUContext::findLocalDeclarations() functions. 0276 * 0277 * \subsection navigating-declarations Declarations 0278 * Declarations always occur within a context, which can be accessed through KDevelop::Declaration::parentContext(). 0279 * Some declarations (eg. namespaces, classes) create a new context which can then contain child declarations, eg. 0280 * variables and functions within a class. For these declarations, the associated context which contains these 0281 * child declarations can be accessed through KDevelop::Declaration::internalContext(), if one exists. 0282 * 0283 * \subsection navigating-uses Uses 0284 * Uses are instances where a declaration is referenced in the code. All uses for a declaration can be calculated 0285 * from the duchain, although this can potentially be a time-consuming task. KDevelop::Declaration::uses() will return 0286 * all uses for a declaration, and KDevelop::Declaration::smartUses() will return smart ranges which represent all uses 0287 * in the currently opened documents. 0288 * 0289 * \subsection navigating-types Types 0290 * Declarations may have a type, which can be retrieved through KDevelop::Declaration::abstractType(). Types can then 0291 * be visited using KDevelop::TypeVisitor, or manually with the corresponding calls in the type subclasses. Types can be 0292 * compared for equality using KDevelop::AbstractType::equals(). 0293 * 0294 * \section efficiency DUChain efficiency issues 0295 * It was confirmed during the implementation of the DUChain that there is too much information to store the duchain for 0296 * an entire project in memory at the same time (KDevPlatform itself was >1Gb). Subsequently, saving the chains to disk 0297 * has been implemented. Following are some of the ramifications of this design. 0298 * 0299 * \subsection referenced-topcontexts Top Context Referencing 0300 * In order to determine which chains can be unloaded from memory, a referenced pointer was introduced called 0301 * KDevelop::ReferencedTopDUContext. If you are using duchain objects outside of a duchain lock, and you need them to 0302 * remain in memory, you should create a KDevelop::ReferencedTopDUContext for the top context of each of the chains you need. 0303 * This will ensure it is not unloaded. However, do not use this excessively or %KDevelop will have the same problem 0304 * of using large amounts of memory. 0305 * 0306 * \code 0307 * KDevelop::TopDUContextPointer topContext; 0308 * KDevelop::ReferencedTopDUContext topReferenced 0309 * 0310 * topContext = KDevelop::DUChainUtils::standardContextForUrl( myUrl ); 0311 * topReferenced = KDevelop::DUChainUtils::standardContextForUrl( myOtherUrl ); 0312 * 0313 * // Both of these pointers may be valid here 0314 * 0315 * // Sleep 0316 * sleep(10); 0317 * 0318 * // Lock the duchain for reading 0319 * KDevelop::DUChainReadLocker readLock( KDevelop::DUChain::lock() ); 0320 * 0321 * // topContext may not be valid any more, because it may have been saved to disk and unloaded from memory. 0322 * // topReferenced will still be valid if it was valid when it was retrieved (above). 0323 * \endcode 0324 * 0325 * \subsection code-model Code Model 0326 * In order to facilitate easy access to top level declarations, a list of top level declarations is available from 0327 * KDevelop::CodeModel. For each parsed file, you can call KDevelop::CodeModel::items() to retrieve a list of declarations 0328 * and some basics about their type. If you need any further information, the chain must be loaded from disk. 0329 * \code 0330 * uint count; 0331 * const CodeModelItem* items; 0332 * IndexedString file = \<yourFile\>; 0333 * 0334 * // Retrieve the items for the given file 0335 * KDevelop::CodeModel::self().items(file, count, items); 0336 * 0337 * for (int i = 0; i < count; ++i) { 0338 * CodeModelItem* thisItem = items++; 0339 * 0340 * // Use the item here. 0341 * ... 0342 * } 0343 * \endcode 0344 * 0345 * To access the declaration for each item, use KDevelop::PersistentSymbolTable::declarations(). 0346 * 0347 * \section inadequate When the duchain doesn't contain all the information 0348 * If you need more information than is available in the duchain, you're most likely looking at using the AST generated by the 0349 * language support. Note that this is obviously not language-independent, so it should be a last resort in cases where 0350 * the functionality being supplied is not language-specific. If the duchain is missing some information that would make 0351 * sense to add, please raise it with the %KDevelop developers. 0352 * 0353 * \todo add mechanism to get at the AST 0354 * \todo keep the AST in memory for loaded files 0355 */ 0356 0357 0358 // DOXYGEN_REFERENCES = language/editor 0359 // DOXYGEN_SET_WARN_LOGFILE=language/duchain/doxygen.log 0360 // DOXYGEN_SET_RECURSIVE=yes