Warning, /education/kstars/kstars/skycomponents/indexing.dox is written in an unsupported language. File is not indexed.
0001 /*! \page HTMIndexing HTM Indexing 0002 \tableofcontents 0003 Design Document for HTM Indexing in KStars 0004 \section Theory Theory 0005 A spatial index was introduced to drastically speed up both the 0006 drawing of objects on the screen and searching for objects based upon 0007 a position such as finding the object nearest to the cursor or finding 0008 out which constellation a point is in. Drawing to the screen is sped 0009 up because the calculation to determine if an object is on screen 0010 is very expensive. Without the index we have to perform this 0011 calculation for every object which is time consuming. The index 0012 speeds things up because it can give us a list of all the objects 0013 on screen (plus a few extras that are off screen). 0014 0015 We are currently using a Hierarchical Triangular Mesh (HTM) spatial 0016 indexing library written by A. Szalay, John Doug Reynolds, Jim Gray, 0017 and Peter Z. Kunszt. An HTM divides the surface of a sphere into a 0018 bunch of little triangles just like a geodesic dome. Each triangle is 0019 called a trixel (triangular pixel) and each trixel has a unique 0020 integer ID. 0021 0022 We index point objects (i.e. stars) by creating a data structure 0023 that can quickly give us a list of all the stars that are located 0024 inside any trixel. To draw the stars, we get a list of all trixels 0025 that are possibly visible on the current screen. The worst case is 0026 when the screen is zoomed out so all trixel on the visible half the 0027 celestial sphere need to be used. When the screen is zoomed in, the 0028 number of visible trixels is reduced by roughly the area of the 0029 celestial sphere that is visible. To draw the stars, we use the HTM 0030 to find all of the trixels that are possibly visible and for each such 0031 trixel we draw the list of stars that are in that trixel. 0032 0033 Extended objects, such as constellation lines are a bit more tricky. 0034 If we indexed them with a single pointer in a single trixel then parts 0035 of the object could be on screen and not drawn. So for extended 0036 objects we create a data structure similar to that used for the stars 0037 which allows us to quickly get the list of all objects in any trixel 0038 but since more than one trixel may be needed to cover an object, there 0039 can be more than one pointer to an object in the index. If we use the 0040 point object algorithm and simply draw every object in the list of 0041 every visible trixel than the same object can be drawn repeatedly if 0042 it is covered by more than one trixel and if more than one of its 0043 covering trixels is visible. This is not a rare event. It is 0044 a very common occurence and if left unchecked, it would destroy much of 0045 the efficiency we gained by implementing the index in the first place. 0046 0047 Fortunately, there is an easy solution to prevent any given object from 0048 being drawn more than once in any given draw cycle (a draw cycle is 0049 when we draw all the objects on the screen. If the zoom scale or the 0050 center point of the screen change, or if the celestial sphere appears 0051 to rotate due to the rotation of the earth then we need to initiate a 0052 new draw cycle to update the screen). We use a global integer value 0053 that gets incremented at the start of each draw cycle which we call 0054 the global drawID. Every indexed extended object has its own internal 0055 integer drawID. When we go through the lists of pointers to visible 0056 object we compare the drawID of each object with the global drawID. 0057 If they are the same, we don't draw the object. If they are different 0058 we draw the object and set the object's drawID to the value of the 0059 global drawID which ensures that the object won't get drawn again in 0060 the same draw cycle. Since the global drawID is incremented at the 0061 beginning of every draw cycle, setting an object's drawID to the 0062 current global drawID will not prevent the object from being drawn in 0063 future draw cycles. 0064 0065 \section Implementation Implementation 0066 HTM Library Interface 0067 0068 + HTMesh 0069 - SkyMesh 0070 0071 The HTMesh class is an addition to the HTM library that provides a 0072 simple interface to that library by exposing only those parts that we 0073 need and hiding the rest. In addition, it contains some simple error 0074 checking for indexing polygons and it also contains a routine for 0075 indexing line segments (by creating a very narrow triangle that covers 0076 the line segment). 0077 0078 The SkyMesh class is a subclass of HTMesh. It mingles the HTM with 0079 QT objects and KStars objects. It has two purposes. The first is 0080 to provide a more convenient, higher level interface to the HTM 0081 library. For example to find the trixel containing a SkyPoint *p we 0082 can call: 0083 0084 skyMesh->index( p ) 0085 0086 instead of: 0087 0088 HTMesh->index( p->ra()->Degrees(), p->dec()->Degrees() ); 0089 0090 SkyMesh also has some real code in it that is used to find all of the 0091 trixels that cover a polygon with an arbitrary number of vertices and 0092 to find all of the trixels that cover a connected series of line 0093 segments (such as those that make up the outline of the Milky Way or 0094 the lines that make up a constellation). These routines work by 0095 calling the more primitive indexing routines and then or'ing together 0096 the resulting lists of indices. The or'ing is needed so that a 0097 pointer to a given object will show up at most once in the list of 0098 objects for a given pixel. The drawID would prevent these extra 0099 entries from causing the same object from being drawn more than once 0100 but the extra entries would still be a waste of both time and space. 0101 0102 The or'ing is done with a QHash that uses the trixel ID's as keys. 0103 When the keys of the QHash are read out, each unique trixel ID will 0104 only be read out once even if that ID was inserted multiple times. 0105 Since that QHash was needed to create a unique list of keys, it is 0106 also used as the output container of the indexing routines from which 0107 the caller reads the ID's. In the current implementation there is 0108 only one QHash and the line/poly indexing routines are always 0109 returning the address of the same QHash over and over. It might have 0110 looked better if we had copied the keys from the QHash into a QList or 0111 a QVector but that extra copying would have been wasteful. 0112 0113 \subsection TopLevel Top Level 0114 Most/all of the instances that use the HTM index are created from 0115 within the SkyMapComposite class. The SkyMesh parameter is currently 0116 explicit so you can quickly see which classes/instances use the index 0117 by looking that the list of constructors inside the SkyMapComposite 0118 constructor. SkyMapComposite is also where (for our purposes) the 0119 draw() calls get initiated. At the top of SkyMapComposite::draw() 0120 there is the code: 0121 0122 float radius = ks->map()->fov(); 0123 if ( radius > 90.0 ) radius = 90.0; 0124 0125 SkyPoint* focus = ks->map()->focus(); 0126 skyMesh->aperture(focus, radius + 1.0); 0127 0128 This code determines what objects will end up in the draw loops of the 0129 classes that use the index. This information is stored in the skyMesh 0130 object that was created at the top of the SkyMapComposite constructor. 0131 All classes that use the index have a pointer to this object which the 0132 use to construct an iterator which will iterate over all of the 0133 trixels that cover the circular area in the aperture call above. 0134 0135 For debugging, it's useful to change "radius + 1.0" to "radius / 2.0" 0136 which means that only trixels needed to cover a circle surrounding the 0137 central quarter of the screen will be returned by the iterators. 0138 Therefore objects that have been indexed properly will disappear as 0139 you pan the screen and objects that have not been indexed will always 0140 stay visible (as long as they are on screen). 0141 0142 You can also change the amount of debugging information that is 0143 printed when indexing near the top of the SkyMapComposite constructor: 0144 0145 skyMesh->debug( 1 ); 0146 0147 0 (zero) means no printout, 1 (one) prints out one line per index and 0148 10 (ten) causes the number of trixels of every indexed object to be 0149 printed. This can give you a sense of the speed of the index and is 0150 also useful for spotting bugs that cause an abnormal number of trixels 0151 to be added to the index. If you want to debug just one section you 0152 can pass similar debug numbers to the indexLines() and indexPolygons() 0153 calls in the LineListIndex end use subclasses. 0154 0155 \subsection DataDrawing Data and Drawing 0156 In order to make the most efficient use of the HTM index, we need to 0157 break the composite/component model and in same cases even break the 0158 strict OOP model that insists that code and data go together. The 0159 approach used here is separate the code and the data. In much of the 0160 new code the data is held in very small lightweight classes that are 0161 little more than structs. The drawing code is in classes that hold 0162 lists of the lightweight data elements. This is not a new idea, in 0163 fact it is a traditional approach to boosting OOP performance. 0164 0165 \subsection DrawingStars Drawing Stars 0166 For the sake of efficiency, the stars were already all being drawn from 0167 the single draw() routine in the StarComponent class which made adding 0168 the HTM index almost trivial. Two new data members were added: 0169 0170 + SkyMesh* skyMesh; 0171 + QVector< StarList* > starIndex; 0172 0173 where StarList is a typedef for QVector<StarObject*>. The starIndex 0174 is populated when we read in the star data from the *.hip files: 0175 0176 + int starID = skyMesh->index( (SkyPoint*) o ); 0177 + starIndex[starID]->append( o ); 0178 0179 For drawing, the biggest change is in the loop over stars inside the 0180 draw() routine. A simple loop over _all_ of the star objects is 0181 replaced with a double loop, first iterating over all the visible 0182 trixels and then for each visible trixel we iterate over all of the 0183 stars that are in that trixel: 0184 0185 - foreach ( SkyObject *o, objectList() ) { 0186 - StarObject *curStar = (StarObject*)o; 0187 0188 + region = skyMesh->iterator(); 0189 + while ( region->hasNext()) { 0190 + StarList* starList = starIndex[ region->next() ]; 0191 + for (int i=0; i < starList->size(); ++i) { 0192 + StarObject *curStar = (StarObject*) starList->at( i ); 0193 0194 The double loop looks more messy but this is where we get the 0195 performance boost because the list of stars we finally iterate over is 0196 at worst case slightly larger than half the original list and at best 0197 case (when the screen is greatly zoomed in) we can gain a factor of 10 0198 or 100 in speed because the number of stars we iterate over is reduced 0199 by that factor. 0200 0201 The index was also used to drastically speed up the objectNearest() 0202 routine in StarComponent but we leave the explanation of how that 0203 works as an exercise for the reader. Hint: it works just like the 0204 change to the draw loop. 0205 0206 \subsection DrawingDSO DRAWING DSO's 0207 Adding the index to the drawing of DSO's was almost as trivial as the 0208 changes in StarComponent. There were two differences. First, the 0209 DSO's are being stored (and drawn) separately by catalog so we needed 0210 a separate index for each catalog. The other change that these 0211 indexes were implemented as: 0212 0213 typedef QVector< DeepSkyObject*> DeepSkyList; 0214 QHash<int, DeepSkyList> dsIndex; 0215 0216 This was done in order to save some space. The stars use a QVector 0217 because it is smaller and faster than a QHash _if_ all of (or most of) 0218 the indices are being used as is the case with the stars. For indexes 0219 of sparse objects (where most of the trixel are empty), we save some 0220 space by using a QHash since the empty trixels take up no room at all. 0221 Adding point objects to a hash based index is only slightly more 0222 complicated than adding objects to a vector based index: 0223 0224 if ( ! dsIndex->contains( trixel ) ) { 0225 dsIndex->insert(trixel, new DeepSkyList() ); 0226 } 0227 dsIndex->value( trixel )->append( o ); 0228 0229 Adding extended objects to a hash based index is the most complicated 0230 case. You can see two examples of the code for that in the routines 0231 indexLines() and indexPolygons() both found in the LineListIndex 0232 class. 0233 0234 \subsection DrawingLinesPolygons Drawing Lines and Polygons 0235 Using the HTM index to help draw lines and polygons is a little bit 0236 more complicated than using it to index and draw points as was done 0237 with the stars and the DSO. The indexing is a little more complicated 0238 because if any part of an object may be visible then we want to draw 0239 the entire object. Also, we want to be able to clip objects that 0240 are on the celestial horizon. Since there are several different kinds 0241 of objects we want to draw which all require slightly different data 0242 structures, we have several different classes to handle them. 0243 0244 Class Structure: 0245 0246 Index Classes 0247 ------------- 0248 0249 + LineListIndex 0250 - ConstellationLines 0251 - Ecliptic 0252 + SkipListIndex 0253 - MilkWay 0254 + NoPrecessIndex 0255 - CoordinateGrid 0256 - ConsellationBoundary 0257 - Equator 0258 0259 + PolyListIndex 0260 - ConstellationBoundaryPoly 0261 0262 Data Container Classes 0263 ---------------------- 0264 0265 + LineList 0266 - SkipHashList 0267 0268 + PolyList 0269 0270 0271 The three abstract index classes and their associated data containers: 0272 0273 LineListIndex SkipListIndex PolyListIndex 0274 || || || 0275 LineList SkipHashList PolyList 0276 0277 NoPrecessIndex and LabeledListIndex uses the plain LineList data 0278 container. NoPrecessIndex does not precess the data in the containers 0279 and LabeledListIndex additionally allows a single label to be added to a 0280 curved line (equator, ecliptic). The LabeledListIndex code could easily 0281 be combined into NoPrecessIndex but I think it is more clear as a 0282 separate subclass at the expense of almost repeated code. 0283 0284 The LineList/LineListIndex (which we will call LineList[Index] for 0285 brevity) provide the base for all the other classes. They can be used 0286 for indexing and drawing either a series of line segments or a series 0287 of filled polygons but not both. If you want to draw both lines and 0288 filled polygons you need to use the SkipHashList[Index] combo that adds 0289 the ability to skip some of the lines that are drawn. If you don't 0290 want to skip any lines and want to draw filled polygons and the 0291 complete outline of the filled polygons then you can use the standard 0292 LineList[Index] combo. 0293 0294 Both the LineList[Index] and the SkipHashList[Index] store their data in 0295 QVector<SkyPoint*>'s. There is one such data structure in each 0296 LineList and each SkipHashList. The PolyList stores the data in a 0297 QPolygonF instead. This was needed to implement the code that uses 0298 the QPolygonF to find out which constellation a point is in. At 0299 present you can not directly draw the data stored in a PolyList's 0300 combo although it would not be difficult to add code to do the drawing 0301 if/when the need arises. 0302 0303 0304 The LineListIndex class 0305 0306 The bulk of the code is in LineListLindex as can be seem by these 0307 relative sizes: 0308 0309 $ wc *index.cpp 0310 0311 572 2088 19073 linelistindex.cpp 0312 60 200 2138 noprecessindex.cpp 0313 143 489 4594 polylistindex.cpp 0314 43 137 1623 skiplistindex.cpp 0315 0316 LineListIndex contains code to index both filled polygons and lines. 0317 There are two versions of each draw routine, one for integer drawing 0318 and the other for anti-aliased (float) drawing for a total of four in 0319 all. There are two indexing routines, one for filled polygons and 0320 another for lines. 0321 0322 In order to avoid code repetition there are three virtual "callback" 0323 methods that subclasses can override in order to customize the 0324 behavior. Most end use classes (such as ConstellationLines or 0325 CoordinateGrid) will want to override the preDraw() method in order to 0326 set the QPen (and QBrush or whatever) before their data is drawn. 0327 This routine is called right before the lines/polygons are drawn. The 0328 default is to set the QPen to a solid thin white line. The end use 0329 classes can also override the draw() method and set the QPen in their 0330 own draw method but in most cases this is slightly more complicated 0331 than overriding preDraw() because they will then have to call 0332 LineListIndex::draw() from inside their own draw() routine which you 0333 don't have to do if you use preDraw(). 0334 0335 The other two virtual callback methods are overridden by the 0336 "abstract" SkipListIndex subclass. The method getIndexHash() is 0337 used to modify the line indexing routine. getIndexHash() is 0338 overridden by SkipListIndex so line segments that should be skipped 0339 from drawing are also skipped from the indexing. 0340 0341 Finally, skipAt() is overridden by SkipListIndex in order to not draw 0342 lines segments that should be skipped. It returns a boolean value and 0343 the default behavior is to always return false which means that by 0344 default all the line segments are drawn. There might be a small 0345 performance hit for having this callback since it is called for every 0346 single _point_, not just every LineList but the speed hit may not even 0347 be noticeable/measurable and it allows us to get by with just 4 almost 0348 identical draw routines. Without it, we would need 8 draw routines 0349 which would make the code matainance even more of a headache. 0350 0351 The iterator in the four draw*() routines is similar to the iterator 0352 in DeepSkyComponent but has the addition of the drawID to prevent 0353 objects from being drawn more than once in any given draw cycle: 0354 0355 DrawID drawID = skyMesh()->drawID(); 0356 MeshIterator region( skyMesh() ); 0357 while ( region.hasNext() ) { 0358 0359 LineListList *lineListList = lineIndex()->value( region.next() ); 0360 if ( lineListList == 0 ) continue; 0361 0362 for (int i = 0; i < lineListList->size(); i++) { 0363 LineList* lineList = lineListList->at( i ); 0364 0365 if ( lineList->drawID == drawID ) continue; 0366 lineList->drawID = drawID; 0367 0368 \subsubsection DataContainer Data Container Classes 0369 The data container classes: LineList, SkipHashList, and PolyList are 0370 simple data containers with no non-trivial code. All of them just 0371 have header files and no .cpp files. 0372 0373 \subsubsection EndUseClasses End Use Classes 0374 There are currently seven different end use classes: 0375 0376 - CoordinateGrid 0377 - ConstellationLines 0378 - ConsellationBoundary 0379 - MilkWay 0380 - ConstellationBoundaryPoly 0381 - Ecliptic 0382 - Equator 0383 0384 These classes are typically small and simple, with just an init() 0385 routine to populate and append a bunch of LineList's (or their 0386 descendents) and a preDraw() routine to set the QPen. It is the 0387 responsibility of the end use classes to make sure the data containers 0388 they use match the *ListIndex file they subclass. 0389 0390 \subsubsection ConstellationBoundaryPoly ConstellationBoundaryPoly class 0391 Originally this was a subclass of LineListIndex but everthing is 0392 different about it so now it (and PolyListIndex) is in a class by 0393 themselves. Same with PolyList which used to be a subclass of 0394 LineList. Here are the differences: 0395 0396 o The data is stored on QPolygonF's instead of lists of SkyPoints. 0397 0398 o It is instantiated and initialized inside of 0399 ConstellationBoundary and not SkyMapComposite. 0400 0401 o The index is stored in a QVector instead of a QHash because 0402 the constellations are covered by every trixel on the sphere. 0403 0404 o There is no drawing done but there are routines for returning 0405 which constellation a SkyPoint is in. 0406 0407 o PolyLists have a name which LineLists don't have. 0408 0409 The only real code duplication is the indexPolygons() routine that was 0410 lifted directly from LineListIndex. But since both the input and 0411 output data structures were differt (QPolygonF vs. List of SkyPoints 0412 and QVector instead of QHash) it made the most sense to give 0413 ConstellationBoundaryPoly its own indexing routine. There is now 0414 duplication in the summary() debugging routine too. 0415 0416 \section JustInTimeUpdate Just-in-Time Updating 0417 Just-in-time updating was recently added to all indexed objects. 0418 Instead of updating all the objects in a separate update() routine, 0419 the updates now take place inside the draw routine. Before 0420 SkyMapComposite::update() (which is still needed for objects that are 0421 not yet indexed) is called, an updateID is incremented in KStarsData. 0422 If the update call sends a non-zero KSNumbers then updateNumID is 0423 incremented also. 0424 0425 Every indexed object has its own updateID and updateNumID. Inside 0426 the draw routine if an object's updateID does not match the global 0427 updateID then the object is updated. In addition if the objects 0428 updateNumID does not match the global updateNumID then a more 0429 expensive update is performed. This should greatly speed up updates 0430 especially if the screen is zoomed in on a small section of the sky. 0431 This simple mechanism ensures that only objects that are (nearly) on 0432 the screen get updated as needed. 0433 0434 \section ErrorDetection Error Detection and Prevention 0435 The HTM library we use does not have any built-in error detection or 0436 prevention. We have to provide all of that ourselves. There are two 0437 types of failure modes in the library. The first failure mode has to 0438 do with passing it coordinates out of range. For example, if we pass 0439 it NaN as one of the coordinates, it will return bad results. The 0440 second type of failure is triggered when we send it two consecutive 0441 identical points to one of the polygon intersection routines. Since 0442 the library automatically wraps the last point back to the first point 0443 when intersecting polygons, if the last point is identical to the 0444 first we will also get a failure. 0445 0446 The identical point failures don't return incorrect results, instead, 0447 they cause library to hang, crash, or consume RAM relentlessly. 0448 Therefore these failures _must_ all be detected and prevented before 0449 unleashing our code on the public even if it causes the program to run 0450 more slowly. Fortunately the polygon intersection routines are only 0451 used in indexing, not in drawing so error prevention has no effect on 0452 the drawing time. Even more fortunately, it has a minimal impact on 0453 the indexing time. 0454 0455 Repeated point failures in all the polygon intersection routines are 0456 detected and prevented in the HTMesh class which we use as an 0457 interface to the HTM library. It contains a somewhat arbitrary 0458 parameter called eps which is currently set to 1.0E-6. We consider 0459 two point identical if their Manhattan distance (in degrees) is less 0460 than eps: 0461 0462 fabs(ra1 - ra2) + fabs(dec1 - dec2) < eps 0463 0464 Since we typically use a level 5 or less mesh and average edge of a 0465 trixel at level 5 is about 3 degrees, an eps as large as 0.1 or so 0466 would probably suffice since the two points are very likely to be in 0467 the same trixel. The eps we use is 5 orders of magnitude smaller and 0468 yet still seems to prevent duplicate point errors in the library. 0469 0470 For the four-point polygon we check for four different consecutive 0471 points: p1:p2, p2:p3, p3:p4, and p4:p1. As soon as any one of these 0472 tests detects a duplicate, we drop one of the dupes and send the 0473 remaining three points to the three-point polygon intersection 0474 routine. If there are any dupes in the remaining three points, they 0475 will be resolved in the three-point intersection. We play the same 0476 trick with the three-point intersection, we do three tests and send 0477 two points to the two-point (line segment) intersection routine if we 0478 found a dupe. Finally if a dupe is found in the two-point 0479 intersection routine, we do a circular intersection around one of the 0480 points, using the distance between the two points as the radius. 0481 0482 Unlike the other polygon routines which use eps as the length scale, 0483 the two-point (line segment) routine uses the average edge size 0484 divided by 10 to trigger dropping down to the circular intersection. 0485 We might want to fine tune this later. The downside of the current 0486 method is that very rarely a few unneeded trixels may be added to the 0487 set of that covers a very short line segment. The performance impact 0488 (size and speed) of this slight imperfection may be too small to 0489 measure. 0490 0491 */