Warning, /education/kstars/kstars/skycomponents/indexing.dox is written in an unsupported language. File is not indexed.

0001 /*! \page HTMIndexing HTM Indexing
0002   \tableofcontents
0003   Design Document for HTM Indexing in KStars
0004   \section Theory Theory
0005   A spatial index was introduced to drastically speed up both the
0006   drawing of objects on the screen and searching for objects based upon
0007   a position such as finding the object nearest to the cursor or finding
0008   out which constellation a point is in.  Drawing to the screen is sped
0009   up because the calculation to determine if an object is on screen
0010   is very expensive.  Without the index we have to perform this
0011   calculation for every object which is time consuming.  The index
0012   speeds things up because it can give us a list of all the objects
0013   on screen (plus a few extras that are off screen).
0014 
0015   We are currently using a Hierarchical Triangular Mesh (HTM) spatial
0016   indexing library written by A. Szalay, John Doug Reynolds, Jim Gray,
0017   and Peter Z. Kunszt.  An HTM divides the surface of a sphere into a
0018   bunch of little triangles just like a geodesic dome.  Each triangle is
0019   called a trixel (triangular pixel) and each trixel has a unique
0020   integer ID.
0021 
0022   We index point objects (i.e. stars) by creating a data structure
0023   that can quickly give us a list of all the stars that are located
0024   inside any trixel.  To draw the stars, we get a list of all trixels
0025   that are possibly visible on the current screen.  The worst case is
0026   when the screen is zoomed out so all trixel on the visible half the
0027   celestial sphere need to be used.  When the screen is zoomed in, the
0028   number of visible trixels is reduced by roughly the area of the
0029   celestial sphere that is visible.  To draw the stars, we use the HTM
0030   to find all of the trixels that are possibly visible and for each such
0031   trixel we draw the list of stars that are in that trixel.
0032 
0033   Extended objects, such as  constellation lines are a bit more tricky.
0034   If we indexed them with a single pointer in a single trixel then parts
0035   of the object could be on screen and not drawn.  So for extended
0036   objects we create a data structure similar to that used for the stars
0037   which allows us to quickly get the list of all objects in any trixel
0038   but since more than one trixel may be needed to cover an object, there
0039   can be more than one pointer to an object in the index.  If we use the
0040   point object algorithm and simply draw every object in the list of
0041   every visible trixel than the same object can be drawn repeatedly if
0042   it is covered by more than one trixel and if more than one of its
0043   covering trixels is visible.  This is not a rare event.  It is
0044   a very common occurence and if left unchecked, it would destroy much of
0045   the efficiency we gained by implementing the index in the first place.
0046 
0047   Fortunately, there is an easy solution to prevent any given object from
0048   being drawn more than once in any given draw cycle (a draw cycle is
0049   when we draw all the objects on the screen.  If the zoom scale or the
0050   center point of the screen change, or if the celestial sphere appears
0051   to rotate due to the rotation of the earth then we need to initiate a
0052   new draw cycle to update the screen).  We use a global integer value
0053   that gets incremented at the start of each draw cycle which we call
0054   the global drawID.  Every indexed extended object has its own internal
0055   integer drawID.  When we go through the lists of pointers to visible
0056   object we compare the drawID of each object with the global drawID.
0057   If they are the same, we don't draw the object.  If they are different
0058   we draw the object and set the object's drawID to the value of the
0059   global drawID which ensures that the object won't get drawn again in
0060   the same draw cycle.  Since the global drawID is incremented at the
0061   beginning of every draw cycle, setting an object's drawID to the
0062   current global drawID will not prevent the object from being drawn in
0063   future draw cycles.
0064 
0065   \section Implementation Implementation
0066   HTM Library Interface
0067 
0068     + HTMesh
0069      - SkyMesh
0070 
0071   The HTMesh class is an addition to the HTM library that provides a
0072   simple interface to that library by exposing only those parts that we
0073   need and hiding the rest.  In addition, it contains some simple error
0074   checking for indexing polygons and it also contains a routine for
0075   indexing line segments (by creating a very narrow triangle that covers
0076   the line segment).
0077 
0078   The SkyMesh class is a subclass of HTMesh.  It mingles the HTM with
0079   QT objects and KStars objects.  It has two purposes.  The first is
0080   to provide a more convenient, higher level interface to the HTM
0081   library.  For example to find the trixel containing a SkyPoint *p we
0082   can call:
0083 
0084      skyMesh->index( p )
0085 
0086   instead of:
0087 
0088      HTMesh->index( p->ra()->Degrees(), p->dec()->Degrees() );
0089 
0090   SkyMesh also has some real code in it that is used to find all of the
0091   trixels that cover a polygon with an arbitrary number of vertices and
0092   to find all of the trixels that cover a connected series of line
0093   segments (such as those that make up the outline of the Milky Way or
0094   the lines that make up a constellation).  These routines work by
0095   calling the more primitive indexing routines and then or'ing together
0096   the resulting lists of indices.  The or'ing is needed so that a
0097   pointer to a given object will show up at most once in the list of
0098   objects for a given pixel.  The drawID would prevent these extra
0099   entries from causing the same object from being drawn more than once
0100   but the extra entries would still be a waste of both time and space.
0101 
0102   The or'ing is done with a QHash that uses the trixel ID's as keys.
0103   When the keys of the QHash are read out, each unique trixel ID will
0104   only be read out once even if that ID was inserted multiple times.
0105   Since that QHash was needed to create a unique list of keys, it is
0106   also used as the output container of the indexing routines from which
0107   the caller reads the ID's.  In the current implementation there is
0108   only one QHash and the line/poly indexing routines are always
0109   returning the address of the same QHash over and over.  It might have
0110   looked better if we had copied the keys from the QHash into a QList or
0111   a QVector but that extra copying would have been wasteful.
0112 
0113   \subsection TopLevel Top Level
0114   Most/all of the instances that use the HTM index are created from
0115   within the SkyMapComposite class.   The SkyMesh parameter is currently
0116   explicit so you can quickly see which classes/instances use the index
0117   by looking that the list of constructors inside the SkyMapComposite
0118   constructor.  SkyMapComposite is also where (for our purposes) the
0119   draw() calls get initiated.  At the top of SkyMapComposite::draw()
0120   there is the code:
0121 
0122       float radius = ks->map()->fov();
0123       if ( radius > 90.0 ) radius = 90.0;
0124 
0125       SkyPoint* focus = ks->map()->focus();
0126       skyMesh->aperture(focus, radius + 1.0);
0127 
0128   This code determines what objects will end up in the draw loops of the
0129   classes that use the index.  This information is stored in the skyMesh
0130   object that was created at the top of the SkyMapComposite constructor.
0131   All classes that use the index have a pointer to this object which the
0132   use to construct an iterator which will iterate over all of the
0133   trixels that cover the circular area in the aperture call above.
0134 
0135   For debugging, it's useful to change "radius + 1.0" to "radius / 2.0"
0136   which means that only trixels needed to cover a circle surrounding the
0137   central quarter of the screen will be returned by the iterators.
0138   Therefore objects that have been indexed properly will disappear as
0139   you pan the screen and objects that have not been indexed will always
0140   stay visible (as long as they are on screen).
0141 
0142   You can also change the amount of debugging information that is
0143   printed when indexing near the top of the SkyMapComposite constructor:
0144 
0145     skyMesh->debug( 1 );
0146 
0147   0 (zero) means no printout, 1 (one) prints out one line per index and
0148   10 (ten) causes the number of trixels of every indexed object to be
0149   printed.  This can give you a sense of the speed of the index and is
0150   also useful for spotting bugs that cause an abnormal number of trixels
0151   to be added to the index.  If you want to debug just one section you
0152   can pass similar debug numbers to the indexLines() and indexPolygons()
0153   calls in the LineListIndex end use subclasses.
0154 
0155   \subsection DataDrawing Data and Drawing
0156   In order to make the most efficient use of the HTM index, we need to
0157   break the composite/component model and in same cases even break the
0158   strict OOP model that insists that code and data go together.  The
0159   approach used here is separate the code and the data.  In much of the
0160   new code the data is held in very small lightweight classes that are
0161   little more than structs.  The drawing code is in classes that hold
0162   lists of the lightweight data elements.  This is not a new idea, in
0163   fact it is a traditional approach to boosting OOP performance.
0164 
0165   \subsection DrawingStars Drawing Stars
0166   For the sake of efficiency, the stars were already all being drawn from
0167   the single draw() routine in the StarComponent class which made adding
0168   the HTM index almost trivial.  Two new data members were added:
0169 
0170   +        SkyMesh* skyMesh;
0171   +        QVector< StarList* > starIndex;
0172 
0173   where StarList is a typedef for QVector<StarObject*>.  The starIndex
0174   is populated when we read in the star data from the *.hip files:
0175 
0176   +    int starID = skyMesh->index( (SkyPoint*) o );
0177   +    starIndex[starID]->append( o );
0178 
0179   For drawing, the biggest change is in the loop over stars inside the
0180   draw() routine.  A simple loop over _all_ of the star objects is
0181   replaced with a double loop, first iterating over all the visible
0182   trixels and then for each visible trixel we iterate over all of the
0183   stars that are in that trixel:
0184 
0185   -   foreach ( SkyObject *o, objectList() ) {
0186   -       StarObject *curStar = (StarObject*)o;
0187 
0188   +    region = skyMesh->iterator();
0189   +    while ( region->hasNext()) {
0190   +        StarList* starList = starIndex[ region->next() ];
0191   +        for (int i=0; i < starList->size(); ++i) {
0192   +            StarObject *curStar = (StarObject*) starList->at( i );
0193 
0194   The double loop looks more messy but this is where we get the
0195   performance boost because the list of stars we finally iterate over is
0196   at worst case slightly larger than half the original list and at best
0197   case (when the screen is greatly zoomed in) we can gain a factor of 10
0198   or 100 in speed because the number of stars we iterate over is reduced
0199   by that factor.
0200 
0201   The index was also used to drastically speed up the objectNearest()
0202   routine in StarComponent but we leave the explanation of how that
0203   works as an exercise for the reader.  Hint: it works just like the
0204   change to the draw loop.
0205 
0206   \subsection DrawingDSO DRAWING DSO's
0207   Adding the index to the drawing of DSO's was almost as trivial as the
0208   changes in StarComponent.  There were two differences.  First, the
0209   DSO's are being stored (and drawn) separately by catalog so we needed
0210   a separate index for each catalog.  The other change that these
0211   indexes were implemented as:
0212 
0213     typedef QVector< DeepSkyObject*> DeepSkyList;
0214     QHash<int, DeepSkyList>          dsIndex;
0215 
0216   This was done in order to save some space. The stars use a QVector
0217   because it is smaller and faster than a QHash _if_ all of (or most of)
0218   the indices are being used as is the case with the stars.  For indexes
0219   of sparse objects (where most of the trixel are empty), we save some
0220   space by using a QHash since the empty trixels take up no room at all.
0221   Adding point objects to a hash based index is only slightly more
0222   complicated than adding objects to a vector based index:
0223 
0224       if ( ! dsIndex->contains( trixel ) ) {
0225           dsIndex->insert(trixel, new DeepSkyList() );
0226       }
0227       dsIndex->value( trixel )->append( o );
0228 
0229   Adding extended objects to a hash based index is the most complicated
0230   case.  You can see two examples of the code for that in the routines
0231   indexLines() and indexPolygons() both found in the LineListIndex
0232   class.
0233 
0234   \subsection DrawingLinesPolygons Drawing Lines and Polygons
0235   Using the HTM index to help draw lines and polygons is a little bit
0236   more complicated than using it to index and draw points as was done
0237   with the stars and the DSO.  The indexing is a little more complicated
0238   because if any part of an object may be visible then we want to draw
0239   the entire object.  Also, we want to be able to clip objects that
0240   are on the celestial horizon.  Since there are several different kinds
0241   of objects we want to draw which all require slightly different data
0242   structures, we have several different classes to handle them.
0243 
0244   Class Structure:
0245 
0246   Index Classes
0247   -------------
0248 
0249     + LineListIndex
0250       - ConstellationLines
0251       - Ecliptic
0252       + SkipListIndex
0253         - MilkWay
0254       + NoPrecessIndex
0255         - CoordinateGrid
0256         - ConsellationBoundary
0257         - Equator
0258 
0259     + PolyListIndex
0260       - ConstellationBoundaryPoly
0261 
0262   Data Container Classes
0263   ----------------------
0264 
0265     + LineList
0266       - SkipHashList
0267 
0268     + PolyList
0269 
0270 
0271   The three abstract index classes and their associated data containers:
0272 
0273      LineListIndex    SkipListIndex       PolyListIndex
0274           ||               ||                  ||
0275        LineList         SkipHashList            PolyList
0276 
0277   NoPrecessIndex and LabeledListIndex uses the plain LineList data
0278   container. NoPrecessIndex does not precess the data in the containers
0279   and LabeledListIndex additionally allows a single label to be added to a
0280   curved line (equator, ecliptic).  The LabeledListIndex code could easily
0281   be combined into NoPrecessIndex but I think it is more clear as a
0282   separate subclass at the expense of almost repeated code.
0283 
0284   The LineList/LineListIndex (which we will call LineList[Index] for
0285   brevity) provide the base for all the other classes.  They can be used
0286   for indexing and drawing either a series of line segments or a series
0287   of filled polygons but not both.  If you want to draw both lines and
0288   filled polygons you need to use the SkipHashList[Index] combo that adds
0289   the ability to skip some of the lines that are drawn.  If you don't
0290   want to skip any lines and want to draw filled polygons and the
0291   complete outline of the filled polygons then you can use the standard
0292   LineList[Index] combo.
0293 
0294   Both the LineList[Index] and the SkipHashList[Index] store their data in
0295   QVector<SkyPoint*>'s.  There is one such data structure in each
0296   LineList and each SkipHashList.  The PolyList stores the data in a
0297   QPolygonF instead.  This was needed to implement the code that uses
0298   the QPolygonF to find out which constellation a point is in.  At
0299   present you can not directly draw the data stored in a PolyList's
0300   combo although it would not be difficult to add code to do the drawing
0301   if/when the need arises.
0302 
0303 
0304   The LineListIndex class
0305 
0306   The bulk of the code is in LineListLindex as can be seem by these
0307   relative sizes:
0308 
0309     $ wc *index.cpp
0310 
0311       572  2088 19073 linelistindex.cpp
0312        60   200  2138 noprecessindex.cpp
0313       143   489  4594 polylistindex.cpp
0314        43   137  1623 skiplistindex.cpp
0315 
0316   LineListIndex contains code to index both filled polygons and lines.
0317   There are two versions of each draw routine, one for integer drawing
0318   and the other for anti-aliased (float) drawing for a total of four in
0319   all.  There are two indexing routines, one for filled polygons and
0320   another for lines.
0321 
0322   In order to avoid code repetition there are three virtual "callback"
0323   methods that subclasses can override in order to customize the
0324   behavior.  Most end use classes (such as ConstellationLines or
0325   CoordinateGrid) will want to override the preDraw() method in order to
0326   set the QPen (and QBrush or whatever) before their data is drawn.
0327   This routine is called right before the lines/polygons are drawn.  The
0328   default is to set the QPen to a solid thin white line.  The end use
0329   classes can also override the draw() method and set the QPen in their
0330   own draw method but in most cases this is slightly more complicated
0331   than overriding preDraw() because they will then have to call
0332   LineListIndex::draw() from inside their own draw() routine which you
0333   don't have to do if you use preDraw().
0334 
0335   The other two virtual callback methods are overridden by the
0336   "abstract" SkipListIndex subclass.  The method getIndexHash() is
0337   used to modify the line indexing routine.  getIndexHash() is
0338   overridden by SkipListIndex so line segments that should be skipped
0339   from drawing are also skipped from the indexing.
0340 
0341   Finally, skipAt() is overridden by SkipListIndex in order to not draw
0342   lines segments that should be skipped.  It returns a boolean value and
0343   the default behavior is to always return false which means that by
0344   default all the line segments are drawn.  There might be a small
0345   performance hit for having this callback since it is called for every
0346   single _point_, not just every LineList but the speed hit may not even
0347   be noticeable/measurable and it allows us to get by with just 4 almost
0348   identical draw routines.  Without it, we would need 8 draw routines
0349   which would make the code matainance even more of a headache.
0350 
0351   The iterator in the four draw*() routines is similar to the iterator
0352   in DeepSkyComponent but has the addition of the drawID to prevent
0353   objects from being drawn more than once in any given draw cycle:
0354 
0355       DrawID drawID = skyMesh()->drawID();
0356       MeshIterator region( skyMesh() );
0357       while ( region.hasNext() ) {
0358 
0359           LineListList *lineListList = lineIndex()->value( region.next() );
0360           if ( lineListList == 0 ) continue;
0361 
0362           for (int i = 0; i < lineListList->size(); i++) {
0363               LineList* lineList = lineListList->at( i );
0364 
0365               if ( lineList->drawID == drawID ) continue;
0366               lineList->drawID = drawID;
0367 
0368     \subsubsection DataContainer  Data Container Classes
0369     The data container classes: LineList, SkipHashList, and PolyList are
0370     simple data containers with no non-trivial code.  All of them just
0371     have header files and no .cpp files.
0372 
0373     \subsubsection EndUseClasses End Use Classes
0374     There are currently seven different end use classes:
0375 
0376         - CoordinateGrid
0377         - ConstellationLines
0378         - ConsellationBoundary
0379         - MilkWay
0380         - ConstellationBoundaryPoly
0381         - Ecliptic
0382         - Equator
0383 
0384     These classes are typically small and simple, with just an init()
0385     routine to populate and append a bunch of LineList's (or their
0386     descendents) and a preDraw() routine to set the QPen.  It is the
0387     responsibility of the end use classes to make sure the data containers
0388     they use match the *ListIndex file they subclass.
0389 
0390     \subsubsection ConstellationBoundaryPoly ConstellationBoundaryPoly class
0391     Originally this was a subclass of LineListIndex but everthing is
0392     different about it so now it (and PolyListIndex) is in a class by
0393     themselves.  Same with PolyList which used to be a subclass of
0394     LineList.  Here are the differences:
0395 
0396       o The data is stored on QPolygonF's instead of lists of SkyPoints.
0397 
0398       o It is instantiated and initialized inside of
0399         ConstellationBoundary and not SkyMapComposite.
0400 
0401       o The index is stored in a QVector instead of a QHash because
0402         the constellations are covered by every trixel on the sphere.
0403 
0404       o There is no drawing done but there are routines for returning
0405         which constellation a SkyPoint is in.
0406 
0407       o PolyLists have a name which LineLists don't have.
0408 
0409     The only real code duplication is the indexPolygons() routine that was
0410     lifted directly from LineListIndex.  But since both the input and
0411     output data structures were differt (QPolygonF vs. List of SkyPoints
0412     and QVector instead of QHash) it made the most sense to give
0413     ConstellationBoundaryPoly its own indexing routine.  There is now
0414     duplication in the summary() debugging routine too.
0415 
0416     \section JustInTimeUpdate Just-in-Time Updating
0417     Just-in-time updating was recently added to all indexed objects.
0418     Instead of updating all the objects in a separate update() routine,
0419     the updates now take place inside the draw routine.  Before
0420     SkyMapComposite::update() (which is still needed for objects that are
0421     not yet indexed) is called, an updateID is incremented in KStarsData.
0422     If the update call sends a non-zero KSNumbers then updateNumID is
0423     incremented also.
0424 
0425     Every indexed object has its own updateID and updateNumID.  Inside
0426     the draw routine if an object's updateID does not match the global
0427     updateID then the object is updated.  In addition if the objects
0428     updateNumID does not match the global updateNumID then a more
0429     expensive update is performed.  This should greatly speed up updates
0430     especially if the screen is zoomed in on a small section of the sky.
0431     This simple mechanism ensures that only objects that are (nearly) on
0432     the screen get updated as needed.
0433 
0434     \section ErrorDetection  Error Detection and Prevention
0435     The HTM library we use does not have any built-in error detection or
0436     prevention.  We have to provide all of that ourselves.  There are two
0437     types of failure modes in the library.  The first failure mode has to
0438     do with passing it coordinates out of range.  For example, if we pass
0439     it NaN as one of the coordinates, it will return bad results.  The
0440     second type of failure is triggered when we send it two consecutive
0441     identical points to one of the polygon intersection routines.  Since
0442     the library automatically wraps the last point back to the first point
0443     when intersecting polygons, if the last point is identical to the
0444     first we will also get a failure.
0445 
0446     The identical point failures don't return incorrect results, instead,
0447     they cause library to hang, crash, or consume RAM relentlessly.
0448     Therefore these failures _must_ all be detected and prevented before
0449     unleashing our code on the public even if it causes the program to run
0450     more slowly.  Fortunately the polygon intersection routines are only
0451     used in indexing, not in drawing so error prevention has no effect on
0452     the drawing time.  Even more fortunately, it has a minimal impact on
0453     the indexing time.
0454 
0455     Repeated point failures in all the polygon intersection routines are
0456     detected and prevented in the HTMesh class which we use as an
0457     interface to the HTM library.  It contains a somewhat arbitrary
0458     parameter called eps which is currently set to 1.0E-6.  We consider
0459     two point identical if their Manhattan distance (in degrees) is less
0460     than eps:
0461 
0462        fabs(ra1 - ra2) + fabs(dec1 - dec2) < eps
0463 
0464     Since we typically use a level 5 or less mesh and average edge of a
0465     trixel at level 5 is about 3 degrees, an eps as large as 0.1 or so
0466     would probably suffice since the two points are very likely to be in
0467     the same trixel.  The eps we use is 5 orders of magnitude smaller and
0468     yet still seems to prevent duplicate point errors in the library.
0469 
0470     For the four-point polygon we check for four different consecutive
0471     points: p1:p2, p2:p3, p3:p4, and p4:p1.  As soon as any one of these
0472     tests detects a duplicate, we drop one of the dupes and send the
0473     remaining three points to the three-point polygon intersection
0474     routine.  If there are any dupes in the remaining three points, they
0475     will be resolved in the three-point intersection.  We play the same
0476     trick with the three-point intersection, we do three tests and send
0477     two points to the two-point (line segment) intersection routine if we
0478     found a dupe.  Finally if a dupe is found in the two-point
0479     intersection routine, we do a circular intersection around one of the
0480     points, using the distance between the two points as the radius.
0481 
0482     Unlike the other polygon routines which use eps as the length scale,
0483     the two-point (line segment) routine uses the average edge size
0484     divided by 10 to trigger dropping down to the circular intersection.
0485     We might want to fine tune this later.  The downside of the current
0486     method is that very rarely a few unneeded trixels may be added to the
0487     set of that covers a very short line segment.  The performance impact
0488     (size and speed) of this slight imperfection may be too small to
0489     measure.
0490 
0491 */