Warning, /office/calligra/sheets/doc/FORMATTING_DESIGN is written in an unsupported language. File is not indexed.

0001 I just want to store some notes here in CVS as a compilation of
0002 what I understood from some previous discussion on redesigning -- wanted
0003 to do this before it is forgotten.  It is very unlikely that I am 
0004 going to have the time/motivation to do this myself but hopefully
0005 this can help give a start to whomever does the work.  I don't think
0006 anything here is definitive, and certainly if there are better ideas
0007 than we should scrap these.  For Norbert/Phillip/Ariya this is just
0008 to help keep track of our ideas, and hopefully the future will see
0009 many more Calligra Sheets hackers who can use this as a head-start in thinking
0010 about Calligra Sheets design.
0011 
0012 
0013 NOTE: when I say 'pointer' in this description, I'm thinking of some shared
0014 object kind of class with a reference count -- not a literal pointer
0015 that we would have to remeber to delete...
0016 
0017 
0018 
0019 The problem needing solved is that the class KSpreadCell (which has 
0020 about a billion instantiations during a run of the program) is several
0021 hundred bytes.  This is a tremendous waste of space, and most of it
0022 is spent holding information such as font type/size/color, which borders
0023 to draw and line thickness, background color, and so on.  Since this
0024 information is going to be identical for a vast majority of cells, we 
0025 should find an efficient way to share data among cells.
0026 
0027 The first idea is to break the format information into small classes,
0028 such as one for font size/type, one for border information and so on.
0029 The way to save memory is to use a 'flyweight' system in which cells
0030 would have a pointer to the data, so cells with the same formatting have
0031 the same pointer and the information itself has only a single instantiation.
0032 
0033 At first, we can simply use the copy constructor of this class to implement
0034 the sharing, and if it seems profitable in the long run these classes
0035 can keep some kind of static mapping so that in the constructor a check
0036 can be done to see if, for instance, helvetica font size 12 has already
0037 been allocated in the past and use that pointer rather than allocating
0038 a 2nd instance.
0039 
0040 
0041 
0042 
0043 Next, these format objects would be collected objects I was calling 'styles'
0044 A style would basically be one of every type of Format object and thus
0045 would completely define the format of the cell.  A style can be shared the
0046 same way as a format object -- if two cells have all identical format
0047 objects than they can share the same style object.
0048 
0049 
0050 
0051 We had discussed two different ways of actually mapping these formats
0052 and styles to particular cells.  
0053 
0054 One way is to simply have
0055 each cell contain a pointer to its style.  Rather than each cell using
0056 200ish bytes to store the formatting, it has the single 4 byte pointer,
0057 and then the 200ish bytes is shared among all cells with that same
0058 formatting information.
0059 
0060 The other possibility is to map it by region.  This involves storing
0061 a map of some sort in KSpreadSheet to say, cells A2:E30 have this style,
0062 column H has this style, etc.  Here, the cell itself would store no
0063 formatting.  
0064 
0065 If I remember correctly, we were leaning towards the second
0066 method because of both the memory consumption, and because it is a simpler
0067 way of handling setting formatting on a full column or row.  However
0068 this method will be much more complex to implement in a way that there
0069 can be efficient lookup to retrieve the current style for a particular cell.
0070 
0071 
0072 
0073 Some things to decide:
0074 
0075 How fine grained to make the format objects?
0076 - How much information to store in each format object.  If there are a few,
0077   large format objects, than each Style is very small, requiring only a 
0078   single pointer for each of these few format objects.  However the data
0079   sharing is not very efficient if between 2 cells the font color changes, and
0080   there are 10 other pieces of data that are exactly the same
0081 
0082   If there is too little in each format object, than we don't gain any
0083   savings in memory because each additional type of format object results
0084   in an extra pointer in each style object
0085 
0086 
0087 
0088 
0089 
0090 There's probably much more that can be put in here.
0091 
0092 
0093 BTW, I hope to stay involved at least a little with Calligra Sheets.  It is unlikely
0094 however that I will try to take on any large chunks of code unless I just get 
0095 in a random programming fit on a weekend  :-)  
0096 
0097 -John