Warning, /office/calligra/filters/stage/powerpoint/README is written in an unsupported language. File is not indexed.

0001 Overview of converting PPT documents to ODP documents.
0002 
0003 This document describes various aspects of ODP documents and of PPT files. Knowledge of both is needed in order to convert from one format to the other.
0004 
0005 
0006 parsing of PPT files.
0007 
0008 PPT files are OLE containers with a number of streams. Five streams are common:
0009  - "PowerPoint Document"
0010  - "Pictures"
0011  - "Current User"
0012  - "DocumentSummaryInformation"
0013  - "SummaryInformation"
0014 
0015 This discussion focusses on "PowerPoint Document". This stream contains all the slides information except for most of the embedded picture files. These pictures are stored in the Pictures stream.
0016 
0017 A newly generated ODP should contain content.xml and styles.xml as well as all embedded pictures and a list of all embedded files in manifest.xml and a mimetype file.
0018 The file content.xml contains all the content specific for the file and styles.xml contains all information specific for the style of presentation. That includes definitions of all the masters and the styles used in the master slides.
0019 
0020 content.xml also contains style information. This is styling information that is contained in 'automatic' styles, i.e. styles that originate from incidental style changes in the document.
0021 
0022 According to this distinction, a powerpoint template would give a nearly empty content.xml file but the same styles.xml file that would be produced for a document created from the template.
0023 
0024 == style inheritance in PPT files ==
0025 
0026 Slides, handouts and notes can have a master. The master has an OfficeArtDggContainer that specifies the default styles for all drawing objects. For text objects the default styles are defined in textCFDefaultsAtom, textPFDefaultsAtom, and textSIDefaultsAtom.
0027 For placeholders, slightly different rules apply. A placeholder is first defined in the master. The slide instances can re-use the placeholder and also inherits all the styles from the placeholder. Every shape (OfficeArtSpContainer) can be a placeholder. It is a placeholder if the optional field clientData.placeholderAtom.position exists and is not 0xFFFFFFFF.
0028 
0029 If a shape is not placeholder, only the global defaults apply to it. If a shape is a placeholder, the styles derive from the styles of the placeholder.
0030 
0031 
0032 
0033 
0034 === styles.xml ===
0035 
0036 == styles ==
0037 
0038 We will go through each part of the styles.xml file and look at what information it contains and where to get that information from in a ppt document. We are taking the file OpenDocument-schema-v1.0-os.rng, from here on called the RNG, as leading in the listing of the possible elements for a styles.xml file.
0039 
0040 This is a minimal styles document according to the RNG, with zero subelements:
0041 
0042 <?xml version="1.0" encoding="UTF-8"?>
0043 <office:document-styles xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0">
0044 </office:document-styles>
0045 
0046 The root element of styles.xml is office:document-styles. The first allowed element there is office:styles. In office:styles, according to the RNG, elements from the group 'styles' (style:style, text:list-style, number:number-style, number:currenty-style, number:percentage-tyle, number:date-style, number:time-style, number:boolean-style, number:text-style) are allowed as well as style:default-style, text:outline-style, text:notes-configuration, text:bibliography-configuration, text:linenumbering-configuration, draw:gradient, svg:linearGradient, svg:radialGradient, draw:hatch, draw:fill-image, draw:marker, draw:stroke-dash, draw:opacity and style:presentation-page-layout.
0047 
0048 The first group of these all have a style:name element. That means they are entities that can be referenced from other parts of content.xml and styles.xml. These entities could also be placed in automatic-styles. There are also global settings that should ideally be defined in every styles.xml file. These are the elements:
0049   <style:default-style style:family="...">
0050 where style:family can be any of 12 families. Each of these families can contain different elements such as style:text-properties,  style:paragraph-properties. Note that if you define style:text-properties for the "paragraph" family that does not automatically mean that it is also defined for the "text" family. To be on the safe side, we should define all of these for all families.
0051 
0052 The style:style and style:default-style elements are important elements. The style:style element must have a name and both must have a family. The combination of name and family should be unique across both styles.xml and content.xml. A minimal style element looks like this:
0053   <style:style style:name="someName" style:family="graphic"/>
0054 The value for style:family may be either text, paragraph, section, ruby, table, table-column, table-row, table-cell, graphic, presentation, drawing page, or chart.
0055 For each of these families, there should be a default style in /office:document-styles/office:styles.
0056 
0057 
0058 == global style elements ==
0059 
0060 Besides style definitions there are other global objects related to style. These are line markers, gradients, hatches, background images, stroke dashes and placeholder layouts. These can be referenced from any style and when converting from ppt files, these are anonymous, which means the converter has to generate a name for them.
0061 
0062 
0063 
0064 
0065 
0066 
0067 
0068 
0069 
0070 
0071 
0072 
0073 
0074 == procedure for converting ppt to odp ==
0075 
0076 = pictures =
0077 
0078 Pictures are easy to convert. Each image has a uid value. This binary array of 16 bytes is converted to a hex array and uses as the base part of the name. The extention reflects the type of image. So all image entries in the Pictures stream are converted to files in the zip container with a name that is derived from the uid key. This same key is used in the PowerPoint Document stream and thus it is easy to create the xlink:href attribute.
0079 
0080 = styles =
0081 
0082 == global elements ==
0083 
0084 The global elements such as line markers, gradients, hatches, background images, stroke dashes and placeholder layouts should be added to the front of the styles.xml. The full list of these items is however only known after going through all master slides, slides, notes and handouts. Since this information in a document tree this is not so hard to do. To simply the programming a functor can be used to reduce the amount of code needed. A distinction should be made between the graphic styles and the placeholders, since the former are defined in FOPTE arrays and the latter are defined in OfficeArtSpContainers.
0085 The names of the global objects that are later referenced should be easily accessible with an appropriate key, without the need to regenerate said options. This could be achieved by a logical naming scheme and the garantee that the objects are travesed in the same order, were it not that equal objects can be referenced more than once.
0086 
0087 For fill-image, the rule can be simple; that element has no important additional information, so the number of the image in the blip store can be used.
0088 
0089 For things like stroke-dash, a scheme to generate the name from the contents would be convenient, or lacking that map that maps pointers of object using the stroke-dash to the stroke-dash name. This scheme would imply a functor that loops overall FOPTE elements in the entire ppt, collects the object and stores them, with a pointer to the FOPTE container in a map.
0090 
0091 
0092 
0093 
0094 == TextCFException to text-properties ==
0095 
0096 Roughly one can say that CFExceptions from the PPT format should be mapped to style:text-properties in ODF. CFException occur in many places in PPT and style:text-properties occur in many places in ODF. Here we list the occurances for both.
0097 
0098 Where does style:text-properties occur?
0099 In these style families: text, paragraph, table-cell, graphic, presentation, and chart. In addition, it occurs in text:list-level-style-bullet and text:list-level-style-number, which are part of text:list-style. So where-ever a style:style from one of these families or a text:list-style is referenced, a style:text-properties can occur.
0100 
0101 Where does TextCFException occur?
0102 
0103  DocumentContainer
0104   DocumenTextInfoContainer
0105    TextMasterStyleAtom
0106     TextMasterStyleLevel
0107      TextCFException
0108    TextCFExceptionAtom
0109     TextCFException
0110   SlideListWithTextContainer
0111    SlideListWithTextSubContainerOrAtom
0112     StyleTextPropAtom
0113      TextCFRun
0114       TextCFException
0115 
0116  MasterOrSlideContainer
0117   MainMasterContainer
0118    TextMasterStyleAtom
0119     TextMasterStyleLevel
0120      TextCFException
0121 
0122  OfficeArtClientTextBox
0123   TextClientDataSubContainerOrAtom
0124    StyleTextPropAtom
0125     TextCFRun
0126      TextCFException
0127 
0128 
0129 
0130 
0131 
0132 
0133 
0134 
0135 
0136 
0137 
0138 
0139 
0140 
0141 
0142 
0143 
0144 
0145