Warning, /education/labplot/src/doc/data_set.dox is written in an unsupported language. File is not indexed.

0001 /**\page data_set Data set
0002 
0003 \section goals Main goals
0004 
0005 We want support for easily importing data from a variety of sources, including files, pipes, SQL-databases, measuring devices (real-time measurements?)
0006 and function generators. Furthermore, GUI should provide functions for easy editing of imported data sets without opening them in a specific view
0007 (spreadsheet, worksheet or editor).
0008 
0009 To set the terminology used in our project, we define some objects. The basic object for storing the data is 'DataSet'.
0010 Furthermore, we discriminate between following data sets:
0011 
0012 - Vector data set: data is organized in columns (vectors). Data can be numeric, text or date/time values,
0013 where the data type always applies to a column as a whole (in contrast to spreadsheet applications à la KSpread).
0014 - Matrix data set: data is organized in matrices, represents values of functions which depends on two variables.
0015 - Image data set: represents imported or generated image data.
0016 
0017 The actual origin of the data is encapsulated in the object 'DataSource'. Each 'DataSet' object has a 'DataSource' object.
0018 
0019 \section features Planned features (user's sight)
0020 
0021 - There is a container for all imported data sets - 'DataSetContainer'.
0022 - There is a 'DataSetManager' - part of the GUI, which provides all the functions for managing available data sets. 
0023 It is possible to generate new sets out of already existing, plots out of vectors from different sets etc. KST is worth looking at.
0024 - It is possible to convert the content of a spreadsheet to vectors and/or matrices and to add them to the 'DataSetContainter'. 
0025 May be this should be done by default. Say, in the 'DataSetManager' all the columns created by hand in different spreadsheets are accessible.
0026 - 'ProjectManager' contains a folder 'DataSets' which represents the data set container introduced above.
0027 - 'DataSets'-folder provides functions for adding new data sets, updating all available sets, removing all sets etc. in the context menu.
0028 - Each child node in the 'DataSets'-folder corresponds to an imported/generated data set.
0029 The different origins of the data sets (file, database or generated from a function)
0030 are pointed out in the 'ProjectManager' by using different icons and different strings in the column 'Comments'.
0031 - Context menu of each 'DataSet'-node provides functions for updating, deleting, showing the data in a new spreadsheet or text editor,
0032 attaching the data to a an existing spreadsheet or worksheet etc. 'Edit'-function opens the Widget associated with the corresponding 'DataSource'
0033 (currently ImportWidget), where the origin of the data (e.g. the file name) can be changed etc.
0034 Update-functions ('Watch the file', 'reload the file', 'stop/continue reading from the pipe') trigger the update in the corresponding views, if available.
0035 Data can be displayed in a spreadsheet or in a text editor in read-only mode on in a mode which also allows changing of the original data.
0036 Saving of the changes modifies the original data in the corresponding data source, except for data set which are generated with the help of a function.
0037 - There are 'Add data plot' and 'Add function plot' fictions in the GUI which help to create/generate a plot from imported/generated data set in one step.
0038 
0039 \section arch Architecture (developer's sight)
0040 
0041 (The contents of this section are still subject to discussion. Everyone is encouraged to provide suggestions and ideas to improve the architecture.)
0042 
0043 We need an object for storing imported/generated data. The general interface for that is provided by 'AbstractDataSet'.
0044 
0045 AbstractAspect
0046 ¦
0047 `- AbstractDataSet
0048         ¦
0049         +- VectorDataSet
0050         ¦
0051         +- MatrixDataSet
0052         ¦
0053         `- ImageDataSet
0054 
0055 'DataSetContainer' is a one the central parts of the application and contains all the data sets available at run-time.
0056 
0057 AbstractPart
0058 ¦
0059 `- DataSetContainer
0060 
0061 The view of the 'DataSetContainer' is the 'DataSetManagerWidget'.
0062 
0063 Data can be imported from a directory, a file on the disk or from a database. May be it's better to first concentrate on these sources only.
0064 Importing of data from measuring devices reduces on Unix-systems to reading out a device file in /dev/ or a pipe created and filled by the measuring device.
0065 Furthermore, we need an abstract interface that allows users to add support for data sources using plug-ins or scripts.
0066 'AbstractDataSource' provides the common interface for all possible data sources.
0067 
0068 AbstractDataSource
0069         ¦
0070         +- DirDataSource
0071         ¦
0072         +- FileDataSource
0073         ¦
0074         +- SqlDataSource
0075         ¦
0076         `- FunctionDataSource
0077 
0078 Different formats of data stemming from file- and sql-sources (ascii or binary, separator, number of lines to skip, numeric or date/time or strings,
0079 data organized in columns or rows etc.) are handled by pre-defined or user-defined I/O-filters,
0080 which convert the content of the data source to the internal representation in terms of vectors or matrices.
0081 Custom settings are made in the ImportWidget. It is possible to save the made settings as a new filter.
0082 Data in the source can be read once, checked for updates automatically ('watch the file'-function in the GUI),
0083 in defined time intervals or the update is triggered by the user himself. In the last case, the GUI signals if the source was changed
0084 (e.g. 'changed'-icon in the project manager).
0085 
0086 \section status Current status in the back-end
0087 
0088 The back-end code from SciDAVis already contains some concept of a data source: 'AbstractColumn'.
0089 A better name would probably be 'AbstractVectorDataSource'
0090 ('vector' in the sense of a mathematical vector in R^N where N is the number of rows and R can be real numbers (or rather the subset that 'double' supports),
0091 strings or dates; the corresponding data abstraction for 3D surface plots would be a matrix), at least from the developer's perspective.
0092 Currently, however, there is only one real implementation of this interface: 'Column'.
0093 
0094 Labplot1.x contains already some pre-defined settings in the 'ImportWidget' and has filters for reading CDF, HDF5 and Origin-files.
0095 Those should be casted to the new I/O-filter-plugin-Framework.
0096 
0097 
0098 */