Warning, /education/labplot/data/datasets/README.md is written in an unsupported language. File is not indexed.

0001 # Datasets
0002 This directory containts the description of collections of datasets publically available on the internet. At the moment reading of text (ASCII) files only is supported.
0003 
0004 To add a new collection of datasets, two steps are required.
0005 
0006 ----
0007 ## Register new collection:
0008 To register a new collection, add the required information to the file *DatasetCollections.json* which has the following simple structure:
0009 
0010 ```javascript
0011 [
0012         {
0013                 "name" : "unique name of the collection",
0014                 "description" : "description of the colleciton, can contain basic html-tags like <b>, etc. to format the text.",
0015                 "url" : "URL of the collection to document where the data is taken from"
0016         },
0017         ...
0018 ]
0019 ```
0020 
0021 ----
0022 ## Document datasets:
0023 The documentation of the actual data sets for every collection is done in a separate file *collection_name.json*. This file should have the structure as in a following example for the dataset 'babyboom' from the JSE collection (*JSEDataArchive.json*):
0024 
0025 ```javascript
0026 {
0027     "name": "JSEDataArchive",
0028     "categories": [
0029         {
0030             "name": "Category",
0031             "subcategories": [
0032                 {
0033                     "name": "Sub-Category1",
0034                     "datasets": [
0035                         {
0036                             "description": "Time of Birth, Sex, and Birth Weight of 44 Babies",
0037                             "description_url": "http://jse.amstat.org/datasets/babyboom.txt",
0038                             "url": "http://jse.amstat.org/datasets/babyboom.dat.txt",
0039                             "filename": "babyboom",
0040                             "name": "Time of Birth, Sex, and Birth Weight of 44 Babies",
0041                             "separator": "TAB",
0042                             "columns": ["Time of birth recorded on the 24-hour clock", "Sex of the child (1 = girl, 2 = boy)", "Birth weight in grams", "Number of minutes after midnight of each birth"]
0043                         },
0044                         ...
0045                     ]
0046                 },
0047                 ...
0048             ]
0049         },
0050         ...
0051     ]
0052 }
0053 ```
0054 
0055 Note, the name of the collection specified here should be identical to the name of the json file.
0056 
0057 To control the parsing process of the text file, several option are available. Below is the full list of parameters that can be specified to document the format of the data set:
0058 
0059 ```javascript
0060 {
0061         "description": "description of the data set",
0062         "description_url": "URL to fetch the addition description",
0063         "url": "URL for the actual data",
0064         "filename": "name of the data file without the file extension",
0065         "name": "user-friendly name of the data set",
0066         "separator": "TAB",
0067         "comment_character": "#",
0068         "create_index_column": false,
0069         "skip_empty_parts": true,
0070         "simplify_whitespaces": false,
0071         "remove_quotes": true,
0072         "use_first_row_for_vectorname": true,
0073         "columns": [],
0074         "number_format": 1,
0075         "DateTime_format": "yyyy-MM-dd"
0076 }
0077 ```
0078 The only required parameters are *'description'*, *'filename'*, *'name'* and *'url'*. For other parameters default values apply as in the list below:
0079 
0080 ```javascript
0081 "description_url": "",
0082 "separator": "auto",
0083 "comment_character": "#",
0084 "create_index_column": false,
0085 "skip_empty_parts": true,
0086 "simplify_whitespaces": false,
0087 "remove_quotes": true,
0088 "use_first_row_for_vectorname": false,
0089 "columns": [],
0090 "number_format": 1,
0091 "DateTime_format": ""
0092 ```
0093 
0094 So, ommiting these parameters in the specification of the data set completely  still results in a complete description of the data set. The meaning of this parameters and their possible values are described in LabPlot's documentation.