Warning, /education/labplot/data/datasets/StatLib.json is written in an unsupported language. File is not indexed.

0001 {
0002     "name": "StatLib",
0003     "categories": [
0004         {
0005             "name": "Medicine",
0006             "subcategories": [
0007                 {
0008                     "datasets": [
0009                         {
0010                             "description": "The file ch14.dat contains the following 19 variables:\n\nPatient ID \nDate on study (MMDDYY)\nTreatment arm (D= daunorubicin, I= idarubicin)\nSex (M= male, F= female)\nAge (years)\nFAB classification  (1 - 6)\nKarnofsky score (0 - 100) \nBaseline white blood cells (in thousands per cubic millimeter)\nBaseline platelets (in thousands per cubic millimeter)\nBaseline hemoglobin (g/dl)\nEvaluable (Y= yes, N= no)\nComplete remission (CR) (Y= yes, N= no)\nCourses of chemotherapy to CR\nDate of CR (MMDDYY)\nDate of last follow-up (MMDDYY)\nStatus at last follow-up (D= dead, A= alive)\nBone marrow transplant (Y= yes, N= no)\nDate of bone marrow transplant (MMDDYY)\nInclusion in June 30, 1988 analysis (Y= yes, N= no)",
0011                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch14.dat",
0012                             "filename": "Leukemia-Trial",
0013                             "name": "Interpretation of a Leukemia Trial Stopped Early",
0014                             "separator": "SPACE",
0015                             "use_first_row_for_vectorname": false
0016                         },
0017                         {
0018                             "description": "The file ch15.dat contains the following variables:\n\n Patient ID   : Integer\n \n Institution  : 0  - Memorial Sloan-Kettering,\n                1  - Mayo Clinic,\n                2  - John Hopkins.\n Group        : 1 - Study,\n                0 - Control.\n\n Means of Detection : 0  - Routine Cytology,\n                      1  - Routine X-ray,\n                      2  - Both X-ray and Cytology,\n                      3  - Interval.\n\n Cell Type    : 0 - Epidermoid,\n                1 - Adenocarcinoma,\n                2 - Large Cell,\n                3 - Oat Cell,\n                4 - Other.\n Stage        : 4 digits, 1st digit (1,2,3) - overall stage,\n                          2nd digit (1,2,3) - tumor,\n                          3rd digit (0,1,2) - lymph nodes\n                          4th digit (0,1) - distant metastases\n Operated     : 1 - yes,\n                0 - no.\n Survival     : Integer - Days from detection to last date known alive.\n Survival Category : 0   - Alive,\n                     1   - Dead of lung cancer,\n                     2   - Dead of other causes.\n\n Missing values - '-'.",
0019                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch15.dat",
0020                             "filename": "Lung-Cancer",
0021                             "name": "Early Lung Cancer Detection Studies",
0022                             "separator": "SPACE",
0023                             "use_first_row_for_vectorname": false
0024                         },
0025                         {
0026                             "description": "The file ch16a.dat contains extent of scleral extension\n(extent to which the tumor has invaded the sclera or \"white of the eye\")\nas coded by two raters for each of 885 eyes.  There is one record for each\neye; the first field of each record contains a patient identifier, the\nsecond field contains the code for scleral extension assigned by rater A,\nand the third field contains the code for scleral extension assigned by\nrater B.  The coding scheme is:\n\n1=None or innermost layers\n2=Within sclera, but does not extend to scleral surface\n3=Extends to scleral surface\n4=Extrascleral extension without transection\n5=Extrascleral extension with presumed residual tumor in the orbit\n\nThe collaborative Ocular Melanoma Study (COMS) owns the\ncopyright to this dataset;  these data are considered preliminary due\nto the ongoing nature of the COMS clinical trials.",
0027                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch16a.dat",
0028                             "filename": "Choroidal-Melanoma",
0029                             "name": "Modeling Interrater Agreement for Pathological Features of Choroidal Melanoma",
0030                             "separator": "SPACE",
0031                             "use_first_row_for_vectorname": false
0032                         },
0033                         {
0034                             "description": "The file ch16b.dat contains the degree of necrosis (tissue\ndeath) data for 3 raters.  The first field contains a patient identifier,\nand the second, third, and fourth fields contain the code for degree of\nnecrosis as assigned by raters A, B, and C, respectively.  The coding\nscheme is:\n\n1=None\n2=Less than 10% of cells\n3=Greater than or equal to 10% of cells\n\n\nThe collaborative Ocular Melanoma Study (COMS) owns the\ncopyright to this dataset;  these data are considered preliminary due\nto the ongoing nature of the COMS clinical trials.",
0035                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch16b.dat",
0036                             "filename": "Choroidal-Melanoma-2",
0037                             "name": "Modeling Interrater Agreement for Pathological Features of Choroidal Melanoma",
0038                             "separator": "SPACE",
0039                             "use_first_row_for_vectorname": false
0040                         },
0041                         {
0042                             "description": "These data remains the copyright of the Harris Birthright Research Unit\nof the University of Aberdeen, UK. It may be used freely for\nnon-commercial purposes and can be freely distributed provided its\nsource is acknowledged.\n\nThe file ch18a.dat contains the following individual-specific variables:\n\nVariable                Coding\nControl/patient code    0=control, 1=patient\nStudy number            1-500 for each group\nNumber of smears        1-15\nBiopsy result           0=negative, 1=positive  \n                        9=missing (no biopsy)\t\nNumber of days from     0-840 if biopsy done,   \nlast smear to biopsy    -1 if no biopsy",
0043                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch18a.dat",
0044                             "filename": "Cervical-Cancer",
0045                             "name": "Modeling the Precursors of Cervical Cancer",
0046                             "separator": "SPACE",
0047                             "use_first_row_for_vectorname": false
0048                         },
0049                         {
0050                             "description": "These data remains the copyright of the Harris Birthright Research Unit\nof the University of Aberdeen, UK. It may be used freely for\nnon-commercial purposes and can be freely distributed provided its\nsource is acknowledged.\n\nThe file ch18a.dat contains the following smear-specific variables:\n\nVariable                Coding                  \nControl/patient code    0=control, 1=patient    \nStudy number            1-500 for each group    \nSmear number            1-15                    \nSmear grade             0=negative, 1=positive  \nInterval in days        0-3733, 0 if 1st smear  \nsince last smear",
0051                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch18b.dat",
0052                             "filename": "Cervical-Cancer\n",
0053                             "name": "Modeling the Precursors of Cervical Cancer\n",
0054                             "separator": "SPACE",
0055                             "use_first_row_for_vectorname": false
0056                         }
0057                     ],
0058                     "name": "Oncology"
0059                 },
0060                 {
0061                     "datasets": [
0062                         {
0063                             "description": "The file ch1b.dat is the waste site file, and contains the \nfollowing variables.  There are NO missing  values.\n\nx: Real, x-coordinate of location of an inactive hazardous waste\nsite containing trichloroethylene (TCE).\n\ny: Real, y-coordinate of location of an inactive hazardous waste\nsite containing trichloroethylene (TCE).\n\nsite: Integer, numerical label of waste site.\n       Key:  Site  1:  Monarch Chemicals\n             Site  2:  IBM Endicott\n             Site  3:  Singer\n             Site  4:  Nesco\n             Site  5:  GE Auburn\n             Site  6:  Solvent Savers\n             Site  7:  Smith Corona\n             Site  8:  Victory Plaza\n             Site  9:  Hadco\n             Site 10:  Morse Chain\n             Site 11:  Groton",
0064                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch1b.dat",
0065                             "filename": "Disease-Clusters",
0066                             "name": "Spatial Pattern Analysis to Detect Rare Disease Clusters",
0067                             "separator": "SPACE",
0068                             "use_first_row_for_vectorname": false
0069                         },
0070                         {
0071                             "description": "The file ch17.dat contains the following 15 variables:\n\nVariable   Description\n\nOBS        Observation number\nCN         Center obtaining and reading the scan\nID         Scan ID\nBA1        Bone area (sq cm) from centralized Reader 1\nBA2        Bone area (sq cm) from centralized Reader 2\nBA3        Bone area (sq cm) from centralized Reader 3\nBC1        Bone mineral content (gm) from centralized Reader 1\nBC2        Bone mineral content (gm) from centralized Reader 2\nBC3        Bone mineral content (gm) from centralized Reader 3\nBMD1       Bone mineral density (gm/sq cm) from centralized Reader 1\nBMD2       Bone mineral density (gm/sq cm) from centralized Reader 2\nBMD3       Bone mineral density (gm/sq cm) from centralized Reader 3\nBA         Bone area (sq cm) from participating center\nBC         Bone mineral content (gm) from participating center\nBMD        Bone mineral density (gm/sq cm) from participating center\n",
0072                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch17.dat",
0073                             "filename": "Bone-Mineral",
0074                             "name": "Quality Control for Bone Mineral Density Scans",
0075                             "separator": "SPACE",
0076                             "use_first_row_for_vectorname": false
0077                         },
0078                         {
0079                             "description": "The file ch21a.dat contains the spontaneous activity and rectal\ntemperature data (416 observations of 6 variables) There are no missing values.\n\nVariable List:\n\nOBS:\t\tObservation identification number.\n\nMORPHINE:\tDose of morphine sulfate (mg/kg) injected into study mice. The \n\t\trange is 0 to 8.0.\n\nDEL9_THC:\tDose of Delta9-THC (mg/kg) injected into study mice.  The \n\t\trange is from 0 to 15.0.\n\nREP:\t\tIdentification of study replication.  The entire 5x7 factorial \n\t\tdesign was replicated.\n\nSPON_ACT:\tSpontaneous Activity as defined by the number of interruptions \n\t\tof a photocell beam in a clear plastic cage over a 10 minute \n\t\tperiod of time.\n\nTEMP_B:\t\tRectal Temperature at baseline (just prior to treatment).\n\nTEMP_60:\tRectal Temperature at 60 minutes post treatment injection.\n\n\n",
0080                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch21a.dat",
0081                             "filename": "Drug-Interactions",
0082                             "name": "Drug Interactions Between Morphine and Marijuana\n",
0083                             "separator": "SPACE",
0084                             "use_first_row_for_vectorname": false
0085                         },
0086                         {
0087                             "description": "The file ch21b.dat contains the tail-flick data (510 observations of \n6 variables) Missing data are encoded with a period. \n\nVariable List:\n\nOBS:\t\tObservation identification number.\n\nREP:\t\tIdentification of study. Two 5x7 factorial experiments and one \n\t\t5x5 factorial experiment are included.\n\nMORPHINE:\tDose of morphine sulfate (mg/kg) injected into study mice. The \n\t\trange is 0 to 8.0.\n\nDEL9_THC:\tDose of Delta9-THC (mg/kg) injected into study mice.  The \n\t\trange is from 0 to 15.0.\n\nFLICK_C:\tControl Flick Time. The number of seconds required for the \n\t\tmouse to flick it tail from beneath a heat stimulus prior to \n\t\ttreatment.\n\nFLICK_T::\tTest Flick Time. The number of seconds required for the \n\t\tmouse to flick it tail from beneath a heat stimulus post \n\t\ttreatment. A 10 sec maximum latency was imposed.",
0088                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch21b.dat",
0089                             "filename": "Drug-Interactions-2",
0090                             "name": "Drug Interactions Between Morphine and Marijuana\n",
0091                             "separator": "SPACE",
0092                             "use_first_row_for_vectorname": false
0093                         }
0094                     ],
0095                     "name": "Other"
0096                 },
0097                 {
0098                     "datasets": [
0099                         {
0100                             "description": "The file ch20.dat contains the following variables:\n\nid           subject identifier\nclinical     indicator for selection into clinical sample:\n                   1=in clinical sample; 0=not in clinical sample\nstratum      stratum membership:\n                   1=high screen; 2=low screen blacks;\n                   3=low screen whites\nrace         subject's self-reported race:\n                   1=white; 2=black\ngender       subject's gender:\n                   1=male; 2=female\nrparents     subject's guardian status:\n                   1=does not live with both natural parents;\n                   0=lives with both natural parents\ncesdtot      subject's total center for epidemiologic studies depression\n                   scale score (range 0-60)\ncohtot       subject's total cohesion score, based on faces-ii\n                   (range 16-80)\nmdd          clinical diagnosis of major depression:\n                   1=positive diagnosis; 0=negative diagnosis\n                   9=missing for subjects not in clinical sample\nweight       sampling weights used in logistic regression; defined as\n                   number of subjects in screening sample in each stratum",
0101                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch20.dat",
0102                             "filename": "Adolescent-Depression",
0103                             "name": "Two-Stage Sampling Designs for Adolescent Depression Studies",
0104                             "separator": "SPACE",
0105                             "use_first_row_for_vectorname": false
0106                         }
0107                     ],
0108                     "name": "Psychology"
0109                 },
0110                 {
0111                     "datasets": [
0112                         {
0113                             "description": "The ten variables listed in each file are:  \n\nAge:           Age on January 1, 1982. \nGender:         0=Male, 1=Female \nEducation:      0=no college, 1=some college  \nSmoker:         1=never, 2=former, 3=current  \nCigarettes/day: values rounded UP to the nearest 5.\nYears smoked:   Number of years smoked as of January 1, 1982. \nYears quit:     Number of years since smoking cessation, as of \n                January 1, 1982 (zero indicates less than one year)\nFollowup Time:  Years from January 1, 1982 until death or last\n                interview.\nDeath codes:    0=alive, 1=death from other causes, 2=lung cancer death.\nFreq:           the frequency at which each combination of variables occured.",
0114                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch19a.dat",
0115                             "filename": "never-smokers",
0116                             "name": "never-smokers",
0117                             "separator": "SPACE",
0118                             "use_first_row_for_vectorname": false
0119                         },
0120                         {
0121                             "description": "The ten variables listed in each file are:  \n\nAge:           Age on January 1, 1982. \nGender:         0=Male, 1=Female \nEducation:      0=no college, 1=some college  \nSmoker:         1=never, 2=former, 3=current  \nCigarettes/day: values rounded UP to the nearest 5.\nYears smoked:   Number of years smoked as of January 1, 1982. \nYears quit:     Number of years since smoking cessation, as of \n                January 1, 1982 (zero indicates less than one year)\nFollowup Time:  Years from January 1, 1982 until death or last\n                interview.\nDeath codes:    0=alive, 1=death from other causes, 2=lung cancer death.\nFreq:           the frequency at which each combination of variables occured.",
0122                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch19b.dat",
0123                             "filename": "current-smokers-m",
0124                             "name": "current smokers: male",
0125                             "separator": "SPACE",
0126                             "use_first_row_for_vectorname": false
0127                         },
0128                         {
0129                             "description": "The ten variables listed in each file are:  \n\nAge:           Age on January 1, 1982. \nGender:         0=Male, 1=Female \nEducation:      0=no college, 1=some college  \nSmoker:         1=never, 2=former, 3=current  \nCigarettes/day: values rounded UP to the nearest 5.\nYears smoked:   Number of years smoked as of January 1, 1982. \nYears quit:     Number of years since smoking cessation, as of \n                January 1, 1982 (zero indicates less than one year)\nFollowup Time:  Years from January 1, 1982 until death or last\n                interview.\nDeath codes:    0=alive, 1=death from other causes, 2=lung cancer death.\nFreq:           the frequency at which each combination of variables occured.",
0130                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch19c.dat",
0131                             "filename": "current-smokers-f",
0132                             "name": "current smokers: female",
0133                             "separator": "SPACE",
0134                             "use_first_row_for_vectorname": false
0135                         },
0136                         {
0137                             "description": "The ten variables listed in each file are:  \n\nAge:           Age on January 1, 1982. \nGender:         0=Male, 1=Female \nEducation:      0=no college, 1=some college  \nSmoker:         1=never, 2=former, 3=current  \nCigarettes/day: values rounded UP to the nearest 5.\nYears smoked:   Number of years smoked as of January 1, 1982. \nYears quit:     Number of years since smoking cessation, as of \n                January 1, 1982 (zero indicates less than one year)\nFollowup Time:  Years from January 1, 1982 until death or last\n                interview.\nDeath codes:    0=alive, 1=death from other causes, 2=lung cancer death.\nFreq:           the frequency at which each combination of variables occured.",
0138                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch19d.dat",
0139                             "filename": "former-smokers-mnc",
0140                             "name": "former smokers:  male, no college",
0141                             "separator": "SPACE",
0142                             "use_first_row_for_vectorname": false
0143                         },
0144                         {
0145                             "description": "The ten variables listed in each file are:  \n\nAge:           Age on January 1, 1982. \nGender:         0=Male, 1=Female \nEducation:      0=no college, 1=some college  \nSmoker:         1=never, 2=former, 3=current  \nCigarettes/day: values rounded UP to the nearest 5.\nYears smoked:   Number of years smoked as of January 1, 1982. \nYears quit:     Number of years since smoking cessation, as of \n                January 1, 1982 (zero indicates less than one year)\nFollowup Time:  Years from January 1, 1982 until death or last\n                interview.\nDeath codes:    0=alive, 1=death from other causes, 2=lung cancer death.\nFreq:           the frequency at which each combination of variables occured.",
0146                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch19e.dat",
0147                             "filename": "former-smokers-msc",
0148                             "name": "former smokers:  male, some college ",
0149                             "separator": "SPACE",
0150                             "use_first_row_for_vectorname": false
0151                         },
0152                         {
0153                             "description": "The ten variables listed in each file are:  \n\nAge:           Age on January 1, 1982. \nGender:         0=Male, 1=Female \nEducation:      0=no college, 1=some college  \nSmoker:         1=never, 2=former, 3=current  \nCigarettes/day: values rounded UP to the nearest 5.\nYears smoked:   Number of years smoked as of January 1, 1982. \nYears quit:     Number of years since smoking cessation, as of \n                January 1, 1982 (zero indicates less than one year)\nFollowup Time:  Years from January 1, 1982 until death or last\n                interview.\nDeath codes:    0=alive, 1=death from other causes, 2=lung cancer death.\nFreq:           the frequency at which each combination of variables occured.",
0154                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch19f.dat",
0155                             "filename": "former-smokers-fnc",
0156                             "name": "former smokers:  female, no college",
0157                             "separator": "SPACE",
0158                             "use_first_row_for_vectorname": false
0159                         },
0160                         {
0161                             "description": "The ten variables listed in each file are:  \n\nAge:           Age on January 1, 1982. \nGender:         0=Male, 1=Female \nEducation:      0=no college, 1=some college  \nSmoker:         1=never, 2=former, 3=current  \nCigarettes/day: values rounded UP to the nearest 5.\nYears smoked:   Number of years smoked as of January 1, 1982. \nYears quit:     Number of years since smoking cessation, as of \n                January 1, 1982 (zero indicates less than one year)\nFollowup Time:  Years from January 1, 1982 until death or last\n                interview.\nDeath codes:    0=alive, 1=death from other causes, 2=lung cancer death.\nFreq:           the frequency at which each combination of variables occured.",
0162                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch19g.dat",
0163                             "filename": "former-smokers-fsc",
0164                             "name": "former smokers:  female, some college",
0165                             "separator": "SPACE",
0166                             "use_first_row_for_vectorname": false
0167                         }
0168                     ],
0169                     "name": "Smoking"
0170                 }
0171             ]
0172         },
0173         {
0174             "name": "Nature",
0175             "subcategories": [
0176                 {
0177                     "datasets": [
0178                         {
0179                             "description": "The file ch3a.dat includes the validation data collected at the stationary\nambient monitoring site.  The variables are:\n\n     1.  Date, in MM/DD/YY format,\n                                                                  DC\n     2.  12-hour average daytime continuous ozone concentration, X  ,\n                                                                  1\n                                                               DP\n     3.  12-hour average daytime passive ozone concentration, X  ,\n                                                               1\n                                                                    NC\n     4.  12-hour average nighttime continuous ozone concentration, X  , and\n                                                                    1\n                                                                 NP\n     5.  12-hour average nighttime passive ozone concentration, X  .\n                                                                 1",
0180                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch3a.dat",
0181                             "filename": "Ozone_",
0182                             "name": "Prediction Models for Personal Ozone Exposure Assessment",
0183                             "separator": "SPACE",
0184                             "use_first_row_for_vectorname": false
0185                         },
0186                         {
0187                             "description": "The file ch3b.dat includes the personal ozone exposure data.  The \nvariables are:\n\n     1.  Subject identification number, ranging from 1 to 23,\n\n     2.  Date, in MM/DD/YY format,\n\n     3.  Home region, ranging from 1 to 6,\n\n     4.  12-hour average daytime personal ozone concentration, Y,\n\n     5.  12-hour average daytime continuous ozone concentration at the\n                           DC\n         stationary site, X  ,\n                           1\n\n     6.  12-hour average nighttime continuous ozone concentration at the\n                           NC\n         stationary site, X  ,\n                           1\n                                                                    O\n     7.  24-hour average home outdoor passive ozone concentration, X ,\n                                                                    1\n                                                                           DI\n     8.  12-hour average home indoor daytime passive ozone concentration, X  ,\n                                                                           1\n                                                                             NI\n     9.  12-hour average home indoor nighttime passive ozone concentration, X  ,\n                                                                             1\n\n    10.  Prediction values for a 12-hour microenvironmental model based\n                                          H\n         on hourly ozone concentrations, X ,\n                                          2\n                                                    O\n    11.  Fraction of time spent anywhere outdoors, X ,\n                                                    3\n                                                  I\n    12.  Fraction of time spent at home indoors, X , and\n                                                  3\n\n    13.  Indicator variable for whether the child stayed near the\n                                  S\n         home for the whole day, X , where 1 = yes, 0 = no.\n                                  3",
0188                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch3b.dat",
0189                             "filename": "Ozone2",
0190                             "name": "Prediction Models for Personal Ozone Exposure Assessment",
0191                             "separator": "SPACE",
0192                             "use_first_row_for_vectorname": false
0193                         }
0194                     ],
0195                     "name": "Weather"
0196                 },
0197                 {
0198                     "datasets": [
0199                         {
0200                             "description": "The file ch2.dat contains the following variables:\n\n   animal - a unique identifier associated with each C. dubia tested\n     conc - concentration (micro grams/L)\n   brood1 - number of young produced in the first brood\n   brood2 - number of young produced in the second brood\n   brood3 - number of young produced in the third brood\n    total - sum of young produced in the 3 broods (=brood1 + brood2 + brood3)",
0201                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch2.dat",
0202                             "filename": "Pollutants",
0203                             "name": "Assessing Toxicity of Pollutants in Aquatic Systems ",
0204                             "use_first_row_for_vectorname": false
0205                         },
0206                         {
0207                             "description": "For a selection of months in the period 1970 to 1983, a measurement of the\nocean salinity at a depth of 100 meters off the Alaskan coast, given in parts\nper thousand.  Columns are:\n\n      1. year\n      2. month\n      3. salinity",
0208                             "url": "http://lib.stat.cmu.edu/crab/salinity",
0209                             "filename": "salinity-2",
0210                             "name": "ocean salinity",
0211                             "separator": "SPACE",
0212                             "use_first_row_for_vectorname": false
0213                         },
0214                         {
0215                             "description": "For a selection of months in the period 1970 to 1983, a measurement of the\nocean temperature at a depth of 100 meters off the Alaskan coast, given in\ndegrees Celsius.  Columns are:\n\n      1. year\n      2. month\n      3. temperature",
0216                             "url": "http://lib.stat.cmu.edu/crab/celsius",
0217                             "filename": "celsius",
0218                             "name": "ocean temperature",
0219                             "separator": "SPACE",
0220                             "use_first_row_for_vectorname": false
0221                         }
0222                     ],
0223                     "name": "Waters"
0224                 },
0225                 {
0226                     "datasets": [
0227                         {
0228                             "description": "The file ch4a.dat contains the  burlap data, with the following variables:\n\n1. mburlap = mean burlap count value obtained over 12 subplot values.\n\n2. megg = mean egg mass density per acre obtained over 21 subplot values.\n\n3. seegg = estimated standard error of mean egg mass density obtained\nover 21 subplot values.",
0229                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch4a.dat",
0230                             "filename": "Gypsy-Moth",
0231                             "name": "Measurement Error Models for Gypsy Moth Studies",
0232                             "separator": "SPACE",
0233                             "use_first_row_for_vectorname": false
0234                         },
0235                         {
0236                             "description": "The file ch4b.dat contains the defoliation data, with the following variables:\n\n1. mdef = mean defoliation value obtained from 20 subplot values.\n\n2. sedef = estimated standard error of mean defoliation\nobtained from 20 subplot values.\n\n3. megg = mean estimated egg mass density obtained over 20 subplots\n\n4. seegg = estimated standard error or mean egg \nmass density obtained from 20 subplot values.\n\n5. cdefegg = estimated covariance of mean defoliation and mean egg mass\ndensity obtained from 20 subplot values.\n",
0237                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch4b.dat",
0238                             "filename": "Gypsy-Moth2",
0239                             "name": "Measurement Error Models for Gypsy Moth Studies",
0240                             "separator": "SPACE",
0241                             "use_first_row_for_vectorname": false
0242                         },
0243                         {
0244                             "description": "The file ch7.dat contains the following variables:\n\nNo - observation number (1,...,294).\nTIME - survival time of halibut (time until death) in hours.\n       (NOTE the Table 1 in the book claims survival time is in minutes,\n       but HOURS is the correct unit)\nCENSOR - censoring indicator.  1=uncensored observation;\n               0=censored observation.\nTOWD - duration (in minutes) of time trawl net was towed on the bottom.\nDELDEPTH - difference between maximum and minimum depth observed during tow\n           (depth measured in meters).\nLENGTH - fork length of halibut in centimeters.\nHANDTIME - handling time (in minutes)  between net coming on board vessel \n             and fish being placed  in holding tanks.\nLOGCAT - natural logarithm of total catch of fish in tow.",
0245                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch7.dat",
0246                             "filename": "Atlantic-Halibut",
0247                             "name": "Survival Analysis for Size Regulation of Atlantic Halibut",
0248                             "separator": "SPACE",
0249                             "use_first_row_for_vectorname": false
0250                         },
0251                         {
0252                             "description": "The file ch9.dat contains the following variables:\n\nBIRD   : Bird id. \nRX     : 1=NT, 2=PT, 3=FT, standing for \"No Tape\" (NT), in which no visible\n         guides connected light cues\n         with the feeders below them; \"Partial Tape\" (PT), in which fluorescent\n         orange Dymo type provided a discontinuous (i.e., broken in two places) \n         connection between each light cue and its feeder; and \"Full Tape\"\n         (FT), in which the visible guide between each light cue and\n         its feeder (fluorescent orange Dymo tape) was continuous.\n         Feeding continued for 180 trials.\nGENDER : 0=male, 1=female. \nOUTCOME: 0=failure 1= success.",
0253                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch9.dat",
0254                             "filename": "Hummingbirds",
0255                             "name": "Spatial Association Learning in Hummingbirds\n",
0256                             "separator": "SPACE",
0257                             "use_first_row_for_vectorname": false
0258                         },
0259                         {
0260                             "description": "The file ch10.dat contains eight variables, with 30 cases for each.\nEach case refers to a site in the forest. The first variable,\n'random', is a character variable indicating whether the site is a\nspotted owl nest site (=N) or a site selected at random\ncoordinates (=R). Variables 2-8 contain the percents of mature forest\n(>80 years of age). The variable names indicate the outer radii of the\nrings in which the percents were calculated. They are: 0.91km,\n1.18km, 1.40km, 1.60km, 1.77km, 2.41km, and 3.38km. So, for example,\nthe variable '1.18km' contains the percents of mature forest in\nrings with outer radius 1.18km and inner radius .91km centered at \nthe different sites.",
0261                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch10.dat",
0262                             "filename": "Habitat-Association",
0263                             "name": "Habitat Association Studies of the Northern Spotted\nOwl, Field Grouse, and Flammulated Owl\n",
0264                             "separator": "SPACE",
0265                             "use_first_row_for_vectorname": false
0266                         },
0267                         {
0268                             "description": "The file ch11a.dat contains a body temperature time series for an\nadult female beaver (Castor canadensis) obtained December 12-13, 1990 \nat Sandhill Wildlife Area, Wisconsin. Observations were made at 10\nminute intervals. These observations follow a random pattern of\nfluctuations, typically observed during freeze-up for all beaver in\nthis study. \n\nVariable List:\n\nObservation No.\nJulian day\nTime\nBody temperature (degrees C) \nActivity (0 = animal inside retreat; 1 = animal outside retreat) \n",
0269                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch11a.dat",
0270                             "filename": "Beaver-Body-Temperatures",
0271                             "name": "Time-Series Analyses of Beaver Body Temperatures",
0272                             "separator": "SPACE",
0273                             "use_first_row_for_vectorname": false
0274                         },
0275                         {
0276                             "description": "The file ch11a.dat contains a body temperature time series for\na subadult female beaver (Castor\ncanadensis). Observations were made at Sandhill Wildlife Area,\nWisconsin, November 3-4, 1990 (before freeze-up). Temperature\nobservations follow a plateau pattern, typically observed during\nthe entire ice-free period (late spring to late autumn). Only the\nfirst 100 observations are included in this data set.\n\nVariable list:\n\nObservation number\nJulian day\nTime\nBody temperature (degrees C)\nActivity (0 = animal inside retreat; 1 = animal outside retreat)",
0277                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch11b.dat",
0278                             "filename": "Beaver-Body-Temperatures2",
0279                             "name": "\nTime-Series Analyses of Beaver Body Temperatures\n",
0280                             "separator": "SPACE",
0281                             "use_first_row_for_vectorname": false
0282                         },
0283                         {
0284                             "description": "The main data set consists of king crab pot survey data for the years 1973\nthrough 1986.  The surveys were conducted in the waters around Kodiak Island,\nAlaska, using pots similar to the pots used by the commercial fishing fleet.\n(A crab pot is a trap that resembles a wooden crate.)  A fixed sampling grid\nwas used to place strings of pots (stations) consisting usually of 10 pots in\nopen ocean, or of 2-5 pots in bays.  The pots were left in the water for\nperiods of 16-24 hours, removed, and the crab counts recorded.  The survey was\nconducted each summer, 2-4 weeks prior to start of the commercial fishing\nseason.  The crab counts are classified by size (roughly representing age) and\nsex into six categories.\n\nThe basic survey data is a file \"survey\", containing a 3,450 by 14 matrix\nwith these columns:\n\n      1. Year (last two digits)\n      2. Fishing district (one of four)\n      3. Station identifier (alphabetic)\n      4. The number of pots fished\n    5-6. Latitude and longitude of the location halfway between\n             the first and last pot of the station\n      7. Number of pre-recruit-4 crab\n      8. Number of pre-recruit-3 crab\n      9. Number of pre-recruit-2 crab\n     10. Number of pre-recruit-1 crab\n     11. Number of recruit males\n     12. Number of post-recruit males\n     13. Number of juvenile females\n     14. Number of adult females",
0285                             "url": "http://lib.stat.cmu.edu/crab/survey",
0286                             "filename": "survey_",
0287                             "name": "Survey",
0288                             "separator": "SPACE",
0289                             "use_first_row_for_vectorname": false
0290                         },
0291                         {
0292                             "description": "====================  Contents of file \"dstns\" ============================\n \nFor each of the years in the survey (1973 to 1986), a frequency distribution\nof the crab by size (in 1 mm increments) that were surveyed.  Separate\ndistributions are given for juvenile females, adult females, and all males.\nThe five columns are:\n\n      1. year\n      2. length in mm\n      3. count of juvenile females\n      4. count of adult females\n      5. count of all males",
0293                             "url": "http://lib.stat.cmu.edu/crab/dstns",
0294                             "filename": "dstns",
0295                             "name": "dstns",
0296                             "separator": "SPACE",
0297                             "use_first_row_for_vectorname": false
0298                         },
0299                         {
0300                             "description": "For each of the 14 years in the survey (1973-86), an estimate of the number of\neggs per female.  Columns are:\n\n      1. year\n      2. estimated eggs per adult female",
0301                             "url": "http://lib.stat.cmu.edu/crab/eggs",
0302                             "filename": "eggs",
0303                             "name": "eggs per female",
0304                             "separator": "SPACE",
0305                             "use_first_row_for_vectorname": false
0306                         },
0307                         {
0308                             "description": "For each year in the survey, a frequency distribution of all females\ncross-classified by size (in 1 mm increments) and percent clutch fullness (5\ncategories).  Clutch fullness is, roughly, the realized egg-bearing potential\nof a female crab.  The seven columns are:\n\n      1. year\n      2. size, in mm\n      3. count of females with 0% fullness\n      4. count of females with 1-29% fullness\n      5. count of females with 30-59% fullness\n      6. count of females with 60-89% fullness\n      7. count of females with 90-100% fullness",
0309                             "url": "http://lib.stat.cmu.edu/crab/fullness",
0310                             "filename": "fullness",
0311                             "name": "Clutch fullness",
0312                             "separator": "SPACE",
0313                             "use_first_row_for_vectorname": false
0314                         }
0315                     ],
0316                     "name": "Animals"
0317                 },
0318                 {
0319                     "datasets": [
0320                         {
0321                             "description": "John O. Rawlings and Susan E. Spruill\n\nThe data set ch5.dat contains the following variables:\n\n1.  site: coded 1-6 corresponding to the location code used in Table 1.\n2.  block: block within site coded 1, 2, ... within sites for the RCB designs;\n    block=1 for all observations for the CRD designs, sites 5 and 6.\n3.  rep: replication within site coded as missing in sites 1-4;\n    coded as 1, 2, ... for replicates in the CRD design.\n4.  ozone: target ozone treatment, coded 0.0=charcoal filtered air, \n    1.0=nonfiltered air, \"x.x\"=target level of ozone as multiple of \n    ambient ozone level.\n5.  rain:  acidic rain treatment, coded as pH of rain solution.\n6.  fam:  genetic family, coded as 1, 2, ... within sites.\n7.  ppmhrs:  cumulative ozone exposure (ppm-h) during the two years of\n    the trials.\n8.  vwpH:  cumulative exposure to acidic rain computed as vwpH \n    = -log(sum(volume*hydrogen ion concentration)).\n9.  biomass:  total above ground biomass (g) after two growing seasons.\n10. diam:  increment of diameter growth (mm) during the two growing seasons.\n11. DMA:  whole-plot component of the covariate initial diameter (mm)\n    expressed as the deviation of the whole-plot mean from the overall\n    site mean.\n12. DMB:  sub-plot component of the covariate initial diameter (mm)\n    expressed as the deviation of the subplot mean from the whole-plot mean.\n13. D2HA: whole-plot component of the covariate initial volume, \n    approximated as diameter squared times height, and expressed as\n    the deviation of the whole-plot mean from the overall site mean.\n14. D2HB:  sub-plot component of the covariate initial volume and\n    expressed as the deviation of the subplot mean from the whole-plot mean.\n15. DMOT:  depth to mottling (cm) of the clay soil; one measurement\n    per whole-plot. \n\nMissing data are coded with '.'",
0322                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch5.dat",
0323                             "filename": "Pine-Seedling",
0324                             "name": "Estimating Pine Seedling Response to Ozone and Acid Rain",
0325                             "separator": "SPACE",
0326                             "use_first_row_for_vectorname": false
0327                         },
0328                         {
0329                             "description": "The file ch8.dat contains the following variables:\n\nPop  - population code, 1034 or 1040\nADH - 1 (cepa),  2 (het) or 3 (fist)\nIDH - 1 (cepa),  2 (het) or 3 (fist)\nPGI - 1 (cepa),  2 (het) or 3 (fist)\nfreq - frequency",
0330                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch8.dat",
0331                             "filename": "Hybrid-Onions",
0332                             "name": "\nMixture Fraction and Linkage Analyses for Hybrid Onions",
0333                             "separator": "SPACE",
0334                             "use_first_row_for_vectorname": false
0335                         }
0336                     ],
0337                     "name": "Plants"
0338                 },
0339                 {
0340                     "datasets": [
0341                         {
0342                             "description": "The file ch6.dat contains the following variables. \n\n\nSTRATA : National Marine Fisheries Service (NMFS) 4 digit strata\n         designator in which the sample was taken              \n                                                                            \nSAMPLE : Sample number per year ranging from 1 to approximately 450\n\nLAT  : Location in terms of latitude  of each sample in the Atlantic Ocean      \n\nLONG  : Location in terms of longitude of each sample in the Atlantic Ocean\n      \nTCATCH : Total number of scallops caught at the ith sample location\n\nPREREC : Number of scallops whose shell length is smaller than 70 millimeters\n                         \nRECRUITS : Number of scallops whose shell length is 70 millimeters or larger",
0343                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch6.dat",
0344                             "filename": "Scallop-Abundance",
0345                             "name": "Geostatistical Estimates of Scallop Abundance",
0346                             "separator": "SPACE",
0347                             "use_first_row_for_vectorname": false
0348                         }
0349                     ],
0350                     "name": "Geology"
0351                 },
0352                 {
0353                     "datasets": [
0354                         {
0355                             "description": "Geographical coordinates of the shoreline of the 17 islands that form the\nKodiak Island group.  The two columns are\n\n      1.  latitude\n      2.  longitude\n\nmeasured in degrees and fractions of a degree.  Each of the 17 groups of\ncoordinates is terminated by a pair of \"NA\"s, and the end of each group loops\nback to the beginning.  For drawing maps, bear in mind that longitude is\nmeasured East to West, which is right to left.  This suggests plotting\nnegative longitude instead of longitude.  Also, to draw maps that \"look right\"\nto an Alaskan, you must take into account that in this part of the world the\naspect ratio of one degree latitude (y-axis) to one degree longitude (x-axis)\nis 1:1.8 (in terms of actual ground distance).",
0356                             "url": "http://lib.stat.cmu.edu/crab/kodiak",
0357                             "filename": "kodiak",
0358                             "name": "Geographical coordinates of the shoreline of Kodiak Island group",
0359                             "separator": "SPACE",
0360                             "use_first_row_for_vectorname": false
0361                         }
0362                     ],
0363                     "name": "Other"
0364                 }
0365             ]
0366         },
0367         {
0368             "name": "Statistics",
0369             "subcategories": [
0370                 {
0371                     "datasets": [
0372                         {
0373                             "description": "Some statistics on the fishing fleet and commercial catch, for each year\nbetween 1960 and 1982.  The six columns are:\n\n      1. year\n      2. number of vessels registered for fishing\n      3. number of crab caught\n      4. total weight in kilograms of crab caught\n      5. total number of pot-lifts.\n      6. wholesale price of king crab in dollars per pound",
0374                             "url": "http://lib.stat.cmu.edu/crab/fleet",
0375                             "filename": "fleet",
0376                             "name": "fishing fleet and commercial catch",
0377                             "separator": "SPACE",
0378                             "use_first_row_for_vectorname": false
0379                         },
0380                         {
0381                             "description": "Commercial catch data for 1960-1982, broken out by district.  The four columns\nare:\n\n      1. year\n      2. district number (1, 2, 3 or 4)\n      3. total catch as a count\n      4. total catch in kilograms",
0382                             "url": "http://lib.stat.cmu.edu/crab/catch",
0383                             "filename": "catch",
0384                             "name": "Commercial catch data",
0385                             "separator": "SPACE",
0386                             "use_first_row_for_vectorname": false
0387                         }
0388                     ],
0389                     "name": "Economics"
0390                 },
0391                 {
0392                     "datasets": [
0393                         {
0394                             "description": "The file ch12.dat contains the following  variables:\n\nlstay: Length of stay of a resident\nage: Age of a resident\ntrt: Nursing home assignment (1: receive treament,0: control)\ngender: Gender (1:male,0:female)\nmarstat: Marital status (1: married,0: not married)\nhlstat: Health status (2: second best, 5: worst)\ncens: Censoring indicator (1:censored, 0: discharged)",
0395                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch12.dat",
0396                             "filename": "Nursing-Home-Usage",
0397                             "name": "\nParametric Duration Analysis of Nursing Home Usage",
0398                             "separator": "SPACE",
0399                             "use_first_row_for_vectorname": false
0400                         },
0401                         {
0402                             "description": "This data set was derived from sample survey data collected in 1988\nin two surveys designed to evaluate the City of Toronto Workplace\nSmoking By-law (National Health Research and Development Program,\nCanada, Project Grant 6606-3346-46).  The principal investigator\nwas Dr. L.L. Pederson, University of Western Ontario, Ontario,\nCanada.  The surveys were conducted by the Institute for Social\nResearch at York University, Ontario, Canada in January-February\n1988 and in November-December 1988.  By agreement with the\nInstitute for Social Research, York University, the survey data are\nin the public domain.  This data set can be used freely for\nnoncommercial purposes and can be freely distributed.\n\nThere are 15 variables in the data set, with values separated by\nblanks.   There are no missing values.  The CSB variable names are as\nfollows:  \n\nidno  y  w  x1   x2   x3   z1   z2   z3   z4   z5   z6   z7   z8   z9\n\n\nSHORT DESCRIPTION     NAME       DEFINITION AND CODING\n\nUnique identifier     idno       (5 digits, beginning with 1 or 2)\n\nOutcome               y          Attitude toward smoking in the\n                                 workplace.  Smoking should be: \n                                 (1 = prohibited, 2 = restricted,\n                                 0 = unrestricted)\n\nWeight                w          Sampling/post-stratification weight\n                                 (ranges from 0.305 to 4.494)\n\nTime                  x1         Time of survey relative to\n                                 implementation of the by-law \n                                 on March 1, 1988\n                                 (1 = post, 0 = pre)\n\nWork                  x2         Place of work indicator 1\n                                 with City of Toronto as baseline\n                                 (1 = outside City of Toronto,\n                                 0 = otherwise)\n\n                      x3         Place of work indicator 2\n                                 with City of Toronto as baseline\n                                 (1 = not outside the home, \n                                 0 = otherwise)\n\nResidence             z1         Place of residence\n                                 (1 = City of Toronto, \n                                 0 = other Metro Toronto)\n\nSmoking               z2         Smoking status indicator 1\n                                 with those who have never smoked \n                                 as the baseline\n                                 (1 = current smoker, \n                                 0 = otherwise)\n\n                      z3         Smoking status indicator 2\n                                 with never as the baseline\n                                 (1 = quit <=6 months ago, \n                                 0 = otherwise)\n\n                      z4         Smoking status indicator 3\n                                 with never as the baseline\n                                 (1 = quit >6 months ago, \n                                 0 = otherwise)\n\n                      z5         Smoking status indicator 4\n                                 with quit >12 months as the baseline\n                                 (1 = quit 6-12 months, \n                                 0 = otherwise)\n\nKnowledge             z6         Knowledge of health effects of\n                                 environmental tobacco smoke\n                                 (score, ranges from 0 to 12)\n\nSex                   z7         Sex of respondent\n                                 (1 = male, 0 = female)\nAge                   z8         Age of respondent\n                                 ( (age in years - 50)/10 )\n\nEducation             z9         Level of education\n                                 (-2 = elementary, \n                                 -1 = some high school, \n                                 0 = high or trade school, \n                                 1 = college or some university,\n                                 2 = university degree)\n  ",
0403                             "url": "http://lib.stat.cmu.edu/datasets/csb/ch13.dat",
0404                             "filename": "Smoking-Restrictions",
0405                             "name": "Analysis of Attitudes Towards Workplace Smoking Restrictions",
0406                             "separator": "SPACE",
0407                             "use_first_row_for_vectorname": false
0408                         }
0409                     ],
0410                     "name": "Population"
0411                 }
0412             ]
0413         }
0414     ]
0415 }