You are here

About the Database

There are two sets of data for use:  SAS files (labeled analysis files) and comma delimited files (labeled ‘raw files’).  Both are structured in a relational database strategy, with keywords of ID (DEIDNUM) and Visit (VISIT) connecting information in the files to the individuals.

For the analysis dataset, downloading the data from the website creates 52 SAS dataset files.  These individual files contain all the data from the CALERIE study, but need to be linked to create analysis files. The documentation of the analysis datasets lists the available datasets (pages 1 & 2) and the variables contained in each (pages 3-180). An example of a SAS program that can join the files to test a simple hypothesis can be found here.

The raw database is structured by content area in a relational strategy - 52 databases which can be connected by ID and VISIT.  These are slightly different from the SAS files.  There are 52 small databases containing the SubjectID and Visit# and variables collected.  To see when variables within datasets were measured look at pages 1-3 of the rawdata contents file.

There is extensive information within the section with technical documents describing the dataset, rules describing the derivations of analysis variables, data collection forms, and a table of available measures which provides detailed documentation on each dataset and the variables and visits pertaining to each.

Click on the image below to view an enlarged graphical explanation of the approval process for the CALERIE Research Network.