Guide for creating README.txt
README.txt files, sometimes referred to as codebooks, provide the necessary information, or metadata, needed to make Digital Research Objects (DROs) (numerical data, photographs, spread sheets, movies) have added value. This makes working with DROs easier and increases the accessibility for users and researchers. Following the guidelines listed in this document help serve as a guide as to what should be included in a well-formed readme.txt.
Additional help can be requested by contacting the UL Research Data Specialist.
Best (better) Practices
- Create one readme file for each data file, whenever possible.
- Name the readme so that it is easily associated with the data file(s) it describes.
- Write your readme document as a plain text file
- Format multiple readme files identically.
- Use standardized date formats.
- Follow the domain conventions of your discipline for taxonomic, geospatial and geologic names and keywords.
Recommended Content
Source material for this page: Cornell University's Research Data Management Service Group.
You may also wish to use the following template as a guide (right-click and save as).
Recommended minimum content for data re-use is in bold.
- General information
- Provide a title for the dataset
- Name/institution/address/email information for
- Principal investigator (or person responsible for collecting the data)
- Associate or co-investigators
- Contact person for questions
- Date of data collection (can be a single date, or a range)
- Information about geographic location of data collection
- Keywords used to describe the data topic
- Language information
- Information about funding sources that supported the collection of the data
- Data and file overview
- For each filename, a short description of what data it contains
- Format of the file if not obvious from the file name
- If the data set includes multiple files that relate to one another, the relationship between the files or a description of the file structure that holds them (possible terminology might include "dataset" or "study" or "data package")
- Date that the file was created
- Date(s) that the file(s) was updated (versioned) and the nature of the update(s), if applicable
- Information about related data collected but that is not in the described dataset
- Sharing and access information
- Licenses or restrictions placed on the data
- Links to publications that cite or use the data
- Links to other publicly accessible locations of the data (see best practices for sharing data for more information about identifying repositories)
- Recommended citation for the data (see best practices for data citation)
- Methodological information
- Description of methods for data collection or generation (include links or references to publications or other documentation containing experimental design or protocols used)
- Description of methods used for data processing (describe how the data were generated from the raw or collected data)
- Any software or instrument-specific information needed to understand or interpret the data, including software and hardware version numbers
- Standards and calibration information, if appropriate
- Describe any quality-assurance procedures performed on the data
- Definitions of codes or symbols used to note or characterize low quality/questionable/outliers that people should be aware
- People involved with sample collection, processing, analysis and/or submission
- Data-specific information
*Repeat this section as needed for each dataset (or file, as appropriate)*- Count of number of variables, and number of cases or rows
- Variable list, including full names and definitions (spell out abbreviated words) of column headings for tabular data
- Units of measurement
- Definitions for codes or symbols used to record missing data
- Specialized formats or other abbreviations used
Resources for Qualitative Datasets and Digital Humanities Projects
Best practices for readme files may need to be adjusted for qualitative datasets and digital humanities projects. For further guidance, please refer to the following resources.
- Geneva Graduate Institute includes tips specific to quantitative and qualitative data within the Documenting Data section.
- Specific to the Digital Humanities: Digital Project Preservation Plan: A Guide for Preserving Digital Humanities / Scholarship Projects (A. Miller, 2019)