This is the "Best Practices for Managing Your Data" page of the "Data Management Planning" guide.
Alternate Page for Screenreader Users
Skip to Page Navigation
Skip to Page Content

The NSF and other granting agencies have established requirements for data sharing and data management planning. This guide presents tips and tools for researchers on how to approach data management.
Last Updated: Jun 19, 2013 URL: http://libguides.ucmercedlibrary.info/data-management Print Guide RSS UpdatesShareThis

Best Practices for Managing Your Data Print Page
  Search: 
 

Local Help for Data Management

Contact:

Susan Borda - Digital Curation Librarian
209-631-8961
Send Email

 

File Formats

Best Practices:

  • Accessible in the future, non-proprietary, commonly used by research community
  • Unencrypted and uncompressed,
  • Not proprietary use: PDF not Word, XML or RDF not RDBMS, CSV not XLS

Resources:

 

Organizing Files/Data

Best Practices:

Folder Structure

  • Data and documentation files are in separate folders
  • Data files organized according to data type and then according to research activity
  • Documentation files are also organized the same way.
  • Restrict level of folders to 3 or 4 deep and not to have more than 10 items in each list.

File Naming

  • Use brief and meaningful file names
  • Avoid spaces and special characters
  • Include file versioning in the naming scheme

File name with versioning

Resources: 

 

Create a Data Register

Create a text document or table that includes:

  • what data you're collecting
  • format(s)
  • naming convention
  • location you're storing the data
  • owner (who's collecting, creating, or responsible for the data)
  • access (who is allowed access)
 

About this guide

Acknowledgements: Sara Rutter, University of Hawaii at Manoa, for sharing her guide; UC3 (University of California Curation Center).

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

Creative Commons License

 

Define Your Data Dictionary

Example Data Dictionary

Example from Hook, Les A., et al. 2010. Best Practices for Preparing Environmental Data Sets to Share and Archive. Available online (http://daac.ornl.gov/PI/BestPractices-2010.pdf) from Oak Ridge National Laboratory Distributed Active Archive Center, Oak Ridge, Tennessee, U.S.A. doi:10.3334/ORNLDAAC/BestPractices-2010

 

Establish a Descriptive File and Dataset Naming Convention

A consistent convention will help you easily identify your files and what they contain. Use abbreviated descriptive information such as

  • project
  • content or parameter
  • location, date and/or time (yyyymmdd for easy sorting; hhmmssTZD for time)
  • version number (establish numbering system for versions)

Use numbers, letters, dashes, underscores. Do not use spaces or special characters. Stay concise to be practical.

 

Using Excel

Best Practices:

  • Use in conjunction with a "Data Dictionary" (similar to that listed above) containing information about:
    • Variable name
    • Variable types
    • Codes and Ranges
    • Missing values
  • Place variable names in row 1
  • Always have a unique identifier per entity
  • Keep track of changes made to worksheet
  • Format columns to matchthe variable type (date, numeric, text, etc.)
  • Data entry guidelines:
    • Freeze column headings so they will not scroll of the screen
    • Enter string variables in a consistent case
    • Do not leave any blank rows in the spreadsheet
    • Do not include unessential text or fancy formatting in the spreadsheet
    • Get rid of formulas - copy the entire spreadsheet into a new sheet using "Values" option
    • Sort data with caution (always SAVE first) 
  • Verify data using double data entry
  • Save as .csv for forward compatibility and interoperability

Resources:

  • DataUp - An Excel add-in that will assist individuals in documenting and preparing Excel for archiving and sharing
  • Elliott, A C. (2006). Preparing data for analysis using Microsoft Excel. Journal of investigative medicine, 54(06), 334-341. 
 

Data Documentation and Metadata

Best Practices:

  • Make good use of "readme.txt" files for documenting details
  • Document:
    • Data collection methods
    • Context of data collection
    • Variable names and description
    • Algorithms used
    • Transformations of data from the raw data through analysis
    • Software and systems used for analysis
  • Use discipline specific metadata standards
  • Use a script rather than GUI during data analysis, better for documentation and makes results easier to reproduce
  • Incorporate a workflow tool such as Kepler, Taverna or VisTrails

Resources:

 

Effective Data Practices: References

  • Data Management 101 - California Digital Library's DataUp project
  • Best Practices for Preparing Environmental Data Sets to Share and Archive (pdf) by Hook et al, 2010.
  • DataOne Best Practices database.
  • UK Data Archive: how-to, resources on data management.
  • Some Simple Guidelines for Effective Data Management by Elizabeth T. Borer et al., Bulletin of the Ecological Society of America 90(2) 205-214, including:
    • store a copy of your original rough data as a read-only, making copies to use in analysis
    • provide descriptive filenames and designate the first row of tables as a header
    • organize records in rows, using column headings that will allow analysis within columns rather than across columns, example: SITE YEAR RAIN TEMP SPEC_NAME POP
    • set up your tables so that you do not have to add columns when adding data
    • use ASCII characters to minimize translation problems with software programs
    • your data tables should only contain data, comments should be in a read.me text file that accompanies the table
  • DataCite on why and how to cite data
Description

Loading  Loading...

Tip