Developing Data Management Policy and GuidanceDocuments for your NARSTO Program or Project------------ A different approach to developing a data management plan inthe NARSTO context.Following is a compilation of data management policy and guidancedocuments for program and project use in developing datamanagement plans. Documents can be downloaded and implementedindividually or as a set, depending upon your data managementneeds. Please be advised that this guidance and the referencedresources will be periodically updated and that users should visit theQSSC web site (link below) for the latest versions.Getting started – Select the data management guidance documents needed in yourProgram or Project from the table of model documents that follows. Adopt, adapt, or refine these model documents as appropriate foryour needs with input from managers, investigators, modelers,data coordinators, etc. Consult with the NARSTO QSSC for more information andassistance. Distribute the approved documents to participants to inform themof their data collection and reporting responsibilities. Ensure that adequate data coordination support is provided to allparticipants to facilitate implementing the plans.Prepared by the NARSTO Quality Systems Science Center (QSSC) A. Hook and Sigurd W. Christensen,NARSTO Quality Systems Science CenterEnvironmental Sciences DivisionOak Ridge National LaboratoryContact: Les Hook, [email protected] , 865-241-4846ORNL research was sponsored by the U.S. Department of Energy and performed at Oak RidgeNational Laboratory (ORNL). ORNL is managed by UT-Battelle, LLC, for the U.S. Department ofEnergy under contract DE-AC05-00OR22725.QSSC Version 200504207DM-0, Page 1

------------ A different approach to developing a data management plan in theNARSTO context, continued.Overview of Data Policy and Management Plan DevelopmentRationale: Providing this information to Project participants will informthem of their data reporting responsibilities, promote consistency andstandardization in data and metadata collection and reporting processes,and greatly facilitate data sharing, integration, synthesis, and analysis.Guidance should be consistent with the needs of the Project.Target Audience: The audience for these guidance documents is theinvestigators, experimentalists, modelers, and data coordinatorsresponsible for generating and submitting data to a Project database,creating other data products, and archiving these data.Guidance Documents: Each document should be 1-2 pages in length (plusattachments) and contain information that has been reviewed in light ofyour Project data management needs. Guidance in the model DMdocuments incorporates existing NARSTO data management protocols andwill often be suitable for use as is. Final guidance should be consistentwith the needs of the Project within the NARSTO context. Add additionalproject-specific guidance as needed.Document Development Process: Ideally, the Project data coordinator willtake the lead on selecting the needed DM documents, coordinating theproject review, and modifying the guidance documents. The providedmodel DM documents are in MSWord format and may be copied andedited as needed. Please contact the QSSC if you have any problems withthe DM documents or have questions about the DM NARSTO guidance.Authority: Each guidance document should be approved by Projectmanagement to ensure acceptance and implementation.Distribution: Ideally these will be web documents and would include linksto on-line Project documents (e.g., DM-4, Site ID table) and NARSTOQSSC resources (e.g., variable name reference tables and DES formattemplate) at Hardcopies could beprovided as needed.QSSC Version 200504207DM-0, Page 2

Proposed Project Data Management Policy / Guidance DocumentsData Management Policy / Guidance DocumentsStatus /ContactApproved by /Date(yyyy/mm/dd) OrganizationDM-1DM-2DM-3DM-4Data Flow OverviewData Policy ConsiderationsProject Name InformationIdentifying Measurement and Sampling Sites Data and Metadata ing Sampling and Measurement Dates andTimesIdentifying Chemical and Physical Variables andDescriptive Field InformationReporting Units for Chemical Variables, Particles,and Physical and Descriptive VariablesAssigning Project-Specific and NARSTO DataQuality FlagsReporting and Flagging Values below DetectionLimitsReporting Missing DataReporting Uncertainty EstimatesReporting Conventions for Mass Measurements,Meteorological Data, and Temperature andPressure Conditions Data Documentation and ArchivingDM-13DM-14DM-15DM-16DM-17Planning to Archive DataCreating Archive Documentation for Your Data SetsCreating a Searchable Index of Your Data Sets withLinks to the Data FilesCapturing Sampling and Analysis Information –Pre- and Post-MeasurementDefining the Quality Level of Data Data Systems ManagementDM-18DM-19DM-20Day-to-Day Operation of Data ManagementSystemsManaging Electronic and Hardcopy Format ProjectRecordsData Management System and SoftwareConfiguration Control GuidelinesQSSC Version 200504207DM-0, Page 3

DM-1: Data Flow OverviewBACK TO TABLESCOPE: Project (MCMA 2003 example)PURPOSE: To inform investigators and potential data users of the generalflow of data and information before, during, and after thecurrent field campaign. Data collected by investigators will beprovided to the MCMA database to meet project data analysisneeds. Certain data and metadata reporting standards arenecessary (e.g., DM-6, Variable naming) to facilitate efficientdata reporting, processing and analysis. Data will ultimatelybe sent to the NARSTO Permanent Data Archive (PDA). Ourreporting standards are consistent with those for the NARSTOPDA.QSSC Version 20050407DM-1, Page 1

Discussion:The information is a general guide to carry out this process. Some larger projects haveonsite Data Managers who work with both the Principal Investigators and the NARSTOQSSC. Other smaller projects do not have Data Managers, and the PIs interact directlywith the QSSC. While projects may have varying assigned roles and responsibilities fordata management, the QSSC is the source for information and assistance with data,metadata, and archiving activities.QSSC Version 20050407DM-1, Page 2

DM-2: Data Policy ConsiderationsBACK TO TABLESCOPE: ProjectPURPOSE: To involve all project managers and participants, as well aspotential data users in the formulation of a data policy. A clear statement ofthe importance of the data collection effort and of the flow of the data andinformation before, during, and after the current activities in the broadestpossible context is needed. It is a shared responsibility of all participants toimplement the data policy.Vision:Is it safe to assume that data and metadata will be shared among Project investigators,and ultimately made available to the public in a timely manner through an archivefacility?Who do you consider to be the audience for data beyond the Project team?Will there be a Project data integration or synthesis effort in the future?Do you see the value of the data as being short-term (3-5 years), mid-term (10 years),or longer (20 years)?Are these considerations the same for field measurement data, laboratory data, andmodeling products (input data, model code, and output results)?Compliance with (as may be applicable): U.S. Government OMB CIRCULAR A 110, (REVISED 11/19/93, As Further Amended 110/a110.html#72 ] U.S. Government Agency implementations of “Guidelines for Ensuring and Maximizing theQuality, Objectivity, Utility, and Integrity of Information Disseminated by Federal Agencies,” OMB,2002. (67 FR 8452) 2.pdf ]*** Example vision statement: The atmospheric sciences community is experiencing an unprecedentedincrease in the types and amount of data being collected, modeled, and assessed. As projects evolve tomore focused, multi-investigator, interdisciplinary efforts in a period of limited resources, the timelyavailability and sharing of data and documentation among participants becomes increasingly important.The need for the use of this information beyond the project for climate assessments and air qualitymanagement decisions has never been greater thus placing the additional responsibility on the project ofproviding for the timely submission of quality controlled data to national data centers for wider public use.***Timeliness of Data Availability:Considerations for timing of field measurement, laboratory, and modeling activities?QSSC Version 20050407DM-2, Page 1

Considerations for timing of laboratory results feeding modeling projects?Rapid turn around of draft data within the Project? Justification?Will data that are the subject of student theses or dissertations need specialconsideration?Will investigators be expected to maintain or archive raw data for specified periods oftime?History tells us enforcement of data policies requires direct involvement by the ProgramManager (i.e., threat of no funding for non-compliance)Quality Assurance:Will each investigation develop a QA project plan? Will the Program have anoverarching QA plan? A final investigation QA summary report?What level of QA is desirable for data to be shared within project? With the public?Flagging data?Encourage reporting of uncertainty measures with data values?Detection limits?Reporting of instrument calibrations and intercomparisons?Will common data-processing protocols be used (e.g., gap-filling, block averaging,standard software packages to convert voltages to concentrations)?Data and Metadata Reporting:Investigators have an obligation to make their data easy to use by others?The Project will develop or adapt (e.g., from the QSSC) a formal description of preferredconventions?Consider extending use of uniform metadata reporting conventions beyond date andtime to include site names, parameter names, CAS RNs, units, methods, missing valuescodes, quality flagging, etc.Consider that searchable, standardized metadata improves synthesis and integrationefforts.Data Archive:Considerations for archiving: long-term system stability and longevity?Consider types and amount of documentation for long-term data archiving – “twentyyear test”. Scientists are encouraged to document their data at a level sufficient to satisfy the well-known“20-year test”. That is, someone 20 years from now, not familiar with the data or how they wereobtained, should be able to find data of interest and then fully understand and use the data solelywith the aid of the documentation archived with the data.( National Research Council, CommitteeQSSC Version 20050407DM-2, Page 2

on Geophysical Data, Solving the Global Change Puzzle, A U.S. Strategy for Managing Data andInformation, National Academy Press, Washington, D.C., 1991.)Consider project maintenance and retention of raw/minimally processed instrumentdata, software codes used for data processing, model code with input data and outputproducts, and hardcopy records.Data Ownership/Control: The issue of data "ownership" is a difficult one.o On the one hand a system must allow an instrument operator to reap therewards of their efforts.o On the other hand the common good is served by sharing.The metadata should clearly state source of data, whether data are preliminaryand for use only among the project or suitable for widespread dissemination andcitation requirements.At some point there is a legal obligation for data collected with government fundsto be freely available.A decision is needed as to when the data sets are freely available to the outsidecommunity.Conflict resolution?Protection of Intellectual Property Rights: How will the Project help to ensure that intellectual property rights are protectedand co-authorship, acknowledgement, or credit is given to data originators andprincipal investigators?Consider the use of data in synthesis and integration studies that result in derived andvalue-added products.Example statement: When data are required for modeling or integrating studies, the originator of the data should beconsulted before data or derived products are incorporated or published in a review or integratedstudy. The scientist collecting such data shall be credited appropriately by either co-authorship orcitation. (SAFARI 2000 DATA POLICY, February 5, f ])Example statement: AmeriFlux Data Fair-Use Policy The AmeriFlux data provided on this site are freely available and were furnished by individualAmeriFlux scientists who encourage their use. Please kindly inform the appropriate AmeriFluxscientist(s) of how you are using the data and of any publication plans. Please acknowledge thedata source as a citation or in the acknowledgments if the data are not yet published. If theAmeriFlux Principal Investigators (PIs) feel that they should be acknowledged or offeredparticipation as authors, they will let you know and we assume that an agreement on suchmatters will be reached before publishing and/or use of the data for publication. If your workdirectly competes with the PI's analysis they may ask that they have the opportunity to submit amanuscript before you submit one that uses unpublished data. In addition, when publishing,please acknowledge the agency that supported the research. Lastly, we kindly request that thoseQSSC Version 20050407DM-2, Page 3

publishing papers using AmeriFlux data provide preprints to the PIs providing the data and to thedata archive at the Carbon Dioxide Information Analysis Center r-use.shtml ]QSSC Version 20050407DM-2, Page 4

DM-3: Project Name InformationBACK TO TABLESCOPE: Project (MCMA 2003 example)PURPOSE: Provide standard names to identify the project, sampling sites,data files, data sets, and FTP site area. Resources,examples, and use in the NARSTO Data Exchange Standard(DES) template are shown.MCMA NamesStudy or Network Short Acronym(Starts with a letter. Use in site names, columns 1 - 4)MCM3Resource: DM-4 : Identifying fixed measurement sitesand mobile measurement platforms*STUDY OR NETWORK ACRONYM*STUDY OR NETWORK NAME(Use in data file and data set names, chars 1-15)Mexico City Metropolitan Area 2003Field CampaignMCMA 2003Resource: Data Exchange Standard Template*ORGANIZATIONACRONYMMIT IPURGAP*ORGANIZATION NAME:Massachusetts Institute of Technology Integrated Programon Urban, Regional, and Global Air PollutionOthers?Resource: Data Exchange Standard TemplateShared-Access FTP Site InformationItemProject InfoUIDmcma (lower case)Passwordxxxxxxxx (case sensitive)Internal/ directory namemcma2003 (lower case)Resource: s.htm]QSSC Version 20050407DM-3, Page 1

Data File and Data Set NamingLimitsData File:57 charsmax,uppercase[*STUDY OR NETWORK ACRONYM] [unique data file descriptors] V1.csvExample: MCMA 2003 SMPS WHERE WHEN V1.csv(except .csv)Projects should define a standard syntax for the [unique data file descriptors] portionof the data file name.Data Set Title:NARSTO [*STUDY OR NETWORK ACRONYM] [Data Description]80 charsmax, titlecaseExample: NARSTO MCMA 2003 Scanning Mobility Particle Size DataData Set Name:NARSTO [STUDY OR NETWORK ACRONYM] [Abbreviated Data Description]40 char max,uppercaseExample: NARSTO MCMA 2003 SMPS DATAResource: iving.pdfQSSC Version 20050407DM-3, Page 2


DM-4: Identifying Measurement and Sampling SitesBACK TO TABLESCOPE: ProjectPURPOSE: Provides a standard for identifying and characterizing fixedmeasurement locations and mobile measurement platforms used by theproject as measurement and sampling sites. Specifications, resources,examples, and use in the DES template are shown.NARSTO Standard Site IdentifierThe NARSTO Standard Site Identifier is constructed as follows for both fixed and mobilesites.ColumnsContents1-4Study or network acronym (see DM-3), beginning with a letter5-6Country code (following the ISO3166 Standard)7-8State or Province9 - 12Site abbreviation (site mnemonic, 1 – 4 chars), beginning with a letterLimits: The full 12 columns must be used, and no blanks are permitted. The last character of the siteidentifier can be repeated to avoid blanks, or underscore ( ) character(s) can be used instead of a blank.Resource: Site Identifier Consensus Metadata /metadatastandards/consensus site id standard.txt]Data Exchange Standard Template for country and state codesExamples:Fixed Site:,SS99USTNBNA ,BNA ,BNA - NASHVILLE INTL AIRPORT,US (UNITED STATES),TN,36.1244767,-86.6781822, Mobile Site: (mobile platform is based at fixed site),SS99USTNG1PN,G1PN,Grumman G-1,US (UNITED STATES),TN,-999.99999,-999.99999, Fixed Site:,ES2HUSTXEFD ,EFD ,EFD - ELLINGTON FIELD AIRPORT,US (UNITED STATES),TX,29.607333,-95.158750, Mobile Site: (same mobile platform is based at different fixed site),ES2HUSTXG1PN,G1PN,Grumman G-1,US (UNITED STATES),TX,-999.999999,-999.999999, QSSC Version 20050407DM-4, Page 1

Project Master List of Site InformationThe project should maintain a master list of site identifiers, characteristics, and otheravailable information. Some items have picklists in the NARSTO Data ExchangeStandard template which also serve to explain the meaning of the item.Key Information Needed to Adequately Characterize Measurement and Sampling REDNoneChar*TABLECOLUMNFORMATFORDISPLAY12Site abbreviation: neChar50NoneCountry codeREQUIREDNoneChar50NoneYesState or province codeREQUIREDChar20NoneYesLatitude: decimal degreesREQUIREDDecimal10.5-999.99999Longitude: decimal degreesREQUIREDDecimal10.5-999.99999Lat/lon reference har120NoneSampling height above groundGround elevation: above meansea levelOPTIONALm (meter)Decimal6.1-99999.9m sYes*TABLE COLUMN NAMESite ID: MISSINGCODENonePressure: site ground levelOPTIONALSite land useREQUIREDhPa(hectopascal)NoneSite location settingREQUIREDNoneChar40NoneMeasurement start date at siteREQUIREDyyyy/mm/ddDate109999/12/31Measurement end date at siteCo-incident measurements 0NoneSite ID: studyOPTIONALNoneChar12NoneLat/lon accuracyOPTIONALm (meter)Decimal7.1-999.9Lat/lon oneOPTIONALAIRS TIONALNoneChar25NoneWMO regionOPTIONALNoneChar1NoneWDCA/GAW station ntPICKLISTAVAILABLEIN THE DESTEMPLATEYesYesYesSite nature (fixed or mobile site)(conditional)Site location typeOPTIONALNoneChar20NoneSite: start dateOPTIONALyyyy/mm/ddDate109999/12/31Site: end dateOPTIONALyyyy/mm/ddDate109999/12/31Site: study start /mm/ddDate109999/12/31Site population classOPTIONALNoneChar30NoneYesSite typeOPTIONALNoneChar20NoneYesSite monitoring supportOPTIONALNoneChar30NoneYesSite: study end dateQSSC Version 20050407YesDM-4, Page 2

Site topographySite monitoring neYesProject Site Information TemplateThere is a Project Site Information Template available to facilitate gathering the requiredand optional site information for the Master List. *Table Column Name row cells haveembedded comments that describe the sought after information. The Excel templatehas several picklists to help ensure consistency of entered values. Additional values canbe added to the ARSTO template atmospheric measurements.xls]The information collected in this site information template will be used when submittingmeasurement data with the DES Template to the NARSTO archive.Use of Project Site Identification Information in DES Template*TABLENAMESite information*TABLE FOCUS Metadata*TABLECOLUMN NAME*TABLECOLUMN UNITS*TABLECOLUMNFORMAT TYPE*TABLECOLUMNFORMAT FORDISPLAY*TABLECOLUMNMISSING CODEAdditionalSiteState or Latitude: Longitude: site info inadjacentabbreviation: Descripti Country province decimal decimalSite ID: standardstandardoncodecode degrees degrees neNone502010.510.5NoneNone-999.99999 -999.99999See KeyInfo Tableabove*TABLE BEGINSSS99USTNBNA BNASS99USTNG1PN G1PNBNA NASHVIL USLE INTL (UNITEDAIRPORT STATES) TNUSGrumman (UNITEDG-1STATES) TN36.1244 86.6781822767999.999 999.999999999More *TABLE ENDSKey Information Needed to Adequately Characterize Measurement and SamplingSitesDES template provides guidance on identifying mobile measurement platforms (e.g., airplanes, vans, andships). The Site information table documents the site information for sites with data appearing in this file.The table name must be as shown (“Site information”). Variables may be presented in a different orderthan shown, but we urge that the Site ID: standard variable appear first. Other variables besides thoseshown may be added to this table. We suggest you consult with your local data manager before addingother variables; standard names and picklists exist for some other variables.QSSC Version 20050407DM-4, Page 3

DM-5: Reporting Sampling and Measurement Dates and TimesBACK TO TABLESCOPE: (Example Mexico City Metropolitan Area 2003 Field Campaign)PURPOSE: Provides a standard for reporting sampling and measurementdates and times. Resources, examples, and use in the DEStemplate are shown.Because reporting dates and time is so important to the success of a project, we havedesigned redundancy into the reporting fields for date and time to prevent many of thereporting problems encountered by similar intensive monitoring projects.Time Basis:Investigators will report data on a Central Standard Time (CST) basis.Dates and Times to Report: Start date and time must be reported as time at the beginning of thesampling/measurement/averaging period. End date and time must be reported as the time at the end of thesampling/measurement/averaging period. For continuous processes, the end date and time of the preceding period may bethe start date and time of the next period. There is no 24:00 time. 23:59, then 00:00 the next day. Contact the QSSC forguidance if fractions of seconds are needed.Reporting Dates and Times:Local Time Zone is specified on every data record. Specify CST.Sample dates and times must be reported in both CST and Coordinated Universal Time(UTC).(1) CST. Formats: 2003/02/28 and 07:00. (Note leading zero. See footnote.)(2) UTC. Formats: 2003/02/28 and 13:00.CST lags UTC time by 6 hours. If the Universal Time is 14:30 UTC, CentralStandard Time would be 08:30 CST.QSSC Version 20050407DM-5, Page 1

Time Resources:Discussion of Coordinated Universal Time (UTC) [ ].U.S. Naval Observatory [ ]To set your PC to the correct U.S. time [ ]Important note: A formula is provided in the main data table of the DES template forconverting local dates and times to UTC.Footnote: (Exact steps may vary slightly depending upon your operating system.)For MS-Windows users, the default date and time format should be changed to the ISO format on everycomputer used to create Data Exchange Standard files, as follows:a) On the Windows desktop, click on Start, Settings, Control Panelb) Click on Regional Settingsc) Click on Dated) In the "Short date style" field, enter yyyy/mm/dde) Click OK, and under Regional Settings, click on Timef) In the Time style field, enter hh:mm:ss tt (this causes the hour to display leading zeros. e.g.,08:00)g) Click OKQSSC Version 20050407DM-5, Page 2

Reporting Sampling and Measurement Dates and Times in DES Template*TABLE NAME*TABLE FOCUS*TABLE USER NOTE*TABLE KEY FIELDNAMES*TABLE COLUMN NAME*TABLE COLUMN NAMETYPE*TABLE COLUMN CASIDENTIFIER*TABLE COLUMN USERNOTE*TABLE COLUMN UNITS*TABLE COLUMNFORMAT TYPE*TABLE COLUMNFORMAT FOR DISPLAY*TABLE COLUMNMISSING CODE*TABLE COLUMNLOOKUP TABLE NAME*TABLE COLUMNOBSERVATION TYPE*TABLE COLUMN FIELDSAMPLING ORMEASUREMENTPRINCIPLE*TABLE BEGINSMainDataTableSurface—fixedInstrument colocation IDDate start:local timeTimestart:localtimeSite altimeTime Datezone: start:local meDateTimeCharDateTimeDateTimeSite Instrument NoneSupplementary dataSupplementary dataNoneSupplementarydataSupplementary SupplementarydataNoneSupplementary dataNoneSupplementarydataSupplementary dataNoneSupplementarydataNotapplicableNot 0114:002000/01/0213:00*TABLE ENDSQSSC Version 20050407DM-5, Page 3

DM-6: Identifying Chemical and Physical Variables and DescriptiveField InformationBACK TO TABLESCOPE: ProjectPURPOSE: Provides the approach for identifying (i.e., naming) chemicaland non-chemical/physical measured variables and variousdescriptive metadata elements. Resources are identified andexamples are shown.This document points to references tables of CAS Registry Numbers, names forchemical and nonchemical/physical variables and various metadata elements to use inthe Data Exchange Standard files. Data providers are expected to use these tables todetermine the appropriate identifiers for the chemical substances, physical properties,and metadata elements (e.g., date, time, locations) they are reporting.Identifying Chemical Substances with a CAS1 Registry Number:Chemical Substances with a CAS RNLimitsCAS Identifier:Valid CAS number.(CAS Registry Number with "C" ical Name:(Prefered is CAS-9CI nomemclature. IUPAC forpolycyclics. Other common name might beacceptable.)The "C" prefix preventsspreadsheet programs fromconverting some CAS numbers todates.Please request CAS RNs and 9CInames from the QSSC as needed.Exmples: (source 9CI)2-Undecanone2-DodecanoneHeptane, 3-ethylResource: Chem Ref Tables.xls1The CAS Registry Number and the CAS-9CI name (Chemical Abstracts Service, 9th Collective Index Nomenclature) are thecopyrighted property of the American Chemical Society. The NARSTO QSSC has the permission of CAS to use this information inNARSTO archive data sets. By extension, EPA Supersites Projects and NARSTO affiliated projects may incorporate CAS numbersand CAS-9CI names into data being processed for NARSTO archiving. Furthermore, the use of CAS numbers and CAS-9CI namesis permitted as required in supporting regulatory requirements and/or for reports to Government Agencies and in copyrightedscientific publications when the CAS information are incidental to the publication. Any use or redistribution other than that describedQSSC Version 20050407DM-6, Page 1

here is not permitted without the prior, written permission of the American Chemical Society. Please contact the QSSC if you haveany questions about the use of CAS information.Identifying Chemical Substances/Measurements/CalculatedQuantities that do not have a designated CAS Registry Number:Chemical Substances without a CAS RNLimitsChemical Substances Identifier:Formal syntax with key phraseand detailed modifier if needed,separated by a ":"Examples:Carbon: elemental (EC)Hydrocarbons: non-methane (NMHC)NOx (nitric oxide nitrogen dioxide)Please request new names fromthe QSSC as needed.Resource: without CAS.xlsIdentifying Physical/Non-chemical Measurements:Physical/Non-chemical MeasurementsLimitsPhysical/Non-chemical Identifier:Formal syntax with key phraseand detailed modifier if needed,separated by a ":"Examples:PM2.5: massAircraft: headingHumidity: relativePlease request new names fromthe QSSC as needed.Resource:

----- A different approach to developing a data management plan in the NARSTO context. Following is a compilation of data management policy and guidance documents for program and project use in developing data management plans. Documents can be downloaded and implemented individually or as a set, de