Request for Information (RFI): Input on National Cancer Institute Metadata Repository and Services

Notice Number: NOT-CA-15-019

Key Dates
Release Date: April 3, 2015
Response Date: May 15, 2015

Related Announcements
None

Issued by
National Cancer Institute (NCI)

Purpose

The National Cancer Institute (NCI) Center for Bioinformatics and Information Technology (CBIIT) is seeking broad input and feedback from sources of expertise and interest in semantic metadata management services.

The cornerstone of these services is the Cancer Data Standards Registry (caDSR), a repository with data element descriptions and form designs, with semantics linked to NCI’s controlled terminology. The caDSR offers services to the cancer research community to create, access, maintain, and use these descriptions across application systems, files, and databases. CBIIT wants to understand how to make the NCI common data elements more useful, more accessible, easier to integrate into research and care processes and systems, and better support community comment and community curation of these elements, linked semantics, and metadata. CBIIT plans to rebuild and modernize these services and make it easier to discover the consensus standards and integrate these elements and linked data into cancer research and care workflows. The intent is to align NCI’s infrastructure with emerging NIH, national, and international metadata initiatives.

Background

CBIIT’s mission is to provide and advocate for the appropriate use of data science, informatics, and information technology (IT) to support and accelerate the NCI Mission to prevent cancer, treat cancer, and improve cancer outcomes. An important role of the NCI Semantic Infrastructure (SI) is to support the NCI research mission through community definition and collection of metadata. Data that have well defined linked metadata can improve the use, interpretation, and reuse of data and the extraction of information and knowledge from these data. Supporting both human readable and machine-readable definitions and metadata has been an important driver for the NCI Semantic Infrastructure. These general metadata characteristics are also among the key principles for data citation , and are noted to enable data access, verifiability and discoverability.

The primary goals for updating the metadata services are to:

  • Simplify and streamline community creation, curation, maintenance, and discovery;
  • Support content harmonization leveraging automated means for identification of overlapping content
  • Support interoperability and integration of data elements, modules of elements, and semantics into existing and novel workflows; and
  • Support knowledge extraction.

Information Requested

All stakeholders with an interest in improving cancer research through the use of well-described, discoverable, and open descriptions of data are invited to provide information. Your response may mention your membership or affiliation within an industry, government, or academia.

If you choose, you can identify your area of expertise by, but not limited to, any of the following:

  • Metadata management services or software;
  • Formal or community metadata standards, e.g., ISO/IEC 11179, Federal Government Open Data Metadata Schema, ISA-TAB, etc.;
  • Semantics management, e.g., W3C semantic web technologies; and
  • Data Science bridging biomedical research and health care.

The NCI is seeking information that includes but is not limited to the following areas:

  • Effective approaches, processes, and capabilities to augment/replace services currently available in the caDSR;
  • Identify requirement gaps and provide related use cases (e.g., identification of specific emerging fields and technologies with multiple existing metadata standards that could benefit from harmonization and integration);
  • Lessons learned from existing metadata repository efforts, particularly examples with field-tested processes and infrastructure, or examples of failures in metadata repository efforts;
  • Common challenges in metadata repository development and interoperability (e.g., methods for community engagement, preventing redundant or duplicate content, building interoperability with related standards, and supporting transformations between similar data);
  • Simplifying the use of metadata in computational and human-based approaches to support processes including data discovery, data analysis, data reuse; and
  • Effective approaches for linking metadata to data and data catalogues (e.g., the use of XML/JSON attributes to link to descriptive metadata associated with the data).

Submitting a Response

All responses must be submitted to  SI_MDR_RFI@mail.nih.gov by May 15, 2015. Please include the Notice number in the subject line. Response to this RFI is voluntary. Responders are free to address any or all of the categories listed above. The submitted information will be reviewed by NIH staff. Submitted information will be considered confidential.

Please do not include any proprietary, classified, confidential, or sensitive information in your response. The NIH will use the information submitted in response to this RFI at its discretion and will not provide comments to any responder's submission. The collected information will be reviewed by NIH staff, may appear in reports, and may be shared publicly on an NIH website.

The Government reserves the right to use any non-proprietary technical information in summaries of the state of the science, and any resultant solicitation(s). The NIH may use the information gathered by this RFI to inform the development of future funding opportunity announcements.

This RFI is for information and planning purposes only and should not be construed as a solicitation or as an obligation on the part of the Federal Government, the National Institutes of Health (NIH), or individual NIH Institutes and Centers. No basis for claims against the U.S. Government shall arise as a result of a response to this request for information or from the Government’s use of such information.

Inquiries

Please direct all inquiries to:

Denise Warzel
National Cancer Institute (NCI)
Telephone: 301-480-6199
Email: warzeld@mail.nih.gov