Skip Navigation
U.S. Department of Health & Human Services Logo
HHS URL Link
OER Logo   OER Public Websites Archive   Archive  
This website contains archived materials provided for historical reference purposes only.
The content and links are no longer maintained and may be outdated.
Archive Home
About Grants
Grants Process
Electronic Grants
Funding
Funding Opportunities
NIH-Wide Initiatives
Forms & Deadlines
Grants Policy
News & Events
About OER

Related Archives         
ARCHIVED - Overview of Report on Rating of Grant Applications (RGA)

Note: The information below is an overview. The full text of the report is available for download. We hope that you will carefully read and consider both of these documents; we welcome your comments, which can be sent to DDER@NIH.GOV. An Update including frequently made comments and frequently asked questions is also available for your perusal.

Overview

As part of reinvention activities and the ongoing effort to maintain high standards for peer review at the NIH, a subcommittee of the NIH Committee on Improving Peer Review was formed in the fall. This Rating of Grant Applications (RGA) subcommittee was tasked with examining the process by which scientific review groups rate grant applications and making recommendations to improve that process, and to make these recommendations in light of current knowledge of measurement and decision making.

Next Steps

Changes to so critical an element of peer review as the system of rating grant applications should not be implemented without the participation and contributions of the scientific community that they will affect. Therefore, we are broadcasting this overview of the Report On Rating of Grant Applications, in the hope that we may benefit from the close scrutiny that we anticipate it will be given. In your reading of it, you may want to consider separately the various aspects of the recommendations that might be implemented; as the report itself points out, recommendations can be implemented independently of each other.

The report has been scrutinized by staff of the NIH, and is the product of a careful and conscientious working group. The full report has been sent to the directors of the Institutes and Centers, and been an item of discussion for all of the relevant NIH-wide standing committees: the Extramural Program Management Committee (EPMC), the Review Policy Committee (RPC), and the Program Officers and Project Officers Forum (POPOF).

Current Rating System Works Reasonably Well

The current system for rating grant applications works reasonably well. No one appears to believe that poor quality science is consistently being given good scores or that exceptionally good science is being given poor scores, and this is the gold standard for a reasonable system of rating science. Thus, we recognize that there is considerable commitment to the present method of "doing business" and a disinclination to change it.

Why Change a System That Works?

In today's funding environment, it becomes increasingly important to ensure that scores are as reliable as they can be, and that NIH staff have the maximal amount of useful information on which to base funding decisions. So, while worthy science is being given good scores, there is still a range within "good" that is distinguished by reviewers but that fails to be conveyed via the scores assigned under the current system. This loss of information is due in part to a tendency of initial review groups to cluster priority scores in the "outstanding" region, which could be attributed to qualitative differences in the science being reviewed by different groups or to differences in the scoring behavior of different groups. Percentiling was an attempt to account for and counter by statistical means the differences among review groups in their scoring behavior in this regard, but the subcommittee felt that the very arithmetic of priority scores and percentiles gives an impression of a greater precision of discernment than is really the case. Besides the compression of scores within a particular range, other information that tends to be lost under the present rating system is the initial review group's assessment of the scientific significance of a grant application as distinguished from its assessment of an application's methods and feasibility.

Committee Task and Method

The work on rating of grant applications grew out of the larger context of the reinvention of NIH extramural activities. The Rating of Grant Applications (RGA) subcommittee was established by the Extramural Reinvention Committee to examine the grants review rating process with an eye toward fine-tuning the current system. The subcommittee was composed of staff representing different ICDs. Several outside experts in the behavioral aspects of decision making and in psychometrics served as consultants.

In defining the scope of its activities, the subcommittee viewed the initial review of applications as serving two functions: to assess the scientific and technical merit of an application through a narrative and one or more quantitative indices, and to comment on other aspects of the application that should be clearly separated from the assessment of scientific merit, e. g., biosafety, animal and human subject welfare. During the course of its discussions, the subcommittee developed a set defining characteristics used in initial scientific peer review, which served as points of departure in their subsequent discussions and in the development of their recommendations.

Defining Characteristics of Peer Review:

a. The rating assigned to an application should be a quantitative representation of scientific merit alone and not represent any other property of the application.

b. The criteria used to review applications should include all aspects of scientific merit, should be as salient as possible to reviewers, and should form the only basis for both the quantitative ratings and narrative critique of each application.

c. The ratings of all the reviewers should be equally able to influence the final score of scientific merit.

d. The potential for "gaming" the system (i.e., consciously or unconsciously introducing inequities based on factors other than scientific merit or otherwise distorting the rating of scientific merit) should be minimized.

e. The manner in which review results are reported should summarize the totality of information contained in the review group's ratings.

f. The form in which review results are reported should be useful to those making funding decisions and informative to advisory councils and applicants.

g. The rating system should encourage reviewers to make as fine discriminations as they can reliably make.

h. Procedures should minimize the burden for reviewers both before and at the review meeting.

i. Federal policy issues (e.g., gender/minority representation, protection of animal and human subjects) must be addressed appropriately.

Recommendations

The Committee agreed that it should not be constrained by current practice but should be prepared to propose any workable system, even if radically different, should such a system be superior. Psychometric experts and the literature on decision making and evaluative processes were consulted, available data were analyzed, and simulations were made. On the basis of all of this information, the recommendations below have been developed. These recommendations, which will be the basis for discussion about possible changes in the scoring system, are now made available for your scrutiny and comment. It is not the case that these recommendations must be considered as a packet; we are interested in opinions regarding the merit and feasibility of piloting each of them individually.

Review Criteria

Recommendation 1: The three proposed review criteria listed below should be adopted for unsolicited research project grant applications.

Significance: The extent to which the project, if successfully carried out, will make an original and important contribution to biomedical and/or behavioral science.

Approach: The extent to which the conceptual framework, design (including, as applicable, the selection of appropriate subject populations or animal models), methods, and analyses are properly developed, well-integrated, and appropriate to the aims of the project.

Feasibility: The likelihood that the proposed work can be accomplished by the investigators, given their documented experience and expertise, past progress, preliminary data, requested and available resources, institutional commitment, and (if appropriate) documented access to special reagents or technologies and adequacy of plans for the recruitment and retention of subjects.

Recommendation 2: Reviews should be conducted criterion by criterion, and the reviewers' written critiques should address each criterion separately.

Recommendation 3: Applications should receive a separate numerical rating on each criterion.

Recommendation 4: Reviewers should not make global ratings of scientific merit.

Rating Scale

Recommendation 5: The rating scale should be defined so that larger numerical values represent greater degrees of the characteristic being rated and the smaller values represent smaller degrees.

Recommendation 6: The number of scale positions should be commensurate with the number of discriminations that reviewers can reliably make in the characteristic being rated. An eight-step scale (0-7) is recommended on the basis of the psychometric literature; however, a maximum of 11 steps (0-10) are acceptable.

Recommendation 7: The rating scale should be anchored only at the ends. The performance of end-anchors should be evaluated and other approaches to anchoring should be investigated as needed.

Calculation, Standardization, and Reporting of Scores

Recommendation 8: Scores should be standardized on each criterion within reviewer and then averaged across reviewers. The exact parameters for this standardization should be defined by an appropriately constituted group.

Recommendation 9: Scores should be reported on the scale used by reviewers in making the original ratings. Scores should be reported with an implied precision commensurate with the information contained in the scores. Two significant digits are recommended.

Recommendation 10: If a single score is required that represents overall merit, it should be computed from the three criterion scores using an algorithm that is common to all applications. The Committee favors the arithmetic average of the three scores; however, an appropriately constituted group should test and choose the algorithm to be used.

Any or all of these recommendations could conceivably be implemented as part of the peer review process. We are currently considering the pros and cons of each recommendation, and the positive and negative impacts that each could have on the peer review system and on other aspects of the awarding of research grants at NIH. You are invited to read the Report of the Committee on Rating of Grant Applications and to offer your comments. Decisions on implementation of any of these recommendations would need to be made by January of 1997 if they were to be in place for the review of grant applications to be funded in fiscal year 1998.

Downloading This Document
The full text of the report is available in Text and PDF format. To download the appropriate file, please choose the format of your choice:
  • TXT - 89 KB
  • PDF - 406 KB



Note: For help accessing PDF, RTF, MS Word, Excel, PowerPoint, Audio or Video files, see Help Downloading Files.



Archive web This web page is archived and provided for historical reference purposes only. The content and links are no longer maintained and may be outdated. See the Archive Home Page for more details about archived files.