Skip to content

The Steward's SchemaResponsible AI in Biomedical Research

A Practical Policy Guide for navigating the transition from Prompter to Steward.

Disclaimer

Note: This is a personal website by Steven Grambow, PhD. The AI policy landscape is constantly changing, and this site represents an ongoing attempt to consolidate current movements in the space. While AI tools were used to assist in the creation of this content, all information has been reviewed by a human for accuracy. Please use the provided links to search for and verify specific policies, as they may have changed since this publication.

Executive Summary โ€‹

The regulatory landscape for AI in biomedicine has shifted from informal experimentation to structured accountability. The consensus across medical publishers (ICMJE, Elsevier, Springer Nature), editorial ethics bodies (COPE, WAME), and federal research agencies (NIH, NSF) is convergent: AI is a tool, not an author or a reviewer. Humans retain full liability for every output.

This guide addresses four activities central to the clinician-scientist workflow: writing manuscripts, reviewing manuscripts, writing grant applications, and reviewing grant applications.


1. Writing Manuscripts โ€‹

Core Principle: Transparency and Full Liability

AI tools โ€” large language models, image generators, coding assistants โ€” cannot be listed as authors. They cannot meet authorship criteria because they cannot take responsibility for the accuracy, integrity, and originality of the work (ICMJE Section V.A). The Committee on Publication Ethics (COPE) and the World Association of Medical Editors (WAME) maintain aligned positions: AI systems lack the accountability required for authorship, and human authors bear full responsibility for AI-assisted content.

What You Must Do โ€‹

Disclose substantive AI use. If you used generative AI for drafting text, data analysis, or image generation, you must describe which tool you used and how you used it. Insert a "Declaration of Generative AI and AI-Assisted Technologies" section before your References. ICMJE Section V.A states explicitly that nondisclosure of AI use "may require corrective action and may be construed as misconduct in some circumstances." Major publishers โ€” including Elsevier, Springer Nature, Wiley, and SAGE โ€” require comparable declarations for generative use while typically exempting narrow copy-editing assistance (basic grammar and spell-check) from formal disclosure.

Verify everything. You are personally liable for every citation, claim, and figure. If an AI fabricates a reference or introduces plagiarized text, that constitutes research misconduct โ€” fabrication, falsification, or plagiarism โ€” regardless of the tool's role. All major frameworks place this responsibility squarely on the human author.

Know the disclosure threshold. The trigger is generative use: drafting prose, producing images, generating analytical output, or substantively rewriting text. Traditional spell-check and basic grammar correction generally do not require disclosure. However, tools like Grammarly now include AI-powered features (sentence rewriting, tone adjustment, paragraph generation) that cross into generative territory. The disclosure obligation attaches to what the tool did, not what the tool is called. If a tool substantively rewrote your sentences or generated new text, disclose it.

Key policy sources:ICMJE Recommendations, Section V (Jan 2026) ยท COPE Position on AI ยท Elsevier AI Policy ยท Springer Nature Editorial Policies


2. Reviewing Manuscripts โ€‹

Core Principle: Confidentiality and Human Judgment

ICMJE Section V.B states that reviewers must maintain manuscript confidentiality, which "may prohibit the uploading of the manuscript to software or other AI technologies where confidentiality cannot be assured" unless the journal explicitly permits such use (ICMJE Section V.B). Publishers including Elsevier and SAGE go further, stating that AI technologies are not permitted to generate review reports and that reviewers must not share submitted manuscripts with generative AI tools.

The primary policy concern is confidentiality assurance. Public and commercial AI services โ€” even those that do not train on user inputs by default โ€” cannot contractually guarantee the same protections as a journal's peer review system. Data may be logged, retained, accessed by provider employees, or subject to legal process. A note on training specifically: many frontier model providers (OpenAI, Anthropic, Google) do not train on inputs submitted through their API or paid tiers by default, but some services do, and free-tier accounts may have different defaults. These distinctions, while real, do not resolve the confidentiality concern because the policies address whether confidentiality can be assured, not merely whether training occurs. Researchers should verify the data-handling terms of any AI tool they use, but for peer review purposes, the practical guidance remains clear.

Practical rule: Do not upload manuscripts under review into generative AI tools. If a journal has a specific, disclosed policy permitting limited AI assistance in review, follow that journal's terms exactly. Otherwise, treat all manuscript content as confidential and keep it out of AI systems.

A note on local models. Running a model locally on your own hardware (e.g., via Ollama or LM Studio) eliminates the third-party confidentiality risk, since no data leaves your machine. However, to date, few journal policies explicitly address locally hosted models. This guidance reflects a conservative interpretation of existing confidentiality and reviewer-responsibility expectations: until journals clarify this boundary, the cautious professional stance is to seek journal permission before using any generative AI โ€” local or cloud โ€” in the review process.

Key policy sources:ICMJE Section V.B (Jan 2026) ยท Elsevier Reviewer Policy ยท SAGE Editorial Policies


3. Reviewing Grants โ€‹

Core Principle: Absolute Prohibition

Federal grant review policies impose stricter requirements than manuscript review. The rationale is twofold: protecting the confidentiality of unpublished research ideas and ensuring that scientific evaluation reflects human expert judgment.

NIH (NOT-OD-23-149, June 2023). Peer reviewers are prohibited from using generative AI tools to analyze or formulate critiques of grant applications. NIH is revising its Security, Confidentiality, and Nondisclosure Agreements so that all peer reviewers and advisory council members must certify that they understand the confidential nature of the review process and will not use AI tools in analyzing and critiquing applications.

NSF (Important Notice No. 154, Dec 2023). Reviewers are prohibited from uploading any proposal content, review information, or related records to non-approved generative AI tools. Sharing proposal information with generative AI via the open internet is treated as a breach of NSF's confidentiality pledge and applicable policies. NSF reviewers participate as Special Government Employees, and the confidentiality obligations that govern NSF staff apply to them.

Local models do not create an exception here. Unlike the manuscript review context, the federal prohibition is not limited to confidentiality concerns. NIH's language prohibits using AI tools to "analyze or formulate critiques" โ€” full stop. A locally hosted model is still a generative AI tool. The prohibition applies regardless of where the model runs.

Consequence: Violation may result in loss of reviewer privileges, breach-of-agreement findings, and potential federal enforcement action.

Key policy sources:NIH NOT-OD-23-149 (June 2023) ยท NSF Important Notice No. 154 (Dec 2023)


4. Writing Grant Applications โ€‹

Core Principle: Originality and Authenticity

NIH โ€” NOT-OD-25-132 (Effective September 25, 2025) โ€‹

In "Supporting Fairness and Originality in NIH Research Applications," NIH states that it will not consider applications that are "substantially developed by AI," or that contain sections substantially developed by AI, to be the original ideas of the applicants. This policy has three operational components:

Originality threshold. "Substantially developed by AI" is the test. If NIH determines that an application or key sections were generated by AI rather than reflecting the investigator's own intellectual work, the application may be administratively withdrawn. Post-award detection may trigger referral to the Office of Research Integrity (ORI).

Submission cap. Each PD/PI or Multiple PI is limited to six new, renewal, resubmission, or revision applications per calendar year. T-series training grants and R13 conference grants are excluded from this cap. NIH noted that a small number of PIs had been submitting more than 40 applications in a single round, a pattern consistent with AI-driven volume that strains the review system.

Detection and enforcement. NIH states it will employ technology to detect AI-generated content, though the accuracy and reliability of current detection tools remain areas of active development and debate โ€” false positives are a documented concern. Consequences of a finding include disallowed costs, withheld awards, suspension, termination, and referral to ORI for plagiarism or fabrication. This is also why documentation matters: if you are falsely flagged and have no process trail, you have no defense.

What this means in practice. The distinction is between AI as an editorial assistant and AI as a ghostwriter of scientific ideas. A PI who develops the central hypothesis, designs the study, and writes the Specific Aims โ€” then uses AI to refine prose clarity or check grammar โ€” is operating within the policy. A PI who prompts an LLM to generate aims, hypotheses, or study designs from a topic area is not. The core intellectual contribution must demonstrably originate from the investigator.

NSF โ€” Disclosure-Based Approach โ€‹

NSF takes a different approach, relying on transparency and its existing research misconduct framework rather than submission caps or explicit prohibitions on AI-assisted drafting:

Disclosure encouraged. Proposers are encouraged to describe in the Project Description "the extent to which, if any, generative AI technology was used and how it was used to develop their proposal" (NSF Important Notice No. 154).

Misconduct definition updated. The PAPPG 24-1, Supplement 1 (issued December 8, 2025) revises Chapter XII.C to specify that research misconduct โ€” fabrication, falsification, or plagiarism โ€” includes misconduct "committed by an individual directly or through the use or assistance of other persons, entities, or tools, including artificial intelligence (AI)-based tools."

Human accountability. Proposers are responsible for the accuracy and authenticity of their entire submission, including any content developed with AI assistance.

The key distinction between NIH and NSF: NIH sets a regulatory ceiling โ€” volume limits, the "substantially developed" test, and administrative withdrawal. NSF maintains a behavioral floor โ€” transparency expectations and strict misconduct liability for AI-assisted errors. Both hold the human investigator fully accountable.

Key policy sources:NIH NOT-OD-25-132 (July 2025) ยท NSF Important Notice No. 154 (Dec 2023) ยท NSF PAPPG 24-1, Supplement 1 (Dec 2025)


Best Practice: Document Your Process โ€‹

This is a practical recommendation, not a policy requirement.

If you can't document it, you can't defend it. NIH has stated it will use detection technology. ICMJE says nondisclosure may be treated as misconduct. Both create scenarios in which you may need to demonstrate that your intellectual contribution is genuine and that AI played only an auxiliary role.

Maintain a documentation trail for any project where AI tools were used:

  • Save chat logs and prompts. Export or screenshot your interactions with AI tools. Record what you asked, what the tool produced, and what you changed.
  • Use version control. Track drafts in Git, Word's track-changes, or your institution's document management system. Timestamped revisions show the evolution of your ideas.
  • Log tool usage. Keep a brief record of which tools were used, for which sections, and for what purpose. This becomes the basis for your disclosure statement.

This practice also protects you in the inverse case: if you did not use AI and are falsely flagged by detection software, a clean version history demonstrating your writing process is your strongest defense. Think of it as the AI-era equivalent of a lab notebook.


Quick-Reference Decision Guide โ€‹

ActivityPermitted?Key Condition
Drafting manuscript text with AIYesDisclose tool and purpose; verify all content
Basic grammar/spell-checkYesGenerally no disclosure required
AI-powered rewriting tools (e.g., Grammarly AI features)YesDisclose โ€” these are generative, not simple copy-editing
Generating figures or images with AIYesDisclose; confirm no copyright violation
Reviewing a manuscript with AINo*Do not upload confidential material; follow journal policy
Reviewing a grant application with AINoFederal prohibition applies regardless of model location
Polishing grant prose with AI (your original ideas)Yes, with cautionIdeas must be yours; AI refines language only; document the process
Generating hypotheses or aims via AINoViolates NIH originality standard; undermines NSF authenticity expectation
NSF proposal preparation with AIYes, with cautionDisclose in Project Description; you own all errors

* ICMJE defers to journal-specific policy; most journals effectively prohibit this. Federal grant review is an absolute prohibition.


A Note for Reviewers: What Counts as Prohibited Use? โ€‹

Reviewer Guidance: What Is and Is Not Prohibited โ€‹

The prohibition on AI use in peer review applies to uploading confidential material and using AI to analyze or evaluate the work under review. It does not prohibit using AI tools to look up publicly available information that informs your own understanding.

Permitted examples:

  • Verifying that a cited reference exists and accurately represents the source (e.g., "Does this 2022 paper by Smith et al. in JAMA exist, and what were its findings?")
  • Learning about a published conceptual framework referenced in the work (e.g., "What is the RE-AIM framework and how is it typically applied?")
  • Checking statistical methods or terminology you are unfamiliar with

Not permitted:

  • Uploading any portion of the manuscript or application into an AI tool
  • Asking AI to evaluate the applicant's or author's use of a framework, method, or citation
  • Asking AI to draft, outline, or refine your review or critique
  • Describing proposal-specific details to an AI tool, even without uploading the document

The distinction is between using AI as a reference tool for publicly available knowledge and using AI as an analytical tool for confidential material. NSF explicitly permits the former: "NSF reviewers may share publicly available information with current generation generative AI tools" (Important Notice No. 154). NIH's prohibition is scoped to using AI to "analyze or formulate critiques" of applications (NOT-OD-23-149), which does not encompass independent background research on published work.

The practical test: If the query you're typing into an AI tool contains no information that could only have come from the document you're reviewing, and you're not asking for evaluative judgment about the work, you are likely within bounds.


Key Takeaways โ€‹

Treat generative AI as a transparent auxiliary, not a surrogate for scientific judgment.

Can I use it to polish my writing?

Yes โ€” with disclosure for generative use.

Can I use it to generate my core hypotheses?

No โ€” those must be your original intellectual work.

Can I use it to review a manuscript?

Not unless the journal explicitly permits it, and even then, only for limited non-substantive support.

Can I use it to review a grant?

Never โ€” federal prohibition, regardless of the model or platform.

What happens if I don't disclose?

Nondisclosure may constitute misconduct under ICMJE, NIH, and NSF frameworks.

How do I protect myself?

Document your process. Save prompts, track drafts, log tool usage.


Source Documents โ€‹


The Steward's Schema was prepared for the benefit of faculty, staff, and trainees in biomedical research. AI tools โ€” including Google NotebookLM, Google Deep Research, Perplexity Pro, and Claude Opus 4.6 โ€” were used to assist in the research, fact-checking, and preparation of this document. All content was reviewed, verified, and edited by the human author. This document reflects policies active as of February 2026 and should be verified against current sources before reliance.