Skip to content

testing-nccl-test-dymanic-branch#5322

Draft
aslam-quad wants to merge 2 commits intoGoogleCloudPlatform:developfrom
aslam-quad:test-dynamic-branch
Draft

testing-nccl-test-dymanic-branch#5322
aslam-quad wants to merge 2 commits intoGoogleCloudPlatform:developfrom
aslam-quad:test-dynamic-branch

Conversation

@aslam-quad
Copy link
Contributor

Submission Checklist

NOTE: Community submissions can take up to 2 weeks to be reviewed.

Please take the following actions before submitting this pull request.

  • Fork your PR branch from the Toolkit "develop" branch (not main)
  • Test all changes with pre-commit in a local branch #
  • Confirm that "make tests" passes all tests
  • Add or modify unit tests to cover code changes
  • Ensure that unit test coverage remains above 80%
  • Update all applicable documentation
  • Follow Cluster Toolkit Contribution guidelines #

@github-actions github-actions bot added the external PR from external contributor label Mar 9, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the flexibility of the NCCL daily test by allowing it to fetch the cluster-toolkit repository from a configurable owner and branch. This change moves away from hardcoded values, enabling easier testing against different forks or feature branches without manual intervention, thereby streamlining the testing workflow for various development scenarios.

Highlights

  • Dynamic Repository Cloning: The test-nccl.yml Ansible playbook has been updated to dynamically clone the cluster-toolkit repository. It now uses environment variables REPO_OWNER and BRANCH_NAME to determine the source repository and branch, defaulting to GoogleCloudPlatform and develop respectively.
  • Ansible Git Module Integration: The manual git clone command has been replaced with the ansible.builtin.git module, improving idempotency and leveraging Ansible's capabilities for repository management.
  • Improved Debugging: A new debug task was added to explicitly display the repository owner and branch name being used for the NCCL test, aiding in troubleshooting and verification.
Changelog
  • tools/cloud-build/daily-tests/ansible_playbooks/test-validation/test-nccl.yml
    • Added a debug task to display the repository owner and branch name.
    • Replaced manual git clone with ansible.builtin.git module for cloning cluster-toolkit.
    • Configured ansible.builtin.git to use REPO_OWNER and BRANCH_NAME environment variables with default fallbacks.
    • Adjusted the cd command in the Run prerequisite NCCL scripts task to reflect the new cloning method.
Activity
  • No specific activity (comments, reviews, etc.) was found in the provided context for this pull request.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the NCCL test playbook to use the ansible.builtin.git module for cloning, improving idempotency and clarity, and introduces environment variables for dynamic repository owner and branch selection. While this enhances flexibility, it introduces a critical security risk of CI/CD poisoning, allowing arbitrary code execution on the CI runner from a malicious fork. Additionally, several shell tasks are vulnerable to command injection due to improper quoting of Jinja2 variables. It is strongly recommended to implement strict validation for the repository source and use the quote filter for all variables in shell commands. Please ensure the pull request description is fully completed according to contribution guidelines.

Comment on lines +23 to +27
ansible.builtin.git:
repo: "https://github.com/{{ lookup('env','REPO_OWNER') | default('GoogleCloudPlatform', true) }}/cluster-toolkit.git"
dest: "{{ ansible_user_dir }}/cluster-toolkit"
version: "{{ lookup('env','BRANCH_NAME') | default('develop', true) }}"
update: yes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-critical critical

The playbook uses environment variables REPO_OWNER and BRANCH_NAME (retrieved via lookup('env', ...) on lines 19-20 and 24-26) to determine the repository and branch to clone. In a CI/CD environment that triggers on Pull Requests (especially from forks), these variables are often populated with the fork's owner and branch name. Since the playbook subsequently executes scripts from the cloned repository (e.g., import_pytorch_container.sh on line 33, build-nccl-tests.sh on line 34, and run-nccl-tests.sh on line 62), an attacker can achieve arbitrary code execution on the CI runner by submitting a PR from a malicious fork. This is a classic CI/CD poisoning vulnerability.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be addressed.

@aslam-quad aslam-quad force-pushed the test-dynamic-branch branch 2 times, most recently from 94a092d to d033e6a Compare March 9, 2026 10:39
@aslam-quad aslam-quad force-pushed the test-dynamic-branch branch 4 times, most recently from 71dafbe to 6e54def Compare March 13, 2026 05:41
@aslam-quad aslam-quad force-pushed the test-dynamic-branch branch from 6e54def to 8a91929 Compare March 13, 2026 06:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

external PR from external contributor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants