Skip to content

Add coverm/contig to nf-core/modules#997

Open
vinisalazar wants to merge 22 commits intonf-core:devfrom
vinisalazar:dev
Open

Add coverm/contig to nf-core/modules#997
vinisalazar wants to merge 22 commits intonf-core:devfrom
vinisalazar:dev

Conversation

@vinisalazar
Copy link
Contributor

@vinisalazar vinisalazar commented Mar 12, 2026

Add CoverM to calculate read alignment metrics for contig/bin alignments (#587)

  • Run nf-core modules install coverm/contig
  • Create new module directory and modify modules.json

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/mag branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@vinisalazar
Copy link
Contributor Author

Converting this to draft until I add the module logic to the subworkflows, will ping when done.

@vinisalazar
Copy link
Contributor Author

Update to changes:

  • in bin/get_mag_depths.py, add option to open gzip and non-gzip files
  • contig depth calculation by default is now done with COVERM_CONTIG (METABAT2 is still an option)
  • on mag_depths_convert/main.nf, define function to remove gz extension
  • Add new params to nextflow.config with coverm options
  • Update nextflow_schema.json with relevant new params
  • On subworkflows/local/binning/main.nf:
    • Add checks to make sure CoverM mapper is bowtie2 (otherwise doesn't generate BAM files required for binning)
    • Enable both METABAT2 and COVERM as contig depth calculators
      • Same is done on subworkflows/local/binning_preparation for SR and LR
    • Rename output channel from metabat2depths to contig_depths

@vinisalazar vinisalazar marked this pull request as ready for review March 12, 2026 05:08
@vinisalazar
Copy link
Contributor Author

This one is good to go but will require merging nf-core/modules#10588 for the lint checks to pass.

@vinisalazar
Copy link
Contributor Author

PS all code generated by @claude was reviewed

@vinisalazar vinisalazar marked this pull request as draft March 12, 2026 11:50
@vinisalazar
Copy link
Contributor Author

Making this into draft again until the upstream update to the coverm module is merged.

@vinisalazar vinisalazar force-pushed the dev branch 2 times, most recently from c0d42d6 to 55f610f Compare March 12, 2026 12:56
vinisalazar and others added 14 commits March 17, 2026 10:50
Add CoverM to calculate read alignment metrics for contig/bin alignments (nf-core#587)

  - Run `nf-core modules install coverm/contig`
  - Create new module directory and modify modules.json
Introduces --coverm_mapper (default: 'bowtie2') to control which aligner
is used when estimating contig depths. CoverM-native mappers skip bowtie2
alignment and let CoverM handle reads→depths directly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- COVERM_CONTIG_SHORTREAD/LONGREAD replace both METABAT2_JGISUMMARIZEBAMCONTIGDEPTHS variants
- bowtie2 path: BAMs → CoverM (bam_input=true)
- coverm-native path: reads+assembly → CoverM (bam_input=false)
- Add runtime error when BAM-requiring binners are used with coverm-native
- Rename emit metabat2depths → contig_depths

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add COVERM_CONTIG_SHORTREAD and COVERM_CONTIG_LONGREAD blocks with
--methods metabat to output MetaBAT2-compatible depth format.
Shortread also passes --mapper for coverm-native aligner selection.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CoverM outputs uncompressed depth files; update get_mag_depths.py and
CONVERT_DEPTHS to accept both .gz and plain-text inputs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace METABAT2_JGISUMMARIZEBAMCONTIGDEPTHS references with COVERM_CONTIG
and update depth file extension from .txt.gz to .txt. Snapshots will need
regeneration after the first test run to capture updated version strings.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add --depth_calculator (coverm|metabat2, default: coverm) to switch
  between COVERM_CONTIG and METABAT2_JGISUMMARIZEBAMCONTIGDEPTHS for
  contig depth estimation
- Add --coverm_methods with full CoverM methods enum (default: metabat)
  to configure the coverage calculation method passed to CoverM
- Restore METABAT2_JGISUMMARIZEBAMCONTIGDEPTHS_SHORTREAD/LONGREAD config
  blocks in modules.config

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Update coverm/contig to use eval/topic-based versioning and handle
  methods as List or string; remove versions.yml generation
- Remove ch_versions.mix for COVERM_CONTIG (topic versioning is automatic)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Sync coverm/contig main.nf and meta.yml to reference (eval/topic
  versioning, List-aware methods string handling)
- Remove ch_versions.mix for COVERM_CONTIG calls (topic versioning
  is automatic; no versions.yml emitted)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
  - Add coverm to dependencies
  - Add nf-core#997 to 'Added' section
  - Reflect addition of COVERM_CONTIG as contig depth calculator
@vinisalazar vinisalazar marked this pull request as ready for review March 16, 2026 23:54
@vinisalazar
Copy link
Contributor Author

This is now ready for review, but changes are extensive so some advice on testing would be welcome.

Thanks

@jfy133
Copy link
Member

jfy133 commented Mar 17, 2026

Thanks @vinisalazar !

I need to focus on taxprofiler for the next few weeks... @dialvarezs do you think you can go through this?

Copy link
Contributor

@prototaxites prototaxites left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @vinisalazar, this looks really good! I have a few thoughts on the implementation that I think it's worth having a discussion about?

Comment on lines +35 to +36
// Validate: BAM-requiring binners cannot be used with CoverM native mappers
// because no BAM files are produced in that path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CoverM will output BAM files if --bam-file-cache-directory is set! We should force that option so we can get them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me

[meta, bams, bais]
// Validate: BAM-requiring binners cannot be used with CoverM native mappers
// because no BAM files are produced in that path
if (params.coverm_mapper != 'bowtie2') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would suggest we call this param coverage_mapper?

METABAT2_JGISUMMARIZEBAMCONTIGDEPTHS_LONGREAD(ch_summarizedepth_input.longread)
ch_versions = ch_versions.mix(METABAT2_JGISUMMARIZEBAMCONTIGDEPTHS_LONGREAD.out.versions)
// Long reads always come as BAMs (from minimap2)
if (params.depth_calculator == 'coverm') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, though others may disagree, we should just drop JGISUMMARIZEBAMCONTIGDEPTHS entirely and replace it with CoverM. The outputs in "metabat" mode should be value-for-value identical, and supporting both requires a lot of complicated control flow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this. If CoverM produces a equivalent output, then keeping it it's just unnecessary complexity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be happy to implement this. Would you need me to rebase this branch and edit the commits I've already made, or is it okay to just add commits on top?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commits on top, I'm personally not fussy about super clean git history (unless @dialvarezs @prototaxites have a stronger opinion)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, fine with me too. Can always do a squash commit if things are really bad!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note I'm the person who is often reluctant to deprecate existing/accepted modules, and often rather prefer keep adding new altenrative options unless there is a very very clear case of something being superior.

But like @dialvarezs says - if the output is the exact same then I'm fine with the older module being deprecated.

* Binning preparation for short reads
*/

include { BOWTIE2_ASSEMBLY_BUILD } from '../../../modules/local/bowtie2_assembly_build/main'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question, would it make sense to move CoverM into this workflow (and drop the long reads one) so that we have a unified place where all coverage estimation happens?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me too. l'll see how this could be done when working on the "drop the JGIBAMSUMMARIZE..." part.

@vinisalazar
Copy link
Contributor Author

Thanks @prototaxites @dialvarezs @jfy133 for your comments. I have limited time to work on this, but I will try to put in some effort in the next 1-2 weeks.

vinisalazar and others added 6 commits March 24, 2026 11:51
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CoverM in --methods metabat mode produces value-for-value identical
output, so JGISUMMARIZEBAMCONTIGDEPTHS and the depth_calculator
param are now removed. CoverM is always used for contig depth
estimation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…identity

shortread_percentidentity and longread_percentidentity previously set
--percentIdentity on JGISUMMARIZEBAMCONTIGDEPTHS; now they pass
--min-read-percent-identity to COVERM_CONTIG_SHORTREAD and
COVERM_CONTIG_LONGREAD respectively.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When using a CoverM-native mapper (non-bowtie2), pass
--bam-file-cache-directory so that CoverM caches the BAM files it
generates internally during alignment.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove references to the dropped depth_calculator param and
jgi_summarize_bam_contig_depths. Rename coverm_mapper → coverage_mapper
throughout. Update percentidentity help text to reference CoverM's
--min-read-percent-identity flag.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vinisalazar
Copy link
Contributor Author

@prototaxites @dialvarezs implemented the changes requested here, starting from e5b6fa2. However, I'm having a hard time interpreting the tests. The ones that passed in the CI still show errors when inspecting the logs, and others just failed.

I ran the nf-test command and manually inspected the output files and they look right to me, here are the CoverM output file and the MaxBin2 summary file:
Screenshot 2026-03-24 at 14 10 58

Screenshot 2026-03-24 at 14 11 50

I will try to debug to see how to get these tests to pass, but open to suggestions if you have any.

@dialvarezs
Copy link
Member

Hey @vinisalazar, I'm going to give a look at this tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants