Skip to content

PublishOp fails with empty value channel from collect() on empty process output #6745

@pinin4fjords

Description

@pinin4fjords

Bug report

Related to #6458 but a different trigger. That issue covers errorStrategy 'ignore' creating empty value channels. This issue is about collect() on empty process output leading to the same error.

Expected behavior

When a process never runs (empty input), and its output is passed through collect(), the workflow should complete without error if the output block has enabled true.

Actual behavior

ERROR ~ Cannot access first() element from an empty List

Stack trace points to PublishOp.onComplete(PublishOp.groovy:199).

Minimal reproduction

process UPSTREAM {
    input: tuple val(meta), path(reads)
    output: tuple val(meta), path("out.txt"), emit: results
    script: "echo done > out.txt"
}

process DOWNSTREAM {
    input: tuple val(meta), path(data)
    output: tuple val(meta), path("final.txt"), emit: results
    script: "echo final > final.txt"
}

workflow {
    // Empty input - UPSTREAM never runs
    UPSTREAM(Channel.empty())

    // collect() on empty process output never emits
    ch_collected = UPSTREAM.out.results
        .collect { it[1] }
        .map { files -> [['id': 'all'], files] }

    // DOWNSTREAM never starts (no input from collect)
    DOWNSTREAM(ch_collected)

    publish:
    results = DOWNSTREAM.out.results
}

output {
    results {
        enabled true
        path "results"
    }
}

Run with: nextflow run repro.nf -output-dir results

Real-world example: nf-core/rnaseq

This bug was discovered while implementing workflow outputs for nf-core/rnaseq. The pattern appears in:

  1. QUANTIFY_PSEUDO_ALIGNMENT subworkflow uses collect() to aggregate results before passing to TXIMETA_TXIMPORT:

    TXIMETA_TXIMPORT (
        ch_pseudo_results.collect{ meta_results -> meta_results[1] }.map { results -> [ ['id': 'all_samples'], results ] },
        ...
    )
  2. Output block (on workflow-outputs branch) publishes TXIMPORT outputs:

    pseudo_counts_gene {
        enabled !params.skip_pseudo_alignment && params.pseudo_aligner
        path { params.pseudo_aligner }
    }
  3. Trigger condition: In BAM input mode (--skip_alignment true), SALMON_QUANT never runs because there are no FASTQs to process. The collect() on its empty output never emits, TXIMETA_TXIMPORT never starts, and publishing its output triggers the bug.

Observations

  • collect() on an empty channel does not emit (verified with .view() and .ifEmpty())
  • toList() on an empty channel emits []
  • Direct pass-through of an empty process output to the output block works fine
  • The error occurs in PublishOp.onComplete when it encounters a channel from a process that never started

Environment

  • Nextflow version: 25.10.2
  • Java version: OpenJDK 21
  • OS: macOS

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions