Conversation
…profiler), added citations for gprofiler, added gprofiler section to report
… added report text if no enrichment found, added gprofiler2 params to report and nextflow schema
|
… contrasts are investigated
…ing error in schema, added rds to GOST output
pinin4fjords
left a comment
There was a problem hiding this comment.
I'm going to have a another run-through of this in the morning. But for now: do you think it's sensible to have BOTH GSEA and gprofiler run? We could have a variety of gene set methods available, and I'm inclined to have one active at a time or the report will get very busy and complex.
Good point, but I would prefer not to outright prevent multiple methods from running simultaneously (sb might want to compare the results for example). I feel like if all of these methods are disabled by default, that should be fine, no? If a user then decides to run multiple ones in parallel, that's sort of on them :'D |
pinin4fjords
left a comment
There was a problem hiding this comment.
Some more comments.
Mostly I think gprofiler should be better integrated into existing structures for working with gene sets (gene set file channels, report sections etc).
|
Note: This can only be merged after nf-core/modules#4538 |
…sea docu; replaced gsea with gprofiler in test config (gsea is still tested in test_affy); moved round_digits and gene_sets to new mods_ param category
pinin4fjords
left a comment
There was a problem hiding this comment.
Really close, just a few last comments and I tidied up the bash filtering script
Co-authored-by: Jonathan Manning <jonathan.manning@seqera.io>
…annel for gprofiler
pinin4fjords
left a comment
There was a problem hiding this comment.
I added a commit - hope you don't mind, just check you agree.
At a minimum we should document the new option in usage.
Alternatively, and just a suggestion, we reorder the priorities in the gprofiler2 module, so that the new mode option isn't necessary (because the gprofiler2 module would simply ignore the gene sets in the presence of one of the other options - we'd just have to document that).
What do you think?
…ndance into add_gpro, update gprofiler2
pinin4fjords
left a comment
There was a problem hiding this comment.
Last batch of corrections. With these addressed we're ready to go!
|
|
||
| def run_gene_set_analysis = params.gsea_run || params.gprofiler2_run | ||
|
|
||
| if (run_gene_set_analysis) { |
There was a problem hiding this comment.
If you really don't think you can handle running the gprofiler module multiple times for different gene set files, suggest modifying the validation logic like:
if (run_gene_set_analysis) {
if (params.gene_sets_files) {
gene_sets_files = params.gene_sets_files.split(",")
ch_gene_sets = Channel.of(gene_sets_files).map { file(it, checkIfExists: true) }
if (params.gprofiler2_run && (!params.gprofiler2_token && !params.gprofiler2_organism) && gene_sets_files.size() > 1) {
error("gprofiler2 can only work with a single gene set file")
}
} else if (params.gsea_run) {
error("GSEA activated but gene set file not specified!")
} else if (params.gprofiler2_run) {
if (!params.gprofiler2_token && !params.gprofiler2_organism) {
error("To run gprofiler2, please provide a run token, GMT file or organism!")
}
} else {
ch_gene_sets = [] // For methods that can run without gene sets
}
}
(sorry for not making this a proper suggestion- GitHub wouldn't let me becuase of the deleted lines)
There was a problem hiding this comment.
Ah yes, that is simpler of course.
No, I will certainly enable multiple GMTs in the future when I get around to it, just not right now (I'll mention that in the error).
Co-authored-by: Jonathan Manning <jonathan.manning@seqera.io>
Co-authored-by: Jonathan Manning <jonathan.manning@seqera.io>
pinin4fjords
left a comment
There was a problem hiding this comment.
Good to go- thanks for working with me on this, think things are much tighter now.
PR checklist
nf-core lint).nextflow run . -profile test,docker --outdir <OUTDIR>).docs/usage.mdis updated.docs/output.mdis updated.CHANGELOG.mdis updated.README.mdis updated (including new tool citations and authors/contributors).