🧪 Skills

Volcano Plot Script

Generate R/Python code for volcano plots from DEG (Differentially Expressed Genes) analysis results. Triggered when user needs visualization of gene expressi...

v0.1.0
❤️ 0
⬇️ 16
👁 1
Share

Description


name: volcano-plot-script description: Generate R/Python code for volcano plots from DEG (Differentially Expressed Genes) analysis results. Triggered when user needs visualization of gene expression data, p-value vs fold-change scatter plots, publication-ready figures for bioinformatics analysis. version: 1.0.0 category: Bioinfo tags: [volcano-plot, bioinformatics, deg-analysis, r, python, visualization] author: AIPOCH license: MIT status: Draft risk_level: Medium skill_type: Tool/Script owner: AIPOCH reviewer: last_updated: 2026-02-06

Volcano Plot Script Generator

A skill for generating publication-ready volcano plots from differential gene expression analysis results.

Overview

Volcano plots visualize the relationship between statistical significance (p-values) and magnitude of change (fold changes) in gene expression data. This skill generates customizable R or Python scripts for creating high-quality figures suitable for publications.

Use Cases

  • Visualize RNA-seq DEG analysis results
  • Identify significantly upregulated and downregulated genes
  • Highlight genes of interest (markers, pathways)
  • Generate publication-quality figures for manuscripts
  • Compare multiple experimental conditions

Input Requirements

Required input data format:

  • Gene identifier (gene symbol or ENSEMBL ID)
  • Log2 fold change values
  • Adjusted or raw p-values
  • Optional: gene annotations, pathways

Output

  • Publication-ready volcano plot (PNG/PDF/SVG)
  • Customizable R or Python script
  • Optional: labeled significant gene lists

Usage

# Example: Run the volcano plot generator
python scripts/main.py --input deg_results.csv --output volcano_plot.png

Parameters

Parameter Description Default
--input Path to DEG results CSV/TSV required
--output Output plot file path volcano_plot.png
--log2fc-col Column name for log2 fold change log2FoldChange
--pvalue-col Column name for p-value padj
--gene-col Column name for gene IDs gene
--log2fc-thresh Log2 FC threshold for significance 1.0
--pvalue-thresh P-value threshold 0.05
--label-genes File with genes to label None
--top-n Label top N significant genes 10
--color-up Color for upregulated genes #E74C3C
--color-down Color for downregulated genes #3498DB
--color-ns Color for non-significant genes #95A5A6

Technical Difficulty

Medium - Requires understanding of:

  • DEG analysis concepts (fold change, p-values, FDR)
  • Data visualization principles
  • Matplotlib/ggplot2 plotting libraries

Dependencies

Python

  • pandas
  • matplotlib
  • seaborn
  • numpy

R

  • ggplot2
  • dplyr
  • ggrepel (for label positioning)

References

Author

Auto-generated skill for bioinformatics visualization.

Risk Assessment

Risk Indicator Assessment Level
Code Execution Python/R scripts executed locally Medium
Network Access No external API calls Low
File System Access Read input files, write output plots Medium
Instruction Tampering Standard prompt guidelines Low
Data Exposure Output files saved to workspace Low

Security Checklist

  • No hardcoded credentials or API keys
  • Input file paths validated (no ../ traversal)
  • Output directory restricted to workspace
  • Script execution in sandboxed environment
  • Error messages sanitized (no stack traces exposed)
  • Dependencies audited (pandas, matplotlib, seaborn, numpy)

Prerequisites

# Python dependencies
pip install -r requirements.txt

# R dependencies (if using R)
install.packages(c("ggplot2", "dplyr", "ggrepel"))

Evaluation Criteria

Success Metrics

  • Successfully generates executable Python/R script
  • Output plot is publication-ready quality
  • Correctly identifies significant genes based on thresholds
  • Handles missing or malformed data gracefully
  • Color scheme is accessible (colorblind-friendly)

Test Cases

  1. Basic DEG Visualization: Input standard DESeq2 results → Valid volcano plot
  2. Custom Thresholds: Adjust log2FC and p-value thresholds → Correct gene classification
  3. Gene Labeling: Specify genes to label → Labels appear correctly
  4. Large Dataset: Input 20,000+ genes → Performance remains acceptable
  5. Malformed Data: Input with missing values → Graceful error handling

Lifecycle Status

  • Current Stage: Draft
  • Next Review Date: 2026-03-06
  • Known Issues: None
  • Planned Improvements:
    • Add interactive plot option (Plotly)
    • Support for multiple comparison groups
    • Integration with pathway enrichment tools

Reviews (0)

Sign in to write a review.

No reviews yet. Be the first to review!

Comments (0)

Sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Compatible Platforms

Pricing

Free

Related Configs