Minimax Xlsx
MiniMax spreadsheet production system. Engage for any task that involves tabular data, numeric analysis, or spreadsheet generation. Supports XLSX/XLSM/CSV th...
Description
name: minimax-xlsx description: "MiniMax spreadsheet production system. Engage for any task that involves tabular data, numeric analysis, or spreadsheet generation. Supports XLSX/XLSM/CSV through Python 3 (openpyxl + pandas) for workbook construction, formula recalculation via recalc.py (LibreOffice headless), and the MiniMaxXlsx CLI (C#/.NET) for structural validation, formula auditing, and pivot table synthesis."
<toolkit_inventory>
Workbook construction — Python 3 via the ipython tool: openpyxl (creation, styling, formulas) + pandas (data wrangling).
Formula recalculation — recalc.py via the shell tool: invokes LibreOffice in headless mode to compute all formula values, then scans for error tokens and returns a JSON report. openpyxl writes formula text (e.g., =SUM(A1:A10)) but does NOT compute results — this script fills that gap.
python ./scripts/recalc.py output.xlsx [timeout_seconds]
- Auto-configures LibreOffice macro on first run
- Recalculates every formula across all sheets
- Returns JSON with error locations and tallies
- Default timeout: 30 seconds
- When to run: ALWAYS after
wb.save()and BEFORErecalc, whenever the file has formulas - When to skip: Only if the file has zero formulas (pure static data)
Clean output:
{"status": "success", "total_errors": 0, "total_formulas": 42, "error_summary": {}}
Error output:
{"status": "errors_found", "total_errors": 2, "total_formulas": 42, "error_summary": {"#REF!": {"count": 2, "locations": ["Sheet1!B5", "Sheet1!C10"]}}}
CLI diagnostics — MiniMaxXlsx binary via the shell tool, located at ./scripts/MiniMaxXlsx:
| Command | What it does | Typical invocation |
|---|---|---|
recalc |
Detects formula error tokens (#VALUE!, #REF!, etc.), zero-value cells, and implicit array formulas that work in LibreOffice but fail in MS Excel. Run after recalc.py. | ./scripts/MiniMaxXlsx recalc output.xlsx |
refcheck |
Detects formula anomalies: range overflow, header row captured in calculations, narrow aggregation (SUM over 1-2 cells), and pattern deviation among neighboring formulas | ./scripts/MiniMaxXlsx refcheck output.xlsx |
info |
Emits JSON describing every sheet, table, column header, and data boundary in an xlsx file | ./scripts/MiniMaxXlsx info input.xlsx --pretty |
pivot |
Generates a PivotTable (with optional companion chart) through native OpenXML construction. Read ./pivot.md before use. Required flags: --source, --location, --values. Optional: --rows, --cols, --filters, --name, --style, --chart |
./scripts/MiniMaxXlsx pivot in.xlsx out.xlsx --source "Sheet!A1:F100" --rows "Col" --values "Val:sum" --location "Dest!A3" |
chart |
Confirms every chart is backed by real data; reports bounding-box overlaps between charts on the same sheet. Exit 0 = OK; exit 1 = broken/empty charts that must be fixed. Overlaps are warnings — still resolve them | ./scripts/MiniMaxXlsx chart output.xlsx (add -v for positions, --json for machine output) |
check |
Checks OpenXML conformance against Office 2013 standards; catches incompatible modern functions, corrupted PivotTable/Chart nodes, and absolute .rels paths. Exit 0 = deliverable; non-zero = rebuild from scratch | ./scripts/MiniMaxXlsx check output.xlsx |
Implicit array formula handling (detected by recalc):
- Patterns like
MATCH(TRUE(), range>0, 0)require CSE (Ctrl+Shift+Enter) in MS Excel - LibreOffice handles these transparently, so they pass recalculation but fail in Excel
- When detected, restructure:
- Wrong:
=MATCH(TRUE(), A1:A10>0, 0)→ shows #N/A in Excel - Right:
=SUMPRODUCT((A1:A10>0)*ROW(A1:A10))-ROW(A1)+1→ works everywhere - Right: Or use a helper column with explicit TRUE/FALSE values
- Wrong:
Supplementary guides (loaded on demand — not preloaded):
./pivot.md— mandatory before any PivotTable work./charts.md— mandatory before creating chart objects./styling.md— mandatory before writing openpyxl styling code
</toolkit_inventory>
Every spreadsheet task moves through five phases in strict order. Do not skip or reorder phases.
<phase_intake>
Phase 1 — Understand the Task
Before writing any code:
- Restate the problem, surrounding context, and desired outcome in your own words
- Identify all data sources — plan acquisition strategy, log each attempt, fall back to alternatives when a primary source is unavailable
- For data that requires exploration: clean first, then profile distributions, correlations, missing values, and outliers through descriptive statistics
- Derive evidence-backed findings from the processed data; apply methodologies, document significant effects, review assumptions, handle outliers, confirm robustness, ensure reproducibility
- Audit all calculations systematically; validate using alternative data, methods, or segments; assess domain plausibility against external benchmarks; clarify gaps, validation procedures, and significance
- Numeric data must be stored in numeric format — never as text strings
- Financial or monetary datasets require currency formatting with the appropriate symbol
External data provenance — if the deliverable incorporates data fetched via datasource, web_search, API calls, or any retrieval tool:
- Append two traceability columns next to the data:
Provider|Reference Link - Embed URLs as plain strings — HYPERLINK() causes formula-evaluation overhead and occasional corruption
- Sample:
| Data Content | Provider | Reference Link |
|---|---|---|
| Apple Revenue | Yahoo Finance | https://finance.yahoo.com/... |
| China GDP | World Bank API | world_bank_open_data |
- When row-level attribution is impractical, add a footnote section at the bottom of the relevant sheet (separated by a blank row and a "References" label), or create a standalone "References" worksheet
- Delivering a workbook that contains retrieved data without provenance metadata is forbidden
</phase_intake>
<phase_design>
Phase 2 — Design the Workbook
Create a sheet-level blueprint before writing any code. For each sheet, document:
- Cell layout (headers, data region, summary rows, computed columns)
- Every formula and which cells it references
- Cross-sheet dependencies and lookup relationships
Dynamic computation rule (non-negotiable):
Any value derivable from a formula must be expressed as a formula. Static values are only acceptable for external-fetch data, true constants, or circular-dependency avoidance.
# Live formulas — correct
ws['D3'] = '=B3*C3'
ws['E3'] = '=D3/SUM($D$3:$D$50)'
ws['F3'] = '=AVERAGE(B3:B50)'
# Frozen snapshots — wrong
result = price * qty
ws['D3'] = result # loses traceability
Cross-table lookups — step by step:
When two tables share a common key (signals: "based on", "from another table", "match against", or columns like ProductID / EmployeeID appear in both):
- Identify the shared key column in both the source and the target table
- Confirm the key occupies the first column of the lookup range — if not, use
INDEX()+MATCH()instead - Build the formula with absolute anchoring and an error wrapper:
ws['D3'] = '=IFERROR(VLOOKUP(B3,$E$2:$H$120,2,FALSE),"")' - For cross-sheet references, prefix the range with the sheet name:
Summary!$A$2:$D$80 - Multi-file scenarios: consolidate all sources into a single workbook before writing any lookup formulas — substituting pandas
merge()for VLOOKUP is not allowed
Common pitfalls: #N/A usually means the key does not exist in the target range; #REF! means the column index exceeds the width of the lookup range.
Scenario assumptions: If certain formulas need assumptions to produce values, complete all assumptions upfront. Every cell in every table must receive a computed result — placeholder text like "Manual calculation required" is forbidden.
</phase_design>
<phase_fabrication>
Phase 3 — Build, Audit, Repeat
Construct the workbook one sheet at a time. Audit immediately after each sheet — never defer checks to the end.
FOR EACH sheet:
1. BUILD — populate cells with data, formulas, and visual formatting
2. SAVE — wb.save('output.xlsx')
3. RECALC — python ./scripts/recalc.py output.xlsx (if sheet has formulas)
4. AUDIT — ./scripts/MiniMaxXlsx recalc output.xlsx
./scripts/MiniMaxXlsx refcheck output.xlsx
(if the sheet has charts) ./scripts/MiniMaxXlsx chart output.xlsx -v
5. FIX — resolve every finding; loop back to step 1 until zero issues
6. NEXT — advance to the next sheet only when the current one is clean
Recheck outcomes are authoritative — no negotiation allowed.
The recalc subcommand identifies formula errors (#VALUE!, #DIV/0!, #REF!, #NAME?, #N/A, etc.) and zero-result cells. Follow these rules without exception:
- Zero tolerance: If
recalcflags ANY issue, resolve it before delivery. Period. - Do NOT assume issues will self-correct:
- Wrong: "These errors will disappear when the user opens the file in Excel"
- Wrong: "Excel will recalculate and fix these automatically"
- Right: Fix ALL flagged issues until error_count = 0
- Every finding is an action item:
error_count: 5means 5 problems to solvezero_value_count: 3means 3 suspicious cells to examine- Only
error_count: 0allows advancing to the next step
- Common rationalizations to avoid:
- Wrong: "The #REF! happens because openpyxl doesn't evaluate formulas" — fix it!
- Wrong: "The #VALUE! will resolve when opened in Excel" — fix it!
- Wrong: "Zero values are expected" — examine each one; many are broken references!
- Delivery gate: Files with ANY recalc findings cannot be shipped.
Workbook scaffold:
from openpyxl import Workbook
from openpyxl.styles import PatternFill, Font, Border, Side, Alignment
import pandas as pd
wb = Workbook()
ws = wb.active
ws.title = "Data"
ws.sheet_view.showGridLines = False # mandatory on every sheet
ws['B2'] = "Title"
ws['B2'].font = Font(size=16, bold=True)
ws.row_dimensions[2].height = 30 # prevent title clipping
wb.save('output.xlsx')
Visual design — before writing any styling code, read ./styling.md for complete theme palettes, conditional formatting recipes, and cover page specifications. Key rules:
- Gridlines off on every sheet; content starts at B2, not A1
- Four themes are available: grayscale (default), financial (monetary/fiscal work), verdant (ecology, education, humanities), dusk (technology, creative, scientific). Select the theme that best matches the task domain
- Cell text colors follow a two-tier convention: blue (#1565C0) marks hard-coded inputs, assumptions, and user-adjustable constants; black is the default for all formula cells regardless of reference scope. Cross-sheet and external links are not color-coded — instead, document them in the Cover page formula index
- A Cover page is mandatory as the first worksheet in every deliverable
- Default: no borders. Use thin borders within models only when they clarify structure.
Merged cells: Use ws.merge_cells() for titles, multi-column headers, or grouped labels. Apply formatting to the top-left cell only. Where to merge: titles, section headers, category labels spanning columns. Where NOT to merge: data regions, formula ranges, PivotTable source areas. Always set alignment on merged cells.
Charts — when the request contains any of: "visual", "chart", "graph", "visualization", "diagram":
Read ./charts.md in full before creating any chart object. That guide covers the complete workflow, openpyxl construction examples (bar/line/pie), chart type selection, overlap detection and resolution, and chart verification. Do not attempt chart creation without it.
PivotTables — activate when you detect any of these signals:
- Explicit: "pivot table", "data pivot", "数据透视表"
- Implicit: roll up, grouped summary, category totals, segment analysis, distribution view, frequency split, total per category
- The dataset exceeds 50 rows with natural grouping dimensions
- Multi-dimensional cross-tabulation is needed
When a PivotTable is warranted:
- Read
./pivot.mdcover-to-cover before doing anything - Follow the execution sequence documented there
- Use the
pivotCLI command exclusively — hand-coding pivot structures in openpyxl is forbidden - The pivot output is read-only from this point forward — any subsequent openpyxl
load_workbook()call will silently break internal XML references, producing a file Excel refuses to open
Execution order is strict: Complete all openpyxl-authored sheets (Cover, Summary, data tabs) first, then run pivot as the final write step. After pivot emits the file, do not modify that file again.
</phase_fabrication>
<phase_verification>
Phase 4 — Certify the File
After every sheet has passed its individual audit, run the structural gate:
./scripts/MiniMaxXlsx check output.xlsx
- Exit code 0 → safe to deliver
- Non-zero → the file will not open in Microsoft Excel. Do NOT attempt incremental patches — regenerate the workbook from corrected code.
</phase_verification>
<phase_release>
Phase 5 — Delivery Checklist
Before handing the file to the user, confirm every item:
- At least one .xlsx file in the delivery
- Every sheet with headers also contains data rows — no empty tables
- No formula cell evaluates to null (if any do, verify the referenced cells hold values)
- Row and column dimensions are proportional — no extremely narrow columns paired with tall rows
- All computations use real data unless the user explicitly requested synthetic data
- Measurement units appear in column headers, not inline with cell values
- Theme matches the task domain: financial for fiscal work, verdant for ecology/education/humanities, dusk for technology/creative/scientific, grayscale for everything else
- External data includes provenance metadata (Provider + Reference Link) in the workbook
- Charts are real embedded objects, not "chart data" sheets with manual instructions
- PivotTables were built via the
pivotCLI, not hand-coded in openpyxl - Cross-table lookups use VLOOKUP/INDEX-MATCH formulas, not pandas
merge() -
checkreturned exit code 0 - Chart overlaps have been resolved (if charts exist) — no overlapping bounding boxes
</phase_release>
Hard Constraints
Zero-tolerance error tokens — none of these may exist in the delivered file:
#VALUE!, #DIV/0!, #REF!, #NAME?, #NULL!, #NUM!, #N/A
Additional banned outcomes:
- Off-by-one cell references (wrong row, wrong column, or both)
- Text starting with
=misinterpreted as a formula - Hardcoded numbers where a formula should exist
- Filler strings — "TODO", "Not computed", "Needs manual input", "Awaiting data" or any similar stub text in a delivered cell
- Column headers missing units; mixed units within a calculation chain
- Monetary figures without currency symbols (¥/$)
- Any cell computing to 0 must be investigated — often a broken reference
Off-by-one prevention: Before each save, trace every formula's references back to the intended cells. Then run refcheck. Common errors: referencing header rows, wrong row/column offset. If a result is 0 or unexpected, verify references first.
Monetary values: Store at full precision (15000000, not 1.5M). Format for display via "¥#,##0". Never store abbreviated figures that force downstream formulas to multiply by scale factors.
Compatibility blocklist — the check command rejects these automatically:
The following functions require Excel 365/2021+ or are Google Sheets exclusives. Files that use them will fail to open in Excel 2019/2016. Grouped by migration effort:
Drop-in replacements available (swap the function, keep the same cell structure):
| Blocked | Substitute |
|---|---|
XLOOKUP() |
INDEX() + MATCH() |
XMATCH() |
MATCH() |
SORT(), SORTBY() |
Sort via Data ribbon or VBA |
SEQUENCE() |
ROW() arithmetic or manual fill |
RANDARRAY() |
RAND() with fill-down |
LET() |
Break into helper cells |
LAMBDA() |
Named ranges or VBA |
Structural redesign required (no drop-in replacement — rethink the approach):
| Blocked | Migration strategy |
|---|---|
FILTER() |
AutoFilter, or SUMIF/COUNTIF criteria ranges |
UNIQUE() |
Remove Duplicates, or COUNTIF-based dedup helper column |
TEXTSPLIT() |
MID() + FIND() chain |
VSTACK(), HSTACK() |
Manual range layout or helper columns |
TAKE(), DROP() |
INDEX() + ROW() offset slicing |
ARRAYFORMULA() (Google only) |
CSE arrays via Ctrl+Shift+Enter |
QUERY() (Google only) |
PivotTables or SUMIF/COUNTIF |
IMPORTRANGE() (Google only) |
Copy data into the workbook manually |
Banned workflow patterns:
- Building all sheets first, then running checks once at the end
- Ignoring
recalc/refcheckfindings and moving to the next sheet - Delivering any file that failed
check - Creating "chart data" sheets with manual-insert instructions instead of real embedded charts
- Delivering files with overlapping charts without resolving the overlaps
Reviews (0)
No reviews yet. Be the first to review!
Comments (0)
No comments yet. Be the first to share your thoughts!