What Building a CSV Module Taught Us About Our Own Language

What Building a CSV Module Taught Us About Our Own Language

February 12, 2026·
Luma Core Team

We recently set out to build a CSV module for Luma — a pure .luma file that can parse, query, and convert comma-separated data. The kind of practical, everyday library that any language should be able to support. What we didn’t expect was how quickly it would expose gaps in the compiler itself.

This is a story about building something real and letting it tell you what’s missing.

The Plan

The idea was straightforward: represent a CSV table as [[str]] — a list of rows, where each row is a list of strings. The first row is the header. Simple functions for reading, filtering, converting. No special types, no framework — just lists and strings.

We wrote the code, hit compile, and the compiler said no. Six times.

What We Found

Logical OR didn’t work. We needed || to check whether a CSV field needs quoting — does it contain a comma, or a quote, or a newline? The operator existed in the lexer and parser, but it had a precedence bug that caused it to be rejected every time. A one-line fix: its precedence was set to -1 when it should have been 0.

Carriage returns were invisible. CSV files from Windows use \r\n line endings. When we tried to normalize them with "\r", the compiler treated \r as a literal letter r — it simply wasn’t in the escape table. Our replacement function was stripping every r from the data. Adding one case to the lexer’s escape switch fixed it.

Negative numbers panicked. We tried result: int = -1 as a “not found” sentinel. The parser crashed — it had no idea what to do with a minus sign at the start of an expression. We added unary minus support to the parser, and taught the code generator to handle it properly.

Nested lists couldn’t grow. The whole design was built on [[str]] — but calling .add() on a list of lists failed. The compiler generates typed helper functions for list operations (_luma_list_add_str, _luma_list_add_int, etc.), but it had no helpers for list-of-list types. We added them.

You couldn’t name a function len. When a module exports a function called len, the compiled Go code shadows Go’s built-in len — which is illegal. We added name mangling so user functions with reserved names get prefixed automatically, while built-in methods like .len() on lists and strings continue to work.

Slicing was all-or-nothing. We wanted table[1..] to skip the header row. The parser required both a start and end index — open-ended slices like list[1..] or list[..3] simply didn’t parse. We updated the parser to make both bounds optional, and added the code generation to produce the right Go slice expressions.

What We Did

We fixed all six. Then we rewrote the CSV module using the design we originally wanted.

The result is clean and natural:

csv = import "csv"

table: [[str]] = csv.read("data.csv")
names: [str] = csv.col(table, "name")
engineers: [[str]] = csv.where(table, "role", "engineer")
print(csv.to_json(engineers))

Functions like rows() now return [[str]] instead of [any]. Column access uses direct indexing instead of splitting strings on every call. Filtering uses table[1..] to skip the header — no more manual index tracking.

The Lesson

We could have found these bugs with carefully written test suites. But we found them faster by building something real. A CSV parser exercises string escaping, nested data structures, operator expressions, function naming, and list slicing — all in one small module.

Every language has gaps between what it claims to support and what actually works when you try to use it. The best way to close those gaps is to write real code in the language, and to be honest about what breaks.

Six bugs. Six fixes. One better language.

Last updated on