# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Commands ```bash # Build cargo build # Run (compress a file) cargo run -- [-m c] # Run (extract a file) cargo run -- -m x # Run tests cargo test # Run a single test cargo test # Run tests in a specific module cargo test --lib node::test cargo test --lib hufftree::base::test cargo test --lib hufftree::canonical_tests cargo test --lib storage::test ``` ## Architecture This is a CLI tool for compressing and decompressing UTF-8 text files using canonical Huffman encoding. The compressed format uses `.z` as the file extension. **Compression pipeline** (`-m c`): 1. `hufftree::base::get_char_frequencies` — counts character frequencies in input text 2. `hufftree::base::Hufftree::new` — builds a Huffman tree using a min-heap (`BinaryHeap>`) 3. `hufftree::canonical::CanonicalHufftree::from_tree` — converts the base tree into canonical form (codes reassigned by length, then frequency order) 4. `storage::store_tree_and_text` — writes the compressed file **Decompression pipeline** (`-m x`): 1. `storage::read_tree_and_text` — reads the file, reconstructs `CanonicalHufftree::from_vec`, decodes the text **Binary file format** (defined in `src/storage.rs`): ``` 4 bytes — total bit length of the remaining data n×8 bytes — tree entries: (4 bytes code_length BE) + (4 bytes UTF-8 char) 4 bytes — delimiter (0xFFFFFFFF) m bytes — Huffman-encoded text (padded to byte boundary) ``` **Key types:** - `node::Node` — binary tree node with optional `char` and frequency; ordered by frequency - `hufftree::base::Hufftree` — wraps the root `Node` and the character list; used only during compression - `hufftree::canonical::CanonicalHufftree` — bidirectional map (`BiMap`) for encode/decode; also stores `storage_char_codes: Vec<(char, u32)>` for serialization - `cli::Args` / `cli::Mode` — Clap-derived CLI args; mode defaults to `C` (compress) **Dependencies:** `bit-vec` for `BitVec`, `bimap` for bidirectional char↔code lookup, `clap` for CLI parsing.