This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
# Build
cargo build
# Run (compress a file)
cargo run -- <input_file> [-m c]
# Run (extract a file)
cargo run -- <input_file> -m x
# Run tests
cargo test
# Run a single test
cargo test <test_name>
# Run tests in a specific module
cargo test --lib node::test
cargo test --lib hufftree::base::test
cargo test --lib hufftree::canonical_tests
cargo test --lib storage::test
This is a CLI tool for compressing and decompressing UTF-8 text files using canonical Huffman encoding. The compressed format uses .z as the file extension.
Compression pipeline (-m c):
hufftree::base::get_char_frequencies — counts character frequencies in input texthufftree::base::Hufftree::new — builds a Huffman tree using a min-heap (BinaryHeap<Reverse<Node>>)hufftree::canonical::CanonicalHufftree::from_tree — converts the base tree into canonical form (codes reassigned by length, then frequency order)storage::store_tree_and_text — writes the compressed fileDecompression pipeline (-m x):
storage::read_tree_and_text — reads the file, reconstructs CanonicalHufftree::from_vec, decodes the textBinary file format (defined in src/storage.rs):
4 bytes — total bit length of the remaining data
n×8 bytes — tree entries: (4 bytes code_length BE) + (4 bytes UTF-8 char)
4 bytes — delimiter (0xFFFFFFFF)
m bytes — Huffman-encoded text (padded to byte boundary)
Key types:
node::Node — binary tree node with optional char and frequency; ordered by frequencyhufftree::base::Hufftree — wraps the root Node and the character list; used only during compressionhufftree::canonical::CanonicalHufftree — bidirectional map (BiMap<char, BitVec>) for encode/decode; also stores storage_char_codes: Vec<(char, u32)> for serializationcli::Args / cli::Mode — Clap-derived CLI args; mode defaults to C (compress)Dependencies: bit-vec for BitVec, bimap for bidirectional char↔code lookup, clap for CLI parsing.