|
|
@@ -1,57 +0,0 @@
|
|
|
-# CLAUDE.md
|
|
|
-
|
|
|
-This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
-
|
|
|
-## Commands
|
|
|
-
|
|
|
-```bash
|
|
|
-# Build
|
|
|
-cargo build
|
|
|
-
|
|
|
-# Run (compress a file)
|
|
|
-cargo run -- <input_file> [-m c]
|
|
|
-
|
|
|
-# Run (extract a file)
|
|
|
-cargo run -- <input_file> -m x
|
|
|
-
|
|
|
-# Run tests
|
|
|
-cargo test
|
|
|
-
|
|
|
-# Run a single test
|
|
|
-cargo test <test_name>
|
|
|
-
|
|
|
-# Run tests in a specific module
|
|
|
-cargo test --lib node::test
|
|
|
-cargo test --lib hufftree::base::test
|
|
|
-cargo test --lib hufftree::canonical_tests
|
|
|
-cargo test --lib storage::test
|
|
|
-```
|
|
|
-
|
|
|
-## Architecture
|
|
|
-
|
|
|
-This is a CLI tool for compressing and decompressing UTF-8 text files using canonical Huffman encoding. The compressed format uses `.z` as the file extension.
|
|
|
-
|
|
|
-**Compression pipeline** (`-m c`):
|
|
|
-1. `hufftree::base::get_char_frequencies` — counts character frequencies in input text
|
|
|
-2. `hufftree::base::Hufftree::new` — builds a Huffman tree using a min-heap (`BinaryHeap<Reverse<Node>>`)
|
|
|
-3. `hufftree::canonical::CanonicalHufftree::from_tree` — converts the base tree into canonical form (codes reassigned by length, then frequency order)
|
|
|
-4. `storage::store_tree_and_text` — writes the compressed file
|
|
|
-
|
|
|
-**Decompression pipeline** (`-m x`):
|
|
|
-1. `storage::read_tree_and_text` — reads the file, reconstructs `CanonicalHufftree::from_vec`, decodes the text
|
|
|
-
|
|
|
-**Binary file format** (defined in `src/storage.rs`):
|
|
|
-```
|
|
|
-4 bytes — total bit length of the remaining data
|
|
|
-n×8 bytes — tree entries: (4 bytes code_length BE) + (4 bytes UTF-8 char)
|
|
|
-4 bytes — delimiter (0xFFFFFFFF)
|
|
|
-m bytes — Huffman-encoded text (padded to byte boundary)
|
|
|
-```
|
|
|
-
|
|
|
-**Key types:**
|
|
|
-- `node::Node` — binary tree node with optional `char` and frequency; ordered by frequency
|
|
|
-- `hufftree::base::Hufftree` — wraps the root `Node` and the character list; used only during compression
|
|
|
-- `hufftree::canonical::CanonicalHufftree` — bidirectional map (`BiMap<char, BitVec>`) for encode/decode; also stores `storage_char_codes: Vec<(char, u32)>` for serialization
|
|
|
-- `cli::Args` / `cli::Mode` — Clap-derived CLI args; mode defaults to `C` (compress)
|
|
|
-
|
|
|
-**Dependencies:** `bit-vec` for `BitVec`, `bimap` for bidirectional char↔code lookup, `clap` for CLI parsing.
|