Development: ARC File Format
ARC
Part of Developers_Notebook-ARC-File Formats.
A file format by SEA (System Enchancement Associates) that was very popular before Phil Katz came up with the ZIP format.
Holds multiple files but no directories like Developers_Notebook-ARC-ZIP.
There was quite a bit of controversy in the 80's (Probably the most important event in the open source debate)
Header
|Offset| Count| Type | Description |0| 1| wxByte | ID=1Ah |1| 1| wxByte | Compression method |2| 13| wxByte | File name (12 bytes + \0) |15| 1| wxUint32| Compressed file size |19| 1| wxUint32| File date in 16(32?)-bit MS-DOS format (All zero on bits 0-16 == no date) | | | | Bits 0-4 = day of month | | | | Bits 5-8 = month of year | | | | Bits 9-15 = year + 1980 | | | | Bits 16-20 = second/2 (Not displayed by UNARC, # is even) | | | | Bits 21-26 = minute | | | | Bits 27-31 = hour (0-24) |23| 1| wxUint16 | 16-bit CRC (0x8005, 0, true) |25| 1| wxUint32| Original file size
Compression Methods
<nowiki># </nowiki>is the compression method number from offset 1. |#|Compression Name|Method|Notes |0|End of archive marker|| |1|Unpacked|Uncompressed|No Length Field, Offset 25 is File Data, ARC 1.0-Only? |2|Unpacked|Uncompressed|ARC 3.1 |3|Packed| RLE | |4|Packed & Squeezed| RLE + Huffman|Static Conanical Huffman, ARC Pre-5.20 |5|Crunched|12-bit LZW| ARC Pre-4.0 |6|Packed & Crunched| RLE + 12-bit LZW| ARC Pre-4.1 (PAK documentation says 4.5) |7|Packed & Crunched| RLE + 12-bit LZW| (Faster Hash Algorithm - Internal to SEA) - ARC 4.6 |8|Packed & Crunched| RLE + 9-X*-bit LZC|*X = Offset 28, only 12 has ever been accepted, ARC 5.0 |9|Squashed| 9-13-bit LZC|c/o Phil Katz, PKARC |10|Packed & Crushed| RLE + 2-13-bit LZC |PAK only |11|Distilled| 8k Dynamic Huffman |8k == Size of Sliding Window, Similar to ZIP's Implode, PAK 2.51 only |12-19|Unknown|Unknown| ARC 6.0 ? |20-29|Informational items||Original File Size (And File Name) may be ignored, ARC 6.0 |30-39|Control items(?)|Unknown|ARC 6.0 |40+|Reserved||
Information Items
(Sub)Header
|Offset | Count |Type| Description |0 | 2 |byte| Length of header (includes "length" and "type"?) |2 | 1 |byte| (sub)type |3 | ? |byte| data
Information Item Type
|Block type| Subtype| Description | 20| | archive info || 0 | archive description (ASCIIZ) || 1 | name of creator program (ASCIIZ) || 2 | name of modifier program (ASCIIZ) | 21| | file info || 0 | file description (ASCIIZ) || 1 | long name (if not MS-DOS "8.3" filename) || 2 | extended date-time info (reserved) || 3 | icon (reserved) || 4 | file attributes (ASCIIZ) || | Attributes use an uppercase letter to signify the following: || | R read access || | W write access || | H hidden file || | S system file || | N network shareable | 22| | operating system info (reserved)
Remarks
- wxLITTLE_ENDIAN
- UNARC and other ARC compatable archivers search 64 bytes for the 1A header if it is not found.
- Except for the beginning of the file, in which case they search an extra 3 bytes to support self-extracting (Developers_Notebook-ARC-SFX) files.
- The LZC version used in arc files is the block compress version of Developers_Notebook-ARC-Z files.
- The ARC file ends with a end-of-archive mark, two wxBytes with a value of 0x1A and 0x0.
Resources
- NARC.DOC from NARC
- Some good compression tutorials related to ARC
- APPNOTE.TXT from PKPAK
- Some misc stuff
- http://www.corion.net/cgi-bin/wiki.cgi/display/Format%3AARC
- General file format info
- arc_file.inf (contains info from UNARC.INF and SQSHINFO.DOC, may be called arc_file.ini)
- SFX files. CRC. Quite useful.
- ARCHIVE FORMATS AND DATA by Raymond Clay
- Generally Innacurate! However is the only document to contain actual compression methods of distilled and crushed used by PAK
- Floating around the internet as ARCHIVES.TXT
- Generally Innacurate! However is the only document to contain actual compression methods of distilled and crushed used by PAK
- SQUASH.ZIP, which contains the algorithm for ARC's dynamic LZW variations
- Available at Simtel's MS-DOS archive