An FTXT reader program can read the text and ignore all formatting and structural information in a document form that uses forms ftxt for the leaf nodes. this amounts to stripping a document down to a stream of plain text. It would do this by skipping over all chunks except FTXT.chrs (CHRS chunks found inside a form ftxt) and within the ftxt.chrs chunks skipping all control characters and control sequences. (appendix c diagrams this text scanner.) It may also read FTXT.fons chunks to find a description for font 1. Here's a Pascal-ish program for an ftxt stripper. given a form (a document of some kind), it scans for all FTXT.chrs chunks. this would likely be applied to the first FORM in an IFF file. PROCEDURE ReadFORM4CHRS(); {Read an IFF FORM for FTXT.CHRS chunks.} BEGIN IF the FORM's subtype = "FTXT" THEN ReadFTXT4CHRS() ELSE WHILE something left to read in the FORM DO BEGIN read the next chunk header; CASE the chunk's ID OF "LIST", "CAT ": ReadCAT4CHRS(); "FORM": ReadFORM4CHRS(); OTHERWISE skip the chunk's body; END END END; {Read a LIST or CAT for all FTXT.CHRS chunks.} PROCEDURE ReadCAT4CHRS(); BEGIN WHILE something left to read in the LIST or CAT DO BEGIN read the next chunk header; CASE the chunk's ID OF "LIST", "CAT ": ReadCAT4CHRS(); "FORM": ReadFORM4CHRS(); "PROP": IF we're reading a LIST AND the PROP's subtype = "FTXT" THEN read the PROP for "FONS" chunks; OTHERWISE error--malformed IFF file; END END END; PROCEDURE ReadFTXT4CHRS(); {Read a FORM FTXT for CHRS chunks.} BEGIN WHILE something left to read in the FORM FTXT DO BEGIN read the next chunk header; CASE the chunk's ID OF "CHRS": ReadCHRS(); "FONS": BEGIN read the chunk's contents into a FontSpecifier variable; IF the font specifier's id = 1 THEN use this font; END; OTHERWISE skip the chunk's body; END END END; {Read an FTXT.CHRS. Skip all control sequences and unused control chars.} PROCEDURE ReadCHRS(); BEGIN WHILE something left to read in the CHRS chunk DO CASE read the next character OF LF: start a new output line; ESC: SkipControl([' '..'/'], ['0'..'~']); IN [' '..'~'], IN [NBSP..'ÿ']: output the character; SS2, SS3: ; {Just handle the following G0 character directly, ignoring the shift to G2 or G3.} CSI: SkipControl([' '..'?'], ['@'..'~']); DCS, OSC, PM, APC: SkipControl([' '..'~'] + [NBSP..'ÿ'], [ST]); END END; {Skip a control sequence of the format (rSet)* (tSet), i.e., any number of characters in the set rSet followed by a character in the set tSet.} PROCEDURE SkipControl(rSet, tSet); VAR c: CHAR; BEGIN REPEAT c := read the next character UNTIL c NOT IN rSet; IF c NOT IN tSet THEN put character c back into the input stream; END The following program is an optimized version of the above routines ReadFORM4CHRS and ReadCAT4CHRS for the case where you're ignoring fonts as well as formatting. It takes advantage of certain facts of the IFF format to read a document form and its nested forms, lists, and cat s without a stack. In other words, it's a hack that ignores all fonts and faces to cheaply get to the plain text of the document. {Cheap scan of an IFF FORM for FTXT.CHRS chunks.} PROCEDURE ScanFORM4CHRS(); BEGIN IF the document FORM's subtype = "FTXT" THEN ReadFTXT4CHRS() ELSE WHILE something left to read in the FORM DO BEGIN read the next chunk header; IF it's a group chunk (LIST, FORM, PROP, or CAT) THEN read its subtype ID; CASE the chunk's ID OF "LIST", "CAT ":; {NOTE: See explanation below.*} "FORM": IF this FORM's subtype = "FTXT" THEN ReadFTXT4CHRS() ELSE; {NOTE: See explanation below.*} OTHERWISE skip the chunk's body; END END END; *Note: This implementation is subtle. After reading a group header other than form ftxt it just continues reading. this amounts to reading all the chunks inside that group as if they weren't nested in a group.