sINI1 Format Specification (sINI0) This work by Tai Kedzierski is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License. Please visit http://creativecommons.org/licenses/by-sa/3.0/ for more information. You can copy, distribute and adapt this work as you wish, provided you clearly acknowledge the original author through an appropriate means. === Abstract === This specification seeks to define an easy-edit / easy-parse format named hereafter "sINI". This specification builds upon the sINI0 base specification to include some additional features on the previous version. The purpose of sINI is to provide a fixed, standardized and extended parametric configuration file format. It is intended to be easy to write, easy to write parsing code for, and easy for a human to read and edit a sINI file. This specification defines the first extended version, "sINI1", whose features any future version of sINI should include. === 0 - Definitions === Character - any single byte from a character encoding Alphabetic - any printable character from a-z and A-Z strictly, as defined for 7-byte ASCII Numeric - any character in 0-9 Alphanumeric - any character that is Alphabetic or Alphanumeric IWS - in-line whitespace, namely space characters and tabulation characters. Printable - any printable character in 7-bit ASCII range, and IWS. EOL - a line ending, determined by a CRLF sequence. A sequence ending at EOL does not include the CRLF sequence unless otherwise specified. SOL - start of a line, occurring just before the first character of the file, or just after a line terminator (after EOL) === 1 - Fundamentals === sINI files MUST start with a semi-colon, the characters "sINI", then followed immediately by a sequence of Numerics, for example ;sINI1 This is the sINI identifier. If there is IWS after the sINI identifier, the character sequence that follows until the EOL determines the file encoding. e.g. ;sINI1 utf-8 The default encoding is UTF-8 The numeric part x of sINI[x] indicates the sINI version. All features in a given version MUST be catered for in a later version. This means that all features of sINI0 must be supported by sINI1 parsers, and so forth. === 2 - Structure === Every sINI file is organized into Sections. Every section has zero-to-many key-value pairs, also known as parameters. Section names and key names MUST contain an Alphabetic character, and MAY contain any amount of other alphanumeric characters, as well as the additional characters "-", "_", ".". A section name must be: SOL "[" section-name "]" EOL A parameter consists of SOL KEY "=" VALUE EOL Any IWS appearing between the "=" sign and EOL is considered part of the data. VALUE may consist of any Printable. VALUE may not contain CR, LF or CR-LF characters, as these are reserved for EOL. Any parameter must be declared as part of a section. A parameter is part of the last section to have been declared. If no section has been declared, it is declared in the section "Main". The following sINI samples are thus equivalent: ;sINI0 [Main] item1=hi item2=bye and ;sINI0 item2=bye [Main] item1=hi and ;sINI0 item1=hi item2=bye If a section is explicitly declared, a parser MUST reveal this, even if the section has no data. "Main" exists only if explicitly declared, or if some data has been assigned under it by default. The following are thus NOT equivalent: ;sINI0 [Main] [Sec1] and ;sINI0 [Sec1] If a line starts with a semi-colon, then the line is discarded, and any extension to the line as well. See Multi-line Data below. === 3 - Data over several lines === If the last character before EOL is a backslahsh ("\") then the next line is deemed the continuation of the current line and the backslash is discarded. This rule is also valid for comments. The following items are equivalent: ;sINI0 multiline=this data is \ on several lines and ;sINI0 multiline=this data is on several lines If two back slashes are present conescutively, they stand for a single backslash character, as part of the data. This is valid: ;sINI0 multiline=this data is \\ newkey=on several lines This is not valid ;sINI0 multiline=this data is \\ on several lines Data can also contain line breaks under the sINI1 specification. See section 5: sINI1 Extended Features === 4 - Duplicity === If a section is declared twice, then its contents are merged as if it were one section. If a key is duplicated within a section, then the last declaration takes precedence. The following are equivalent: ;sINI0 [Sec1] keyA=data keyB=info [Sec2] key1=name key1=rename [Sec1] keyA=new data and ;sINI0 [Sec1] keyA=new data keyB=info [Sec2] key1=rename === 5 - sINI1 Extended Features === If a key's value is ">>" (with no trailing whitespace), the lines following the parameter's line form the actual data for the key. The lines are read until a line containing the single character "." (dot) is encountered: ;sINI1 mykey=>> This data is truly multiline ..Starts with a dot It can span accross lines . newkey=This is data in a new key If a line of the data starts with "." then the line is pre-pended with an additional "." The actual data in the above mykey parameter is: This data is truly multiline .Starts with a dot It can span accross lines Parsers are not required to guarantee that the type of line terminator - CR, LF or CRLF - will be preserved upon parsing. If such information is essential, it is advised to encode the data.