sINI Base Format Specification (sINI0) This work by Tai Kedzierski is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License. Please visit http://creativecommons.org/licenses/by-sa/3.0/ for more information. You can copy, distribute and adapt this work as you wish, provided you clearly acknowledge the original author through an appropriate means. === Abstract === This specification seeks to define an easy-edit / easy-parse format named hereafter "sINI", generally compatible with the current existing INI format(s). As there is no definitive specification for the general INI format and its implementations, sINI is not intended to be fully compatible with any, although in most cases, a basic INI parser will understand sINI0, and vice versa. The purpose of sINI is to provide a fixed, standardized and extended parametric configuration file format. It is intended to be easy to write, easy to write parsing code for, and easy for a human to read and edit a sINI file. This specification defines the base version, "sINI0", upon which any other sINI format is built. === 0 - Definitions === Character - any single byte from a character encoding Alphabetic - any printable character from a-z and A-Z strictly, as defined for 7-byte ASCII Numeric - any character in 0-9 Alphanumeric - any character that is Alphabetic or Alphanumeric IWS - in-line whitespace, namely space characters and tabulation characters. Printable - any printable character in 7-bit ASCII range, and IWS. EOL - a line ending, determined by a CRLF sequence. A sequence ending at EOL does not include the CRLF sequence unless otherwise specified. SOL - start of a line, occurring just before the first character of the file, or just after a line terminator (after EOL) === 1 - Fundamentals === sINI files MUST start with a semi-colon, the characters "sINI", then followed immediately by a sequence of Numerics, for example ;sINI0 This is the sINI identifier. If there is IWS after the sINI identifier, the character sequence that follows until the EOL determines the file encoding. e.g. ;sINI0 utf-8 The default encoding is UTF-8 The numeric part x of sINI[x] indicates the sINI version. All features in a given version MUST be catered for in a later version. This means that all features of sINI0 must be supported by sINI1 parsers, and so forth. === 2 - Structure === Every sINI file is organized into Sections. Every section has zero-to-many key-value pairs, also known as parameters. Section names and key names MUST contain an Alphabetic character, and MAY contain any amount of other alphanumeric characters, as well as the additional characters "-", "_", ".". A section name must be: SOL "[" section-name "]" EOL A parameter consists of SOL KEY "=" VALUE EOL Any IWS appearing between the "=" sign and EOL is considered part of the data. VALUE may consist of any Printable. VALUE may not contain CR, LF or CR-LF characters, as these are reserved for EOL. Any parameter must be declared as part of a section. A parameter is part of the last section to have been declared. If no section has been declared, it is declared in the section "Main". The following sINI samples are thus equivalent: ;sINI0 [Main] item1=hi item2=bye and ;sINI0 item2=bye [Main] item1=hi and ;sINI0 item1=hi item2=bye If a section is explicitly declared, a parser MUST reveal this, even if the section has no data. "Main" exists only if explicitly declared, or if some data has been assigned under it by default. The following are thus NOT equivalent: ;sINI0 [Main] [Sec1] and ;sINI0 [Sec1] If a line starts with a semi-colon, then the line is discarded, and any extension to the line as well. See Multi-line Data below. === 3 - Data over several lines === If the last character before EOL is a backslahsh ("\") then the next line is deemed the continuation of the current line and the backslash is discarded. This rule is also valid for comments. The following items are equivalent: ;sINI0 multiline=this data is \ on several lines and ;sINI0 multiline=this data is on several lines If two back slashes are present conescutively, they stand for a single backslash character, as part of the data. This is valid: ;sINI0 multiline=this data is \\ newkey=on several lines This is not valid ;sINI0 multiline=this data is \\ on several lines === 4 - Duplicity === If a section is declared twice, then its contents are merged as if it were one section. If a key is duplicated within a section, then the last declaration takes precedence. The following are equivalent: ;sINI0 [Sec1] keyA=data keyB=info [Sec2] key1=name key1=rename [Sec1] keyA=new data and ;sINI0 [Sec1] keyA=new data keyB=info [Sec2] key1=rename