Version History:

Version 4.1.9 [17/9/2006]

  - Updated code to compile under Gnu GCC 4.0.2:
    - adding forward declarations and typename keywords to compensate
      for changes in resolution time of templated classes and functions
    - changed references to deprecated *.h libraries to current versions

Version 4.1.8 [18/3/2004]

  - put minimum of 2 on matching expressions/contexts required for 
    secondary contexts/expressions or rules involving types lacking
    characteristic expressions/contexts. This removes the need for
      sesp_for_no_characteristics * minimum_contexts_per_type > 100%
      scsp_for_no_characteristics * minimum_expressions_per_type > 100%
      rsp_for_no_characteristics * minimum_expressions_per_type > 100%
    and values set by the 'set support' command have been adjusted
    accordingly

(unnumbered) [14/8/2003]

  - debugged error in save_load file_var routine, where the binary
    storage of integers was architecture-dependent. Now integers are
    always stored least-significant-byte first.

  - removed extraneous spaces in rule-output in morpheme mode
	
4.1.7.1 [19/6/2003]

  - debugged error in 'clear database' routine, where the whole-sentence
    type was not properly initialized

  - debugged error in reloading large savefiles, where the array
     expression::esize was assigned values but not memory space.

  - debugged minor configuration problems, moving emconfig.h to config.h,
    and adding #include <iomanip.h> to show.cc

  - debugged compilation errors caused by respecifying default parameter 
    values in definitions of declared-but-not-yet-defined functions


4.1.7 [2/8/2002]

  - Used automake and autoconf to create proper installation and
    configuration files (optimization level is now a configuration
    option, as it should be)

  - Reworked sources and configuration checks to increase portability.
    The configuration script now compensates for different versions
    of the readline library, and the code compiles under g++-3.0.

  - Solved bug where word::table was erroneously locked (after a parse
    or show-expression command, the program would not learn new words in 
    the first line of a sample)

  - Reworked generate command to avoid infinite loops

  - Added 'exit nosave' command to exit without saving or prompt.

  - Added debug command 'delete', and multiplicities to insert/delete

  - When optimizing the rule_grammar, calculate size in words rather than 
    number of rules

  - Replaced shortcircuit_rules setting by check whether this would fit
      (given current values of ruleset_increase_disallowed)

  - Changed compression rate calculation so that it uses
    the actual replaced context/expression pairs instead of the
    context/expression pairs of the entire induced submatrix

  - Discarded intention to catch break signal, so that user interrupts
    return you to the user interface rather than abort the entire program;
    apparently there is no way to treat signals as exceptions, and
    recreating the entire exception handling mechanism to handle signals
    is too much work. Note: if someone knows how to do this, please tell me.

  - Discarded intention to  use editline instead of readline; editline does
    not appear to offer the option of configuring tab completion to handle 
    commands as well as files.


4.1.6.1 [11/3/2002]

  - Fixed bug where specifying the '--database' option additionally
    implied the '--quiet' option

  - Fixed bug where save files with settings and empty data caused corrupt 
    data structures

  - Fixed bug with save files in morpheme mode

  - Fixed bug where words read from files in morpheme mode were prefixed
    by whitespace

  - Added rightside map to expression data representation in morpheme mode
    to optimize calculations

  - Fixed bug where the xx_for_no_characteristics settings were not saved

  - Enhanced file save format recognition module to succesfully
    implement downward-upward file compatibility

4.1.6 [27/6/2001]

  - manual written

  - Add support for negative samples (i.e. sentences which
    should not be contructible), and an 'unlearn' command.
    Also add a faculty for determining the most uncertain derived
    sentence and querying an oracle about its correctness.

  - Added settings for optimizing for structure in parsings, as
    opposed to optimizing for compactness of the ruleset.

  - Add quotes, escapes, multi-line commands and multi-command
    lines to command line interpreter. As a side-effect, it is no
    longer necessary that redirection symbols are preceded by space
    and not followed by space: now, it is sufficient if they are not
    escaped or enclosed in quotes.

  - Add advanced command line editing, history buffer and tab-completion 
    to command line interpreter, using the Gnu Readline/History libraries

  - Reimplemented command recognition using regular expression pattern 
    matching.

  - Use Gnu getopt to get startup command line options and arguments

  - Used template functions in a number of places to combine similar
    blocks of code into one block, for improved maintainability.

  - Redesigned the save-load module for (hopefully) the last time,
    making it easily extendible, and using template functions so
    that the same block of code does both reading and writing.

  - Made source code modules separately compilable (for later linking)
    in order to speed development. Reorganized modules.
    Wrote configure and make scripts to make emile and morpheme,
    providing for versions with different levels of debugging
    and optimization (see INSTALL.TXT)

  - Removed minor bug in the algorithm eliminating rules that do not
    contribute to the covering of secondary expressions of a type.

4.1.5 [19/6/2000]

  - Changed the representation of grammatical derivation rules to allow
      rules using arbitrary combinations of subexpressions and type
      references, such as [1] => the [2] jumped over [3] dog.

  - Developed a completely new algorithm to create the rules grammar, based
    on the idea of adding types to the set of used types as long as this
    results in a decrease of the resulting rule set. The algorithm is 
    incremental in the sense that recompilations will start with the types 
    used before.

  - Added a 'clear rules' command to reset the markings of used types
    (i.e. to reset the incrementality of the rules-finding algorithm).

  - Added separate settings for support percentages for secondary contexts,
    secondary expressions and rules, for the case where a type has no
    characteristic contexts/expressions and uses primaries instead.

4.1.4 [27/4/2000]

  - Changed the representation of collections that are not used for 
    membership queries. A vector representation is sufficient for 
    traversals. Also performed several other memory usage optimizations, 
    for a global decrease in memory usage of 50-60%.

  - Change the definition of 'characteristic' expression/context used
    in the program to the correct one.

  - Added a compression rate measure to the 'show type' output, defined as
      compression(g) = ( (total length of all primary expressions)
                         + (total length of all primary contexts) )
                       / (total length of all combinations of primary
                          contexts with primary expressions)
    and added a 'show types sorted-by-compression' command

  - added "parse_type _n_|* _phrase_" command to attempt parsing a phrase
    as an expression of a specified or unknown type

  - Changed the default value for 'minimum_number_of_expressions' from 4 to 2.

  - Changed name of 'Chomsky-type rule' to 'binary rule'.

  - Added a 'show memory details' command to show more detailed memory 
    usage statistics (including memory usage statistics for individual
    program structures, for debugging purposes).

  - Change database format version checking to check for
    a valid _range_ rather than a valid _number_. This should improve
    downward compatibility of savefiles.

  - Eliminated the need for a space after the command when using the
    commands '!' and '?' (synonymes for 'shell' and 'help').

  - Added the shortcut '.' as a synonyme for 'script', with no need
    for a space after the command (as above).

  - for sake of completeness, added 'show version' command with 'version' 
    and 'ver' shortcuts,

  - Change version numbers from '3.' to '4.1', as the early prolog 
    version turned out to have been the real EMILE 3.0.
    Added EMILE 1.0-3.0 to CHANGES.TXT

  - Corrected bug where Carriage Returns were treated as separate symbols
    instead of as whitespace.

  - Set EMILE to ignore 'broken pipe' errors when piping the output of
    a show command. As it was, piping the output to 'less' and quitting
    less by pressing 'q' caused Emile to exit.


4.1.3 [7/3/2000]

  - Added a 'generate new' command which generates only sentences
    that are not already in the database (and not already generated).

  - Added a 'set support' command to set _all_ support-related settings
    to the given value or a derivate thereof.

  - Added 'set random' and 'show random' commands to control the randomizer

  - Changed the default support values from 70 & 91 to 50 & 75, as the 
    latter values tend to yield better results.

  - Fixed bug where the generate command in Word Analysis mode generated
    extraneous spaces inside the words.


4.1.2 [3/3/2000]

  - Added 'generate' command to generate sentences based on the
    derived rules.

  - Emile has a refinement, that it ignores periods as end-of-sentence
    markers if they follow an initial. This behaviour is now optional, and
    controlled by the new settings variable 'ignore_abbreviation_periods'.

  - Added option to use regular expressions for end-of-sentence markers,
    controlled by the new settings variable 'regular_expression_as_marker'.
    Also reworked the normal end-of-sentence-marking system to use
    the regular-expression engine, which should increase speed.

  - The parse command now displays grammatical structure

  - When parsing, Emile may now arbitrarily (re)assign types to single
    words in order to get a satisfactory parsing. This behaviour is
    controlled by the new settings variable 'parser_tolerance'.

  - Added seperate help-screens for shortcut commands and show command.

  - Added 'show mem' command to monitor memory usage.

  - Added 'show help' option to show command as a synonym for 'help'.

  - Added redirection of the show command's output to pipes or files.

  - Added requirement that for single rules [a]=>[b], the type [a] should 
    have more secondary expressions than the type [b]. This should
    prevent loops that throw the parser into an infinite regress.

  - Fixed bug where assigning '0' to a settings variable caused it
    to revert to its default setting.

  - Added version history in 'CHANGES.TXT'.

  - Added brief install instructions in 'INSTALL.TXT'.


4.1.1 [22/1/2000]:

  - Added '!' to the list of default end-of-sentence markers.

  - Fixed bug where a line containing only whitespace was not considered
    to be an empty line (for purposes of ending a multi-line sentence).


4.1 [19/1/2000]:

  - Implemented new algorithm using characteristic, essential and secondary
    contexts and expressions, and added corresponding setting variables.

  - Various bugfixes

  - Added optional arguments to show command for displaying single types, 
    contexts or expressions.

  - Updated and expanded documentation of algorithm.

  - Allowing multi-line sentences is now optional (on by default).

  - Using multiplicities of sentences is now optional, and off by default.

  - Added verbosity levels > 1, so debug level information can be enabled
    at runtime rather than at compilation time.

  - Added settings 'type_usefulness_required' and 'rule_usefulness_required'
    to control the eliminating of types/rules that do not contribute enough.

  - Expanded the help screen to include one-line descriptions of commands.


4.0.4 [10/12/1999]:

  - Added compilation-time option to analyze morhemes and words instead
    of phrases and sentences, and optimized data structures for same.

  - Wrote documentation of algorithm


4.0.1 - 3.0.3 [??]

  - Various undocumented bug fixes and enhancements


4.0 [9/9/1999]:

  - First non-alpha version

3.0 [1998]
    Prolog program described in Pieter Adriaan's article `Learning 
    Shallow Context-Free Languages under Simple Distributions'.
    Based on a 1-dimensional clustering algorithm.

2.0 [1993?]
1.0 [1993?]
    Mainly theoretical versions: see `Lanuage Learning from a
    Categorial Perspective' by Pieter Adriaans.