# HG changeset patch # User Mario de Sousa # Date 1329397142 0 # Node ID 92d40d2a7adcf82f9fe60505bee54094e749e34a # Parent ff4d26b7e51dbb7e6910960747edb1ae62f44b07 Update comments on general architecture. diff -r ff4d26b7e51d -r 92d40d2a7adc readme --- a/readme Thu Feb 16 10:27:52 2012 +0000 +++ b/readme Thu Feb 16 12:59:02 2012 +0000 @@ -8,7 +8,64 @@ FINAL DRAFT - IEC 61131-3, 2nd Ed. (2001-12-10) - Copyright (C) 2003-2011 Mario de Sousa (msousa@fe.up.pt) + Copyright (C) 2003-2012 Mario de Sousa (msousa@fe.up.pt) + + +**************************************************************** +**************************************************************** +**************************************************************** +********* ********* +********* ********* +********* O V E R A L L G O A L S ********* +********* ********* +********* ********* +**************************************************************** +**************************************************************** +**************************************************************** + + + + This project has the goal of producing an open source compiler for the programming languages defined +in the IEC 61131-3 standard. These programming languages are mostly used in the industrial automation +domain, to program PLCs (Programmable Logic Controllers). + + This standard defines 5 programming languages: + - IL : Instructtion List + A textual programming language, somewhat similar to assembly. + - ST : Structured Text + A textual programming language, somewhat similar to Pascal. + - FBD: Function Block Diagram + A graphical programming language, somewhat similar to an electrical circuit diagram based on small + scale integration ICs (Integrated Circuits) (counters, AND/OR/XOR/... logic gates, timers, ...). + - LD : Ladder Diagram + A graphical programming language, somewhat similar to an electrical circuit diagram based on + relays (used for basic cabled logic controllers). + - SFC: Sequential Function Chart + A graphical programming language, that defines a state machine, based largely on Grafcet. + (may also be expressed in textual format). + + Of the above 5 languages, the standard defines textual representations for IL, ST and SFC. +It is these 3 languages that we target, and we currently support all three, as long as they are +expressed in the textual format as defined in the standard. + + Currently the matiec project generates two compilers (more correctly, code translaters, but we like +to call them compilers :-O ): iec2c, and iec2iec + + Both compilers accept the same input: a text file with ST, IL and/or SFC code. + + The iec2c compiler generates ANSI C code which is equivalent to the IEC 61131-3 code expressed in the input file. + + The iec2iec compiler generates IEC61131-3 code which is equivalent to the IEC 61131-3 code expressed in the input file. +This last compiler should generate and output file which should be almost identical to the input file (some formating +may change, as well as the case of letters, etc.). This 'compiler' is mostly used by the matiec project contributors +to help debug the lexical and syntax portions of the compilers. + + + + To compile/build these compilers, just +$./configure; make + + @@ -25,26 +82,109 @@ **************************************************************** The compiler works in 4(+1) stages: - Stage 1 - Lexical analyser - implemented with flex (iec.flex) - Stage 2 - Syntax parser - implemented with bison (iec.y) - Stage 3 - Semantics analyser - currently in its early stages - Stage 4 - Code generator - implemented in C++ - Stage 4+1 - Binary code generator - gcc, javac, etc... + ================================== + Stage 1 - Lexical analyser - implemented with flex (stage1_2/iec_flex.ll) + Stage 2 - Syntax parser - implemented with bison (stage1_2/iec_bison.yy) + Stage pre3 - Populate symbol tables - Symbol tables that will ease searching for symbols in the abstract symbol tree. + Stage 3 - Semantics analyser - currently does type checking only + Stage 4 - Code generator - generates ANSI C code + + Stage 5 - Binary code generator - gcc, javac, etc... (Not integrated into matiec compiler. Must be called explicitly by the user.) + Data structures passed between stages, in global variables: - 1->2 : tokens (int), and token values (char *) - 2->1 : symbol tables (defined in symtable.hh) - 2->3 : abstract syntax tree (tree of C++ classes, in absyntax.hh file) - 3->4 : Same as 2->3 - 4->4+1 : file with program in c, java, etc... + ========================================================== + 1->2 : tokens (int), and token values (char *) (defined in stage1_2/stage1_2_priv.hh) + 2->1 : symbol tables (implemented in util/symtable.[hh|cc], and defined in stage1_2/stage1_2_priv.hh) + 2->3 : abstract syntax tree (tree of C++ objects, whose classes are defined in absyntax/absyntax.hh) +pre3->3,4 : global symbol tables (defined in util/[d]symtable.[hh|cc] and declared in absyntax_utils/absyntax_utils.hh) + 3->4 : abstract syntax tree (same as 2->3), but now annotated (i.e. some extra data inserted into the absyntax tree) + + 4->5 : file with program in c, java, etc... + + The compiler works in several passes: - Pass 1: executes stages 1 and 2 simultaneously - Pass 2: executes stage 3 - Pass 3: executes stage 4 - Pass 4: executes stage 4+1 + ==================================== + +Stage 1 and Stage 2 +------------------- + Executed in one single pass. This pass will: + - Do lexical analysis + - Do syntax analysis + - Execute the absyntax_utils/add_en_eno_param_decl_c visitor class + This class will add the EN and ENO parameter declarations to all + functions that do not have them already explicitly declared by the user. + This will let us handle these parameters in the remaining compiler just as if + they were standard input/output parameters. + + +Stage Pre3 +---------- + Executed in one single pass. This pass will populate the following symbol tables: + - function_symtable; /* A symbol table with all globally declared functions POUs. */ + - function_block_type_symtable; /* A symbol table with all globally declared functions block POUs. */ + - program_type_symtable; /* A symbol table with all globally declared program POUs. */ + - type_symtable; /* A symbol table with all user declared (non elementary) datat type definitions. */ + - enumerated_value_symtable; /* A symbol table with all identifiers (values) declared for enumerated types. */ + + +Stage 3 +------- + Executes two algorithms (flow control analysis, and data type analysis) in several passes. + + Flow control: + Pass 1: Does flow control analysis (for now only of IL code) + Implemented in -> stage3/flow_control_analysis_c + This will anotate the abstract syntax tree + (Every object of the class il_instruction_c that is in the abstract syntax tree will have the variable 'prev_il_instruction' correctly filled in.) + + Data Type Analysis + Pass 1: Analyses the possible data types each expression/literal/IL instruction/etc. may take + Implemented in -> stage3/fill_candidate_datatypes_c + This will anotate the abstract syntax tree + (Every object of in the abstract syntax tree that may have a data type, will have the variable 'candidate_datatypes' correctly filled in.) + Pass 2: Narrows all the possible data types each expression/literal/IL instruction/etc. may take down to a single data type + Implemented in -> stage3/narrow_candidate_datatypes_c + This will anotate the abstract syntax tree + (Every object of in the abstract syntax tree that may have a data type, will have the variable 'datatype' correctly filled in. + Additionally, objects in the abstract syntax tree that represen function invocations will have the variables + 'called_function_declaration', 'extensible_param_count' and 'candidate_functions' correctly filled in. + Additionally, objects in the abstract syntax tree that represen function block (FB) invocations will have the variable + 'called_fb_declaration' correctly filled in.) + Pass 2: Prints error messages in the event of the IEC 61131-3 source code being analysed contains semantic data type incompatibility errors. + Implemented in -> stage3/print_datatype_errors_c + + +Stage 4 +------- + Has 2 possible implementations. + + iec2c : Generates C source code in a single pass (stage4/generate_c). + iec2iec: Generates IEC61131 source code in a single pass (stage4/generate_iec). + + + + + + +**************************************************************** +**************************************************************** +**************************************************************** +********* ********* +********* ********* +********* N O T E S ********* +********* ********* +********* ********* +**************************************************************** +**************************************************************** +**************************************************************** + + + + NOTE 1 @@ -388,4 +528,4 @@ ************************************************************************** - Copyright (C) 2003-2011 Mario de Sousa (msousa@fe.up.pt) + Copyright (C) 2003-2012 Mario de Sousa (msousa@fe.up.pt) diff -r ff4d26b7e51d -r 92d40d2a7adc stage3/narrow_candidate_datatypes.cc --- a/stage3/narrow_candidate_datatypes.cc Thu Feb 16 10:27:52 2012 +0000 +++ b/stage3/narrow_candidate_datatypes.cc Thu Feb 16 12:59:02 2012 +0000 @@ -415,6 +415,36 @@ } + + +/*************************************************************************************************/ +/* Important NOTE: */ +/* */ +/* The visit() methods for all the IL instructions must be idem-potent, as they may */ +/* potentially be called twice to narrow the same object. In other words, they may be called */ +/* to narrow an object that has already been previously narrowed. */ +/* This occurs when that IL instruction imediately precedes an IL non-formal function */ +/* invocation: */ +/* LD 45.5 */ +/* SIN */ +/* */ +/* In the above case, 'LD 45.5' will be narrowed once when the code that handles the */ +/* SIN function call */ +/* */ +/* narrow_nonformal_call(...), which is called by narrow_function_invocation(...), which is */ +/* in turn called by visit(il_function_call_c *) */ +/* */ +/* calls the call_param_value->accept(*this), where call_param_value will be a pointer */ +/* to the preceding IL instruction (in the above case, 'LD 45.5'). */ +/* */ +/* That same IL instruction will be again narrowed when called by the for() loop in */ +/* the visit(instruction_list_c *) visitor method. */ +/*************************************************************************************************/ + + + + + // void *visit(instruction_list_c *symbol); void *narrow_candidate_datatypes_c::visit(il_simple_operation_c *symbol) { /* Tell the il_simple_operator the datatype that it must generate - this was chosen by the next il_instruction (we iterate backwards!) */ diff -r ff4d26b7e51d -r 92d40d2a7adc stage3/stage3.cc --- a/stage3/stage3.cc Thu Feb 16 10:27:52 2012 +0000 +++ b/stage3/stage3.cc Thu Feb 16 12:59:02 2012 +0000 @@ -54,7 +54,7 @@ tree_root->accept(fill_candidate_datatypes); narrow_candidate_datatypes_c narrow_candidate_datatypes(tree_root); tree_root->accept(narrow_candidate_datatypes); - print_datatypes_error_c print_datatypes_error(tree_root); + print_datatypes_error_c print_datatypes_error(tree_root); tree_root->accept(print_datatypes_error); if (print_datatypes_error.get_error_found()) return -1;