You are currently browsing the category archive for the ‘Policy Writing’ category.

It is highly recommended to go through the following links first

In this grammar file we do our semantic analysis, by walking and evaluating the nodes, of the AST tree, generated in our grammar file.


Figure 1. A Snapshot of the tree of a statement.

tree grammar StatementWalker1;

options {

language = Java;

tokenVocab = XL1;

ASTLabelType = CommonTree;

}

@header{

package com.fawad.policywritingtool;

}

policy

: (mode ‘(””‘ appname ‘”‘ ‘as’ app ‘,’ ‘”‘ permname'”‘ ‘as’ perm ‘)’ ‘:’

statement+ ‘->’ policyeffect ‘(‘app ‘,’ perm ‘)’ ‘;’ )+

;

mode

: RES

;

appname

: IDENT(‘.’ IDENT | ‘_’ IDENT)+

;

app

: IDENT+

;

method

: IDENT+'()’

;

permname

: IDENT(‘.’ IDENT | ‘_’ IDENT)+

;

perm

: IDENT+

;


An integer ‘result’ is returned when the statement is evaluated

statement returns [int result] :    e=expression  { result = e; }

;


Following are the operators and their respective implementation in the walker

expression returns [int result]

: ^(‘AND’ op1=expression op2=expression ) { if(op1==1 && op2 == 1) result =1; else result =0;}

|^(‘OR’ op1=expression op2=expression ) { if(op1==1 || op2 == 1) result =1; else result =0;}

|^(‘>’ op1=expression op2=expression) { result= op1 > op2?1:0; }

| ^(‘<‘ op1=expression op2=expression) { result= op1 < op2?1:0;}

| ^(‘>=’ op1=expression op2=expression) { result= op1 >= op2?1:0;}

| ^(‘<=’ op1=expression op2=expression) { result= op1 <= op2?1:0;}

| ^(‘=’ op1=expression op2=expression) { result= op1 != op2?1:0;}

| ^(‘!=’ op1=expression op2=expression) { result= op1 != op2?1:0;}

| ^(‘+’ op1=expression op2=expression) { result = op1 + op2; }

| ^(‘-‘ op1=expression op2=expression) { result = op1 – op2; }

| ^(‘*’ op1=expression op2=expression) { result = op1 * op2; }

| ^(‘/’ op1=expression op2=expression) { result = op1 / op2; }

| ^(‘%’ op1=expression op2=expression) { result = op1 \% op2; }

| ^(NEGATION e=expression) { result = -e;}

| TRUE{ result = 1;}

| FALSE{ result = 0;}

| INTEGER { result= Integer.parseInt($INTEGER.text); }

;

Policy effect returns an integer ‘res’ with the value ‘1’ for allow and  ‘0’ for deny

policyeffect returns [int res ]

: ALLOW {res= 1 ;}

| DENY {res= 0 ;}

;

Advertisements

It is highly recommended that one should have at least a basic knowledge of Antlr, Lexer and Parsers, Tree Walker etc. For a light introduction of Antlr, Policy writing, Grammer or Lexer and Parsers etc and some operational basics please read our previous post at

The policy input expected by our grammar is as follows

restrict (“edu.ringlet.Ringlet” as Ringlet, “android.permission.SMS_SEND” as SMS) :

Ringlet.sentSms() < 5; -> allow(Ringlet, SMS) ;

1.    This is our Grammar file which specifies the rules(syntax and Semantics) of our high-level policy language. In this file the Lexer scans our language into tokens then the parser generates a tree out of the tokens to get some meaning out of it

grammar XL1;

options

{


2.    The Target language specified is Java

language = Java;


3.    This will output our grammar into an Abstract Syntax Tree

output=AST;

ASTLabelType=CommonTree;

}


4.    Explicit Token used in negation of a term

tokens

{

NEGATION;

}

@header

{

package com.serg.policywritingtool;

}

@lexer::header

{

package com.serg.policywritingtool;

}

5.    This is our main ‘policy rule’, which determines what input we should be expecting and in which order

policy

: (mode ‘(””‘ appname ‘”‘ ‘as’ app ‘,’ ‘”‘ permname'”‘ ‘as’ perm ‘)’ ‘:’

statement+ ‘->’ policyeffect ‘(‘app ‘,’ perm ‘)’ ‘;’ )+ ;

6.    The Following Subrules used in the policy rule determines which input is a possible one and which is not. The first input rule ‘mode’ specifies that it can take lexer rule RES(explained below) only as input

Mode

: RES

;

7.    The appname rule takes IDENT followed by .IDENT or _IDENT.(the + sign states that this should occur at least one time)

Appname


8.    IDENT is discussed below

: IDENT(‘.’ IDENT | ‘_’ IDENT)+

;

app

: IDENT+

;


9.    The appname rule takes IDENT followed by .IDENT or _IDENT (the + sign states that this should occur at least one time)

permname

10. IDENT is discussed below

: IDENT(‘.’ IDENT | ‘_’ IDENT)+

;

perm

: IDENT+

;


11. Statement rule specifies that it takes expression rule followed by a semi-colon ‘;’ as input

Statement

:   expression ‘;’

;


12. Expression rule then specifies it takes two realtaions with an AND or OR in between

Expression

: relation ((‘AND’^ | ‘OR’^)relation)*

;

13. Realtion rule then specifies it takes two relations with an AND or OR in between

relation : add((‘=’^ | ‘!=’^ | ‘<‘^ | ‘<=’^ | ‘>’^ | ‘>=’^) add)*

;

14. Expression add then specifies it takes two mult terms  with a + or – in between

add

: mult((‘+’^ | ‘-‘^) mult)*

;

15. Expression mult then specifies it takes two unary terms  with a * or / in between

mult

: unary((‘*’^ | ‘/’^ | ‘mod’^) unary)*

;

unary

: (‘+’! | negation^)* not

;

16. Any ‘-‘ sign of a negation entered should be converted to our explicit Token defined

Negation

: ‘-‘ -> NEGATION

;

17. To neagte a logical operator the string literal ‘not’ is used

not

: ‘not’? term

;

18. A term is specified to be either an integer. app.method() name or can take the string literals (true or false)

Term

:  app’.’method  | ‘(‘! expression ‘)’! |INTEGER | TRUE | FALSE

;

method

: IDENT+'()’

;

19. Policyeffect can only take lexer rules ALLOW or DENY as input

Policyeffect

: ALLOW | DENY

;

Policy identification

20. RES(a lexer rule) can only contain the string literals ‘restrict’ or ‘unrestrict’ anything else will give an error.

RES

: ‘restrict’ | ‘unrestrict’

;

21. Lexer rule DENY can only take string literal deny as input

DENY

: ‘deny’

;

22. Lexer rule ALLOW can only take string literal allow as input

ALLOW

: ‘allow’

;

23. Lexer rule TRUE can only take string literal true as input

TRUE

: ‘true’

;

24. Lexer rule FASLE can only take string literal false as input

FALSE

: ‘false’

;

25. INTEGER rule specifies the input to be one or more integers from 0 to 9

INTEGER

: ‘0’..’9’+

;

26. WS is a whitespace character rule specifying space or end of line etc

WS

: (‘ ‘|’\n’|’\t’|’\r’|’\f’)+ {$channel=HIDDEN;}

;

27. The lexer rule IDENT specifies that the input should be a sequence of one or more letters(small or upper case)

IDENT

: (‘a’..’z’|’A’..’Z’)+

;

28. COMMENT rule allows us to add line comments in our high level language

COMMENT

: ‘//’ .* (‘\n’|’\r’){$channel=HIDDEN;}

;

29. Similar as COMMENT but for multiple lines

MULTICOMMENT

: ‘/*’ .* ‘*/’ {$channel=HIDDEN;}

;

1.       Get eclipse from http://www.eclipse.org (version eclipse classic 3.5.1)

2.       Download ANTLR at http://www.antlr.org/download/html and click on “Complete ANTLR 3.2jar, all tools,runtime,etc” link

3.       After downloading antlr package create a folder named “ANTLR-3.2” and inside it create another folder “lib” and put antlr 3.2jar downloaded in step 2 in the lib directory.

4.       Move the antlr-3.2 directory into C drive(or any work folder for ubuntu), eclipse should be in the same directory

5.       For antlr plugin inclusion in eclipse goto the wiki link on the antlr homepage(www.antlr.org) and inside it click on “Integration with Development environments” and we get two chices underneath eclipse click on “Eclipse 3.3+ for antlr for ANTLR 3.x”. Init we have AntlrDT( a standard Eclipse plugin implementing an Antlr 3.1+ specific grammar editor, outline, and builder) and ANTLR IDE(an eclipse plugin for ANTLRv3 grammars which is required) optiions. click on http://antlrv3ide.sourceforge.net/ and under download/install we get to know wht we need to do to get the setup complete for installing actual ANTLR plugin:


6.       eclipse 3.5 (we already downloaded it).

7.       GEF 3.2 (graphic editor framework for eclipse).

8.       zest 1.0.0 ( a simplification for graphs in eclipse inside the GEF.

9.       Dynamic Language Toolkit(DLTK) Core.

10.   Open eclipse save your select a destination to save your workspace e.g C:\Users\owais\workspace (/home/advo/workspace for ubuntu). to install the plugins goto help menu click on Install New Software, change the work with drop down meny to “Galileo- http://download.eclipse.org/releases/galileo. on the main plugin distribution site for the plugins to be downloaded from eclipse, uncheck the Group items by category to get GEF, zest and DLTK. To get DLTK type Dynamic languages in the search box for main plugin distribution site. you will get a list of available dynamic languages plugins. Check the “Dynamic Languages Toolkit-Core Frameworks”.

11.   To get GEF type GEF in search and check “Graphical Editing Framework GEF”.

12.   Type zest in the search to get available zest plugins and check “Graphical Editing Framework Zest Visualization                 Toolkit”.

13.   In order to make sure all three plugins got checked blank/remove anything entered in the search box to get all the                 available plugins list. Confirm all the three checkmarks are there by visual inspection.

14.   Click on next we will see all three plugins, and then click next and accept the license and click finish and restart to take effect.

15.   Now to download actual ANTLR IDE plugin on the download/install page on Antlr wiki copy the link http:antlrv3ide.surceforge.net/updates and in eclipse goto help menu and open install new software option as we didi for previous plugins. On work with menu click on Add option and paste the copied link in the Location field and in name field enter ANTLR IDE. It will list the ANTLR IDE and we have to check everything available. Click next and further next check the accept license and click finish to install the plugin. during instalation we will get a warning, avoid it by clicking ok we will setup the IDE ourselves.Restart eclipse to take effect

16.   To setup ANTLR goto window menu and click on preferences. click on ANTLR and check the “mark generated resources as derived“. UNderneath ANTLR click on builder and define an antlr package. click on add and include the directory that we put ANTLR-3.2 into and click ok. underneath Code Generator check “Project relative folder” to keep generated files seperate from source files and give name to output folder name e.g antlr-generated and check “Append java package to output folder” and click Apply. Under editor  change Tab policy menu “Tabs only” and change identation size and displayed tab size to 4. Inside editor click on Content Assist and uncheck “Enable auto activation” and in folding option uncheck “enable folding“……. and click OK to complete to setup everything to create a project.

ANTLR, Another Tool for Language Recognition, is a language tool that provides a framework for constructing recognizers, interpreters, compilers, and translators from grammatical descriptions containing actions in a variety of target languages.

Programmers usually use parser generators to build translators and interpreters for domain-specific languages such as proprietary data formats, common network protocols, text processing languages, and domain-specific programming languages.

Domain-specific languages are important to software development because they represent a more natural, high fidelity, robust, and maintainable means of encoding a problem than simply writing software in a general-purpose language. For example, NASA uses domain-specific command languages for space missions to improve reliability, reduce risk, reduce cost, and increase the speed of development. Even the first Apollo guidance control computer from the 1960s used a domain-specific language that supported vector computations.

This article will explain the main ANTLR components and explains how they all fit together.

A translator maps each input sentence of a language to an output sentence. To perform the mapping, the translator executes some code you provide that operates on the input symbols and emits some output. A translator must perform different actions for different sentences, which means it must be able to recognize the various sentences.

Recognition is much easier if you break it into two similar but distinct tasks or phases. The separate phases mirror how your brain reads English text. You don’t read a sentence character by character. Instead, you perceive a sentence as a stream of words. The human brain subconsciously groups character sequences into words and looks them up in a dictionary before recognizing grammatical structure. The first translation phase is called lexical analysis and operates on the incoming character stream. The second phase is called parsing and operates on a stream of vocabulary symbols, called tokens, emanating from the lexical analyzer. ANTLR automatically generates the lexical analyzer and parser for you by analyzing the grammar you provide.

Performing a translation often means just embedding actions (code) within the grammar. ANTLR executes an action according to its position within the grammar. In this way, you can execute different code for different phrases (sentence fragments). For example, an action within, say, an expression rule is executed only when the parser is recognizing an expression. Some translations should be broken down into even more phases. Often the translation requires multiple passes, and in other cases, the translation is just a heck of a lot easier to code in multiple phases. Rather than reparse the input characters for each phase, it is more convenient to construct an intermediate form to pass between phases.

Figure 1: Overall translation data flow; edges represent data structure flow, and squares represent translation phases

This intermediate form is usually a tree data structure, called an abstract syntax tree (AST), and is a highly processed, condensed version of the input. Each phase collects more information or performs more computations. A final phase, called the emitter, ultimately emits output using all the data structures and computations from previous phases.

Figure 1 illustrates the basic data flow of a translator that accepts characters and emits output. The lexical analyzer, or lexer, breaks up the input stream into tokens. The parser feeds off this token stream and tries to recognize the sentence structure. The simplest translators execute actions that immediately emit output, bypassing any further phases.

Another kind of simple translator just constructs an internal data structure, it doesn’t actually emit output. A configuration file reader is the best example of this kind of translator. More complicated translators use the parser only to construct ASTs. Multiple tree parsers (depthfirst tree walkers) then scramble over the ASTs, computing other data structures and information needed by future phases. Although it is not shown in this figure, the final emitter phase can use templates to generate structured text output.

A template is just a text document with holes in it that an emitter can fill with values. These holes can also be expressions that operate on the incoming data values. ANTLR formally integrates the StringTemplate engine to make it easier for you to build emitters. StringTemplate is a domain-specific language for generating structured text from internal data structures that has the flavor of an output grammar. Features include template group inheritance, template polymorphism, lazy evaluation, recursion, output autoindentation, and the new notions of group interfaces and template regions.2 StringTemplate’s feature set is driven by solving real problems encountered in complicated systems. Indeed, ANTLR makes heavy use of StringTemplate to translate grammars to executable recognizers. Each ANTLR language target is purely a set of templates and fed by ANTLR’s internal retargetable code generator.

For a more details on antlr visit the links in the references.

References:

The Definitive ANTLR Reference: Building Domain-Specific Languages by Terence Parr

http://www.techpresentations.org/An_Introduction_to_ANTLR

http://www.antlr.org/

http://www.stringtemplate.org

Blog Stats

  • 331,044 hits

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 233 other followers

%d bloggers like this: