diff mbox series

[RISU,6/7] risugen: support @GroupName in risu files

Message ID 20171031145444.13766-7-alex.bennee@linaro.org
State Superseded
Headers show
Series Add @Group support and some aarch64.risu cleanups | expand

Commit Message

Alex Bennée Oct. 31, 2017, 2:54 p.m. UTC
The existing pattern support is useful but it does get a little
tedious when faced with large groups of instructions. This introduces
the concept of a @GroupName which can be sprinkled in the risu
definition and is attached to all instructions following its
definition until the next group or an empty group "@" is specified.

It can be combined with the existing pattern support to do things
like:

  ./risugen --group AdvSIMDAcrossVector --not-pattern ".*_RES" aarch64.risu foo.bin

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

---
 README  | 10 ++++++++++
 risugen | 20 ++++++++++++++++++++
 2 files changed, 30 insertions(+)

-- 
2.14.2

Comments

Peter Maydell Nov. 21, 2017, 12:06 p.m. UTC | #1
On 31 October 2017 at 14:54, Alex Bennée <alex.bennee@linaro.org> wrote:
> The existing pattern support is useful but it does get a little

> tedious when faced with large groups of instructions. This introduces

> the concept of a @GroupName which can be sprinkled in the risu

> definition and is attached to all instructions following its

> definition until the next group or an empty group "@" is specified.

>

> It can be combined with the existing pattern support to do things

> like:

>

>   ./risugen --group AdvSIMDAcrossVector --not-pattern ".*_RES" aarch64.risu foo.bin

>

> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>



I think the general idea here is fine but some extra checking
of syntax in the implementation would be good.


> ---

>  README  | 10 ++++++++++

>  risugen | 20 ++++++++++++++++++++

>  2 files changed, 30 insertions(+)

>

> diff --git a/README b/README

> index 312e9cd..9946e6e 100644

> --- a/README

> +++ b/README

> @@ -75,6 +75,10 @@ reads the configuration file arm.risu, and generates 10000 instructions

>  based on the instruction patterns matching the regular expression

>  "VQSHL.*imm.*". The resulting binary is written to vqshlimm.out.

>

> +An alternative to using regular expression patterns is to use the

> +--group specifier. This relies on the configuration file having been

> +annotated with suitable @ markers.


If I say "--group foo --group bar" am I asking for insns which
are in both group foo and bar, or insns which are in either
foo or bar or both ?

> +

>  This binary can then be passed to the risu program, which is

>  written in C. You need to run risu on both an ARM native target

>  and on the program under test. The ARM native system is the 'master'

> @@ -140,6 +144,12 @@ Lines starting with a '.' are directives to risu/risugen:

>   * ".mode [thumb|arm]" specifies whether the file contains ARM

>     or Thumb instructions; it must precede all instruction patterns.

>

> +Lines starting with a '@' are a grouping directive. Instructions

> +following will be assigned to a comma separated list of groups. The

> +list of groups is reset at the next '@' directive which may be empty.

> +This provides an alternative method to selecting instructions than RE

> +patterns.

> +

>  Other lines are instruction patterns:

>   insnname encodingname bitfield ... [ [ !blockname ] { blocktext } ]

>  where each bitfield is either:

> diff --git a/risugen b/risugen

> index 8bfb0e9..aba4bb7 100755


> --- a/risugen

> +++ b/risugen

> @@ -34,7 +34,10 @@ my @insn_keys;

>

>  # The arch will be selected based on .mode directive defined in risu file.

>  my $arch = "";

> +# Current group, updated by @GroupName

> +my $insn_group = "";

>

> +my @group = ();                 # include groups

>  my @pattern_re = ();            # include pattern

>  my @not_pattern_re = ();        # exclude pattern

>

> @@ -122,6 +125,11 @@ sub parse_config_file($)

>              exit(1);

>          }

>

> +        if ($tokens[0] =~ /^@(.*)/ ) {

> +            $insn_group = $1;

> +            next;

> +        }


Here, the syntax we're parsing is supposed to be @ followed
by a comma-separated list of groupnames, right? But we treat
everything after the '@ as a single opaque string.
I think we should properly parse this, allowing whitespace
before/after commas, maybe restrict the set of characters
we allow in groupnames, and store them as an array of group
names rather than a raw string.

> +

>          if ($tokens[0] =~ /^\./) {

>              parse_risu_directive($file, $seen_pattern, @tokens);

>              next;

> @@ -239,6 +247,9 @@ sub parse_config_file($)

>          $insnrec->{fixedbits} = $fixedbits;

>          $insnrec->{fixedbitmask} = $fixedbitmask;

>          $insnrec->{fields} = [ @fields ];

> +        if (length $insn_group) {

> +            $insnrec->{group} = $insn_group;

> +        }

>          $insn_details{$insnname} = $insnrec;

>      }

>      close(CFILE) or die "can't close $file: $!";

> @@ -249,6 +260,12 @@ sub select_insn_keys ()

>  {

>      # Get a list of the insn keys which are permitted by the re patterns

>      @insn_keys = sort keys %insn_details;

> +    if (@group) {

> +        my $re = join("|",@group);

> +        @insn_keys = grep {

> +            defined($insn_details{$_}->{group}) &&

> +                grep /$re/, $insn_details{$_}->{group}} @insn_keys


This implementation is implicitly permitting regex characters in
the user-specified --group argument. Is that intentional?
If we store the group names for the insn as an array of names
then we can just check for an explicit match in them against
the user's provided set of names.

> +    }

>      if (@pattern_re) {

>          my $re = '\b((' . join(')|(',@pattern_re) . '))\b';

>          @insn_keys = grep /$re/, @insn_keys;

> @@ -277,6 +294,7 @@ Valid options:

>      --fpscr n    : set initial FPSCR (arm) or FPCR (aarch64) value (default is 0)

>      --condprob p : [ARM only] make instructions conditional with probability p

>                     (default is 0, ie all instructions are always executed)

> +    --group name[,name..]: only use instructions in defined groups

>      --pattern re[,re...] : only use instructions matching regular expression

>                     Each re must match a full word (that is, we match on

>                     the perl regex '\\b((re)|(re))\\b'). This means that

> @@ -305,6 +323,7 @@ sub main()

>      GetOptions( "help" => sub { usage(); exit(0); },

>                  "numinsns=i" => \$numinsns,

>                  "fpscr=o" => \$fpscr,

> +                "group=s" => \@group,

>                  "pattern=s" => \@pattern_re,

>                  "not-pattern=s" => \@not_pattern_re,

>                  "condprob=f" => sub {

> @@ -319,6 +338,7 @@ sub main()

>      # allow "--pattern re,re" and "--pattern re --pattern re"

>      @pattern_re = split(/,/,join(',',@pattern_re));

>      @not_pattern_re = split(/,/,join(',',@not_pattern_re));

> +    @group = split(/,/,join(',',@group));

>

>      if ($#ARGV != 1) {

>          usage();

> --

> 2.14.2


thanks
-- PMM
diff mbox series

Patch

diff --git a/README b/README
index 312e9cd..9946e6e 100644
--- a/README
+++ b/README
@@ -75,6 +75,10 @@  reads the configuration file arm.risu, and generates 10000 instructions
 based on the instruction patterns matching the regular expression
 "VQSHL.*imm.*". The resulting binary is written to vqshlimm.out.
 
+An alternative to using regular expression patterns is to use the
+--group specifier. This relies on the configuration file having been
+annotated with suitable @ markers.
+
 This binary can then be passed to the risu program, which is
 written in C. You need to run risu on both an ARM native target
 and on the program under test. The ARM native system is the 'master'
@@ -140,6 +144,12 @@  Lines starting with a '.' are directives to risu/risugen:
  * ".mode [thumb|arm]" specifies whether the file contains ARM
    or Thumb instructions; it must precede all instruction patterns.
 
+Lines starting with a '@' are a grouping directive. Instructions
+following will be assigned to a comma separated list of groups. The
+list of groups is reset at the next '@' directive which may be empty.
+This provides an alternative method to selecting instructions than RE
+patterns.
+
 Other lines are instruction patterns:
  insnname encodingname bitfield ... [ [ !blockname ] { blocktext } ]
 where each bitfield is either:
diff --git a/risugen b/risugen
index 8bfb0e9..aba4bb7 100755
--- a/risugen
+++ b/risugen
@@ -34,7 +34,10 @@  my @insn_keys;
 
 # The arch will be selected based on .mode directive defined in risu file.
 my $arch = "";
+# Current group, updated by @GroupName
+my $insn_group = "";
 
+my @group = ();                 # include groups
 my @pattern_re = ();            # include pattern
 my @not_pattern_re = ();        # exclude pattern
 
@@ -122,6 +125,11 @@  sub parse_config_file($)
             exit(1);
         }
 
+        if ($tokens[0] =~ /^@(.*)/ ) {
+            $insn_group = $1;
+            next;
+        }
+
         if ($tokens[0] =~ /^\./) {
             parse_risu_directive($file, $seen_pattern, @tokens);
             next;
@@ -239,6 +247,9 @@  sub parse_config_file($)
         $insnrec->{fixedbits} = $fixedbits;
         $insnrec->{fixedbitmask} = $fixedbitmask;
         $insnrec->{fields} = [ @fields ];
+        if (length $insn_group) {
+            $insnrec->{group} = $insn_group;
+        }
         $insn_details{$insnname} = $insnrec;
     }
     close(CFILE) or die "can't close $file: $!";
@@ -249,6 +260,12 @@  sub select_insn_keys ()
 {
     # Get a list of the insn keys which are permitted by the re patterns
     @insn_keys = sort keys %insn_details;
+    if (@group) {
+        my $re = join("|",@group);
+        @insn_keys = grep {
+            defined($insn_details{$_}->{group}) &&
+                grep /$re/, $insn_details{$_}->{group}} @insn_keys
+    }
     if (@pattern_re) {
         my $re = '\b((' . join(')|(',@pattern_re) . '))\b';
         @insn_keys = grep /$re/, @insn_keys;
@@ -277,6 +294,7 @@  Valid options:
     --fpscr n    : set initial FPSCR (arm) or FPCR (aarch64) value (default is 0)
     --condprob p : [ARM only] make instructions conditional with probability p
                    (default is 0, ie all instructions are always executed)
+    --group name[,name..]: only use instructions in defined groups
     --pattern re[,re...] : only use instructions matching regular expression
                    Each re must match a full word (that is, we match on
                    the perl regex '\\b((re)|(re))\\b'). This means that
@@ -305,6 +323,7 @@  sub main()
     GetOptions( "help" => sub { usage(); exit(0); },
                 "numinsns=i" => \$numinsns,
                 "fpscr=o" => \$fpscr,
+                "group=s" => \@group,
                 "pattern=s" => \@pattern_re,
                 "not-pattern=s" => \@not_pattern_re,
                 "condprob=f" => sub {
@@ -319,6 +338,7 @@  sub main()
     # allow "--pattern re,re" and "--pattern re --pattern re"
     @pattern_re = split(/,/,join(',',@pattern_re));
     @not_pattern_re = split(/,/,join(',',@not_pattern_re));
+    @group = split(/,/,join(',',@group));
 
     if ($#ARGV != 1) {
         usage();