From patchwork Tue Feb 13 07:21:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128203 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp3975698ljc; Mon, 12 Feb 2018 23:23:34 -0800 (PST) X-Google-Smtp-Source: AH8x226KyUa4RIFGEJuCxyaMhnndZy1tUjwYJWIT8jHS2mdo6uXqWjQDMKV4k2M2FZKxEtsIWVjM X-Received: by 10.129.196.8 with SMTP id j8mr133495ywi.260.1518506614044; Mon, 12 Feb 2018 23:23:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518506614; cv=none; d=google.com; s=arc-20160816; b=gvNwKjrx5suyckB50jSO2/ESSeSFVkx5XR2DiRJ1HfzzZgvui68CsDIV2dnfFwB+W7 phlu6ee9MjFoVmThEnfFceYoz1CwiZemCsrjlnyP5ay31v+Ocye8WYAKPTwfBJL3KUvU 8Eg3jNAjZyXO3wSzobG8z4dZAsmWxFsgnzDrEguTIB1NNW0hZmq7ThAdgRbbjayYEfdo 329SRv3vlnQ4OILw8e2kIwloYLNQ8JvCaGCV6PaSizzhWiFcSs+O2vHeUMgXC0znMHrb yHgsm2KqoBlEYOmDw8VzebMnL0RGfAtzKkHNtpGf6B9lj7NoccS84DX6XjMNJ0XQkLqs UikQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:message-id:date:to:from :dkim-signature:arc-authentication-results; bh=WhX3OVsHZz33KV9SNMCXYDuzKQqY6yPNb3xBanglttc=; b=bgWB5O81FgmrzqPy3+jM7tU3ltK9qJzpRlHItyQgZ14yKx9sDO5SWHJnk2vVQPVefY bMRDbLAYRnDgazwJGQRePtTW5EFh8CmE41tnx7EULraV1WpaDAwkEA4eI1dhJQJU7MRK stV0q99eejk16EbCr0RplC0iznoLQw46c052cZAUYdn1f91CoSVht7R8u1ePm1e78tG0 nCnVxFOqO2L/zKLvzfS1IQnztT1WkHdbxa8bTfio/swtzpbXGq+0sp31oPSZSMjM7VVa 6RAfhlC4kkdqCqX3JdCD47urMebiKSeguqGuzbBxqPp8MkGgdZG1LMSzpGnvdwQJOzNO gchQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=g1TQJEMr; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id e203si1817725ywc.711.2018.02.12.23.23.33 for (version=TLS1 cipher=AES128-SHA bits=128/128); Mon, 12 Feb 2018 23:23:34 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=g1TQJEMr; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:49710 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1elUw9-0001Or-Ci for patch@linaro.org; Tue, 13 Feb 2018 02:23:33 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54028) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1elUuf-0000xm-Hr for qemu-devel@nongnu.org; Tue, 13 Feb 2018 02:22:05 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1elUub-0005eB-1Q for qemu-devel@nongnu.org; Tue, 13 Feb 2018 02:22:01 -0500 Received: from mail-pl0-x229.google.com ([2607:f8b0:400e:c01::229]:44024) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1elUua-0005dZ-Ky for qemu-devel@nongnu.org; Tue, 13 Feb 2018 02:21:56 -0500 Received: by mail-pl0-x229.google.com with SMTP id f4so6231570plr.10 for ; Mon, 12 Feb 2018 23:21:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id; bh=WhX3OVsHZz33KV9SNMCXYDuzKQqY6yPNb3xBanglttc=; b=g1TQJEMrA1NFsK4DvffsGbi8gdZeDgEA+dITSy1+5ZDrwwBvvPGZ0iHPdi5eAUiM9g 7pmuc01Sc2WIitE8UgRp5dsjhvVUTX1PaBB/wtCMCcULpEZ+KQY+YJL3A0U973Ps00Ms vI+9BpwciwoLGvTLoqNCIaEAaYdbm8Sl+elyI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=WhX3OVsHZz33KV9SNMCXYDuzKQqY6yPNb3xBanglttc=; b=M7FOQsBg6kFGudDBlOhbMDfskvHg5SEr36olri2qFe9+ABn23HXxEs2FOtRNBez9pH CjfkOIOoYgwWRFb4il0MhvruF4fq0jb03wTGy7C4dzYtipLFmzuL2CYhZLhykiDgOXG/ TIjgaKW//w2uKAFGdB7/SCE+9ZXABcZ/zs8PbXwuOOWK9YYulQnsOUyqs+WU847WNjdf o4aGlu1WMFDMDy1IVyXxs0FgT2lnsv9ayrrwt4fCTZmhuzwt8FHzM6NUkoJdLjtyec9v 0PtcFS7NbQqLAcjO5bWB2TZELMVN91MQRwEY2pepMoODsAsQfcLsrAY33nPpYOqBOfWD Zaug== X-Gm-Message-State: APf1xPCazZbFd7s9uL85vSuGYOaF7gCCaEecO0yAcvEH7fexmFQzQo64 Tyi/tCwRZR36WsqREhwjmRMCASuon5U= X-Received: by 2002:a17:902:2c83:: with SMTP id n3-v6mr252655plb.227.1518506514487; Mon, 12 Feb 2018 23:21:54 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-6-47.tukw.qwest.net. [174.21.6.47]) by smtp.gmail.com with ESMTPSA id k19sm20677627pgn.2.2018.02.12.23.21.53 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 12 Feb 2018 23:21:53 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 12 Feb 2018 23:21:50 -0800 Message-Id: <20180213072150.32252-1-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::229 Subject: [Qemu-devel] [PATCH v3] scripts: Add decodetree.py X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" To be used to decode ARM SVE, but could be used for any fixed-width ISA. Signed-off-by: Richard Henderson --- Changes since v2: * Fix tests/decode/err_init3.def. * Mark main decoder static by default. * Properly diagnose unspecified bits. * Remove output file on error. - I had been doing this in the Makefile, but all told it's cleaner to do it in the script. Changes since v1: * Pass pycodestyle-{2,3}. * Support 16-bit and 32-bit insns (I have a def file for thumb1). * Testsuite (only negative tests so far). * Called translate functions default to static. * Notice duplicate assignments and missing assignments to fields. * Use '-' to indicate a non-decoded bit, as opposed to '.' which must be filled in elsewhere by a format or a field. --- scripts/decodetree.py | 1044 +++++++++++++++++++++++++++++++++++++++++ tests/Makefile.include | 9 +- tests/decode/check.sh | 16 + tests/decode/err_argset1.def | 1 + tests/decode/err_argset2.def | 1 + tests/decode/err_field1.def | 1 + tests/decode/err_field2.def | 1 + tests/decode/err_field3.def | 1 + tests/decode/err_field4.def | 2 + tests/decode/err_field5.def | 1 + tests/decode/err_init1.def | 2 + tests/decode/err_init2.def | 2 + tests/decode/err_init3.def | 3 + tests/decode/err_init4.def | 3 + tests/decode/err_overlap1.def | 2 + tests/decode/err_overlap2.def | 2 + tests/decode/err_overlap3.def | 2 + tests/decode/err_overlap4.def | 2 + tests/decode/err_overlap5.def | 1 + tests/decode/err_overlap6.def | 2 + tests/decode/err_overlap7.def | 2 + tests/decode/err_overlap8.def | 1 + tests/decode/err_overlap9.def | 2 + 23 files changed, 1102 insertions(+), 1 deletion(-) create mode 100755 scripts/decodetree.py create mode 100755 tests/decode/check.sh create mode 100644 tests/decode/err_argset1.def create mode 100644 tests/decode/err_argset2.def create mode 100644 tests/decode/err_field1.def create mode 100644 tests/decode/err_field2.def create mode 100644 tests/decode/err_field3.def create mode 100644 tests/decode/err_field4.def create mode 100644 tests/decode/err_field5.def create mode 100644 tests/decode/err_init1.def create mode 100644 tests/decode/err_init2.def create mode 100644 tests/decode/err_init3.def create mode 100644 tests/decode/err_init4.def create mode 100644 tests/decode/err_overlap1.def create mode 100644 tests/decode/err_overlap2.def create mode 100644 tests/decode/err_overlap3.def create mode 100644 tests/decode/err_overlap4.def create mode 100644 tests/decode/err_overlap5.def create mode 100644 tests/decode/err_overlap6.def create mode 100644 tests/decode/err_overlap7.def create mode 100644 tests/decode/err_overlap8.def create mode 100644 tests/decode/err_overlap9.def -- 2.14.3 diff --git a/scripts/decodetree.py b/scripts/decodetree.py new file mode 100755 index 0000000000..f0b33a937e --- /dev/null +++ b/scripts/decodetree.py @@ -0,0 +1,1044 @@ +#!/usr/bin/env python +# +# Generate a decoding tree from a specification file. +# +# The tree is built from instruction "patterns". A pattern may represent +# a single architectural instruction or a group of same, depending on what +# is convenient for further processing. +# +# Each pattern has "fixedbits" & "fixedmask", the combination of which +# describes the condition under which the pattern is matched: +# +# (insn & fixedmask) == fixedbits +# +# Each pattern may have "fields", which are extracted from the insn and +# passed along to the translator. Examples of such are registers, +# immediates, and sub-opcodes. +# +# In support of patterns, one may declare fields, argument sets, and +# formats, each of which may be re-used to simplify further definitions. +# +# *** Field syntax: +# +# field_def := '%' identifier ( unnamed_field )+ ( !function=identifier )? +# unnamed_field := number ':' ( 's' ) number +# +# For unnamed_field, the first number is the least-significant bit position of +# the field and the second number is the length of the field. If the 's' is +# present, the field is considered signed. If multiple unnamed_fields are +# present, they are concatenated. In this way one can define disjoint fields. +# +# If !function is specified, the concatenated result is passed through the +# named function, taking and returning an integral value. +# +# FIXME: the fields of the structure into which this result will be stored +# is restricted to "int". Which means that we cannot expand 64-bit items. +# +# Field examples: +# +# %disp 0:s16 -- sextract(i, 0, 16) +# %imm9 16:6 10:3 -- extract(i, 16, 6) << 3 | extract(i, 10, 3) +# %disp12 0:s1 1:1 2:10 -- sextract(i, 0, 1) << 11 +# | extract(i, 1, 1) << 10 +# | extract(i, 2, 10) +# %shimm8 5:s8 13:1 !function=expand_shimm8 +# -- expand_shimm8(sextract(i, 5, 8) << 1 +# | extract(i, 13, 1)) +# +# *** Argument set syntax: +# +# args_def := '&' identifier ( args_elt )+ +# args_elt := identifier +# +# Each args_elt defines an argument within the argument set. +# Each argument set will be rendered as a C structure "arg_$name" +# with each of the fields being one of the member arguments. +# +# Argument set examples: +# +# ®3 ra rb rc +# &loadstore reg base offset +# +# *** Format syntax: +# +# fmt_def := '@' identifier ( fmt_elt )+ +# fmt_elt := fixedbit_elt | field_elt | field_ref | args_ref +# fixedbit_elt := [01.-]+ +# field_elt := identifier ':' 's'? number +# field_ref := '%' identifier | identifier '=' '%' identifier +# args_ref := '&' identifier +# +# Defining a format is a handy way to avoid replicating groups of fields +# across many instruction patterns. +# +# A fixedbit_elt describes a contiguous sequence of bits that must +# be 1, 0, [.-] for don't care. The difference between '.' and '-' +# is that '.' means that the bit will be covered with a field and +# '-' means that the bit is really ignored by the cpu and will not +# be covered by a field. +# +# A field_elt describes a simple field only given a width; the position of +# the field is implied by its position with respect to other fixedbit_elt +# and field_elt. +# +# If any fixedbit_elt or field_elt appear then all bits must be defined. +# Padding with a fixedbit_elt of all '.' is an easy way to accomplish that. +# +# A field_ref incorporates a field by reference. This is the only way to +# add a complex field to a format. A field may be renamed in the process +# via assignment to another identifier. This is intended to allow the +# same argument set be used with disjoint named fields. +# +# A single args_ref may specify an argument set to use for the format. +# The set of fields in the format must be a subset of the arguments in +# the argument set. If an argument set is not specified, one will be +# inferred from the set of fields. +# +# It is recommended, but not required, that all field_ref and args_ref +# appear at the end of the line, not interleaving with fixedbit_elf or +# field_elt. +# +# Format examples: +# +# @opr ...... ra:5 rb:5 ... 0 ....... rc:5 +# @opi ...... ra:5 lit:8 1 ....... rc:5 +# +# *** Pattern syntax: +# +# pat_def := identifier ( pat_elt )+ +# pat_elt := fixedbit_elt | field_elt | field_ref +# | args_ref | fmt_ref | const_elt +# fmt_ref := '@' identifier +# const_elt := identifier '=' number +# +# The fixedbit_elt and field_elt specifiers are unchanged from formats. +# A pattern that does not specify a named format will have one inferred +# from a referenced argument set (if present) and the set of fields. +# +# A const_elt allows a argument to be set to a constant value. This may +# come in handy when fields overlap between patterns and one has to +# include the values in the fixedbit_elt instead. +# +# The decoder will call a translator function for each pattern matched. +# +# Pattern examples: +# +# addl_r 010000 ..... ..... .... 0000000 ..... @opr +# addl_i 010000 ..... ..... .... 0000000 ..... @opi +# +# which will, in part, invoke +# +# trans_addl_r(ctx, &arg_opr, insn) +# and +# trans_addl_i(ctx, &arg_opi, insn) +# + +import io +import os +import re +import sys +import getopt +import pdb + +insnwidth = 32 +insnmask = 0xffffffff +fields = {} +arguments = {} +formats = {} +patterns = [] + +translate_prefix = 'trans' +translate_scope = 'static ' +input_file = '' +output_file = None +output_fd = None +insntype = 'uint32_t' + +re_ident = '[a-zA-Z][a-zA-Z0-9_]*' + + +def error(lineno, *args): + """Print an error message from file:line and args and exit.""" + global output_file + global output_fd + + if lineno: + r = '{0}:{1}: error:'.format(input_file, lineno) + elif input_file: + r = '{0}: error:'.format(input_file) + else: + r = 'error:' + for a in args: + r += ' ' + str(a) + r += '\n' + sys.stderr.write(r) + if output_file and output_fd: + output_fd.close() + os.remove(output_file) + exit(1) + + +def output(*args): + global output_fd + for a in args: + output_fd.write(a) + + +if sys.version_info >= (3, 0): + re_fullmatch = re.fullmatch +else: + def re_fullmatch(pat, str): + return re.match('^' + pat + '$', str) + + +def output_autogen(): + output('/* This file is autogenerated by scripts/decodetree.py. */\n\n') + + +def str_indent(c): + """Return a string with C spaces""" + return ' ' * c + + +def str_fields(fields): + """Return a string uniquely identifing FIELDS""" + r = '' + for n in sorted(fields.keys()): + r += '_' + n + return r[1:] + + +def str_match_bits(bits, mask): + """Return a string pretty-printing BITS/MASK""" + global insnwidth + + i = 1 << (insnwidth - 1) + space = 0x01010100 + r = '' + while i != 0: + if i & mask: + if i & bits: + r += '1' + else: + r += '0' + else: + r += '.' + if i & space: + r += ' ' + i >>= 1 + return r + + +def is_pow2(x): + """Return true iff X is equal to a power of 2.""" + return (x & (x - 1)) == 0 + + +def ctz(x): + """Return the number of times 2 factors into X.""" + r = 0 + while ((x >> r) & 1) == 0: + r += 1 + return r + + +def is_contiguous(bits): + shift = ctz(bits) + if is_pow2((bits >> shift) + 1): + return shift + else: + return -1 + + +def eq_fields_for_args(flds_a, flds_b): + if len(flds_a) != len(flds_b): + return False + for k, a in flds_a.items(): + if k not in flds_b: + return False + return True + + +def eq_fields_for_fmts(flds_a, flds_b): + if len(flds_a) != len(flds_b): + return False + for k, a in flds_a.items(): + if k not in flds_b: + return False + b = flds_b[k] + if a.__class__ != b.__class__ or a != b: + return False + return True + + +class Field: + """Class representing a simple instruction field""" + def __init__(self, sign, pos, len): + self.sign = sign + self.pos = pos + self.len = len + self.mask = ((1 << len) - 1) << pos + + def __str__(self): + if self.sign: + s = 's' + else: + s = '' + return str(pos) + ':' + s + str(len) + + def str_extract(self): + if self.sign: + extr = 'sextract32' + else: + extr = 'extract32' + return '{0}(insn, {1}, {2})'.format(extr, self.pos, self.len) + + def __eq__(self, other): + return self.sign == other.sign and self.sign == other.sign + + def __ne__(self, other): + return not self.__eq__(other) +# end Field + + +class MultiField: + """Class representing a compound instruction field""" + def __init__(self, subs, mask): + self.subs = subs + self.sign = subs[0].sign + self.mask = mask + + def __str__(self): + return str(self.subs) + + def str_extract(self): + ret = '0' + pos = 0 + for f in reversed(self.subs): + if pos == 0: + ret = f.str_extract() + else: + ret = 'deposit32({0}, {1}, {2}, {3})' \ + .format(ret, pos, 32 - pos, f.str_extract()) + pos += f.len + return ret + + def __ne__(self, other): + if len(self.subs) != len(other.subs): + return True + for a, b in zip(self.subs, other.subs): + if a.__class__ != b.__class__ or a != b: + return True + return False + + def __eq__(self, other): + return not self.__ne__(other) +# end MultiField + + +class ConstField: + """Class representing an argument field with constant value""" + def __init__(self, value): + self.value = value + self.mask = 0 + self.sign = value < 0 + + def __str__(self): + return str(self.value) + + def str_extract(self): + return str(self.value) + + def __cmp__(self, other): + return self.value - other.value +# end ConstField + + +class FunctionField: + """Class representing a field passed through an expander""" + def __init__(self, func, base): + self.mask = base.mask + self.sign = base.sign + self.base = base + self.func = func + + def __str__(self): + return self.func + '(' + str(self.base) + ')' + + def str_extract(self): + return self.func + '(' + self.base.str_extract() + ')' + + def __eq__(self, other): + return self.func == other.func and self.base == other.base + + def __ne__(self, other): + return not self.__eq__(other) +# end FunctionField + + +class Arguments: + """Class representing the extracted fields of a format""" + def __init__(self, nm, flds): + self.name = nm + self.fields = sorted(flds) + + def __str__(self): + return self.name + ' ' + str(self.fields) + + def struct_name(self): + return 'arg_' + self.name + + def output_def(self): + output('typedef struct {\n') + for n in self.fields: + output(' int ', n, ';\n') + output('} ', self.struct_name(), ';\n\n') +# end Arguments + + +class General: + """Common code between instruction formats and instruction patterns""" + def __init__(self, name, lineno, base, fixb, fixm, udfm, fldm, flds): + self.name = name + self.lineno = lineno + self.base = base + self.fixedbits = fixb + self.fixedmask = fixm + self.undefmask = udfm + self.fieldmask = fldm + self.fields = flds + + def __str__(self): + r = self.name + if self.base: + r = r + ' ' + self.base.name + else: + r = r + ' ' + str(self.fields) + r = r + ' ' + str_match_bits(self.fixedbits, self.fixedmask) + return r + + def str1(self, i): + return str_indent(i) + self.__str__() +# end General + + +class Format(General): + """Class representing an instruction format""" + + def extract_name(self): + return 'extract_' + self.name + + def output_extract(self): + output('static void ', self.extract_name(), '(', + self.base.struct_name(), ' *a, ', insntype, ' insn)\n{\n') + for n, f in self.fields.items(): + output(' a->', n, ' = ', f.str_extract(), ';\n') + output('}\n\n') +# end Format + + +class Pattern(General): + """Class representing an instruction pattern""" + + def output_decl(self): + global translate_scope + global translate_prefix + output('typedef ', self.base.base.struct_name(), + ' arg_', self.name, ';\n') + output(translate_scope, 'void ', translate_prefix, '_', self.name, + '(DisasContext *ctx, arg_', self.name, + ' *a, ', insntype, ' insn);\n') + + def output_code(self, i, extracted, outerbits, outermask): + global translate_prefix + ind = str_indent(i) + arg = self.base.base.name + output(ind, '/* line ', str(self.lineno), ' */\n') + if not extracted: + output(ind, self.base.extract_name(), '(&u.f_', arg, ', insn);\n') + for n, f in self.fields.items(): + output(ind, 'u.f_', arg, '.', n, ' = ', f.str_extract(), ';\n') + output(ind, translate_prefix, '_', self.name, + '(ctx, &u.f_', arg, ', insn);\n') + output(ind, 'return true;\n') +# end Pattern + + +def parse_field(lineno, name, toks): + """Parse one instruction field from TOKS at LINENO""" + global fields + global re_ident + global insnwidth + + # A "simple" field will have only one entry; + # a "multifield" will have several. + subs = [] + width = 0 + func = None + for t in toks: + if re_fullmatch('!function=' + re_ident, t): + if func: + error(lineno, 'duplicate function') + func = t.split('=') + func = func[1] + continue + + if re_fullmatch('[0-9]+:s[0-9]+', t): + # Signed field extract + subtoks = t.split(':s') + sign = True + elif re_fullmatch('[0-9]+:[0-9]+', t): + # Unsigned field extract + subtoks = t.split(':') + sign = False + else: + error(lineno, 'invalid field token "{0}"'.format(t)) + po = int(subtoks[0]) + le = int(subtoks[1]) + if po + le > insnwidth: + error(lineno, 'field {0} too large'.format(t)) + f = Field(sign, po, le) + subs.append(f) + width += le + + if width > insnwidth: + error(lineno, 'field too large') + if len(subs) == 1: + f = subs[0] + else: + mask = 0 + for s in subs: + if mask & s.mask: + error(lineno, 'field components overlap') + mask |= s.mask + f = MultiField(subs, mask) + if func: + f = FunctionField(func, f) + + if name in fields: + error(lineno, 'duplicate field', name) + fields[name] = f +# end parse_field + + +def parse_arguments(lineno, name, toks): + """Parse one argument set from TOKS at LINENO""" + global arguments + global re_ident + + flds = [] + for t in toks: + if not re_fullmatch(re_ident, t): + error(lineno, 'invalid argument set token "{0}"'.format(t)) + if t in flds: + error(lineno, 'duplicate argument "{0}"'.format(t)) + flds.append(t) + + if name in arguments: + error(lineno, 'duplicate argument set', name) + arguments[name] = Arguments(name, flds) +# end parse_arguments + + +def lookup_field(lineno, name): + global fields + if name in fields: + return fields[name] + error(lineno, 'undefined field', name) + + +def add_field(lineno, flds, new_name, f): + if new_name in flds: + error(lineno, 'duplicate field', new_name) + flds[new_name] = f + return flds + + +def add_field_byname(lineno, flds, new_name, old_name): + return add_field(lineno, flds, new_name, lookup_field(lineno, old_name)) + + +def infer_argument_set(flds): + global arguments + + for arg in arguments.values(): + if eq_fields_for_args(flds, arg.fields): + return arg + + name = str(len(arguments)) + arg = Arguments(name, flds.keys()) + arguments[name] = arg + return arg + + +def infer_format(arg, fieldmask, flds): + global arguments + global formats + + const_flds = {} + var_flds = {} + for n, c in flds.items(): + if c is ConstField: + const_flds[n] = c + else: + var_flds[n] = c + + # Look for an existing format with the same argument set and fields + for fmt in formats.values(): + if arg and fmt.base != arg: + continue + if fieldmask != fmt.fieldmask: + continue + if not eq_fields_for_fmts(flds, fmt.fields): + continue + return (fmt, const_flds) + + name = 'Fmt_' + str(len(formats)) + if not arg: + arg = infer_argument_set(flds) + + fmt = Format(name, 0, arg, 0, 0, 0, fieldmask, var_flds) + formats[name] = fmt + + return (fmt, const_flds) +# end infer_format + + +def parse_generic(lineno, is_format, name, toks): + """Parse one instruction format from TOKS at LINENO""" + global fields + global arguments + global formats + global patterns + global re_ident + global insnwidth + global insnmask + + fixedmask = 0 + fixedbits = 0 + undefmask = 0 + width = 0 + flds = {} + arg = None + fmt = None + for t in toks: + # '&Foo' gives a format an explcit argument set. + if t[0] == '&': + tt = t[1:] + if arg: + error(lineno, 'multiple argument sets') + if tt in arguments: + arg = arguments[tt] + else: + error(lineno, 'undefined argument set', t) + continue + + # '@Foo' gives a pattern an explicit format. + if t[0] == '@': + tt = t[1:] + if fmt: + error(lineno, 'multiple formats') + if tt in formats: + fmt = formats[tt] + else: + error(lineno, 'undefined format', t) + continue + + # '%Foo' imports a field. + if t[0] == '%': + tt = t[1:] + flds = add_field_byname(lineno, flds, tt, tt) + continue + + # 'Foo=%Bar' imports a field with a different name. + if re_fullmatch(re_ident + '=%' + re_ident, t): + (fname, iname) = t.split('=%') + flds = add_field_byname(lineno, flds, fname, iname) + continue + + # 'Foo=number' sets an argument field to a constant value + if re_fullmatch(re_ident + '=[0-9]+', t): + (fname, value) = t.split('=') + value = int(value) + flds = add_field(lineno, flds, fname, ConstField(value)) + continue + + # Pattern of 0s, 1s, dots and dashes indicate required zeros, + # required ones, or dont-cares. + if re_fullmatch('[01.-]+', t): + shift = len(t) + fms = t.replace('0', '1') + fms = fms.replace('.', '0') + fms = fms.replace('-', '0') + fbs = t.replace('.', '0') + fbs = fbs.replace('-', '0') + ubm = t.replace('1', '0') + ubm = ubm.replace('.', '0') + ubm = ubm.replace('-', '1') + fms = int(fms, 2) + fbs = int(fbs, 2) + ubm = int(ubm, 2) + fixedbits = (fixedbits << shift) | fbs + fixedmask = (fixedmask << shift) | fms + undefmask = (undefmask << shift) | ubm + # Otherwise, fieldname:fieldwidth + elif re_fullmatch(re_ident + ':s?[0-9]+', t): + (fname, flen) = t.split(':') + sign = False + if flen[0] == 's': + sign = True + flen = flen[1:] + shift = int(flen, 10) + f = Field(sign, insnwidth - width - shift, shift) + flds = add_field(lineno, flds, fname, f) + fixedbits <<= shift + fixedmask <<= shift + undefmask <<= shift + else: + error(lineno, 'invalid token "{0}"'.format(t)) + width += shift + + # We should have filled in all of the bits of the instruction. + if not (is_format and width == 0) and width != insnwidth: + error(lineno, 'definition has {0} bits'.format(width)) + + # Do not check for fields overlaping fields; one valid usage + # is to be able to duplicate fields via import. + fieldmask = 0 + for f in flds.values(): + fieldmask |= f.mask + + # Fix up what we've parsed to match either a format or a pattern. + if is_format: + # Formats cannot reference formats. + if fmt: + error(lineno, 'format referencing format') + # If an argument set is given, then there should be no fields + # without a place to store it. + if arg: + for f in flds.keys(): + if f not in arg.fields: + error(lineno, 'field {0} not in argument set {1}' + .format(f, arg.name)) + else: + arg = infer_argument_set(flds) + if name in formats: + error(lineno, 'duplicate format name', name) + fmt = Format(name, lineno, arg, fixedbits, fixedmask, + undefmask, fieldmask, flds) + formats[name] = fmt + else: + # Patterns can reference a format ... + if fmt: + # ... but not an argument simultaneously + if arg: + error(lineno, 'pattern specifies both format and argument set') + if fixedmask & fmt.fixedmask: + error(lineno, 'pattern fixed bits overlap format fixed bits') + fieldmask |= fmt.fieldmask + fixedbits |= fmt.fixedbits + fixedmask |= fmt.fixedmask + undefmask |= fmt.undefmask + else: + (fmt, flds) = infer_format(arg, fieldmask, flds) + arg = fmt.base + for f in flds.keys(): + if f not in arg.fields: + error(lineno, 'field {0} not in argument set {1}' + .format(f, arg.name)) + if f in fmt.fields.keys(): + error(lineno, 'field {0} set by format and pattern'.format(f)) + for f in arg.fields: + if f not in flds.keys() and f not in fmt.fields.keys(): + error(lineno, 'field {0} not initialized'.format(f)) + pat = Pattern(name, lineno, fmt, fixedbits, fixedmask, + undefmask, fieldmask, flds) + patterns.append(pat) + + # Validate the masks that we have assembled. + if fieldmask & fixedmask: + error(lineno, 'fieldmask overlaps fixedmask (0x{0:08x} & 0x{1:08x})' + .format(fieldmask, fixedmask)) + if fieldmask & undefmask: + error(lineno, 'fieldmask overlaps undefmask (0x{0:08x} & 0x{1:08x})' + .format(fieldmask, undefmask)) + if fixedmask & undefmask: + error(lineno, 'fixedmask overlaps undefmask (0x{0:08x} & 0x{1:08x})' + .format(fixedmask, undefmask)) + if not is_format: + allbits = fieldmask | fixedmask | undefmask + if allbits != insnmask: + error(lineno, 'bits left unspecified (0x{0:08x})' + .format(allbits ^ insnmask)) +# end parse_general + + +def parse_file(f): + """Parse all of the patterns within a file""" + + # Read all of the lines of the file. Concatenate lines + # ending in backslash; discard empty lines and comments. + toks = [] + lineno = 0 + for line in f: + lineno += 1 + + # Discard comments + end = line.find('#') + if end >= 0: + line = line[:end] + + t = line.split() + if len(toks) != 0: + # Next line after continuation + toks.extend(t) + elif len(t) == 0: + # Empty line + continue + else: + toks = t + + # Continuation? + if toks[-1] == '\\': + toks.pop() + continue + + if len(toks) < 2: + error(lineno, 'short line') + + name = toks[0] + del toks[0] + + # Determine the type of object needing to be parsed. + if name[0] == '%': + parse_field(lineno, name[1:], toks) + elif name[0] == '&': + parse_arguments(lineno, name[1:], toks) + elif name[0] == '@': + parse_generic(lineno, True, name[1:], toks) + else: + parse_generic(lineno, False, name, toks) + toks = [] +# end parse_file + + +class Tree: + """Class representing a node in a decode tree""" + + def __init__(self, fm, tm): + self.fixedmask = fm + self.thismask = tm + self.subs = [] + self.base = None + + def str1(self, i): + ind = str_indent(i) + r = '{0}{1:08x}'.format(ind, self.fixedmask) + if self.format: + r += ' ' + self.format.name + r += ' [\n' + for (b, s) in self.subs: + r += '{0} {1:08x}:\n'.format(ind, b) + r += s.str1(i + 4) + '\n' + r += ind + ']' + return r + + def __str__(self): + return self.str1(0) + + def output_code(self, i, extracted, outerbits, outermask): + ind = str_indent(i) + + # If we identified all nodes below have the same format, + # extract the fields now. + if not extracted and self.base: + output(ind, self.base.extract_name(), + '(&u.f_', self.base.base.name, ', insn);\n') + extracted = True + + # Attempt to aid the compiler in producing compact switch statements. + # If the bits in the mask are contiguous, extract them. + sh = is_contiguous(self.thismask) + if sh > 0: + # Propagate SH down into the local functions. + def str_switch(b, sh=sh): + return '(insn >> {0}) & 0x{1:x}'.format(sh, b >> sh) + + def str_case(b, sh=sh): + return '0x{0:x}'.format(b >> sh) + else: + def str_switch(b): + return 'insn & 0x{0:08x}'.format(b) + + def str_case(b): + return '0x{0:08x}'.format(b) + + output(ind, 'switch (', str_switch(self.thismask), ') {\n') + for b, s in sorted(self.subs): + assert (self.thismask & ~s.fixedmask) == 0 + innermask = outermask | self.thismask + innerbits = outerbits | b + output(ind, 'case ', str_case(b), ':\n') + output(ind, ' /* ', + str_match_bits(innerbits, innermask), ' */\n') + s.output_code(i + 4, extracted, innerbits, innermask) + output(ind, '}\n') + output(ind, 'return false;\n') +# end Tree + + +def build_tree(pats, outerbits, outermask): + # Find the intersection of all remaining fixedmask. + innermask = ~outermask + for i in pats: + innermask &= i.fixedmask + + if innermask == 0: + pnames = [] + for p in pats: + pnames.append(p.name + ':' + str(p.lineno)) + error(pats[0].lineno, 'overlapping patterns:', pnames) + + fullmask = outermask | innermask + + # Sort each element of pats into the bin selected by the mask. + bins = {} + for i in pats: + fb = i.fixedbits & innermask + if fb in bins: + bins[fb].append(i) + else: + bins[fb] = [i] + + # We must recurse if any bin has more than one element or if + # the single element in the bin has not been fully matched. + t = Tree(fullmask, innermask) + + for b, l in bins.items(): + s = l[0] + if len(l) > 1 or s.fixedmask & ~fullmask != 0: + s = build_tree(l, b | outerbits, fullmask) + t.subs.append((b, s)) + + return t +# end build_tree + + +def prop_format(tree): + """Propagate Format objects into the decode tree""" + + # Depth first search. + for (b, s) in tree.subs: + if isinstance(s, Tree): + prop_format(s) + + # If all entries in SUBS have the same format, then + # propagate that into the tree. + f = None + for (b, s) in tree.subs: + if f is None: + f = s.base + if f is None: + return + if f is not s.base: + return + tree.base = f +# end prop_format + + +def main(): + global arguments + global formats + global patterns + global translate_scope + global translate_prefix + global output_fd + global output_file + global input_file + global insnwidth + global insntype + + decode_function = 'decode' + + long_opts = ['decode=', 'translate=', 'output=', 'insnwidth='] + try: + (opts, args) = getopt.getopt(sys.argv[1:], 'o:w:', long_opts) + except getopt.GetoptError as err: + error(0, err) + for o, a in opts: + if o in ('-o', '--output'): + output_file = a + elif o == '--decode': + decode_function = a + elif o == '--translate': + translate_prefix = a + translate_scope = '' + elif o in ('-w', '--insnwidth'): + insnwidth = int(a) + if insnwidth == 16: + insntype = 'uint16_t' + insnmask = 0xffff + elif insnwidth != 32: + error(0, 'cannot handle insns of width', insnwidth) + else: + assert False, 'unhandled option' + + if len(args) < 1: + error(0, 'missing input file') + input_file = args[0] + f = open(input_file, 'r') + parse_file(f) + f.close() + + t = build_tree(patterns, 0, 0) + prop_format(t) + + if output_file: + output_fd = open(output_file, 'w') + else: + output_fd = sys.stdout + + output_autogen() + for n in sorted(arguments.keys()): + f = arguments[n] + f.output_def() + + # A single translate function can be invoked for different patterns. + # Make sure that the argument sets are the same, and declare the + # function only once. + out_pats = {} + for i in patterns: + if i.name in out_pats: + p = out_pats[i.name] + if i.base.base != p.base.base: + error(0, i.name, ' has conflicting argument sets') + else: + i.output_decl() + out_pats[i.name] = i + output('\n') + + for n in sorted(formats.keys()): + f = formats[n] + f.output_extract() + + output(translate_scope, 'bool ', decode_function, + '(DisasContext *ctx, ', insntype, ' insn)\n{\n') + + i4 = str_indent(4) + output(i4, 'union {\n') + for n in sorted(arguments.keys()): + f = arguments[n] + output(i4, i4, f.struct_name(), ' f_', f.name, ';\n') + output(i4, '} u;\n\n') + + t.output_code(4, False, 0, 0) + + output('}\n') + + if output_file: + output_fd.close() +# end main + + +if __name__ == '__main__': + main() diff --git a/tests/Makefile.include b/tests/Makefile.include index f41da235ae..8cc7d56e97 100644 --- a/tests/Makefile.include +++ b/tests/Makefile.include @@ -928,6 +928,13 @@ $(patsubst %, check-%, $(check-qapi-schema-y)): check-%.json: $(SRC_PATH)/%.json check-tests/qapi-schema/doc-good.texi: tests/qapi-schema/doc-good.test.texi @diff -q $(SRC_PATH)/tests/qapi-schema/doc-good.texi $< +.PHONY: check-decodetree +check-decodetree: + $(call quiet-command, \ + cd $(SRC_PATH)/tests/decode && \ + ./check.sh "$(PYTHON)" "$(SRC_PATH)/scripts/decodetree.py", \ + TEST, decodetree.py) + # Consolidated targets .PHONY: check-qapi-schema check-qtest check-unit check check-clean @@ -936,7 +943,7 @@ check-qtest: $(patsubst %,check-qtest-%, $(QTEST_TARGETS)) check-unit: $(patsubst %,check-%, $(check-unit-y)) check-speed: $(patsubst %,check-%, $(check-speed-y)) check-block: $(patsubst %,check-%, $(check-block-y)) -check: check-qapi-schema check-unit check-qtest +check: check-qapi-schema check-unit check-qtest check-decodetree check-clean: $(MAKE) -C tests/tcg clean rm -rf $(check-unit-y) tests/*.o $(QEMU_IOTESTS_HELPERS-y) diff --git a/tests/decode/check.sh b/tests/decode/check.sh new file mode 100755 index 0000000000..6eb1392593 --- /dev/null +++ b/tests/decode/check.sh @@ -0,0 +1,16 @@ +#!/bin/sh + +PYTHON=$1 +DECODETREE=$2 +E=0 + +# All of these tests should produce errors +for i in err_*.def; do + if $PYTHON $DECODETREE $i > /dev/null 2> /dev/null; then + # Pass, aka failed to fail. + echo FAIL: $i 1>&2 + E=1 + fi +done + +exit $E diff --git a/tests/decode/err_argset1.def b/tests/decode/err_argset1.def new file mode 100644 index 0000000000..65d089d582 --- /dev/null +++ b/tests/decode/err_argset1.def @@ -0,0 +1 @@ +&args a a diff --git a/tests/decode/err_argset2.def b/tests/decode/err_argset2.def new file mode 100644 index 0000000000..16a812cf0d --- /dev/null +++ b/tests/decode/err_argset2.def @@ -0,0 +1 @@ +&args a b c d0 0e diff --git a/tests/decode/err_field1.def b/tests/decode/err_field1.def new file mode 100644 index 0000000000..075404ced1 --- /dev/null +++ b/tests/decode/err_field1.def @@ -0,0 +1 @@ +%field asdf diff --git a/tests/decode/err_field2.def b/tests/decode/err_field2.def new file mode 100644 index 0000000000..08933bf8c9 --- /dev/null +++ b/tests/decode/err_field2.def @@ -0,0 +1 @@ +%field 0:33 diff --git a/tests/decode/err_field3.def b/tests/decode/err_field3.def new file mode 100644 index 0000000000..ecb6427a40 --- /dev/null +++ b/tests/decode/err_field3.def @@ -0,0 +1 @@ +%field 31:2 diff --git a/tests/decode/err_field4.def b/tests/decode/err_field4.def new file mode 100644 index 0000000000..2844afc24a --- /dev/null +++ b/tests/decode/err_field4.def @@ -0,0 +1,2 @@ +%field 0:1 +%field 0:1 diff --git a/tests/decode/err_field5.def b/tests/decode/err_field5.def new file mode 100644 index 0000000000..cc3ea844ae --- /dev/null +++ b/tests/decode/err_field5.def @@ -0,0 +1 @@ +%field 0:1 !function=a !function=a diff --git a/tests/decode/err_init1.def b/tests/decode/err_init1.def new file mode 100644 index 0000000000..2c986cf627 --- /dev/null +++ b/tests/decode/err_init1.def @@ -0,0 +1,2 @@ +&args a b +insn 00000000 00000000 00000000 b:8 &args diff --git a/tests/decode/err_init2.def b/tests/decode/err_init2.def new file mode 100644 index 0000000000..7c80854ea5 --- /dev/null +++ b/tests/decode/err_init2.def @@ -0,0 +1,2 @@ +&args a b +insn 00000000 00000000 a:8 b:8 &args a=1 diff --git a/tests/decode/err_init3.def b/tests/decode/err_init3.def new file mode 100644 index 0000000000..15a3060c61 --- /dev/null +++ b/tests/decode/err_init3.def @@ -0,0 +1,3 @@ +&args a +@format ........ ........ a:16 &args +insn 00000000 00000000 a:16 @format diff --git a/tests/decode/err_init4.def b/tests/decode/err_init4.def new file mode 100644 index 0000000000..b84d968acd --- /dev/null +++ b/tests/decode/err_init4.def @@ -0,0 +1,3 @@ +&args a b +@format ........ ........ a:16 &args +insn 00000000 00000000 ........ ........ @format diff --git a/tests/decode/err_overlap1.def b/tests/decode/err_overlap1.def new file mode 100644 index 0000000000..5d39c2ddad --- /dev/null +++ b/tests/decode/err_overlap1.def @@ -0,0 +1,2 @@ +%field 0:1 +insn 00000000 00000000 00000000 00000000 %field diff --git a/tests/decode/err_overlap2.def b/tests/decode/err_overlap2.def new file mode 100644 index 0000000000..38e4f8ae31 --- /dev/null +++ b/tests/decode/err_overlap2.def @@ -0,0 +1,2 @@ +@format ........ ........ ........ ....... fld:1 +insn 00000000 00000000 00000000 00000000 @format diff --git a/tests/decode/err_overlap3.def b/tests/decode/err_overlap3.def new file mode 100644 index 0000000000..90c73f4f30 --- /dev/null +++ b/tests/decode/err_overlap3.def @@ -0,0 +1,2 @@ +%field 0:1 +insn 00000000 00000000 00000000 -------- %field diff --git a/tests/decode/err_overlap4.def b/tests/decode/err_overlap4.def new file mode 100644 index 0000000000..d83f8e3153 --- /dev/null +++ b/tests/decode/err_overlap4.def @@ -0,0 +1,2 @@ +@format ........ ........ ........ .......- +insn 00000000 00000000 00000000 00000000 @format diff --git a/tests/decode/err_overlap5.def b/tests/decode/err_overlap5.def new file mode 100644 index 0000000000..2f4359bb9b --- /dev/null +++ b/tests/decode/err_overlap5.def @@ -0,0 +1 @@ +%field 3:5 0:5 diff --git a/tests/decode/err_overlap6.def b/tests/decode/err_overlap6.def new file mode 100644 index 0000000000..bf446a3092 --- /dev/null +++ b/tests/decode/err_overlap6.def @@ -0,0 +1,2 @@ +@format ........ ........ ........ .......1 +insn 00000000 00000000 00000000 00000000 @format diff --git a/tests/decode/err_overlap7.def b/tests/decode/err_overlap7.def new file mode 100644 index 0000000000..44d914163b --- /dev/null +++ b/tests/decode/err_overlap7.def @@ -0,0 +1,2 @@ +insn1 00000000 00000000 00000000 00000000 +insn2 00000000 00000000 00000000 00000000 diff --git a/tests/decode/err_overlap8.def b/tests/decode/err_overlap8.def new file mode 100644 index 0000000000..40ce278fb1 --- /dev/null +++ b/tests/decode/err_overlap8.def @@ -0,0 +1 @@ +insn 00000000 00000000 00000000 0000000. diff --git a/tests/decode/err_overlap9.def b/tests/decode/err_overlap9.def new file mode 100644 index 0000000000..8490635eea --- /dev/null +++ b/tests/decode/err_overlap9.def @@ -0,0 +1,2 @@ +@format ........ a:8 ........ b:7 . +insn 00000000 ........ 00000000 ........ @format