Discussion:
[GASPP PATCH 0/6] More gas-pp/armasm patches
(too old to reply)
Martin Storsjo
2017-10-16 19:38:13 UTC
Permalink
These are necessary for building x264 with msvc/armasm64. Most of
them can probably be squashed into the main patch before pushing.

Martin Storsjo (6):
Allow extra space before comma in cbz/adr
Allow register names such as xzr instead of the pattern [xw]\d+ in
ccmp/csel
Handle cinc just like ccmp/csel
Convert ldr/str/ldrb/strb etc into ldurb, when the offset is negative
Convert local labels in tbz instructions for armasm
Work around an armasm64 bug in the scale operand to fcvtzs/scvtf

gas-preprocessor.pl | 41 +++++++++++++++++++++++++++++++++++++----
1 file changed, 37 insertions(+), 4 deletions(-)
--
2.7.4
Martin Storsjo
2017-10-16 19:38:14 UTC
Permalink
This should be squashed into "Add support for MS armasm64".
---
gas-preprocessor.pl | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gas-preprocessor.pl b/gas-preprocessor.pl
index 6353a07..4c91ee0 100755
--- a/gas-preprocessor.pl
+++ b/gas-preprocessor.pl
@@ -891,7 +891,7 @@ sub handle_serialized_line {
($arch eq "aarch64" and !is_aarch64_register($target))) {
$call_targets{$target}++;
}
- } elsif ($line =~ /(?:^|\n)\s*(\w+\s*:\s*)?(cbn?z|adr)\s+(\w+),\s*(\w+)/) {
+ } elsif ($line =~ /(?:^|\n)\s*(\w+\s*:\s*)?(cbn?z|adr)\s+(\w+)\s*,\s*(\w+)/) {
my $instr = $2;
my $reg = $3;
my $target = $4;
--
2.7.4
Diego Biurrun
2017-10-16 19:39:13 UTC
Permalink
Post by Martin Storsjo
This should be squashed into "Add support for MS armasm64".
---
gas-preprocessor.pl | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
OK

Diego
Martin Storsjo
2017-10-16 19:38:15 UTC
Permalink
Also update the csel pattern similarly.

This is required for building x264.
---
gas-preprocessor.pl | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gas-preprocessor.pl b/gas-preprocessor.pl
index 4c91ee0..2add3dd 100755
--- a/gas-preprocessor.pl
+++ b/gas-preprocessor.pl
@@ -1000,10 +1000,10 @@ sub handle_serialized_line {

# Convert "ccmp w0, #0, #0, ne" into "ccmpne w0, #0, #0",
# and "csel w0, w0, w0, ne" into "cselne w0, w0, w0".
- $line =~ s/(ccmp|csel)\s+([xw]\d+)\s*,\s*([xw#]\d+)\s*,\s*([xw#]\d+)\s*,\s*($arm_cond_codes)/\1\5 \2, \3, \4/;
+ $line =~ s/(ccmp|csel)\s+([xw]\w+)\s*,\s*([xw#]\w+)\s*,\s*([xw#]\w+)\s*,\s*($arm_cond_codes)/\1\5 \2, \3, \4/;

# Convert "cset w0, lo" into "csetlo w0"
- $line =~ s/(cset)\s+([xw]\d+)\s*,\s*($arm_cond_codes)/\1\3 \2/;
+ $line =~ s/(cset)\s+([xw]\w+)\s*,\s*($arm_cond_codes)/\1\3 \2/;

# Strip out prfum; armasm64 fails to assemble any
# variant/combination of prfum tested so far, but it can be
--
2.7.4
Janne Grunau
2017-10-18 06:42:06 UTC
Permalink
Post by Martin Storsjo
Also update the csel pattern similarly.
This is required for building x264.
---
gas-preprocessor.pl | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/gas-preprocessor.pl b/gas-preprocessor.pl
index 4c91ee0..2add3dd 100755
--- a/gas-preprocessor.pl
+++ b/gas-preprocessor.pl
@@ -1000,10 +1000,10 @@ sub handle_serialized_line {
# Convert "ccmp w0, #0, #0, ne" into "ccmpne w0, #0, #0",
# and "csel w0, w0, w0, ne" into "cselne w0, w0, w0".
- $line =~ s/(ccmp|csel)\s+([xw]\d+)\s*,\s*([xw#]\d+)\s*,\s*([xw#]\d+)\s*,\s*($arm_cond_codes)/\1\5 \2, \3, \4/;
+ $line =~ s/(ccmp|csel)\s+([xw]\w+)\s*,\s*([xw#]\w+)\s*,\s*([xw#]\w+)\s*,\s*($arm_cond_codes)/\1\5 \2, \3, \4/;
# Convert "cset w0, lo" into "csetlo w0"
- $line =~ s/(cset)\s+([xw]\d+)\s*,\s*($arm_cond_codes)/\1\3 \2/;
+ $line =~ s/(cset)\s+([xw]\w+)\s*,\s*($arm_cond_codes)/\1\3 \2/;
# Strip out prfum; armasm64 fails to assemble any
# variant/combination of prfum tested so far, but it can be
ok

Janne
Martin Storsjo
2017-10-16 19:38:16 UTC
Permalink
This can be squashed into "Add support for MS armasm64"; this
was found while trying to build x264.
---
gas-preprocessor.pl | 3 +++
1 file changed, 3 insertions(+)

diff --git a/gas-preprocessor.pl b/gas-preprocessor.pl
index 2add3dd..552ed0c 100755
--- a/gas-preprocessor.pl
+++ b/gas-preprocessor.pl
@@ -1002,6 +1002,9 @@ sub handle_serialized_line {
# and "csel w0, w0, w0, ne" into "cselne w0, w0, w0".
$line =~ s/(ccmp|csel)\s+([xw]\w+)\s*,\s*([xw#]\w+)\s*,\s*([xw#]\w+)\s*,\s*($arm_cond_codes)/\1\5 \2, \3, \4/;

+ # Convert "cinc w0, w0, ne" into "cincne w0, w0".
+ $line =~ s/(cinc)\s+([xw]\w+)\s*,\s*([xw]\w+)\s*,\s*($arm_cond_codes)/\1\4 \2, \3/;
+
# Convert "cset w0, lo" into "csetlo w0"
$line =~ s/(cset)\s+([xw]\w+)\s*,\s*($arm_cond_codes)/\1\3 \2/;
--
2.7.4
Janne Grunau
2017-10-18 06:42:38 UTC
Permalink
Post by Martin Storsjo
This can be squashed into "Add support for MS armasm64"; this
was found while trying to build x264.
---
gas-preprocessor.pl | 3 +++
1 file changed, 3 insertions(+)
diff --git a/gas-preprocessor.pl b/gas-preprocessor.pl
index 2add3dd..552ed0c 100755
--- a/gas-preprocessor.pl
+++ b/gas-preprocessor.pl
@@ -1002,6 +1002,9 @@ sub handle_serialized_line {
# and "csel w0, w0, w0, ne" into "cselne w0, w0, w0".
$line =~ s/(ccmp|csel)\s+([xw]\w+)\s*,\s*([xw#]\w+)\s*,\s*([xw#]\w+)\s*,\s*($arm_cond_codes)/\1\5 \2, \3, \4/;
+ # Convert "cinc w0, w0, ne" into "cincne w0, w0".
+ $line =~ s/(cinc)\s+([xw]\w+)\s*,\s*([xw]\w+)\s*,\s*($arm_cond_codes)/\1\4 \2, \3/;
+
# Convert "cset w0, lo" into "csetlo w0"
$line =~ s/(cset)\s+([xw]\w+)\s*,\s*($arm_cond_codes)/\1\3 \2/;
ok

Janne
Martin Storsjo
2017-10-16 19:38:17 UTC
Permalink
---
gas-preprocessor.pl | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/gas-preprocessor.pl b/gas-preprocessor.pl
index 552ed0c..b650c39 100755
--- a/gas-preprocessor.pl
+++ b/gas-preprocessor.pl
@@ -1012,6 +1012,19 @@ sub handle_serialized_line {
# variant/combination of prfum tested so far, but it can be
# left out without any
$line =~ s/prfum.*\]//;
+
+ # Convert "ldrb w0, [x0, #-1]" into "ldurb w0, [x0, #-1]".
+ # Don't do this for forms with writeback though.
+ if ($line =~ /(ld|st)(r[bh]?)\s+(\w+)\s*,\s*\[\s*(\w+)\s*,\s*#([^\]]+)\s*\][^!]/) {
+ my $instr = $1;
+ my $suffix = $2;
+ my $target = $3;
+ my $base = $4;
+ my $offset = eval_expr($5);
+ if ($offset < 0) {
+ $line =~ s/$instr$suffix/${instr}u$suffix/;
+ }
+ }
}
# armasm is unable to parse &0x - add spacing
$line =~ s/&0x/& 0x/g;
--
2.7.4
Janne Grunau
2017-10-18 06:52:41 UTC
Permalink
Post by Martin Storsjo
---
gas-preprocessor.pl | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/gas-preprocessor.pl b/gas-preprocessor.pl
index 552ed0c..b650c39 100755
--- a/gas-preprocessor.pl
+++ b/gas-preprocessor.pl
@@ -1012,6 +1012,19 @@ sub handle_serialized_line {
# variant/combination of prfum tested so far, but it can be
# left out without any
$line =~ s/prfum.*\]//;
+
+ # Convert "ldrb w0, [x0, #-1]" into "ldurb w0, [x0, #-1]".
+ # Don't do this for forms with writeback though.
+ if ($line =~ /(ld|st)(r[bh]?)\s+(\w+)\s*,\s*\[\s*(\w+)\s*,\s*#([^\]]+)\s*\][^!]/) {
+ my $instr = $1;
+ my $suffix = $2;
+ my $target = $3;
+ my $base = $4;
+ my $offset = eval_expr($5);
+ if ($offset < 0) {
+ $line =~ s/$instr$suffix/${instr}u$suffix/;
+ }
+ }
}
# armasm is unable to parse &0x - add spacing
$line =~ s/&0x/& 0x/g;
patch ok-ish but we should fix the offending code too. ldr? without
writeback supports only unsigned offsets and gas seems to fix it for us.

Janne
Martin Storsjö
2017-10-18 07:40:12 UTC
Permalink
Post by Janne Grunau
Post by Martin Storsjo
---
gas-preprocessor.pl | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/gas-preprocessor.pl b/gas-preprocessor.pl
index 552ed0c..b650c39 100755
--- a/gas-preprocessor.pl
+++ b/gas-preprocessor.pl
@@ -1012,6 +1012,19 @@ sub handle_serialized_line {
# variant/combination of prfum tested so far, but it can be
# left out without any
$line =~ s/prfum.*\]//;
+
+ # Convert "ldrb w0, [x0, #-1]" into "ldurb w0, [x0, #-1]".
+ # Don't do this for forms with writeback though.
+ if ($line =~ /(ld|st)(r[bh]?)\s+(\w+)\s*,\s*\[\s*(\w+)\s*,\s*#([^\]]+)\s*\][^!]/) {
+ my $instr = $1;
+ my $suffix = $2;
+ my $target = $3;
+ my $base = $4;
+ my $offset = eval_expr($5);
+ if ($offset < 0) {
+ $line =~ s/$instr$suffix/${instr}u$suffix/;
+ }
+ }
}
# armasm is unable to parse &0x - add spacing
$line =~ s/&0x/& 0x/g;
patch ok-ish but we should fix the offending code too. ldr? without
writeback supports only unsigned offsets and gas seems to fix it for us.
Ok, patch sent to x264.

// Martin

Martin Storsjo
2017-10-16 19:38:18 UTC
Permalink
Also convert the register from wX into xX, since armasm fails to
assemble it when referring to the register as wX.
---
gas-preprocessor.pl | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/gas-preprocessor.pl b/gas-preprocessor.pl
index b650c39..d9eaf1d 100755
--- a/gas-preprocessor.pl
+++ b/gas-preprocessor.pl
@@ -891,16 +891,23 @@ sub handle_serialized_line {
($arch eq "aarch64" and !is_aarch64_register($target))) {
$call_targets{$target}++;
}
- } elsif ($line =~ /(?:^|\n)\s*(\w+\s*:\s*)?(cbn?z|adr)\s+(\w+)\s*,\s*(\w+)/) {
+ } elsif ($line =~ /(?:^|\n)\s*(\w+\s*:\s*)?(cbn?z|adr|tbz)\s+(\w+)\s*,(\s*#\d+\s*,)?\s*(\w+)/) {
my $instr = $2;
my $reg = $3;
- my $target = $4;
+ my $bit = $4;
+ my $target = $5;
if ($target =~ /^(\d+)([bf])$/) {
# The target is a local label
$line = handle_local_label($line, $1, $2);
} else {
$call_targets{$target}++;
}
+ # Convert tbz with a wX register into an xX register.
+ if ($instr eq "tbz" and $reg =~ /w\d+/) {
+ my $xreg = $reg;
+ $xreg =~ s/w/x/;
+ $line =~ s/\b$reg\b/$xreg/;
+ }
} elsif ($line =~ /^\s*.h?word.*\b\d+[bf]\b/) {
while ($line =~ /\b(\d+)([bf])\b/g) {
$line = handle_local_label($line, $1, $2);
--
2.7.4
Janne Grunau
2017-10-18 06:56:26 UTC
Permalink
Post by Martin Storsjo
Also convert the register from wX into xX, since armasm fails to
assemble it when referring to the register as wX.
---
gas-preprocessor.pl | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/gas-preprocessor.pl b/gas-preprocessor.pl
index b650c39..d9eaf1d 100755
--- a/gas-preprocessor.pl
+++ b/gas-preprocessor.pl
@@ -891,16 +891,23 @@ sub handle_serialized_line {
($arch eq "aarch64" and !is_aarch64_register($target))) {
$call_targets{$target}++;
}
- } elsif ($line =~ /(?:^|\n)\s*(\w+\s*:\s*)?(cbn?z|adr)\s+(\w+)\s*,\s*(\w+)/) {
+ } elsif ($line =~ /(?:^|\n)\s*(\w+\s*:\s*)?(cbn?z|adr|tbz)\s+(\w+)\s*,(\s*#\d+\s*,)?\s*(\w+)/) {
my $instr = $2;
my $reg = $3;
- my $target = $4;
+ my $bit = $4;
+ my $target = $5;
if ($target =~ /^(\d+)([bf])$/) {
# The target is a local label
$line = handle_local_label($line, $1, $2);
} else {
$call_targets{$target}++;
}
+ # Convert tbz with a wX register into an xX register.
+ if ($instr eq "tbz" and $reg =~ /w\d+/) {
+ my $xreg = $reg;
+ $xreg =~ s/w/x/;
+ $line =~ s/\b$reg\b/$xreg/;
+ }
} elsif ($line =~ /^\s*.h?word.*\b\d+[bf]\b/) {
while ($line =~ /\b(\d+)([bf])\b/g) {
$line = handle_local_label($line, $1, $2);
please mention that this is an armasm bug, the fixup is ok though

Janne
Martin Storsjo
2017-10-16 19:38:19 UTC
Permalink
The operand shouldn't be stored as is, but stored as 64-scale, in
the opcode, but armasm64 misses to do this.

This might be a big enough bug to report and try to get fixed, but
that requires removing this workaround at that point.
---
gas-preprocessor.pl | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/gas-preprocessor.pl b/gas-preprocessor.pl
index d9eaf1d..182b684 100755
--- a/gas-preprocessor.pl
+++ b/gas-preprocessor.pl
@@ -1032,6 +1032,16 @@ sub handle_serialized_line {
$line =~ s/$instr$suffix/${instr}u$suffix/;
}
}
+
+ # Instructions like fcvtzs and scvtf store the scale value
+ # inverted in the opcode (stored as 64 - scale), but armasm64
+ # stores it as-is. Thus convert from "fcvtzs w0, s0, #8"
+ # into "fcvtzs w0, s0, #56".
+ if ($line =~ /(?:fcvtzs|scvtf)\s+(\w+)\s*,\s*(\w+)\s*,\s*#(\d+)/) {
+ my $scale = $3;
+ my $inverted_scale = 64 - $3;
+ $line =~ s/#$scale/#$inverted_scale/;
+ }
}
# armasm is unable to parse &0x - add spacing
$line =~ s/&0x/& 0x/g;
--
2.7.4
Janne Grunau
2017-10-18 07:02:29 UTC
Permalink
Post by Martin Storsjo
The operand shouldn't be stored as is, but stored as 64-scale, in
the opcode, but armasm64 misses to do this.
This might be a big enough bug to report and try to get fixed, but
that requires removing this workaround at that point.
Please report this as bug. I'd propose a environment variable or version
check for this fixup.
Post by Martin Storsjo
---
gas-preprocessor.pl | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/gas-preprocessor.pl b/gas-preprocessor.pl
index d9eaf1d..182b684 100755
--- a/gas-preprocessor.pl
+++ b/gas-preprocessor.pl
@@ -1032,6 +1032,16 @@ sub handle_serialized_line {
$line =~ s/$instr$suffix/${instr}u$suffix/;
}
}
+
+ # Instructions like fcvtzs and scvtf store the scale value
+ # inverted in the opcode (stored as 64 - scale), but armasm64
+ # stores it as-is. Thus convert from "fcvtzs w0, s0, #8"
+ # into "fcvtzs w0, s0, #56".
+ if ($line =~ /(?:fcvtzs|scvtf)\s+(\w+)\s*,\s*(\w+)\s*,\s*#(\d+)/) {
+ my $scale = $3;
+ my $inverted_scale = 64 - $3;
+ $line =~ s/#$scale/#$inverted_scale/;
+ }
}
# armasm is unable to parse &0x - add spacing
$line =~ s/&0x/& 0x/g;
The fixup itself is ok

Janne
Martin Storsjö
2017-10-18 07:10:39 UTC
Permalink
Post by Janne Grunau
Post by Martin Storsjo
The operand shouldn't be stored as is, but stored as 64-scale, in
the opcode, but armasm64 misses to do this.
This might be a big enough bug to report and try to get fixed, but
that requires removing this workaround at that point.
Please report this as bug.
I've reported it at
https://connect.microsoft.com/VisualStudio/feedback/details/3142655/armasm64-encodes-incorrect-scale-for-fcvtzs-and-scvtf
(seems to require sign-in to see) but apparently they are switching to
some other system so I have to re-report it elsewhere.
Post by Janne Grunau
I'd propose a environment variable or version check for this fixup.
Yeah, I thought about that. At this point within gas-preprocessor, we
don't really know the version of armasm (and executing it to find out
would be pretty expensive). One possible solution is to enclose the
hook-up of the affected functions in x264 with something like this:

#if !defined(_MSC_FULL_VER) || _MSC_FULL_VER >= FIXED_VERSION

That assumes that the assembler is updated in sync with the compiler,
which I guess is reasonable.

Alternatively we'd use something like the existing GASPP_FIX_XCODE5 env
var, requiring setting GASPP_FIX_ARMASM_FP_SCALE or something such. If the
bug report progress shows that it might not ever be fixed, we could remove
the need for the env variable of course.

// Martin
Loading...