Fixing Floating-point handling in Bionic on RISC-V
Posted on
It's been a while I have been an intern at the PLCT Lab. I have learned a lot about the RISC-V ecosystem during the past several weeks.
Background
As I am currently with the AOSP Porting Squad, the first things to do is to get Bionic (the libc
implementation on Android) work correctly. My mentor Mr. Wang (汪辰) has already done a stellar job on porting most of the Bionic to the RISC-V architecture. However, he needs to squeeze some time to work on the RISC-V QEMU support recently, so he tasked me with the job of fixing the floating-point handling issues.
Preparations
Obviously, the first thing to do is to get up to the speed with my mentor's existing work and continue from there. You can find the existing implementation here: https://github.com/aosp-riscv/platform_bionic.
To build Bionic, together with its dependencies and unit tests, you have to build it in an AOSP development environment. Our porting squad has already provided instructions on how to set up the environment, and you can find them below:
Some of the instructions may appear a little bit vague. For instance, to obtain prebuilt binaries for RISC-V GCC and QEMU, you have to navigate to this repository and this other repository.
Now source the necessary scripts and start the build process:
source ./build/envsetup.sh
export TARGET_ARCH=riscv64
export TARGET_PRODUCT=aosp_riscv64
mmm bionic external/icu
Identify the Cause
After the build process finishes, we can now run the tests:
pushd test/riscv
source ./envsetup
cd bionic/host
./run.sh
And we can see the following errors:
[... snip ...]
bionic/tests/math_test.cpp:1017: Failure
Expected equality of these values:
1235
lrintl(1234.01L)
Which is: 1234
[ FAILED ] math_h.lrint (3 ms)
Hmm... I think the cause might be the rounding mode was not respected, especially if we examine this part of the unit-test code:
1011 TEST(MATH_TEST, lrint) {
1012 auto guard = android::base::make_scope_guard([]() { fesetenv(FE_DFL_ENV); });
1013
1014 fesetround(FE_UPWARD); // lrint/lrintf/lrintl obey the rounding mode.
1015 ASSERT_EQ(1235, lrint(1234.01));
1016 ASSERT_EQ(1235, lrintf(1234.01f));
1017 ASSERT_EQ(1235, lrintl(1234.01L));
1018 fesetround(FE_TOWARDZERO); // lrint/lrintf/lrintl obey the rounding mode.
1019 ASSERT_EQ(1234, lrint(1234.01));
1020 ASSERT_EQ(1234, lrintf(1234.01f));
1021 ASSERT_EQ(1234, lrintl(1234.01L));
As you can see, the test clearly sets the rounding mode but our rounding result is not correct.
The next thing to look at is the function that sets the rounding mode, more specifically, the fesetround
function.
We will find the implementation like this:
96 int fesetround(int round)
97 {
98 round &= FE_UPWARD;
99 asm volatile ("fsrm %z0" : : "r" (round));
100 return 0;
101 }
Huh, we can see there is a piece of assembly in here. Don't worry, let's look it up in the RISC-V Assembly Manual and see how it works.
In Chapter 11.2: Floating-Point Control and Status Register, we can see fsrm
instruction accepts a rounding mode flag:
[...] The FRRM instruction reads the Rounding Mode field
frm
and copies it into the least-significant three bits of integer register rd, with zero in all other bits. FSRM swaps the value infrm
by copying the original value into integer register rd, and then writing a new value obtained from the three least-significant bits of integer register rs1 intofrm
.
And thus, we can verify that fesetround
was indeed implemented correctly.
So what went wrong? Hmm... Is the current implementation really using the FPU though? Maybe it's using the software float-point implementation instead. Let's look at the implementations for other architectures under the libm
folder, like arm64
: Yep, there are some assembly files lying around. This could be further proved by the build files.
Down the Rabbit Hole
Now that we have identified the issue, let's come up with a solution. For aarch64
, Bionic used compiler intrinsics to round the numbers, let's try that with a short example program:
#include <stdio.h>
int main() {
printf("%f\n", __builtin_rint(100.1));
return 0;
}
Since we are not going to run it (it would need a libc
), let's examine the disassembly instead:
0000000000000014 <.LBB0_1>:
14: 00000517 auipc a0,0x0
18: 00050513 mv a0,a0
1c: 2108 fld fa0,0(a0)
1e: 00000097 auipc ra,0x0
22: 000080e7 jalr ra # 1e <.LBB0_1+0xa>
26: e20505d3 fmv.x.d a1,fa0
Hmm, the compiler used fmv.x.d a1, fa0
to copy the integer value from FPU register fa0
to the GPR a0
.
From the assembly manual, we will learn that, fmv
is special. In the sense that, it does not respect rounding mode. We can verify that from the encoding diagram provided by the manual:
Well, that didn't work. Let's just hand-craft some assembly code instead.
Let's see... There has to be a rounding instruction in RISC-V somewhere in the manual, isn't it?
It turns out, there is none. Our current best bet is to use fcvt
to copy the float-point value to a GPR and then copy it back to the float-point register.
That... doesn't seem to be the solution, but nevertheless, a solution.
So let's implement the rint
-family of functions like this:
rint:
# copy from FPU register fa0 to GPR a0 with dynamic rounding mode
FCVT.L.D a0, fa0, dyn
# copy from GPR a0 to FPU ft0 with dynamic rounding mode
FCVT.D.L ft0, a0, dyn
# copy value from ft0 to fa0 while preserving sign bit (e.g. handles -0.0)
FSGNJ.D fa0, ft0, fa0
RET
rintf:
FCVT.W.S a0, fa0, dyn
FCVT.S.W ft0, a0, dyn
FSGNJ.S fa0, ft0, fa0
RET
The Abyss
Okay, there is one more issue: we need to implement rintl
, where the function signature looks like this:
long double rintl(long double x);
Can we apply the previous knowledge here? *sigh* I hope we could.
The Story of long double
In order to understand why there is an issue. We need to first understand the type long double
.
On x86
machines, long double
is defined as an extended precision floating point type, which is 80-bit long (for alignment reasons, it would be padded to 128-bit). On arm64
, long double
is the same as double
, which is 64-bit long.
For RISC-V though, according to RISC-V C ABI Specification, long double
on RISC-V is... 128-bit, which is a quadruple-precision floating point type!
This pose a huge issue, where hardware long double
support is only possible with RISC-V Q
extension, which is outside of the RISC-V riscv64gc
specification.
I have looked at GNU LibC (GLibc) implementation of this, and it seems like they are using the software only implementation (also does not respect rounding mode).
To be continued...?
So we are facing an impasse here. We are still thinking of a way to resolve this. I guess this is the end of this "adventure"... for now.