CLA signed integer comparison workaround

CLA has a hardware flaw that affects integer comparisons. The signed integer comparison instruction overflows if the difference is too large, such as when the inputs have opposite sign and are near the extreme values. This sets the status register incorrectly. The compiler implements a workaround for this problem; it will check the upper bits with a float comparison before proceeding to do the integer comparison. ### Background CLA comparisons set bits in the SR (status register) to indicate the result. These bits can be set by both the integer comparison instruction MCMP32 and ordinary integer arithmetic instructions, such as MSUB32. For integers, only NF (negative flag) and ZF (zero flag) are relevant (while CLA does have underflow/overflow flags, those flags only apply to floating-point instructions). Once these bits are set, conditional instructions can use one of the following conditions: condition name | meaning | when true ---------------|-----------------------|---------- NEQ | not equal | ZF == 0 EQ | equal | ZF == 1 GT | greater than | ZF == 0 and NF == 0 GEQ | greater than or equal | NF == 0 LT | less than | NF == 1 LEQ | less than or equal | ZF == 1 or NF == 1 The CLA instruction MCMP32 is a 32-bit signed integer comparison. CLA does not have an unsigned integer comparison instruction. The compiler implements unsigned integer comparison with the signed integer comparison instruction by transforming the operands into signed operands such that the result of the comparison of the transformed operands is the same as intended by the original operation. ### Problem The [TMS320x2806x Piccolo Technical Reference Manual SPRUH18G (Apr 2017) ](https://www.ti.com/lit/ug/spruh18g/spruh18g.pdf) says (page 592, MCMP32): > Set ZF and NF flags on the result of MRa - MRb where MRa and MRb are 32-bit integers. > If (MRa == MRb) {ZF=1; NF=0;} > If (MRa > MRb) {ZF=0; NF=0;} > If (MRa < MRb) {ZF=0; NF=1;} This reference manual also says (page 665, MSUB32) that the NF and ZF flags are set based on the bits in the result register, not the values of the input. In the case of overflow, the NF flag will be incorrect. Signed subtraction overflow does not matter to the compiler because it is considered an error (undefined behavior) on the part of the user. However, comparison must not overflow, even if subtracting those inputs would overflow. Consider the inputs ``` MR0 = 0x1 MR1 = 0x80000001 (-2147483647) ``` Note that ``` 1 - (-2147483647) ==> 2147483648 (overflow!) ``` The actual (infinite precision) result is a large positive number, but "MSUB32 MRx, MR0, MR1" sets NF=1, indicating a negative result. Clearly MR0 > MR1, but because MCMP32 behaves as if it were MSUB32, "MCMP32 MR0, MR1" sets ZF=0 and NF=1. This means that MCMP32 cannot reliably compare arbitrary signed integer values if subtraction of those values would overflow. (Note: EQ and NEQ are OK.) ### Detection The compiler will detect comparisons that could have the problem. The compiler will not emit this warning for every integer comparison. It will not emit a warning if the comparison can be shown to be safe. For example, all comparisons to zero are safe. Comparisons between two short ints are also safe. ``` % cat foo.cla int foo(int x, int y) { return x < y; } cl2000 foo.cla --cla_support "foo.cla", line 1: warning #30013-D: Comparison operation uses integer comparison instruction, which does not operate properly for values that would overflow subtraction. Use --cla_signed_compare_workaround=on to have the compiler work around this issue. ``` [[y Implementation Detection of the problem was implemented in C2000 compiler version 18.12.0.LTS ]] ### Workaround For each integer comparison, we need to ensure that either MCMP32 operates correctly on the inputs, or compute the comparison in some other way. If we're going to use the MCMP32 instruction, we need to ensure that the inputs satisfy: ``` -2147483648 <= (int64_t)x - (int64_t)y (int64_t)x - (int64_t)y <= 2147483647 ``` ![a graph](images/CGT-CLA-signed-comparison-problem-region.png) Because unsigned integer comparison is implemented in terms of signed integer comparison, it too has the problem. To be safe, the inputs must satisfy: ``` -2147483648 <= (int64_t)(0x80000000^x) - (int64_t)(0x80000000^y) (int64_t)(0x80000000^x) - (int64_t)(0x80000000^y) <= 2147483647 ``` ![a graph](images/CGT-CLA-unsigned-comparison-problem-region.png) #### Manual Workaround Suppose the compiler indicates there might be a problem with a particular comparison, perhaps: `if (x < y)` There are a few ways to work around the problem in the source code. If you know that the integer value of the operands are such that a subtraction will not overflow, you can give the compiler clues that this comparison is safe: `if (__mlt(x, y))` The "safe" comparison intrisics are \__mlt, \__mleq, \__mgt, and \__mgeq, and the unsigned equivalents \__mltu, \__mlequ, \__mgtu, \__mgequ If you know that the inputs will fit in a short, do this: `if ((short)x < (short)y)` or `for (short i = 0; i < (short)y); i++)` If you know your inputs will fit into float without losing precision, do this: `if ((float)x < (float)y)` or `for (float i = 0; i < (float)y); i++)` Because comparisons to zero cannot overflow, it might be better to rewrite loops as "down counters": `for (int i = y-1; i >= 0; i--) ` #### Compiler Workaround The compiler's strategy for working around this issue is to use the floating-point comparison instruction to detect and avoid situations where the integer comparison instruction will not compute the correct result. For example, given: `x <= y` the compiler will turn this into: `(float)x < (float)y || (float)x == (float)y && (x <= y)` [[y Implementation This workaround is implemented in C2000 compiler releases 16.9.9.LTS and 18.1.5.LTS. ]] But 32-bit floating-point does not have as much precision as 32-bit integer, so how does this work? It is true that casting x and y to float may lose precision and cause rounding, but even so `(float)x < (float)y` implies `x <= y`. To see that this is so, note that `x > y implies (float)x >= (float)y` because the rounding error of (float)y can never be greater than the rounding error of (float)x. So if `(float)x < (float)y` we have a quick answer (yes, x <= y). What about the other cases? If `(float)x > (float)y`, this implies `x >= y` we again have a quick answer (no, x is not <= y). If `(float)x == (float)y` it could be the case that the conversion to float introduced roundoff error, so we can't be sure about whether x is greater than, less than, or equal to y, but we can be sure that x and y are close enough in value that using the integer comparison instruction will not run afoul of the problem, so we can use it to directly test (x <= y). [[y Code size impact While conversion to float is a single instruction on CLA, there are a fair number of delay slots that must be filled, which typically means extra NOPs. If the compiler must use a floating-point comparison instruction to work around a suspect integer comparison, you can end up with a lot of extra instructions. Instead of one MCMP32 instruction, expect to see up to 22 instructions, including NOPs. This impacts both signed and unsigned integer comparison, but does not affect float comparisons. In practice, a lot of integer comparisons are safe (the compiler can be sure the inputs are fairly close together), but if your application uses a lot of 32-bit integer comparisons, you could see a significant increase in code size.]]  <div id="footer"></div>