Closed Bug 555798 Opened 14 years ago Closed 14 years ago

create global helpers for integer ops that handle overflow

Categories

(Core :: General, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: vlad, Assigned: bjacob)

References

Details

(Whiteboard: [sg:want P1])

Attachments

(4 files, 15 obsolete files)

8.22 KB, text/plain
vlad
: review+
Details
38.27 KB, patch
vlad
: review+
jrmuizel
: review+
Details | Diff | Splinter Review
5.00 KB, patch
neil
: review+
Details | Diff | Splinter Review
9.40 KB, patch
Details | Diff | Splinter Review
It would be handy if we had some global helpers somewhere for a handful of safe integer ops that checked for overflow.  Something like:

bool safe_ab(const int32& a, const int32& b, int32& out);
bool safe_a_plus_b(const int32& a, const int32& b, int32& out);
bool safe_ab_plus_c(const int32& a, const int32& b, const int32& c, int32& out);

and similar for uint32.
I think it might make more sense to have a C++ safe-int class that represents "integer OR error" so you can do a bunch of calculations and then check for error at the end. There would be a "get" method that gives you the result as an int, and which aborts if there's an error (so you should check for error before calling it). Similar to the way FP NaNs work.
Sure, that would work as well, and would be easier to use.  I see CERT has an IntegerLib at http://www.cert.org/secure-coding/integralsecurity.html
Attached patch A prototype implementation (obsolete) — Splinter Review
Here's a prototype implementation. It is basically a c++ wrapper around IntegerLib and is used as follows:

safe_int<int> a(5);
safe_int<int> b(7);

a *= b;

if (a.overflow) {
   handle_errors;
}
putting the overflow check behind an inline function would be helpful for future evolution I think
Jeff: I want!

roc: what do you mean?
I mean if you access a .overflow field then it always has to be a field, but if you use an inline function you can change it to use a bit mask or a comparison to a magic value or something else
Gerv,

The license text for IntegerLib is the following which I assume should be fine:

 * Copyright (c) 2005 Carnegie Mellon University.
 * All rights reserved.

 * Permission to use this software and its documentation for any purpose is hereby granted, 
 * provided that the above copyright notice appear and that both that copyright notice and 
 * this permission notice appear in supporting documentation, and that the name of CMU not 
 * be used in advertising or publicity pertaining to distribution of the software without 
 * specific, written prior permission.
 * 
 * CMU DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES 
 * OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL CMU BE LIABLE FOR ANY SPECIAL, INDIRECT OR 
 * CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, 
 * WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, RISING OUT OF OR IN 
 * CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. 

It would be nice to keep our modifications under the same license. How should this be done? Should we add Copyright (c) 2010 Mozilla Foundation and copies of the rights and warranty texts with CMU replaced with Mozilla Foundation?
Attached patch an implementation (obsolete) — Splinter Review
Here's an implementation based on templates.

it supports all types from 8bit to 64bit.
it supports both signed and unsigned.
it is only ~250 LOC.
it implements pretty much the same API as Jeff's original proposal.
Attachment #453887 - Flags: review?(jmuizelaar)
Attached patch test for safeint implementation (obsolete) — Splinter Review
This is a pretty thorough test.
Attachment #453888 - Flags: review?(jmuizelaar)
some more notes:

 - it doesn't carry significant code from IntegerLib. The only thing it uses from it is the formulas themselves, but those can be found at thousands of places on the Internet. So, credit them informally, but no need to consider copyright and licenses. And it's too generic code to be patent-sensitive.

 - it doesn't allow any implicit conversion to/from plain integers. If we had explicit conversion operators in C++98, we could consider this, but since we don't (they're coming in C++0x), it would be unsafe, so the API requires one to call .value() to get the integer value; and from there it seemed to make sense not to allow any implicit conversion, so the ctor is explicit, and safe_int*int is not allowed.
Huh, I didn't see that this had been reported by Vlad.

Jeff/Vlad, feel free to reassign the reviewing to Vlad :-)
Still don't know whether to ask Jeff or Vlad for review :-)

This new version has less template testosterone, since the goal is to get this accepted ;-) and NSPR defines anyway all integer types for us (PRInt64...)

It also adds unary operator-(), and simplifies a few more things.
Attachment #453887 - Attachment is obsolete: true
Attachment #453965 - Flags: review?(vladimir)
Attachment #453887 - Flags: review?(jmuizelaar)
This updated test is more exhaustive.
Attachment #453888 - Attachment is obsolete: true
Attachment #453966 - Flags: review?(vladimir)
Attachment #453888 - Flags: review?(jmuizelaar)
Attached file test for safeint implementation, v3 (obsolete) —
This new test version checks that invalidity is correctly preserved, so that we don't have the bad surprise to figure some day that x+y*z-t is flagged as valid even though z isn't...!
Assignee: nobody → bjacob
Attachment #453966 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #453968 - Flags: review?(vladimir)
Attachment #453966 - Flags: review?(vladimir)
(In reply to comment #9)
> The license text for IntegerLib is the following which I assume should be fine:

Are we still using this? If so, yes, that licence is fine but needs to be added to about:licence. Please file a separate bug for that in mozilla.org / Licensing.

> It would be nice to keep our modifications under the same license. How should
> this be done?

The normal method is just to make modifications and leave the licence as-is :-)

Gerv
(In reply to comment #17)
> (In reply to comment #9)
> > The license text for IntegerLib is the following which I assume should be fine:
> 
> Are we still using this?

As mentioned in comment 12, my version doesn't use code from IntegerLib except for the formulas themselves, the only nontrivial 2 lines of code deriving from IntegerLib code are:

the signed addition check:
line 73:  bool( (((x+y)^x)&((x+y)^y)) >> (sizeof(T)*8-1) )

the signed substraction check:
line 82:  bool( ((x^y)&((x-y)^x)) >> (sizeof(T)*8-1) )

Since these are only generic formulas, and equivalent formulas can be found at other places on the internet, for example:
https://www.securecoding.cert.org/confluence/display/seccode/INT32-C.+Ensure+that+operations+on+signed+integers+do+not+result+in+overflow

I don't really think that these formulas are subject to copyright. However, talking about this really showed a simplification so I'm going to post a new version using simpler formulas, making them really different from IntegerLib code.

For the multiplication checks, we really don't use any significant IntegerLib code. I started from that code and "digested" it, and it completely breaks down to trivial things that I rewrote "in my own idiom".
Attached file implementation v3
New in this version:
 - the last formulas copied from IntegerLib have been "digested" i.e. only the basic idea remains, and a comment explains what that idea is, it's really very simple, e.g. for addition:

// addition is valid if the sign of x+y is equal to either that of x
// or that of y
   has_sign_bit<T>(binary_complement<T>(((x+y)^x) & ((x+y)^y)))

- we are now storing m_is_valid as a T, not as a bool, and are doing bit-wise operations instead of logic operations, this should reduce the number of integer conversions and give better code.
Attachment #453965 - Attachment is obsolete: true
Attachment #454018 - Flags: review?(vladimir)
Attachment #453965 - Flags: review?(vladimir)
OK, then there is no licensing issue :-)

Gerv
Attachment #454018 - Attachment is patch: false
Attachment #453968 - Attachment is patch: false
Comment on attachment 454018 [details]
implementation v3

This looks great to me; would like jeff to look it over as well, if he hasn't looked at the final version.

I suggest that this should live in xpcom/ds/SafeInt.h, inside the mozilla namespace.  Should probably name the main class SafeInt instead of safeint too, just for consistency.  Also needs a MPL/GPL/LGPL tri-license header, grab one from an existing file.
Attachment #454018 - Flags: review?(vladimir)
Attachment #454018 - Flags: review?(jmuizelaar)
Attachment #454018 - Flags: review+
Oh -- doesn't this need to implement division?
No it currently doesn't, but Jeff asked for it too and it's very easy to implement, so let's do it. FYI (looking at IntegerLib) there is only one nontrivial thing to check in division, besides div by zero, it's that min_value / -1 is invalid because |min_value| > max_value.
> No it currently doesn't

I mean, it currently doesn't implement division --- but yes this would be nice to have.
Ok great -- want to do division and do the boilerplate stuff like header, Makefile.in change etc (follow what TimeStamp.h does), and I'll r+ a new patch?
Attached file implementation, v4 (obsolete) —
OK, here's a new version for Jeff to review. Changes since v3:
 - supports division
 - doesn't rely on std::numeric_limits anymore, because that may not be supported on actual PRInt... types (indeed, depending on the platform, they may be defined as compiler built-in types).
 - actually blocks usage on unsupported types. V3 was only checking that a type of same size and same signedness was supported, but it didn't check that it was actually the same type. That was dangerous, because it wouldn't catch usage of safeint<some_broken_pseudo_int_type>.
Attachment #454018 - Attachment is obsolete: true
Attachment #454321 - Flags: superreview?(jmuizelaar)
Attachment #454321 - Flags: review?
Attachment #454018 - Flags: review?(jmuizelaar)
Attachment #454018 - Attachment is obsolete: false
Attachment #454321 - Flags: superreview?(jmuizelaar)
Attachment #454321 - Flags: superreview?
Attachment #454321 - Flags: review?(jmuizelaar)
Attachment #454321 - Flags: review?
Attachment #454321 - Flags: superreview?
Attachment #454321 - Attachment description: test for safeint implementation, v4 → implementation, v4
Attached file test for safeint implementation, v4 (obsolete) —
Here is the corresponding updated test.
Attachment #453968 - Attachment is obsolete: true
Attachment #454322 - Flags: review?(jmuizelaar)
Attachment #453968 - Flags: review?(vladimir)
(In reply to comment #25)
> Ok great -- want to do division and do the boilerplate stuff like header,
> Makefile.in change etc (follow what TimeStamp.h does), and I'll r+ a new patch?

Ah ok, will do that next week.
Attachment #454321 - Attachment is patch: false
Attached patch Add SafeInt class (obsolete) — Splinter Review
Here's the patch!

Changes since v4:
 - move to xpcom/ds and xpcom/tests
 - adapt to xpcom coding style
 - After discussion with Jeff, we decided to make it allow implicit (plain integer) -> SafeInt conversion, as well as mixing plain ints with SafeInt in arithmetic expressions (the result type is the SafeInt type; different SafeInt types cannot be mixed). Jeff: in the end I found a simple way of allowing all cases including e.g. 2+x.
 - unit tests improvements.

Vlad: in the Makefile.in, TimeStamp was inside of a ifndef MOZ_ENABLE_LIBXUL, so here it was not being built. So I added my test outside of that ifndef.
Attachment #444455 - Attachment is obsolete: true
Attachment #444456 - Attachment is obsolete: true
Attachment #454321 - Attachment is obsolete: true
Attachment #454322 - Attachment is obsolete: true
Attachment #454670 - Flags: review?(vladimir)
Attachment #454670 - Flags: review?(jmuizelaar)
Attachment #454321 - Flags: review?(jmuizelaar)
Attachment #454322 - Flags: review?(jmuizelaar)
Ah yes, and also, more changes since v4:
 - refactors our code handling the different integer types, it's saner now, and more extensible.
 - the safeguard against using SafeInt on an unsupported type has been redone, it no longer adds a dummy template parameter to SafeInt and gives much saner compile errors.
Attached patch Add SafeInt class, updated (obsolete) — Splinter Review
Minor update:
 - let the test return g_tests_failed > 0
 - add SafeInt.h to the Makefile.in
Attachment #454670 - Attachment is obsolete: true
Attachment #454729 - Flags: review?(vladimir)
Attachment #454729 - Flags: review?(jmuizelaar)
Attachment #454670 - Flags: review?(vladimir)
Attachment #454670 - Flags: review?(jmuizelaar)
Attached patch Add SafeInt class, updated again (obsolete) — Splinter Review
Here's yet another update, fixing the construction-from-plain-integer problem.

The problem was: should the SafeInt<T> constructor taking a T be explicit or not?

 * if it was explicit, doing SafeInt<PRInt8> x(1); was not allowed, as 1 is not of type PRInt8

 * if it was not explicit, doing SafeInt<PRUint8> x(-1) was allowed and was dangerous, as -1 would be implicitly casted to PRUint8 without generating an error.


The new version fixes this problem by making this constructor non-explicit BUT letting its argument be of a separate type U (separate template param) so that no implicit conversion takes place before the constructor is called, so that the constructor can explicitly do a is_in_range<T>(x) check. This is completely flexible and safe at the same time!
Attachment #454729 - Attachment is obsolete: true
Attachment #454831 - Flags: review?(vladimir)
Attachment #454831 - Flags: review?(jmuizelaar)
Attachment #454729 - Flags: review?(vladimir)
Attachment #454729 - Flags: review?(jmuizelaar)
Oops, the documentation for the constructor was outdated. This update only fixes documentation.
Attachment #454831 - Attachment is obsolete: true
Attachment #454870 - Flags: review?(vladimir)
Attachment #454870 - Flags: review?(jmuizelaar)
Attachment #454831 - Flags: review?(vladimir)
Attachment #454831 - Flags: review?(jmuizelaar)
After discussion with Jeff and Ehsan:
 - rename SafeInt to CheckedInt
     (it's more precise and SafeInt is the name used by Microsoft already)
 - methods name changes:
     Value()  ---> value()
     IsValid() ---> valid()
Attachment #454870 - Attachment is obsolete: true
Attachment #454952 - Flags: review?(vladimir)
Attachment #454952 - Flags: review?(jmuizelaar)
Attachment #454870 - Flags: review?(vladimir)
Attachment #454870 - Flags: review?(jmuizelaar)
Comment on attachment 454952 [details] [diff] [review]
Add CheckedInt class

I only looked quickly and at this point care more about the external api then anything else. It all seems sane enough :)
Attachment #454952 - Flags: review?(jmuizelaar) → review+
Comment on attachment 454952 [details] [diff] [review]
Add CheckedInt class

yup, looks good!
Attachment #454952 - Flags: review?(vladimir) → review+
OK great! This is on the tryserver and I am preparing a patch to make WebGL use it.
http://hg.mozilla.org/mozilla-central/rev/c6b9defb1972
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Comment on attachment 454952 [details] [diff] [review]
Add CheckedInt class

>+template<typename T> struct is_unsupported_type { enum { answer = 0 }; };
>+template<> struct is_unsupported_type<unsupported_type> { enum { answer = 1 }; };
You only use this in one spot, and you then go and use ? 0 : 1 to flip the value; why not change this to struct is_supported_type instead?

>+        is_signed = (T(-1) > T(0)) ? 0 : 1
<= also works.

>+        return (x <= integer_traits<T>::max()) &
>+               (x >= integer_traits<T>::min());
Did you mean && ?
[Having looked at the rest of the patch I see you combining "bool" variables with & but I know there's a difference here. Your code compiles as
(x <= integer_traits<T>::max() ? 1 : 0)&(x >= integer_traits<T>::min() ? 1 : 0)
but using && it would compile as
(x <= integer_traits<T>::max() ? (x >= integer_traits<T>::min() ? 1 : 0) : 0)]

>+    /** Constructs a valid checked integer with uninitialized value */
>+    CheckedInt() : mIsValid(1)
Uninitialized values are valid?

>+        T result = -value();
>+        /* give the compiler a good chance to perform RVO */
>+        return CheckedInt(result,
>+                       mIsValid & CheckedInt_internal::is_sub_valid(T(0), value(), result));
It looks strange that you use value() to access mValue but you access mIsValid directly. (And you are consistently inconsistent, too!)
[Also your indentation happens to be off in this line.]
(In reply to comment #39)
> Comment on attachment 454952 [details] [diff] [review]
> Add CheckedInt class
> 
> >+template<typename T> struct is_unsupported_type { enum { answer = 0 }; };
> >+template<> struct is_unsupported_type<unsupported_type> { enum { answer = 1 }; };
> You only use this in one spot, and you then go and use ? 0 : 1 to flip the
> value; why not change this to struct is_supported_type instead?

I thought about this but on the other hand, this makes it more explicit that this is just comparing for equality with unsupported_type.

> 
> >+        is_signed = (T(-1) > T(0)) ? 0 : 1
> <= also works.

Sure...

> 
> >+        return (x <= integer_traits<T>::max()) &
> >+               (x >= integer_traits<T>::min());
> Did you mean && ?

Tough one. I really meant & because I want my boolean variables here to be of type T, not bool, in order to limit the number of type conversions. But >= and <= return bool, so I guess I should rather write:
  T(x <= integer_traits<T>::max()) &
  T(x >= integer_traits<T>::min());
and count on the compiler to be smart enough to optimize <= and >= to stay within type T. Or perhaps I should optimize that manually by doing substractions and checking the sign bit. The point is that no branching is needed here.

> [Having looked at the rest of the patch I see you combining "bool" variables
> with & but I know there's a difference here. Your code compiles as
> (x <= integer_traits<T>::max() ? 1 : 0)&(x >= integer_traits<T>::min() ? 1 : 0)
> but using && it would compile as
> (x <= integer_traits<T>::max() ? (x >= integer_traits<T>::min() ? 1 : 0) : 0)]

I know, and I really don't want that, I would like to limit the number of branchings, by doing as much as possible by bit operations on type T.

> 
> >+    /** Constructs a valid checked integer with uninitialized value */
> >+    CheckedInt() : mIsValid(1)
> Uninitialized values are valid?

We sure can initialize by zero instead, it doesn't cost much compared to the rest. I would rather not take the route of considering uninitialized as invalid as that would dilute the meaning of invalid.

> 
> >+        T result = -value();
> >+        /* give the compiler a good chance to perform RVO */
> >+        return CheckedInt(result,
> >+                       mIsValid & CheckedInt_internal::is_sub_valid(T(0), value(), result));
> It looks strange that you use value() to access mValue but you access mIsValid
> directly. (And you are consistently inconsistent, too!)
> [Also your indentation happens to be off in this line.]

Yes, good point, moving to using mValue.

(patch coming).
It turns out that GCC 4.4 / x86-64 is much more clever than us. First, it realizes that since the conditions here have no side effect, & and && are logically equivalent. Second, regardless of & vs && and regardless of casting to integer type or not, it just generates this super-optimized assembly (here case where x is int and T is short:)

	addl	$32768, %edi
	xorl	%eax, %eax
	cmpl	$65535, %edi
	setbe	%al
	ret

i.e. it realizes that by shifting the problem (addl) it can turn this into an unsigned int range check, i.e. only one comparison needed; and it does by itself optimize this comparison by just checking the sign bit.
This patch:
 * removes the "optimizations" that were seen to be futile in previous comment
 * makes default ctor init by zero
 * makes the code use mValue member instead of value()

Notice that another patch is coming in bug 601535.
Attachment #481467 - Flags: review?(neil)
(In reply to comment #40)
> (In reply to comment #39)
> > why not change this to struct is_supported_type instead?
> I thought about this but on the other hand, this makes it more explicit that
> this is just comparing for equality with unsupported_type.
"twice_bigger_type_is_supported" confused me here.

> > >+        return (x <= integer_traits<T>::max()) &
> > >+               (x >= integer_traits<T>::min());
> > Did you mean && ?
> The point is that no branching is needed here.
Sorry, I forgot that we now have instructions such as setbe.

> > >+    /** Constructs a valid checked integer with uninitialized value */
> > >+    CheckedInt() : mIsValid(1)
> > Uninitialized values are valid?
> I would rather not take the route of considering uninitialized as invalid
> as that would dilute the meaning of invalid.
I can't see how you can claim that uninitialized is valid.
Comment on attachment 481467 [details] [diff] [review]
remove futile optimizations, use mValue member, and init by zero

>diff --git a/gfx/angle/src/compiler/util.h b/gfx/angle/src/compiler/util.h
Some other patch?

>         T result = -value();
Ironically you didn't change the instance that I quoted ;-) In fact you missed at least one other value(), maybe you searched for .value() ?

>-                         lhs.mIsValid & rhs.mIsValid & is_op_valid);
>+                         lhs.mIsValid && rhs.mIsValid && is_op_valid);
As I mentioned, I know why you wanted to use & here, so I went back and edited my original comment, and now I don't have any objection to &s in this code.
Attachment #481467 - Flags: review?(neil)
(In reply to comment #44)
> Comment on attachment 481467 [details] [diff] [review]
> remove futile optimizations, use mValue member, and init by zero
> 
> >diff --git a/gfx/angle/src/compiler/util.h b/gfx/angle/src/compiler/util.h
> Some other patch?

Yes, sorry!

> 
> >         T result = -value();
> Ironically you didn't change the instance that I quoted ;-) In fact you missed
> at least one other value(), maybe you searched for .value() ?

That's exactly what I did... should be fixed in the new patch.

> 
> >-                         lhs.mIsValid & rhs.mIsValid & is_op_valid);
> >+                         lhs.mIsValid && rhs.mIsValid && is_op_valid);
> As I mentioned, I know why you wanted to use & here, so I went back and edited
> my original comment, and now I don't have any objection to &s in this code.

But actually, _I_ am now convinced that this change (& ---> &&) is useful: it makes the code more explicit (you wouldn't have had to ask a question if it had been like that) while not making any difference in the resulting executable.
Comment on attachment 481567 [details] [diff] [review]
v2: remove futile optimizations, use mValue member, and init by zero

>-        return (x <= integer_traits<T>::max_value()) &
>+        return (x <= integer_traits<T>::max_value()) &&
>                (x >= integer_traits<T>::min_value());
[All the (somewhat old) compilers I tried do better with & than &&...]

> {                                                                     \
>-    T x = lhs.value();                                                \
>-    T y = rhs.value();                                                \
>+    T x = lhs.mValue;                                                \
>+    T y = rhs.mValue;                                                \
>     T result = x OP y;                                                \
>     T is_op_valid                                                     \
>         = CheckedInt_internal::is_##NAME##_valid(x, y, result);       \
>     /* give the compiler a good chance to perform RVO */              \
>     return CheckedInt<T>(result,                                      \
>-                         lhs.mIsValid & rhs.mIsValid & is_op_valid);  \
>+                         lhs.mIsValid && rhs.mIsValid && is_op_valid);  \
[Oops, the \s don't line up any more...]
Attachment #481567 - Flags: review?(neil) → review+
(In reply to comment #47)
> Comment on attachment 481567 [details] [diff] [review]
> v2: remove futile optimizations, use mValue member, and init by zero
> 
> >-        return (x <= integer_traits<T>::max_value()) &
> >+        return (x <= integer_traits<T>::max_value()) &&
> >                (x >= integer_traits<T>::min_value());
> [All the (somewhat old) compilers I tried do better with & than &&...]

Ah! Thanks for the information. & it is, then.
Comment on attachment 481567 [details] [diff] [review]
v2: remove futile optimizations, use mValue member, and init by zero

a=me as long as it passes try
Attachment #481567 - Flags: approval2.0+
Whiteboard: [sg:want P1]
This fixes compilation of CheckedInt<T> where T is a standard integer type like |long| that doesn't match any of the PRInt... types.

Not asking for review for now as I don't want to disturb people in these times of fixing hard-blockers.
updated: keep generating good compile errors when using unsupported type.
Attachment #503866 - Attachment is obsolete: true
please create a new bug for the above fixes -- this one's already been fixed :)  can mark the new one as blocking this one.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: