From add4406e90d4ff702a018c1df0b222eb77217abc Mon Sep 17 00:00:00 2001 From: Jerry Morrison Date: Mon, 10 Nov 2014 00:36:36 -0800 Subject: [PATCH 1/6] First draft of the latest "int rename" RFC. --- active/0000-int-name.md | 86 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 86 insertions(+) create mode 100644 active/0000-int-name.md diff --git a/active/0000-int-name.md b/active/0000-int-name.md new file mode 100644 index 00000000000..ef797fa10dd --- /dev/null +++ b/active/0000-int-name.md @@ -0,0 +1,86 @@ +- Start Date: 2014-11-09 +- RFC PR #: (leave this empty) +- Rust Issue #: (leave this empty) + +# Summary + +Rename the pointer-size integer types from `int` and `uint` to `index` and `uindex` to avoid misconceptions and misuses. They aren't the default types you're looking for. + +This RFC assumes a fixed-size integer type will become the type inference fallback and thus the language's "default integer type." See [RFC: Change integer fallback RFC to suggest `i32` instead of `int` as the fallback](https://github.com/rust-lang/rfcs/pull/452). That fixed-size type will be used heavily in tutorials and libraries. + + +# Motivation + + - To avoid programmer misconceptions and misuses about integer types. Target-dependent integers are for specific uses and should not look like the "default" types to use. + - To avoid a class of bugs when porting code to different size address spaces. + - To avoid excess performance costs when porting Rust code to larger address spaces. + - To resolve many discussions about these issues. + + +# Background + +The Rust language does array indexing using the smallest integer types that span the address space. For this purpose, Rust defines integer types, currently named `int` and `uint`, that are large enough to hold a pointer in the target environment -- `uint` for indexing and `int` for pointer differences. + +(For memory safety, the memory allocator will limit each node to half of address space so any array index will fit in a signed, pointer-sized integer.) + +But contrary to expectations set by other programming languages, these are not the fastest, "native," register, C-sized, nor 32-bit integer types. + +Given the history, `int` and `uint` _look_ like default integer types, but a target-dependent size is not a good default. + +Using pointer-sized integers for computations that are not limited by memory produces code with overflow bugs (checked or unchecked) on different size targets, non-portable binary I/O, and excess performance costs. + +This RFC replaces [RFC: int/uint portability to 16-bit CPUs](https://github.com/rust-lang/rfcs/pull/161). + + +# Detailed design + +Rename these two types. The names `index` and `uindex` are meant to convey their intended use with arrays. Use them more narrowly for array indexing and related purposes. + + +# Drawbacks + + - Renaming `int`/`uint` requires changing a bunch of existing code. (The Rust Guide will change anyway, once the integer fallback type is chosen.) + - The new names are longer. + + +# Alternatives + +Alternative names: + + - `index` and `uindex`, named for their uses and preserving Rust's "i" and "u" integer prefixes. + - `intptr` and `uintptr`, [borrowing from C's](http://en.cppreference.com/w/cpp/types/integer) `intptr_t` and `uintptr_t`. These names are awkward by design. + - `isize` and `usize`, [borrowing from C's](http://en.cppreference.com/w/cpp/types/integer) `ssize_t` and `size_t` with Rust's "i/u" prefixes. But these types are defined as having the same number of bits as a pointer, not as a way of measuring sizes. A `usize` would be larger than needed for the largest memory node. + - `intps` and `uintps`. + - `PointerSizedInt` and `PointerSizedUInt`. + - ... + +The impact of not doing this: Portability bugs, peformance bugs, difficulties explaining the language, and recurring discussions about this. A possible impact on language adoption when people read warnings not to use `int`. + +Another alternative considered is a lint warning on every use of `int` or `uint` that's not directly related to array indexing. + + +# Unresolved questions + + - Change this before Rust 1.0? + + +# References + + - [Guide: what to do about int](https://github.com/rust-lang/rust/issues/15526) + - [If `int` has the wrong size …?](http://discuss.rust-lang.org/t/if-int-has-the-wrong-size/454) + - [integer type style guidelines](https://github.com/rust-lang/rust-guidelines/issues/24) + - [Encourage fixed-size integer](https://github.com/rust-lang/rust/issues/16446) + + - [Change integer fallback RFC to suggest `i32` instead of `int` as the fallback](https://github.com/rust-lang/rfcs/pull/452) + - [Restore int fallback](https://github.com/rust-lang/rust/issues/16968) + - [Restore int/f64 fallback for unconstrained literals](https://github.com/rust-lang/rfcs/pull/212) and [consider removing the fallback to int for integer inference](https://github.com/rust-lang/rust/issues/6023) + - [Specify that int and uint are at least 32 bits on every CPU architecture](https://github.com/rust-lang/rust/issues/14758) + - [RFC: rename `int` and `uint` to `intptr`/`uintptr`](https://github.com/rust-lang/rust/issues/9940) + - [Decide whether to keep pointer sized integers as the default](https://github.com/rust-lang/rust/issues/11831) + +Example `int`/`uint` portability bugs [listed by](https://github.com/rust-lang/rust/issues/16446#issuecomment-59621753) Mickaël Salaün: + + - [`std::num::pow`: exponent should not be a `uint`](https://github.com/rust-lang/rust/issues/16755) + - [Bitv uses architecture-sized uints for backing storage](https://github.com/rust-lang/rust/issues/16736) + - [libcollection : Switches from uint to u32 in BitV and BitVSet](https://github.com/rust-lang/rust/pull/18018) + - [uint -> u32](https://github.com/dwrensha/capnproto-rust/commit/87ab4ee0fc03939ef2a186274395c8c69cb6689c), [update for uint -> u32](https://github.com/dwrensha/capnp-rpc-rust/commit/b2e0c953f60b389afd884264ea53cdec7f4de7b3) From a2d2fb2052e49a41c2347e525e85edf82f4b3d1a Mon Sep 17 00:00:00 2001 From: Jerry Morrison Date: Mon, 10 Nov 2014 18:11:34 -0800 Subject: [PATCH 2/6] Link to RFC: Scoped attributes for checked arithmetic --- active/0000-int-name.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/active/0000-int-name.md b/active/0000-int-name.md index ef797fa10dd..bb410b8d8b9 100644 --- a/active/0000-int-name.md +++ b/active/0000-int-name.md @@ -27,7 +27,7 @@ But contrary to expectations set by other programming languages, these are not t Given the history, `int` and `uint` _look_ like default integer types, but a target-dependent size is not a good default. -Using pointer-sized integers for computations that are not limited by memory produces code with overflow bugs (checked or unchecked) on different size targets, non-portable binary I/O, and excess performance costs. +Using pointer-sized integers for computations that are not limited by memory produces code with overflow bugs ([checked or unchecked](https://github.com/rust-lang/rfcs/pull/146)) on different size targets, non-portable binary I/O, and excess performance costs. This RFC replaces [RFC: int/uint portability to 16-bit CPUs](https://github.com/rust-lang/rfcs/pull/161). From 38b8214b5fb0fb9f5b73e764f9dbfaa1ddeee02a Mon Sep 17 00:00:00 2001 From: Jerry Morrison <1fish2@users.noreply.github.com> Date: Wed, 12 Nov 2014 21:37:20 -0800 Subject: [PATCH 3/6] minor editing refinements --- active/0000-int-name.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/active/0000-int-name.md b/active/0000-int-name.md index bb410b8d8b9..d64a71918d3 100644 --- a/active/0000-int-name.md +++ b/active/0000-int-name.md @@ -19,7 +19,7 @@ This RFC assumes a fixed-size integer type will become the type inference fallba # Background -The Rust language does array indexing using the smallest integer types that span the address space. For this purpose, Rust defines integer types, currently named `int` and `uint`, that are large enough to hold a pointer in the target environment -- `uint` for indexing and `int` for pointer differences. +The Rust language does array indexing using the smallest integer types that span the address space. For this purpose, Rust defines two integer types that are large enough to hold a pointer in the target environment: currently named `uint` for indexing and `int` for pointer differences. (For memory safety, the memory allocator will limit each node to half of address space so any array index will fit in a signed, pointer-sized integer.) @@ -32,14 +32,14 @@ Using pointer-sized integers for computations that are not limited by memory pro This RFC replaces [RFC: int/uint portability to 16-bit CPUs](https://github.com/rust-lang/rfcs/pull/161). -# Detailed design +# Detailed Design Rename these two types. The names `index` and `uindex` are meant to convey their intended use with arrays. Use them more narrowly for array indexing and related purposes. # Drawbacks - - Renaming `int`/`uint` requires changing a bunch of existing code. (The Rust Guide will change anyway, once the integer fallback type is chosen.) + - Renaming `int`/`uint` requires changing a bunch of existing code. (The Rust Guide will also change once the integer type inference fallback is chosen.) - The new names are longer. @@ -54,17 +54,17 @@ Alternative names: - `PointerSizedInt` and `PointerSizedUInt`. - ... -The impact of not doing this: Portability bugs, peformance bugs, difficulties explaining the language, and recurring discussions about this. A possible impact on language adoption when people read warnings not to use `int`. +The impact of not doing this: Portability bugs, peformance bugs, difficulties explaining the language, and recurring discussions about this. Also a possible impact on language adoption when people read warnings to be careful about using `int`. Another alternative considered is a lint warning on every use of `int` or `uint` that's not directly related to array indexing. -# Unresolved questions +# Unresolved Questions - Change this before Rust 1.0? -# References +# Discussion References - [Guide: what to do about int](https://github.com/rust-lang/rust/issues/15526) - [If `int` has the wrong size …?](http://discuss.rust-lang.org/t/if-int-has-the-wrong-size/454) From 92395b367578042774d34156d20e9195a001e4b3 Mon Sep 17 00:00:00 2001 From: Jerry Morrison <1fish2@users.noreply.github.com> Date: Thu, 13 Nov 2014 18:33:25 -0800 Subject: [PATCH 4/6] Feedback. First agree to rename then pick the names. Incorporate most of the feedback so far. The first aim is to decide to rename `int` and `uint`. The second step is to pick the specific names. I see the second step as usability design, not really bikeshedding, but it is second. A quick-and-dirty usability test could be done. (Although I do like the case for `iptr` and `uptr`.) TODO: Add usage examples. --- active/0000-int-name.md | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/active/0000-int-name.md b/active/0000-int-name.md index d64a71918d3..713d3f8e124 100644 --- a/active/0000-int-name.md +++ b/active/0000-int-name.md @@ -4,7 +4,7 @@ # Summary -Rename the pointer-size integer types from `int` and `uint` to `index` and `uindex` to avoid misconceptions and misuses. They aren't the default types you're looking for. +Rename the pointer-size integer types `int` and `uint` to avoid misconceptions and misuses. They aren't the default types you're looking for. Several candidate names are listed under **Alternatives**, below, starting with `iptr` and `uptr`. This RFC assumes a fixed-size integer type will become the type inference fallback and thus the language's "default integer type." See [RFC: Change integer fallback RFC to suggest `i32` instead of `int` as the fallback](https://github.com/rust-lang/rfcs/pull/452). That fixed-size type will be used heavily in tutorials and libraries. @@ -34,12 +34,17 @@ This RFC replaces [RFC: int/uint portability to 16-bit CPUs](https://github.com/ # Detailed Design -Rename these two types. The names `index` and `uindex` are meant to convey their intended use with arrays. Use them more narrowly for array indexing and related purposes. +Rename these two pointer-sized integer types. Decide on new names that convey their intended use with arrays rather than general-purpose integers. + +Update code and documentation to use pointer-sized integers more narrowly for array indexing and related purposes. Provide a deprecation period to carry out these updates. + +Rename the integer literal suffixes `i` and `u` to new names that suit the new type names (e.g. `iptr` and `uptr`). # Drawbacks - - Renaming `int`/`uint` requires changing a bunch of existing code. (The Rust Guide will also change once the integer type inference fallback is chosen.) + - Renaming `int`/`uint` requires changing a bunch of existing code. On the other hand, this is an ideal opportunity to fix integer portability bugs. + - The Rust Guide also needs to change, but it'll mostly change for the integer type inference fallback type. - The new names are longer. @@ -47,9 +52,14 @@ Rename these two types. The names `index` and `uindex` are meant to convey their Alternative names: - - `index` and `uindex`, named for their uses and preserving Rust's "i" and "u" integer prefixes. - - `intptr` and `uintptr`, [borrowing from C's](http://en.cppreference.com/w/cpp/types/integer) `intptr_t` and `uintptr_t`. These names are awkward by design. - - `isize` and `usize`, [borrowing from C's](http://en.cppreference.com/w/cpp/types/integer) `ssize_t` and `size_t` with Rust's "i/u" prefixes. But these types are defined as having the same number of bits as a pointer, not as a way of measuring sizes. A `usize` would be larger than needed for the largest memory node. + - `iptr` and `uptr`, which refer directly to the (variable) *pointer* length just like `i32` refers to the length 32 bits. + - `index` and `uindex`, related to array indexing and preserving Rust's "i"/"u" integer prefixes, however `uindex` is the type used for indexing. (Is "index" too good of an identifier to sacrifice to a keyword?) + - `sindex` and `index`, since the unsigned type is the one used for indexing. + - `intptr` and `uintptr`, [borrowing from C's](https://en.wikipedia.org/wiki/C_data_types#Fixed-width_integer_types) `intptr_t` and `uintptr_t`. These names are awkward by design. + - `isize` and `usize`, [borrowing from C's](https://en.wikipedia.org/wiki/C_data_types#Size_and_pointer_difference_types) `ssize_t` and `size_t` with Rust's "i/u" prefixes. But these two types are defined as having the same number of bits as a pointer, that is in terms of the address space size, not for measuring objects. An unsigned pointer-sized integer can count at least twice the number of bytes in the maximum memory node (which is limited by the signed pointer-sized integer) and yet it artificially limits the size in elements of a bit vector. + - `index` and `ptrdiff`. + - `offset` and `size`. + - `ioffset` and `ulength` or `ulen` or `uaddr`. - `intps` and `uintps`. - `PointerSizedInt` and `PointerSizedUInt`. - ... From 79c867ad9582cfb3b1139cc8119efa0f36fc3811 Mon Sep 17 00:00:00 2001 From: Jerry Morrison <1fish2@users.noreply.github.com> Date: Wed, 19 Nov 2014 15:47:48 -0800 Subject: [PATCH 5/6] Add imem/umem. Simplify/clarify isize/usize info. --- active/0000-int-name.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/active/0000-int-name.md b/active/0000-int-name.md index 713d3f8e124..b00e0278267 100644 --- a/active/0000-int-name.md +++ b/active/0000-int-name.md @@ -38,7 +38,7 @@ Rename these two pointer-sized integer types. Decide on new names that convey th Update code and documentation to use pointer-sized integers more narrowly for array indexing and related purposes. Provide a deprecation period to carry out these updates. -Rename the integer literal suffixes `i` and `u` to new names that suit the new type names (e.g. `iptr` and `uptr`). +Rename the integer literal suffixes `i` and `u` to new names that suit the new type names. Examples: `32uptr`, `32usize`, or `32umem`, depending on the new names selected. # Drawbacks @@ -56,7 +56,8 @@ Alternative names: - `index` and `uindex`, related to array indexing and preserving Rust's "i"/"u" integer prefixes, however `uindex` is the type used for indexing. (Is "index" too good of an identifier to sacrifice to a keyword?) - `sindex` and `index`, since the unsigned type is the one used for indexing. - `intptr` and `uintptr`, [borrowing from C's](https://en.wikipedia.org/wiki/C_data_types#Fixed-width_integer_types) `intptr_t` and `uintptr_t`. These names are awkward by design. - - `isize` and `usize`, [borrowing from C's](https://en.wikipedia.org/wiki/C_data_types#Size_and_pointer_difference_types) `ssize_t` and `size_t` with Rust's "i/u" prefixes. But these two types are defined as having the same number of bits as a pointer, that is in terms of the address space size, not for measuring objects. An unsigned pointer-sized integer can count at least twice the number of bytes in the maximum memory node (which is limited by the signed pointer-sized integer) and yet it artificially limits the size in elements of a bit vector. + - `isize` and `usize`, [borrowing from C's](https://en.wikipedia.org/wiki/C_data_types#Size_and_pointer_difference_types) `ssize_t` and `size_t` with Rust's "i/u" prefixes, indicating integers large enough to hold the *size-in-bytes* of a memory object, and thus ([as in C++](http://en.cppreference.com/w/cpp/types/size_t)) ideal for indexing an in-memory array of elements at least 1 byte each. + - `imem` and `umem`, defined as integers large enough to address any memory the program can address. Suits both indices and sizes (unlike `uptr`, `uindex`, and `usize`). - `index` and `ptrdiff`. - `offset` and `size`. - `ioffset` and `ulength` or `ulen` or `uaddr`. From 180cda6a41ce6320f3f6668651872bac0bd882ac Mon Sep 17 00:00:00 2001 From: Jerry Morrison <1fish2@users.noreply.github.com> Date: Thu, 20 Nov 2014 14:30:56 -0800 Subject: [PATCH 6/6] explain imem/umem as "memory numbers" * Add short suffixes `im`/`um`. * Link to the language ref section on machine-dependent integer types. --- active/0000-int-name.md | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/active/0000-int-name.md b/active/0000-int-name.md index b00e0278267..8a10f59b934 100644 --- a/active/0000-int-name.md +++ b/active/0000-int-name.md @@ -19,11 +19,13 @@ This RFC assumes a fixed-size integer type will become the type inference fallba # Background -The Rust language does array indexing using the smallest integer types that span the address space. For this purpose, Rust defines two integer types that are large enough to hold a pointer in the target environment: currently named `uint` for indexing and `int` for pointer differences. +The Rust language does array indexing using the smallest integer types that span the address space. For this purpose, Rust defines two [machine-dependent integer types](http://doc.rust-lang.org/reference.html#machine-dependent-integer-types) that have the same number of bits as the target platform's pointer type. They're currently named `uint` for indexing and `int` for pointer differences. -(For memory safety, the memory allocator will limit each node to half of address space so any array index will fit in a signed, pointer-sized integer.) +(For memory safety, the language sets the theoretical upper bound on object and array size to the maximum `int` value.) -But contrary to expectations set by other programming languages, these are not the fastest, "native," register, C-sized, nor 32-bit integer types. +These types are useful for "memory numbers": indices, counts, sizes, offsets, etc. The problem is their names. + +Contrary to expectations set by other programming languages, these are not the fastest, "native," register, C-sized, "word" sized, nor 32-bit integer types. Given the history, `int` and `uint` _look_ like default integer types, but a target-dependent size is not a good default. @@ -34,11 +36,11 @@ This RFC replaces [RFC: int/uint portability to 16-bit CPUs](https://github.com/ # Detailed Design -Rename these two pointer-sized integer types. Decide on new names that convey their intended use with arrays rather than general-purpose integers. +Rename these two pointer-sized integer types. Decide on new names that convey their intended memory-scale uses rather than general-purpose integers. Update code and documentation to use pointer-sized integers more narrowly for array indexing and related purposes. Provide a deprecation period to carry out these updates. -Rename the integer literal suffixes `i` and `u` to new names that suit the new type names. Examples: `32uptr`, `32usize`, or `32umem`, depending on the new names selected. +Rename the integer literal suffixes `i` and `u` to new names that suit the new type names. The suffix could be the same as the type, e.g. `32umem`, `32uptr`, or `32usize` (depending on the new names selected) or a shorter form, e.g. `32um` and `100im`. # Drawbacks @@ -56,8 +58,8 @@ Alternative names: - `index` and `uindex`, related to array indexing and preserving Rust's "i"/"u" integer prefixes, however `uindex` is the type used for indexing. (Is "index" too good of an identifier to sacrifice to a keyword?) - `sindex` and `index`, since the unsigned type is the one used for indexing. - `intptr` and `uintptr`, [borrowing from C's](https://en.wikipedia.org/wiki/C_data_types#Fixed-width_integer_types) `intptr_t` and `uintptr_t`. These names are awkward by design. - - `isize` and `usize`, [borrowing from C's](https://en.wikipedia.org/wiki/C_data_types#Size_and_pointer_difference_types) `ssize_t` and `size_t` with Rust's "i/u" prefixes, indicating integers large enough to hold the *size-in-bytes* of a memory object, and thus ([as in C++](http://en.cppreference.com/w/cpp/types/size_t)) ideal for indexing an in-memory array of elements at least 1 byte each. - - `imem` and `umem`, defined as integers large enough to address any memory the program can address. Suits both indices and sizes (unlike `uptr`, `uindex`, and `usize`). + - `isize` and `usize`, [borrowing from C's](https://en.wikipedia.org/wiki/C_data_types#Size_and_pointer_difference_types) `ssize_t` and `size_t` with Rust's "i/u" prefixes, indicating integers large enough to hold the *size-in-bytes* of a memory object, and thus ([as in C++](http://en.cppreference.com/w/cpp/types/size_t)) the right range to index an in-memory array of elements at least 1 byte each. + - `imem` and `umem`, meaning *"memory numbers."* These type names are suitable for indexes, counts, offsets, and sizes (unlike `uptr`, `uindex`, and `usize`). As memory numbers, it makes sense that they're sized to fit the address space. - `index` and `ptrdiff`. - `offset` and `size`. - `ioffset` and `ulength` or `ulen` or `uaddr`.