Basic Slicing and Indexing
It is a very common usage to extract sub-matrix from matrix, or indexing tensor to lower dimension sub-tensor, to perform future computation.
RSTSR provides most functionality that NumPy calls "basic indexing", which gives tensor view instead of owned tensor. By this mechanism, most tensor extraction operation can be performed without memory copy. For large tensors, cost of all basic slicing and indexing operations are cheap, compared to memory assignment and tensor arithmetics.
Due to language limit, in rust, indexing by brackets []
can only return underlying data &T
, so it is not able to return a tensor view by brackets []
technically.
In RSTSR, elementwise indexing by []
will return reference of element &T
, only if data is stored by Vec<T>
type. Usage of []
indexing is quite limited.
However, to obtain sub-tensor view TensorView
(or TensorMut
) by using function to index and slice is possible.
The most important functions and macros to perform slicing are
slice
(equivalntlyi
): return tensor view by feeding slice parameters;slice_mut
(equivalntlyi
): return mutable tensor view by feeding slice parameters;slice!((start, ) stop (, slice))
: generate slice configuration, which should be similar to python's intrinsicslice
function.s![]
: generate slice parameters (useful when different types of slicing and indexing occurs in the same time); in most scenarios this macro can be substituted by tuple of different types;
Macro slice!
is different to function slice
.
If you are not feeling good using both function slice
and macro slice!
(such as tensor.slice(slice!(1, 5, 2))
), you can still use the equilvant function i
to perform tensor indexing and slicing (such as tensor.i(slice!(1, 5, 2))
).
Clashed naming of these functions may be terrible, but it actually binds to some conventions:
- function
slice
comes from rust cratendarray
; - function
i
comes from rust cratecandle
; - macro
slice!
comes from python's intrinsic function.
Note that we have not implemented advanced indexing. Advanced indexing is mainly about indexing by integer tensor, by boolean tensor, or by index list. These are well covered in numpy, but will be difficult for RSTSR. In most cases, advanced indexing requires (or more efficient when there is) explicitly memory copy. We will persuit to realize some of advanced indexing features in future.
Slicing in RSTSR always generate dynamic dimension.
Please note that by slicing, RSTSR will always generate dynamic dimension (IxD
) tensor, instead of generating fixed dimension (Ix1
for 1-D, Ix2
for 2-D, etc.).
This is a fallback compared to ndarray
, where ndarray
have a more sophisticated macro system to handle fixed dimension slicing.
Terminology
- Slicing (by range or slice): -D tensor to -D tensor operation, giving a view of smaller tensor;
- Indexing (by integer): -D tensor to -D tensor, margining out one dimension by selecting;
- Elementwise Indexing (by list of integer): give reference of element
&T
instead of giving tensor view.
In RSTSR, slicing and indexing are implemented in a similar way. User can usually simutanously perform slicing and indexing, whenever rust allows.
RSTSR follows rust, C and python convention of 0-based indexing, which is different to Fortran.
1. Indexing by Number
For example, a 3-D tensor can be indexed into 2-D tensor :
#![allow(unused)] fn main() { // generate 3-D tensor A_ijk let a = rt::arange(24).into_shape([4, 3, 2]); println!("{:}", a); // B_jk = A_ijk where i = 2 let b = a.slice(2); // equivalently `a.i(2)` println!("{:}", b); // output: // [[ 12 13] // [ 14 15] // [ 16 17]] }
Further more, if you wish to perform indexing to both , or say , then you can pass [2, 0]
into slice
function:
#![allow(unused)] fn main() { // C_k = A_ijk where i = 2, j = 0 // surely, `a.slice(2).slice(0)` works, but we can use `a.slice([2, 0])` instead let c = a.slice([2, 0]); println!("{:}", c); // output: [ 12 13] }
RSTSR also accepts negative indices for indexing from the end of the array:
#![allow(unused)] fn main() { // D_jk = A_ijk where i = -1 = 3 (negative index from the end) let d = a.slice(-1); println!("{:}", d); // output: // [[ 18 19] // [ 20 21] // [ 22 23]] }
2. Basic Slicing
2.1 Slicing by range
For example, we want to extract from tensor :
#![allow(unused)] fn main() { // generate 3-D tensor A_ijk let a = rt::arange(24).into_shape([4, 3, 2]); println!("{:}", a); // B_ijk = A_ijk where 1 <= i < 3 let b = a.slice(1..3); // equivalently `a.i(1..3)` println!("{:}", b); // output: // [[[ 6 7] // [ 8 9] // [10 11]] // // [[12 13] // [14 15] // [16 17]]] }
First two dimensions slicing are also available by the following way:
#![allow(unused)] fn main() { // C_ijk = A_ijk where 1 <= i < 3, 0 <= j < 2 let c = a.slice([1..3, 0..2]); println!("{:}", c); // output: // [[[ 6 7] // [ 8 9]] // // [[12 13] // [14 15]]] }
Negative indices are also applicable for this case:
#![allow(unused)] fn main() { let a = rt::arange(24); // D_i = A_i where i = -5..-2 = 19..22 (negative index from the end given 24 elements) let d = a.slice(-5..-2); println!("{:}", d); // output: [ 19 20 21] }
2.2 Slicing by ranges
Not only range types (like 1..3
) is accepted in RSTSR, but also range to (..3
) or range from (1..
).
#![allow(unused)] fn main() { let a = rt::arange(24); // D_i = A_i where i = -5.. or 19.. let d = a.slice(-5..); println!("{:}", d); // output: [ 19 20 21 22 23] }
But as a remainder, rust does not allow two different types to be merged as rust array [T]
:
// generate 3-D tensor A_ijk
let a = rt::arange(24).into_shape([4, 3, 2]).into_owned();
// different types can't be merged into rust array
// - `..` is RangeFull
// - `1..3` is Range
// - `..2` is RangeTo
let b = a.slice([.., 1..3, ..2]);
To resolve this problem, you may use s!
macro, or either just pass tuple (T1, T2)
instead of rust array [T]
:
#![allow(unused)] fn main() { let a = rt::arange(24).into_shape([4, 3, 2]); let b = a.slice((.., 1..3, ..2)); // equivalently `a.slice(s![.., 1..3, ..2])` println!("{:}", b); // output: // [[[ 2 3] // [ 4 5]] // // [[ 8 9] // [ 10 11]] // // [[ 14 15] // [ 16 17]] // // [[ 20 21] // [ 22 23]]] }
We just implemented tuple up to 10 elements; if your tensor is extremely high in number of dimensions, you may wish to use s!
.
3. Special Indexing
3.1 Slicing with strides
To slice with stride, you may use slice!
macro.
The usage of slice!
macro is similar to python's intrinsic function slice
1:
slice!(stop)
: similar to range to..stop
;slice!(start, stop)
: similar to rangestart..stop
;slice!(start, stop, step)
: this is similar to fortran's or numpy's slicingstart:stop:step
.
In ndarray
, this is done by s![start..stop;step]
. ndarray
's resolution is more concise. However, we stick to use the seemingly verbose slice!
macro to generate strided slice.
#![allow(unused)] fn main() { let a = rt::arange(24); // first 5 elements let b = a.slice(slice!(5)); println!("{:}", b); // output: [ 0 1 2 3 4] // elements from 5 to -9 (resembles 15 for the given 24 elements) let b = a.slice(slice!(5, -9)); println!("{:}", b); // output: [ 5 6 7 ... 12 13 14] // elements from 5 to -9 with step 2 let b = a.slice(slice!(5, -9, 2)); println!("{:}", b); // output: [ 5 7 9 11 13] // reversed step 2 let b = a.slice(slice!(-9, 5, -2)); println!("{:}", b); // output: [ 15 13 11 9 7] }
In many cases, None
is also valid input for slice!
. In fact, slice!
is realized by mechanics of Option<T>
, so using Some(val)
is also valid.
#![allow(unused)] fn main() { let b = a.slice(slice!(None, 9, Some(2))); println!("{:}", b); // output: [ 0 2 4 6 8] }
3.2 Inserting axes
You can insert axes by None
or NewAxis
(by definition Indexer::Insert
). This is similar to numpy's None
or np.newaxis
.
#![allow(unused)] fn main() { let a = rt::arange(24).into_shape([4, 3, 2]); // insert new axis at the beginning let b = a.slice(NewAxis); println!("{:?}", b.layout()); // output: shape: [1, 4, 3, 2], stride: [6, 6, 2, 1], offset: 0 // using `None` is equivalent to `NewAxis` let b = a.slice(None); println!("{:?}", b.layout()); // output: shape: [1, 4, 3, 2], stride: [6, 6, 2, 1], offset: 0 // insert new axis at the second position let b = a.slice((.., None)); println!("{:?}", b.layout()); // output: shape: [4, 1, 3, 2], stride: [6, 2, 2, 1], offset: 0 }
Using None
can be elegent, however, we do not accept Some(val)
for indexing. So although the following code compiles, it simply does not work.
#![allow(unused)] fn main() { let a = rt::arange(24).into_shape([4, 3, 2]); // insert new axis at the beginning let b = a.slice(Some(2)); println!("{:?}", b.layout()); // panic: Option<T> should not be used in Indexer. }
3.3 Ellipsis
In RSTSR, you may use Ellipsis
(by definition Indexer::Ellipsis
) to skip some indexes:
#![allow(unused)] fn main() { let a = rt::arange(24).into_shape([4, 3, 2]); // using ellipsis to select index from last dimension // equivallently to `a.slice((.., .., 0))` for 3-D tensor // same to numpy's `a[..., 0]` let b = a.slice((Ellipsis, 0)); println!("{:2}", b); // output: // [[ 0 2 4] // [ 6 8 10] // [ 12 14 16] // [ 18 20 22]] }
3.4 Mixed indexing and slicing
As mentioned before, using array type [T]
is not suitable for representing various kinds of indexing and slicing.
However, you may use macro s!
or tuple to perform this task2.
In most cases, macro s!
and tuple works in the same way; however, they have different definitions in program. s!
should work in more scenarios.
#![allow(unused)] fn main() { let a: Tensor<f64, _> = rt::zeros([6, 7, 5, 9, 8]); // mixed indexing let b = a.slice((slice!(-2, 1, -1), None, None, Ellipsis, 1, ..-2)); println!("{:?}", b.layout()); // output: shape: [3, 1, 1, 7, 5, 6], stride: [-2520, 360, 360, 360, 72, 1], offset: 10088 }
4. Elementwise Indexing
Elementwise indexing is not efficient.
We also offer elementwise indexing in RSTSR. But please note that, in most cases, elementwise indexing is not efficient.
- for "unchecked" elementwise indexing, it have more chance to prevent compiler's internal vectorize and SIMD optimization;
- for "safe" elementwise indexing, additional out-of-bound check is performed, further hampering optimizations.
Thus, for computationally intensive tasks, you are encouraged to use RSTSR internal arithmetic functions or mapping functions, to avoid direct elementwise indexing. Only use elementwise indexing when efficiency is not of concern, or RSTSR internal functions could not fulfill your demands.
4.1 Safe elementwise indexing
To perform indexing, you may use rust's bracket []
:
#![allow(unused)] fn main() { let a = rt::arange(24).into_shape([4, 3, 2]); let val = a[[2, 2, 1]]; println!("{:}", val); // output: 17 println!("{:}", std::any::type_name_of_val(&val)); // output: i32 }
If you provides index out-of-bound, RSTSR will panic:
#![allow(unused)] fn main() { let a = rt::arange(24).into_shape([4, 3, 2]); let val = a[[2, 2, 3]]; println!("{:}", val); // panic: Error::ValueOutOfRange : "idx" = 3 not match to pattern 0..(shp as isize) = 0..2 }
It is different in RSTSR in indexing (to tensor view) and elementwise indexing (to reference of value).
#![allow(unused)] fn main() { let view = a.slice((2, 2, 1)); println!("{:}", view); // output: 17 // it seems to be a value, but actually it is a tensor view println!("{:?}", view); // output: // === Debug Tensor Print === // 17 // DeviceFaer { base: DeviceCpuRayon { num_threads: 0 } } // 0-Dim (dyn), contiguous: CcFf // shape: [], stride: [], offset: 17 // ========================== }
4.2 Unchecked elementwise indexing
Unchecked elemtwise indexing will be slightly faster than safe elementwise indexing.
To perform indexing, you may use unsafe function index_uncheck
:
#![allow(unused)] fn main() { let a = rt::arange(24).into_shape([4, 3, 2]); let val = unsafe { a.index_uncheck([2, 2, 1]) }; println!("{:}", val); // output: 17 }
If you provides index out-of-bound, if the index is still smaller than the underlying memory size, RSTSR will not panic and give wrong value:
#![allow(unused)] fn main() { let a = rt::arange(24).into_shape([4, 3, 2]); let val = unsafe { a.index_uncheck([2, 2, 3]) }; println!("{:}", val); // output: 19 // not desired: last dimension index 3 is out of bound }
This function is marked unsafe
in order to avoid such kind of out-of-bound (but not out-of-memory).
In most cases it is still memory safe, in that out-of-memory accessing Vec<T>
will gracefully panics.