Basic Slicing and Indexing

It is a very common usage to extract sub-matrix from matrix, or indexing tensor to lower dimension sub-tensor, to perform future computation.

RSTSR provides most functionality that NumPy calls "basic indexing", which gives tensor view instead of owned tensor. By this mechanism, most tensor extraction operation can be performed without memory copy. For large tensors, cost of all basic slicing and indexing operations are cheap, compared to memory assignment and tensor arithmetics.

Due to language limit, in rust, indexing by brackets [] can only return underlying data &T, so it is not able to return a tensor view by brackets [] technically. In RSTSR, elementwise indexing by [] will return reference of element &T, only if data is stored by Vec<T> type. Usage of [] indexing is quite limited.

However, to obtain sub-tensor view TensorView (or TensorMut) by using function to index and slice is possible. The most important functions and macros to perform slicing are

slice (equivalntly i): return tensor view by feeding slice parameters;
slice_mut (equivalntly i): return mutable tensor view by feeding slice parameters;
slice!((start, ) stop (, slice)): generate slice configuration, which should be similar to python's intrinsic slice function.
s![]: generate slice parameters (useful when different types of slicing and indexing occurs in the same time); in most scenarios this macro can be substituted by tuple of different types;

Macro slice! is different to function slice.

If you are not feeling good using both function slice and macro slice! (such as tensor.slice(slice!(1, 5, 2))), you can still use the equilvant function i to perform tensor indexing and slicing (such as tensor.i(slice!(1, 5, 2))).

Clashed naming of these functions may be terrible, but it actually binds to some conventions:

function slice comes from rust crate ndarray;
function i comes from rust crate candle;
macro slice! comes from python's intrinsic function.

Note that we have not implemented advanced indexing. Advanced indexing is mainly about indexing by integer tensor, by boolean tensor, or by index list. These are well covered in numpy, but will be difficult for RSTSR. In most cases, advanced indexing requires (or more efficient when there is) explicitly memory copy. We will persuit to realize some of advanced indexing features in future.

Slicing in RSTSR always generate dynamic dimension.

Please note that by slicing, RSTSR will always generate dynamic dimension (IxD) tensor, instead of generating fixed dimension (Ix1 for 1-D, Ix2 for 2-D, etc.). This is a fallback compared to ndarray, where ndarray have a more sophisticated macro system to handle fixed dimension slicing.

Terminology

Slicing (by range or slice): $n$ -D tensor to $n$ -D tensor operation, giving a view of smaller tensor;
Indexing (by integer): $n$ -D tensor to $(n - 1)$ -D tensor, margining out one dimension by selecting;
Elementwise Indexing (by list of integer): give reference of element &T instead of giving tensor view.

In RSTSR, slicing and indexing are implemented in a similar way. User can usually simutanously perform slicing and indexing, whenever rust allows.

RSTSR follows rust, C and python convention of 0-based indexing, which is different to Fortran.

1. Indexing by Number

For example, a 3-D tensor $A_{ijk}$ can be indexed into 2-D tensor $B_{jk} = A_{2 jk}$ :

#![allow(unused)]
fn main() {
    // generate 3-D tensor A_ijk
    let a = rt::arange(24).into_shape([4, 3, 2]);
    println!("{:}", a);

    // B_jk = A_ijk where i = 2
    let b = a.slice(2); // equivalently `a.i(2)`
    println!("{:}", b);
    // output:
    // [[ 12 13]
    //  [ 14 15]
    //  [ 16 17]]
}

Further more, if you wish to perform indexing to both $i = 2, j = 0$ , or say $C_{k} = A_{20 k}$ , then you can pass [2, 0] into slice function:

#![allow(unused)]
fn main() {
    // C_k = A_ijk where i = 2, j = 0
    // surely, `a.slice(2).slice(0)` works, but we can use `a.slice([2, 0])` instead
    let c = a.slice([2, 0]);
    println!("{:}", c);
    // output: [ 12 13]
}

RSTSR also accepts negative indices for indexing from the end of the array:

#![allow(unused)]
fn main() {
    // D_jk = A_ijk where i = -1 = 3 (negative index from the end)
    let d = a.slice(-1);
    println!("{:}", d);
    // output:
    // [[ 18 19]
    //  [ 20 21]
    //  [ 22 23]]
}

2. Basic Slicing

2.1 Slicing by range

For example, we want to extract $1 \leq i < 3$ from tensor $A_{ijk}$ :

#![allow(unused)]
fn main() {
    // generate 3-D tensor A_ijk
    let a = rt::arange(24).into_shape([4, 3, 2]);
    println!("{:}", a);

    // B_ijk = A_ijk where 1 <= i < 3
    let b = a.slice(1..3); // equivalently `a.i(1..3)`
    println!("{:}", b);
    // output:
    // [[[ 6  7]
    //   [ 8  9]
    //   [10 11]]
    //
    //  [[12 13]
    //   [14 15]
    //   [16 17]]]
}

First two dimensions slicing are also available by the following way:

#![allow(unused)]
fn main() {
    // C_ijk = A_ijk where 1 <= i < 3, 0 <= j < 2
    let c = a.slice([1..3, 0..2]);
    println!("{:}", c);
    // output:
    // [[[ 6  7]
    //   [ 8  9]]
    //
    //  [[12 13]
    //   [14 15]]]
}

Negative indices are also applicable for this case:

#![allow(unused)]
fn main() {
    let a = rt::arange(24);
    // D_i = A_i where i = -5..-2 = 19..22 (negative index from the end given 24 elements)
    let d = a.slice(-5..-2);
    println!("{:}", d);
    // output: [ 19 20 21]
}

2.2 Slicing by ranges

Not only range types (like 1..3) is accepted in RSTSR, but also range to (..3) or range from (1..).

#![allow(unused)]
fn main() {
    let a = rt::arange(24);
    // D_i = A_i where i = -5.. or 19..
    let d = a.slice(-5..);
    println!("{:}", d);
    // output: [ 19 20 21 22 23]
}

But as a remainder, rust does not allow two different types to be merged as rust array [T]:

    // generate 3-D tensor A_ijk
    let a = rt::arange(24).into_shape([4, 3, 2]).into_owned();

    // different types can't be merged into rust array
    // - `..` is RangeFull
    // - `1..3` is Range
    // - `..2` is RangeTo
    let b = a.slice([.., 1..3, ..2]);

To resolve this problem, you may use s! macro, or either just pass tuple (T1, T2) instead of rust array [T]:

#![allow(unused)]
fn main() {
    let a = rt::arange(24).into_shape([4, 3, 2]);
    let b = a.slice((.., 1..3, ..2)); // equivalently `a.slice(s![.., 1..3, ..2])`
    println!("{:}", b);
    // output:
    // [[[ 2 3]
    //   [ 4 5]]
    //
    //  [[ 8 9]
    //   [ 10 11]]
    //
    //  [[ 14 15]
    //   [ 16 17]]
    //
    //  [[ 20 21]
    //   [ 22 23]]]
}

We just implemented tuple up to 10 elements; if your tensor is extremely high in number of dimensions, you may wish to use s!.

3. Special Indexing

3.1 Slicing with strides

To slice with stride, you may use slice! macro. The usage of slice! macro is similar to python's intrinsic function slice¹:

slice!(stop): similar to range to ..stop;
slice!(start, stop): similar to range start..stop;
slice!(start, stop, step): this is similar to fortran's or numpy's slicing start:stop:step.

In ndarray, this is done by s![start..stop;step]. ndarray's resolution is more concise. However, we stick to use the seemingly verbose slice! macro to generate strided slice.

#![allow(unused)]
fn main() {
    let a = rt::arange(24);

    // first 5 elements
    let b = a.slice(slice!(5));
    println!("{:}", b);
    // output: [ 0 1 2 3 4]

    // elements from 5 to -9 (resembles 15 for the given 24 elements)
    let b = a.slice(slice!(5, -9));
    println!("{:}", b);
    // output: [ 5 6 7 ... 12 13 14]

    // elements from 5 to -9 with step 2
    let b = a.slice(slice!(5, -9, 2));
    println!("{:}", b);
    // output: [ 5 7 9 11 13]

    // reversed step 2
    let b = a.slice(slice!(-9, 5, -2));
    println!("{:}", b);
    // output: [ 15 13 11 9 7]
}

In many cases, None is also valid input for slice!. In fact, slice! is realized by mechanics of Option<T>, so using Some(val) is also valid.

#![allow(unused)]
fn main() {
    let b = a.slice(slice!(None, 9, Some(2)));
    println!("{:}", b);
    // output: [ 0 2 4 6 8]
}

3.2 Inserting axes

You can insert axes by None or NewAxis (by definition Indexer::Insert). This is similar to numpy's None or np.newaxis.

#![allow(unused)]
fn main() {
    let a = rt::arange(24).into_shape([4, 3, 2]);

    // insert new axis at the beginning
    let b = a.slice(NewAxis);
    println!("{:?}", b.layout());
    // output: shape: [1, 4, 3, 2], stride: [6, 6, 2, 1], offset: 0

    // using `None` is equivalent to `NewAxis`
    let b = a.slice(None);
    println!("{:?}", b.layout());
    // output: shape: [1, 4, 3, 2], stride: [6, 6, 2, 1], offset: 0

    // insert new axis at the second position
    let b = a.slice((.., None));
    println!("{:?}", b.layout());
    // output: shape: [4, 1, 3, 2], stride: [6, 2, 2, 1], offset: 0
}

Using None can be elegent, however, we do not accept Some(val) for indexing. So although the following code compiles, it simply does not work.

#![allow(unused)]
fn main() {
    let a = rt::arange(24).into_shape([4, 3, 2]);

    // insert new axis at the beginning
    let b = a.slice(Some(2));
    println!("{:?}", b.layout());
    // panic: Option<T> should not be used in Indexer.
}

3.3 Ellipsis

In RSTSR, you may use Ellipsis (by definition Indexer::Ellipsis) to skip some indexes:

#![allow(unused)]
fn main() {
    let a = rt::arange(24).into_shape([4, 3, 2]);

    // using ellipsis to select index from last dimension
    // equivallently to `a.slice((.., .., 0))` for 3-D tensor
    // same to numpy's `a[..., 0]`
    let b = a.slice((Ellipsis, 0));
    println!("{:2}", b);
    // output:
    // [[  0  2  4]
    //  [  6  8 10]
    //  [ 12 14 16]
    //  [ 18 20 22]]
}

3.4 Mixed indexing and slicing

As mentioned before, using array type [T] is not suitable for representing various kinds of indexing and slicing. However, you may use macro s! or tuple to perform this task².

In most cases, macro s! and tuple works in the same way; however, they have different definitions in program. s! should work in more scenarios.

#![allow(unused)]
fn main() {
    let a: Tensor<f64, _> = rt::zeros([6, 7, 5, 9, 8]);

    // mixed indexing
    let b = a.slice((slice!(-2, 1, -1), None, None, Ellipsis, 1, ..-2));
    println!("{:?}", b.layout());
    // output: shape: [3, 1, 1, 7, 5, 6], stride: [-2520, 360, 360, 360, 72, 1], offset: 10088
}

4. Elementwise Indexing

Elementwise indexing is not efficient.

We also offer elementwise indexing in RSTSR. But please note that, in most cases, elementwise indexing is not efficient.

for "unchecked" elementwise indexing, it have more chance to prevent compiler's internal vectorize and SIMD optimization;
for "safe" elementwise indexing, additional out-of-bound check is performed, further hampering optimizations.

Thus, for computationally intensive tasks, you are encouraged to use RSTSR internal arithmetic functions or mapping functions, to avoid direct elementwise indexing. Only use elementwise indexing when efficiency is not of concern, or RSTSR internal functions could not fulfill your demands.

4.1 Safe elementwise indexing

To perform indexing, you may use rust's bracket []:

#![allow(unused)]
fn main() {
    let a = rt::arange(24).into_shape([4, 3, 2]);

    let val = a[[2, 2, 1]];
    println!("{:}", val);
    // output: 17

    println!("{:}", std::any::type_name_of_val(&val));
    // output: i32
}

If you provides index out-of-bound, RSTSR will panic:

#![allow(unused)]
fn main() {
    let a = rt::arange(24).into_shape([4, 3, 2]);

    let val = a[[2, 2, 3]];
    println!("{:}", val);
    // panic: Error::ValueOutOfRange : "idx" = 3 not match to pattern 0..(shp as isize) = 0..2
}

It is different in RSTSR in indexing (to tensor view) and elementwise indexing (to reference of value).

#![allow(unused)]
fn main() {
    let view = a.slice((2, 2, 1));
    println!("{:}", view);
    // output: 17

    // it seems to be a value, but actually it is a tensor view
    println!("{:?}", view);
    // output:
    // === Debug Tensor Print ===
    // 17
    // DeviceFaer { base: DeviceCpuRayon { num_threads: 0 } }
    // 0-Dim (dyn), contiguous: CcFf
    // shape: [], stride: [], offset: 17
    // ==========================
}

4.2 Unchecked elementwise indexing

Unchecked elemtwise indexing will be slightly faster than safe elementwise indexing. To perform indexing, you may use unsafe function index_uncheck:

#![allow(unused)]
fn main() {
    let a = rt::arange(24).into_shape([4, 3, 2]);

    let val = unsafe { a.index_uncheck([2, 2, 1]) };
    println!("{:}", val);
    // output: 17
}

If you provides index out-of-bound, if the index is still smaller than the underlying memory size, RSTSR will not panic and give wrong value:

#![allow(unused)]
fn main() {
    let a = rt::arange(24).into_shape([4, 3, 2]);

    let val = unsafe { a.index_uncheck([2, 2, 3]) };
    println!("{:}", val);
    // output: 19
    // not desired: last dimension index 3 is out of bound
}

This function is marked unsafe in order to avoid such kind of out-of-bound (but not out-of-memory). In most cases it is still memory safe, in that out-of-memory accessing Vec<T> will gracefully panics.

RSTSR Book