# 11. Matrices and Other Arraysļ

The open-access textbookDeep R Programmingby Marek Gagolewski is, and will remain, freely available for everyoneās enjoyment (also in PDF). It is a non-profit project.This book is still a work in progress. Beta versions of Chapters 1ā12 are already complete, but there will be more. In the meantime, any bug/typos reports/fixes are appreciated.Although available online, it is a whole course; it should be read from the beginning to the end. Refer to the Preface for general introductory remarks.Also, check out my other book,Minimalist Data Wrangling with Python[20].

When we equip an atomic or generic vector with the `dim`

attribute,
it automatically becomes an object of S3 class `array`

.
In particular, two-dimensional arrays (primary S3 class `matrix`

) allow us
to represent *tabular* data where items are aligned into rows and columns:

```
structure(1:6, dim=c(2, 3)) # a matrix with 2 rows and 3 columns
## [,1] [,2] [,3]
## [1,] 1 3 5
## [2,] 2 4 6
```

This (combined with the fact that there are many built-in
functions overloaded for the `matrix`

class) opens up a range of new
possibilities, which we explore in this chapter.
In particular, we discuss how to perform basic algebraic operations
such as matrix multiplication, transpose, finding eigenvalues,
and performing various decompositions. We also cover data wrangling
operations such as array subsetting and column- and rowwise aggregation.

Important

Oftentimes, a numeric matrix with *n* rows and *m*
will be used to represent *n* points (samples)
in an *m*-dimensional (with *m* features or variables)
space, \(\mathbb{R}^m\).

Furthermore, in the next chapter, we will introduce data frames: matrix-like objects whose columns can be of any (not necessarily the same) type.

## 11.1. Creating Arraysļ

### 11.1.1. **matrix** and **array**ļ

A matrix can be conveniently created by means
of the **matrix** function.

```
(A <- matrix(1:6, byrow=TRUE, nrow=2))
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 4 5 6
```

The above converted an atomic vector of length six
into a matrix with two rows. The number of columns
was determined automatically (`ncol=3`

could have been passed
to get the same result).

Important

By default, the elements of the input vector are read columnwisely:

```
matrix(1:6, ncol=3) # byrow=FALSE
## [,1] [,2] [,3]
## [1,] 1 3 5
## [2,] 2 4 6
```

A matrix can be equipped with dimension names, being a list of two character vectors of appropriate sizes, labelling each row and column, in this order:

```
matrix(1:6, byrow=TRUE, nrow=2, dimnames=list(c("x", "y"), c("a", "b", "c")))
## a b c
## x 1 2 3
## y 4 5 6
```

Alternatively, to create a matrix, we can use the **array** function,
which requires the number of rows and columns be specified explicitly.

```
array(1:6, dim=c(2, 3))
## [,1] [,2] [,3]
## [1,] 1 3 5
## [2,] 2 4 6
```

Note that the elements are consumed in a column-major manner.

Arrays of dimensionality other than 2 are also possible.
Here is a one-dimensional array. When printed, it is indistinguishable
from an atomic vector (but still the `class`

attribute is
set to `array`

):

```
array(1:6, dim=6)
## [1] 1 2 3 4 5 6
```

And now for something completely different: a three-dimensional array of size 3-by-4-by-2

```
array(1:24, dim=c(3, 4, 2))
## , , 1
##
## [,1] [,2] [,3] [,4]
## [1,] 1 4 7 10
## [2,] 2 5 8 11
## [3,] 3 6 9 12
##
## , , 2
##
## [,1] [,2] [,3] [,4]
## [1,] 13 16 19 22
## [2,] 14 17 20 23
## [3,] 15 18 21 24
```

which can be thought of as two matrices of size 3-by-4 (because how else can we print out a 3D object on a 2D console?).

The **array** function can be fed with the `dimnames`

argument
too. For instance, the above three-dimensional hypertable
would require a list of three character vectors,
of sizes 3, 4, and 2, respectively.

That 10-dimensional arrays are also possible the reader is encouraged to try out themself.

### 11.1.2. Promoting and Stacking Vectorsļ

We can promote an ordinary vector to a column vector, i.e., a matrix with one column by calling:

```
as.matrix(1:2)
## [,1]
## [1,] 1
## [2,] 2
cbind(1:2)
## [,1]
## [1,] 1
## [2,] 2
```

and to a row vector:

```
t(1:3) # transpose
## [,1] [,2] [,3]
## [1,] 1 2 3
rbind(1:3)
## [,1] [,2] [,3]
## [1,] 1 2 3
```

Actually, **cbind** and **rbind** stand for column- and
row-bind; they allow multiple vectors and matrices be stacked
one after/below another:

```
rbind(1:4, 5:8, 9:10, 11) # row bind
## [,1] [,2] [,3] [,4]
## [1,] 1 2 3 4
## [2,] 5 6 7 8
## [3,] 9 10 9 10
## [4,] 11 11 11 11
cbind(1:4, 5:8, 9:10, 11) # column bind
## [,1] [,2] [,3] [,4]
## [1,] 1 5 9 11
## [2,] 2 6 10 11
## [3,] 3 7 9 11
## [4,] 4 8 10 11
cbind(1:2, 3:4, rbind(11:13, 21:23)) # vector, vector, 2x3 matrix
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1 3 11 12 13
## [2,] 2 4 21 22 23
```

and so forth.

Unfortunately, the *generalised*
recycling rule is not implemented in full:

```
cbind(1:4, 5:8, cbind(9:10, 11)) # different from cbind(1:4, 5:8, 9:10, 11)
## Warning in cbind(1:4, 5:8, cbind(9:10, 11)): number of rows of result is
## not a multiple of vector length (arg 1)
## [,1] [,2] [,3] [,4]
## [1,] 1 5 9 11
## [2,] 2 6 10 11
```

Note that the first two arguments are of length four.

### 11.1.3. Simplifying Listsļ

**simplify2array** is an extension of the **unlist** function.
Given a list of atomic vectors, each of length one, it will return a flat
atomic vector.
However, if a list of equisized vectors of greater lengths is given,
these will be converted to a matrix.

```
simplify2array(list(1, 11, 21)) # each of length 1
## [1] 1 11 21
simplify2array(list(1:3, 11:13, 21:23, 31:33)) # each of length 3
## [,1] [,2] [,3] [,4]
## [1,] 1 11 21 31
## [2,] 2 12 22 32
## [3,] 3 13 23 33
simplify2array(list(1, 11:12, 21:23)) # no can do
## [[1]]
## [1] 1
##
## [[2]]
## [1] 11 12
##
## [[3]]
## [1] 21 22 23
```

Note that in the second example, each vector becomes a separate column of the resulting matrix[1].

See SectionĀ 12.3.7 for a few more examples.

There are quite a few functions that call the above automatically
by default (compare the `simplify`

or `SIMPLIFY`

(sic!) argument in
**sapply**, **tapply**, **mapply**,
**replicate**, etc.).

For instance:

```
min_mean_max <- function(x) c(Min=min(x), Mean=mean(x), Max=max(x))
sapply(split(iris[["Sepal.Length"]], iris[["Species"]]), min_mean_max)
## setosa versicolor virginica
## Min 4.300 4.900 4.900
## Mean 5.006 5.936 6.588
## Max 5.800 7.000 7.900
```

Take note of what constitutes the columns of the return matrix.

Note the behaviour of **as.matrix** on list arguments.
Write your own version of **simplify2array**
named **as.matrix.list** that *always*
returns a matrix. If a list of non-equisized vectors is given,
fill the missing cells with `NA`

s.

Important

Sometimes a call to **do.call**`(`

**cbind**`, x))`

might be a better idea than a referral to **simplify2array**.
Provided that `x`

is a list of atomic vectors,
it *always* returns a matrix: shorter vectors are recycled
(which might be welcome, but not necessarily).

```
do.call(cbind, list(a=c(u=1), b=c(v=2, w=3), c=c(i=4, j=5, k=6)))
## Warning in (function (..., deparse.level = 1) : number of rows of result
## is not a multiple of vector length (arg 2)
## a b c
## i 1 2 4
## j 1 3 5
## k 1 2 6
```

Consider a named toy list of numeric vectors:

```
x <- list(a=runif(10), b=rnorm(15))
```

Compare the results generated by **sapply**
(which calls **simplify2array**):

```
sapply(x, function(e) c(Mean=mean(e)))
## a.Mean b.Mean
## 0.57825 0.12431
sapply(x, function(e) c(Min=min(e), Max=max(e)))
## a b
## Min 0.045556 -1.9666
## Max 0.940467 1.7869
```

with its version based on **do.call** and **cbind**:

```
sapply2 <- function(...)
do.call(cbind, lapply(...))
sapply2(x, function(e) c(Mean=mean(e)))
## a b
## Mean 0.57825 0.12431
sapply2(x, function(e) c(Min=min(e), Max=max(e)))
## a b
## Min 0.045556 -1.9666
## Max 0.940467 1.7869
```

Note that **sapply** may return an atomic vector
with somewhat surprising `names`

.

### 11.1.4. Beyond Numeric Arraysļ

Arrays built upon atomic vectors other than numeric ones are possible too. For instance, later we will stress that comparisons featuring matrices are performed elementwisely, which results in logical matrices:

```
A >= 3
## [,1] [,2] [,3]
## [1,] FALSE FALSE TRUE
## [2,] TRUE TRUE TRUE
```

Furthermore, matrices of character strings can be useful too:

```
matrix(strrep(LETTERS[1:6], 1:6), ncol=3)
## [,1] [,2] [,3]
## [1,] "A" "CCC" "EEEEE"
## [2,] "BB" "DDDD" "FFFFFF"
```

And of course complex matrices:

```
A + 1i
## [,1] [,2] [,3]
## [1,] 1+1i 2+1i 3+1i
## [2,] 4+1i 5+1i 6+1i
```

We are not limited to *atomic* vectors: lists can
be a basis for arrays as well:

```
matrix(list(1, 11:21, "A", list(1, 2, 3)), nrow=2)
## [,1] [,2]
## [1,] 1 "A"
## [2,] integer,11 list,3
```

Some elements are not displayed properly, but they are still there.

### 11.1.5. Internal Representationļ

An object of S3 class `array`

is an atomic vector or a list
equipped with the `dims`

attribute, which is a vector of nonnegative integers.
Interestingly, we do not have to set the `class`

attribute explicitly:
the accessor function **class** will return an implicit[2]
class anyway (compare SectionĀ 4.4.3).

```
class(1) # atomic vector
## [1] "numeric"
class(structure(1, dim=rep(1, 1))) # 1D array (vector)
## [1] "array"
class(structure(1, dim=rep(1, 2))) # 2D array (matrix)
## [1] "matrix" "array"
class(structure(1, dim=rep(1, 3))) # 3D array
## [1] "array"
```

Note that a 2-dimensional array is additionally of class `matrix`

.

Optional dimension names are represented by means of the `dimnames`

attribute, which is a list of *d* character vectors,
where *d* is the arrayās dimensionality.

```
(A <- structure(1:6, dim=c(2, 3), dimnames=list(letters[1:2], LETTERS[1:3])))
## A B C
## a 1 3 5
## b 2 4 6
dim(A) # or attr(A, "dim")
## [1] 2 3
dimnames(A) # or attr(A, "dimnames")
## [[1]]
## [1] "a" "b"
##
## [[2]]
## [1] "A" "B" "C"
```

Important

Internally, elements in an array are always stored in the columnwise (column-major, Fortran) order:

```
as.numeric(A) # drop all attributes to reveal the underlying numeric vector
## [1] 1 2 3 4 5 6
```

Setting `byrow=TRUE`

in a call to the **matrix**
only affects the order in which this function reads a given source vector,
not the column/row-majorness.

```
(B <- matrix(1:6, ncol=3, byrow=TRUE))
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 4 5 6
as.numeric(B)
## [1] 1 4 2 5 3 6
```

The two said special attributes can be modified through the replacement
functions `**dim<-**` and `**dimnames<-**`
(and of course `**attr<-**` as well).
In particular, changing `dim`

does not alter the underlying atomic vector;
it only affects how other functions, including the corresponding
**print** method, interpret their placement on a virtual grid:

```
`dim<-`(A, c(3, 2)) # not the same as transpose of A
## [,1] [,2]
## [1,] 1 4
## [2,] 2 5
## [3,] 3 6
```

What we have obtained is a different *view* on the same *flat* data vector.
Also, `dimnames`

were dropped because its size became
incompatible with the newly requested dimensionality.

Study the source code of the **nrow**,
**NROW**, **ncol**, **NCOL**,
**rownames**, **row.names**, and
**colnames** functions.

Interestingly, for one-dimensional arrays, the
**names** function returns a sensible value
(based on the `dimnames`

attribute which is a list featuring one character
vector), despite the `names`

attributeās not being set.

What is more, `dimnames`

itself can be named:

```
names(dimnames(A)) <- c("ROWS", "COLUMNS")
print(A)
## COLUMNS
## ROWS A B C
## a 1 3 5
## b 2 4 6
```

It is still a numeric matrix, but the presentation thereof is slightly prettified.

**outer** applies a given (vectorised elementwisely) function
on each pair of elements from two vectors, forming a two-dimensional
result grid.
Based on two calls to **rep**, implement your own version thereof.

Some examples:

```
outer(c(x=1, y=10, z=100), c(a=1, b=2, c=3, d=4), "*") # multiplication
## a b c d
## x 1 2 3 4
## y 10 20 30 40
## z 100 200 300 400
outer(c("A", "B"), 1:8, paste, sep="-") # concatenate strings
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] "A-1" "A-2" "A-3" "A-4" "A-5" "A-6" "A-7" "A-8"
## [2,] "B-1" "B-2" "B-3" "B-4" "B-5" "B-6" "B-7" "B-8"
```

Show how **match**`(y, z)`

can be implemented
with **outer**. Is its time and memory complexity
optimal, though?

**table** creates a contingency matrix/array that
counts the number of unique pairs of corresponding elements
from one or more vectors of equal lengths.
Implement its one- and two-argument version based on
**tabulate**.

For example:

```
tips <- read.csv(paste0("https://github.com/gagolews/teaching-data/raw/",
"master/other/tips.csv"), comment.char="#") # a data.frame (list)
table(tips[["day"]])
##
## Fri Sat Sun Thur
## 19 87 76 62
table(tips[["smoker"]], tips[["day"]])
##
## Fri Sat Sun Thur
## No 4 45 57 45
## Yes 15 42 19 17
```

## 11.2. Array Indexingļ

Array subsetting can be performed by means of an overloaded[3]
`**[**` method, which we will usually provide with
many indexers ā two in the matrix case; see **help**`("[")`

.

In this section, we will be referring to the two following example matrices.

```
(A <- matrix(1:12, byrow=TRUE, nrow=3))
## [,1] [,2] [,3] [,4]
## [1,] 1 2 3 4
## [2,] 5 6 7 8
## [3,] 9 10 11 12
B <- A
dimnames(B) <- list(
c("a", "b", "c"), # row labels
c("x", "y", "z", "w") # column labels
)
B
## x y z w
## a 1 2 3 4
## b 5 6 7 8
## c 9 10 11 12
```

Subsetting higher-dimensional arrays will be covered at the end.

### 11.2.1. Arrays Are Built upon Basic Vectorsļ

Firstly, let us note, though, that subsetting based on one indexer (as in Chapter 5) will refer to the underlying flat vector.

For instance:

```
A[6]
## [1] 10
```

This is the element in the third row, second column: recall that values are stored in a column-major order.

### 11.2.2. Selecting Individual Elementsļ

Mathematically, we say that our example 3-by-4 real matrix \(\mathbf{A}\in\mathbb{R}^{3\times 4}\) is like:

Matrix elements are aligned in a two-dimensional grid.
They are organised into rows and columns.
Hence, we can pinpoint a cell using two indexes:
\(a_{i,j}\) refers to the *i*-th row and the *j*-th column.

Similarly in R:

```
A[3, 2] # 3rd row, 2nd column
## [1] 10
B["c", "y"] # using dimnames == B[3, 2]
## [1] 10
```

### 11.2.3. Selecting Rows and Columnsļ

Some textbooks, and we are fond of this notation here as well,
mark with \(\mathbf{a}_{i,\cdot}\)
a vector that consists of all the elements in
the *i*-th row and with \(\mathbf{a}_{\cdot,j}\) all items
in the *j*-th column.

In R, these will correspond to one of the indexers being left out.

```
A[3, ] # 3rd row
## [1] 9 10 11 12
A[, 2] # 2nd column
## [1] 2 6 10
B["c", ] # or B[3, ]
## x y z w
## 9 10 11 12
B[, "y"] # or B[, 2]
## a b c
## 2 6 10
```

Let us stress that `A[1]`

, `A[1, ]`

, and `A[, 1]`

have all different
meanings.
Also, we see that the resultsā `dimnames`

are adjusted accordingly;
see also **unname** which can take care of them once and for all.

Use **duplicated** to remove repeating rows
in a given numeric matrix (see also **unique**).

### 11.2.4. Dropping Dimensionsļ

Extracting an individual element or a single row/column from a matrix
yields an atomic vector.
If the `dim`

attribute consists of 1s only, it will be removed whatsoever.

In order to obtain proper row and column vectors, we can request the
preservation of the
dimensionality of the output object (and, more precisely,
the length of `dim`

) by passing `drop=FALSE`

to `**[**`.

```
A[1, 2, drop=FALSE] # 1st row, 2nd columns
## [,1]
## [1,] 2
A[1, , drop=FALSE] # 1st row
## [,1] [,2] [,3] [,4]
## [1,] 1 2 3 4
A[ , 2, drop=FALSE] # 2nd column
## [,1]
## [1,] 2
## [2,] 6
## [3,] 10
```

Important

The `drop`

argument unfortunately defaults to `TRUE`

. Many bugs could be
avoided more easily otherwise, especially when the indexers
are generated programmatically.

See also the **drop** function which gets rid of the dimensions
that have only one level.

Note

For list-based matrices, we can also use a multi-argument
version of `**[[**` to extract the individual elements.

```
C <- matrix(list(1, 11:12, 21:23, 31:34), nrow=2)
C[1, 2] # for `[`, input type is the same as the output type, hence a list
## [[1]]
## [1] 21 22 23
C[1, 2, drop=FALSE]
## [,1]
## [1,] integer,3
C[[1, 2]] # extract
## [1] 21 22 23
```

### 11.2.5. Selecting Submatricesļ

Indexing based on two vectors, both of length two or more, extracts a sub-block of a given matrix:

```
A[1:2, c(1, 2, 4)] # rows 1,2 columns 1,2,4
## [,1] [,2] [,3]
## [1,] 1 2 4
## [2,] 5 6 8
B[c("a", "b"), -3]
## x y w
## a 1 2 4
## b 5 6 8
```

Note again that `drop=TRUE`

is the default,
which affects the behaviour if one of the indexers is a scalar.

```
A[c(1, 3), 3]
## [1] 3 11
A[c(1, 3), 3, drop=FALSE]
## [,1]
## [1,] 3
## [2,] 11
```

Overload the **split** function for the `matrix`

class
in such a way that, given a matrix with *n* rows and an
object of class `factor`

of length *n* (or a list of such objects),
a list of *n* matrices is returned. For example:

```
split.matrix <- ...to.do...
A <- matrix(1:12, nrow=3) # matrix whose rows are to be split
s <- factor(c("a", "b", "a")) # determines the grouping of rows
split(A, s)
## $a
## [,1] [,2] [,3] [,4]
## [1,] 1 4 7 10
## [2,] 3 6 9 12
##
## $b
## [,1] [,2] [,3] [,4]
## [1,] 2 5 8 11
```

### 11.2.6. Selecting Elements Based on Logical Vectorsļ

Logical vectors can also be used as indexers, with consequences that are not hard to guess:

```
A[c(TRUE, FALSE, TRUE), -1] # select 1st and 3rd row, all but 1st column
## [,1] [,2] [,3]
## [1,] 4 7 10
## [2,] 6 9 12
B[B[, "x"]>1 & B[, "x"]<=9, ] # all rows where x is in (1, 9]
## x y z w
## b 5 6 7 8
## c 9 10 11 12
A[2, colMeans(A)>6, drop=FALSE] # 2nd row of the columns with means > 6
## [,1] [,2]
## [1,] 8 11
```

Note

In SectionĀ 11.3, we note that comparisons involving matrices are performed in an elementwise manner, for example:

```
A>7
## [,1] [,2] [,3] [,4]
## [1,] FALSE FALSE FALSE TRUE
## [2,] FALSE FALSE TRUE TRUE
## [3,] FALSE FALSE TRUE TRUE
```

Such logical matrices can be used to index other matrices of the same size. This always yields a (flat) vector in return.

```
A[A>7]
## [1] 8 9 10 11 12
```

This nothing else than the single-indexer subsetting
involving two flat vectors (a numeric and a logical one);
the `dim`

attributes are not considered here.

Implement your own versions of **max.col**,
**lower.tri**, and **upper.tri**.

### 11.2.7. Selecting Based on Two-Column Numeric Matricesļ

We can also index a matrix `A`

with a two-column matrix of positive integers `I`

, for instance:

```
(I <- cbind(
c(1, 3, 2, 1, 2),
c(2, 3, 2, 1, 4)
))
## [,1] [,2]
## [1,] 1 2
## [2,] 3 3
## [3,] 2 2
## [4,] 1 1
## [5,] 2 4
```

Now `A[I]`

gives an easy access to:

`A[I[1, 1], I[1, 2]]`

,`A[I[2, 1], I[2, 2]]`

,`A[I[3, 1], I[3, 2]]`

,ā¦

and so forth. In other words, each row of `I`

gives
the coordinates of the elements to extract.

```
A[I]
## [1] 4 9 5 1 11
```

This is exactly
`A[1, 2], A[3, 3], A[2, 2], A[1, 1], A[2, 4]`

.
The result is always a flat vector.

Note

**which** can also return a list of index matrices:

```
which(A>7, arr.ind=TRUE)
## row col
## [1,] 2 3
## [2,] 3 3
## [3,] 1 4
## [4,] 2 4
## [5,] 3 4
```

Moreover, **arrayInd** can be used to convert
flat indexes to multidimensional ones.

Implement your own version of **arrayInd**
and a function performing the inverse operation.

Implement your own version of **diag**.

### 11.2.8. Higher-Dimensional Arraysļ

For *d*-dimensional arrays, indexing can involve up to *d* indexes.

This is particularly useful for dim-named arrays that represent
contingency tables over a Cartesian product of multiple factors.
The built-in **datasets**`::Titanic`

object is
an example of this:

```
str(dimnames(Titanic)) # for reference (note that dimnames are named)
## List of 4
## $ Class : chr [1:4] "1st" "2nd" "3rd" "Crew"
## $ Sex : chr [1:2] "Male" "Female"
## $ Age : chr [1:2] "Child" "Adult"
## $ Survived: chr [1:2] "No" "Yes"
Titanic["Crew", "Male", "Adult", "Yes"]
## [1] 192
```

gives the number of adult male members of the crew who survived the accident. Also:

```
Titanic["Crew", , "Adult", ]
## Survived
## Sex No Yes
## Male 670 192
## Female 3 20
```

and so on.

Check if the above four-dimensional array can be indexed by means of matrices with four columns.

### 11.2.9. Replacing Elementsļ

There is of course also a multidimensional version of the
replacement subsetting function, `**[<-**`.

Generally, subsetting drops all attributes except
`names`

, `dim`

, and `dimnames`

(unless it does not make sense otherwise).
The replacement variant of the index operator
modifies vector values but generally preserves all the attributes.

This enables transforming matrix elements like:

```
B[B<10] <- A[B<10]^2
print(B)
## x y z w
## a 1 16 49 100
## b 4 25 64 121
## c 9 10 11 12
B[] <- rep(seq_len(NROW(B)), NCOL(B)) # NOT the same as B <- ...
print(B)
## x y z w
## a 1 1 1 1
## b 2 2 2 2
## c 3 3 3 3
```

Take note of the preservation of `dim`

and `dimnames`

.

Given a character matrix with entities that can be interpreted as numbers like:

```
(X <- rbind(x=c(a="1", b="2"), y=c("3", "4")))
## a b
## x "1" "2"
## y "3" "4"
```

convert it to a numeric matrix with a single line of code.

## 11.3. Common Operationsļ

### 11.3.1. Matrix Transposeļ

The matrix *transpose*, mathematically denoted with \(\mathbf{A}^T\),
is available via a call to **t**:

```
(A <- matrix(1:6, byrow=TRUE, nrow=2))
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 4 5 6
t(A)
## [,1] [,2]
## [1,] 1 4
## [2,] 2 5
## [3,] 3 6
```

Hence, if \(\mathbf{B}=\mathbf{A}^T\), then it is a matrix such that \(b_{i,j}=a_{j,i}\). In other words, in the transposed matrix, rows become columns and columns become rows.

For higher-dimensional arrays,
a generalised transpose can be achieved with **aperm**
(try permuting the dimensions of `Titanic`

).
Also note that the conjugate transpose of a complex matrix \(\mathbf{A}\)
is done via **Conj**`(`

**t**`(A))`

.

### 11.3.2. Vectorised Mathematical Functionsļ

Vectorised functions such as
**sqrt**,
**abs**,
**round**,
**log**,
**exp**,
**cos**,
**sin**,
etc.,
operate on each element of a given
array[4].

```
A <- matrix(1/(1:6), nrow=2)
round(A, 2) # rounds every element in A
## [,1] [,2] [,3]
## [1,] 1.0 0.33 0.20
## [2,] 0.5 0.25 0.17
```

Using a single call to **matplot**, which accepts
the `y`

argument be a matrix, draw a plot
of \(\sin(x)\), \(\cos(x)\), \(|\sin(x)|\), and \(|\cos(x)|\)
for \(x\in[-2\pi, 6\pi]\).

### 11.3.3. Aggregating Rows and Columnsļ

When we call an aggregation function on an array, it will reduce all elements to a single number:

```
(A <- matrix(1:12, byrow=TRUE, nrow=3))
## [,1] [,2] [,3] [,4]
## [1,] 1 2 3 4
## [2,] 5 6 7 8
## [3,] 9 10 11 12
mean(A)
## [1] 6.5
```

The **apply** function may be used to summarise
individual rows or columns in a matrix:

**apply**`(A, 1, f)`

applies a given function**f**on each*row*of a matrix`A`

;**apply**`(A, 2, f)`

applies**f**on each*column*of`A`

.

For instance:

```
apply(A, 1, mean) # synonym: rowMeans(A)
## [1] 2.5 6.5 10.5
apply(A, 2, mean) # synonym: colMeans(A)
## [1] 5 6 7 8
```

Note that the function being applied does not have to return a single number:

```
apply(A, 2, range) # min and max
## [,1] [,2] [,3] [,4]
## [1,] 1 2 3 4
## [2,] 9 10 11 12
apply(A, 1, function(row) c(Min=min(row), Mean=mean(row), Max=max(row)))
## [,1] [,2] [,3]
## Min 1.0 5.0 9.0
## Mean 2.5 6.5 10.5
## Max 4.0 8.0 12.0
```

Take note of the columnwise order of the output values.

**apply** works on higher-dimensional arrays too:

```
apply(Titanic, 1, mean) # 1st dimension - Class
## 1st 2nd 3rd Crew
## 40.625 35.625 88.250 110.625
apply(Titanic, c(1, 3), mean) # w.r.t. Class (1st) and Age (3rd)
## Age
## Class Child Adult
## 1st 1.50 79.75
## 2nd 6.00 65.25
## 3rd 19.75 156.75
## Crew 0.00 221.25
```

### 11.3.4. Binary Operatorsļ

In SectionĀ 5.5, we have stated that binary elementwise operations, such as addition or multiplication, preserve the attributes of the longer input or both (with the first argument preferred to the second) if they are of equal sizes.

Taking into account that:

an array is simply a flat vector equipped with the

`dim`

attribute, andwe refer to the respective

*default*methods when applying binary operators

allows us to deduce how `**+**`,
`**<=**`, `**&**`, etc. behave in a number
of different contexts.

**Array-Array.**
First, let us note what happens when we operate on two arrays
of identical dimensionalities.

```
(A <- rbind(c(1, 10, 100), c(-1, -10, -100)))
## [,1] [,2] [,3]
## [1,] 1 10 100
## [2,] -1 -10 -100
(B <- matrix(1:6, byrow=TRUE, nrow=2))
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 4 5 6
A + B # elementwise addition
## [,1] [,2] [,3]
## [1,] 2 12 103
## [2,] 3 -5 -94
A * B # elementwise multiplication (not: algebraic matrix multiply)
## [,1] [,2] [,3]
## [1,] 1 20 300
## [2,] -4 -50 -600
```

This is simply the addition and multiplication of the corresponding elements of two given matrices.

**Array-Scalar**.
Second, we can apply scalar-matrix operations:

```
(-1)*B
## [,1] [,2] [,3]
## [1,] -1 -2 -3
## [2,] -4 -5 -6
A^2
## [,1] [,2] [,3]
## [1,] 1 100 10000
## [2,] 1 100 10000
```

These multiplied each element in `B`

by -1
and squared every element in `A`

, respectively.

Also note that the behaviour of comparison operators is similar:

```
A >= 1 & A <= 100
## [,1] [,2] [,3]
## [1,] TRUE TRUE TRUE
## [2,] FALSE FALSE FALSE
```

**Array-Vector**.
Next, based on the recycling rule and the fact that
elements are ordered columnwisely, we get that:

```
B * c(10, 100)
## [,1] [,2] [,3]
## [1,] 10 20 30
## [2,] 400 500 600
```

multiplied every element in the first *row* by 10 and
each element in the second row by 100.

Note that if wish to multiply each element in the first, second, ā¦, etc.
*column* by the first, second, ā¦, etc. value in a vector,
we should *not* call:

```
B * c(1, 100, 1000)
## [,1] [,2] [,3]
## [1,] 1 2000 300
## [2,] 400 5 6000
```

but rather:

```
t(t(B) * c(1, 100, 1000))
## [,1] [,2] [,3]
## [1,] 1 200 3000
## [2,] 4 500 6000
```

or:

```
t(apply(B, 1, `*`, c(1, 100, 1000)))
## [,1] [,2] [,3]
## [1,] 1 200 3000
## [2,] 4 500 6000
```

Write a function which standardises the values in each column of
a given matrix: for each column, from every element,
subtract the mean and then divide it by the standard deviation.
Try to do it in a few different ways,
including via a call to **apply**,
**sweep**, **scale**, or based solely
on arithmetic operators.

Note

Some sanity checks are being done
on the `dim`

attributes, so not every configuration is possible.
Notice the following peculiarities:

```
getOption("error")
## NULL
A + t(B) # dim==c(2, 3) vs dim==c(3, 2)
## Error in A + t(B): non-conformable arrays
A * cbind(1, 10, 100) # this is too good to be true
## Error in A * cbind(1, 10, 100): non-conformable arrays
A * rbind(1, 10) # but A * c(1, 10) works...
## Error in A * rbind(1, 10): non-conformable arrays
A + 1:12
## Error in eval(expr, envir, enclos): dims [product 6] do not match the length of object [12]
A + 1:5 # partial recycling is okay
## Warning in A + 1:5: longer object length is not a multiple of shorter
## object length
## [,1] [,2] [,3]
## [1,] 2 13 105
## [2,] 1 -6 -99
```

## 11.4. Numerical Matrix Algebra (*)ļ

Many data analysis and machine learning algorithms, in their essence, involve quite simple matrix algebra and numerical mathematics. Suffice to say that anyone serious about data science and scientific computing should learn the necessary theory; see, for example, [25] and [26].

R is a convenient interface to the well-tested and stable
algorithms from, amongst others, **LAPACK** and
**BLAS**[5]. Below we mention only a few of them.
Note that there are many third-party packages featuring hundreds of
algorithms tackling differential equations, constrained
and unconstrained optimisation, etc.; exploring the relevant
CRAN Task Views
can give a good overview.

### 11.4.1. Matrix Multiplicationļ

`*****` performs elementwise multiplication.
For what we call (algebraic) matrix multiplication,
we should use the `**%*%**` operator.

Refreshing from a basic linear algebra course,
matrix multiplication can only be performed on two matrices of
*compatible sizes*: the number of columns in the left matrix must match
the number of rows in the right operand.

Given \(\mathbf{A}\in\mathbb{R}^{n\times p}\)
and \(\mathbf{B}\in\mathbb{R}^{p\times m}\), their multiply is a matrix
\(\mathbf{C}=\mathbf{A}\mathbf{B}\in\mathbb{R}^{n\times m}\)
such that \(c_{i,j}\) is the dot product of the *i*-th row in \(\mathbf{A}\)
and the *j*-th column in \(\mathbf{B}\):

for \(i=1,\dots,n\) and \(j=1,\dots,m\).

For instance:

```
(A <- rbind(c(1, 1, 1), c(-1, 1, 0)))
## [,1] [,2] [,3]
## [1,] 1 1 1
## [2,] -1 1 0
(B <- rbind(c(3, -1), c(1, 2), c(6, 1)))
## [,1] [,2]
## [1,] 3 -1
## [2,] 1 2
## [3,] 6 1
A %*% B
## [,1] [,2]
## [1,] 10 2
## [2,] -2 3
```

Note

When applying `**%*%**` on one or more flat vectors,
their dimensionality will be promoted automatically to make
the operation possible. Note that, however,
**c**`(a, b) `

**%*%**` `

**c**`(c, d)`

gives a scalar \(ac+bd\), and not a 2-by-2 matrix.

Further,
**crossprod**`(A, B)`

yields \(\mathbf{A}^T \mathbf{B}\) and
**tcrossprod**`(A, B)`

determines \(\mathbf{A} \mathbf{B}^T\)
more efficiently than relying on `**%*%**`.
Note that we can omit the second argument and get
\(\mathbf{A}^T \mathbf{A}\) and \(\mathbf{A} \mathbf{A}\), respectively

```
crossprod(c(1, 1)) # Euclidean norm squared
## [,1]
## [1,] 2
crossprod(c(1, 1), c(-1, 1)) # dot product of two vectors
## [,1]
## [1,] 0
crossprod(A) # same as t(A) %*% A, i.e., dot products of all column pairs
## [,1] [,2] [,3]
## [1,] 2 0 1
## [2,] 0 2 1
## [3,] 1 1 1
```

Recall that if the dot product of two vectors is equal to 0, we say that they are orthogonal (perpendicular).

(*)
Write your own versions of **cov** and **cor**:
functions to compute the covariance and correlation matrices.
Make use of the fact that the former can be determined
with **crossprod** based on a centred version of an input matrix.

### 11.4.2. Solving Systems of Linear Equationsļ

The **solve** function can be used to solve *m* systems
of *n* linear equations of the form
\(\mathbf{A}\mathbf{X}=\mathbf{B}\),
where \(\mathbf{A}\in\mathbb{R}^{n\times n}\) and \(\mathbf{X},\mathbf{B}\in\mathbb{R}^{n\times m}\)
(via the LU decomposition with partial pivoting and row interchanges).

### 11.4.3. Norms and Metricsļ

Given an *n*-by-*m* matrix \(\mathbf{A}\),
calling **norm**`(A, "1")`

,
**norm**`(A, "2")`

,
and **norm**`(A, "I")`

,
we can compute the operator norms:

where \(\sigma_1\) gives the largest singular value (see below).

Also, passing `"F"`

as the second argument
yields the Frobenius norm,
\(\|\mathbf{A}\|_F = \sqrt{\sum_{i=1}^n \sum_{j=1}^m a_{i,j}^2}\),
and `"M"`

computes the max norm,
\(\|\mathbf{A}\|_M = \max_{{i=1,\dots,n\atop j=1,\dots,m}} |a_{i,j}|\).

Note that if \(\mathbf{A}\) is a column vector, then \(\|\mathbf{A}\|_F\) and \(\|\mathbf{A}\|_2\) are equivalent and are referred to as the Euclidean norm. Moreover, \(\|\mathbf{A}\|_M=\|\mathbf{A}\|_I\) give the supremum norm and outputs \(\|\mathbf{A}\|_1\) the Manhattan (taxicab) one.

Given an *n*-by-*m* matrix \(\mathbf{A}\) representing
*m* vectors in \(\mathbb{R}^n\), normalise each column
so that you obtain *m* unit vectors,
i.e., whose Euclidean norm is 1.

Further, **dist** determines all pairwise distances
between a set of *n* vectors in \(\mathbb{R}^m\),
written as a *n* by *m* matrix.

For example, let us consider three vectors in \(\mathbb{R}^2\):

```
(X <- rbind(c(1, 1), c(1, -2), c(0, 0)))
## [,1] [,2]
## [1,] 1 1
## [2,] 1 -2
## [3,] 0 0
as.matrix(dist(X, "euclidean"))
## 1 2 3
## 1 0.0000 3.0000 1.4142
## 2 3.0000 0.0000 2.2361
## 3 1.4142 2.2361 0.0000
```

From that we see that the distance between the 1st and the 3rd vector is ca. 1.41421. Euclidean, maximum, Manhattan, and Canberra distances/metrics are available, amongst others.

**dist** returns an object of S3 class `dist`

.
Inspect how it is represented.

**adist** implements a couple of string metrics.
For example:

```
x <- c("spam", "bacon", "eggs", "spa", "spams", "legs")
names(x) <- x
(d <- adist(x))
## spam bacon eggs spa spams legs
## spam 0 5 4 1 1 4
## bacon 5 0 5 5 5 5
## eggs 4 5 0 4 4 2
## spa 1 5 4 0 2 4
## spams 1 5 4 2 0 4
## legs 4 5 2 4 4 0
```

gives the Levenshtein distances between each pair of strings.
In particular, we need two edit operations (character
insertions, deletions, or replacements) to turn `"eggs"`

into `"legs"`

(add `l`

and remove `g`

).

Objects of class `dist`

can be used to perform hierarchical
clusterings of datasets. For example:

```
h <- hclust(as.dist(d), method="average") # see also: plot(h, labels=x)
cutree(h, 3)
## spam bacon eggs spa spams legs
## 1 2 3 1 1 3
```

yields a grouping into 3 clusters determined by the average linkage
(`"legs"`

and `"eggs"`

are grouped together, `"spam"`

, `"spa"`

, `"spams"`

form another cluster, and `"bacon"`

is a singleton).

### 11.4.4. Eigenvalues and Eigenvectorsļ

**eigen** returns a sequence of eigenvalues
\((\lambda_1,\dots,\lambda_n)\) (ordered nondecreasingly w.r.t. \(|\lambda_i|\))
and a matrix \(\mathbf{V}\) whose columns define the corresponding
eigenvectors (scaled to unit length) of a given matrix \(\mathbf{X}\).
To recall, by definition it holds that
\(\mathbf{X}\mathbf{v}_{\cdot,i}=\lambda_i\mathbf{v}_{\cdot,i}\).

Here are the eigenvalues and the corresponding eigenvectors of an example matrix (defining rotation in 2D by \(\pi/3\)):

```
(R <- rbind(c(cos(pi/3), -sin(pi/3)), c(sin(pi/3), cos(pi/3))))
## [,1] [,2]
## [1,] 0.50000 -0.86603
## [2,] 0.86603 0.50000
eigen(R)
## eigen() decomposition
## $values
## [1] 0.5+0.86603i 0.5-0.86603i
##
## $vectors
## [,1] [,2]
## [1,] 0.70711+0.00000i 0.70711+0.00000i
## [2,] 0.00000-0.70711i 0.00000+0.70711i
```

Consider a pseudorandom sample from a bivariate[6] normal distribution; see FigureĀ 11.1.

```
Z <- matrix(rnorm(2000), ncol=2) # independent N(0, 1)
Z <- Z %*% rbind(c(1, 0), c(0, sqrt(5))) # scaling
Z <- Z %*% R # rotation
Z <- t(c(10, -5) + t(Z)) # translation
plot(Z, asp=1)
```

It is known that eigenvectors of the covariance matrix correspond to the principal components of the original dataset and the eigenvalues give the variance explained by them:

```
eigen(cov(Z))
## eigen() decomposition
## $values
## [1] 5.18609 0.98386
##
## $vectors
## [,1] [,2]
## [1,] -0.86715 0.49804
## [2,] -0.49804 -0.86715
```

this roughly corresponds to the principal directions \([\sin(\pi/3), \cos(\pi/3)]\) and the thereto-orthogonal \([\cos(\pi/3), -\sin(\pi/3)]\) with variances of 5 and 1, respectively. Still, this method of performing a PCA is not particularly numerically stable; see below for an alternative.

### 11.4.5. QR Decompositionļ

We say that a real *n*-by-*m* matrix \(\mathbf{Q}\), \(n\ge m\),
is orthogonal, whenever \(\mathbf{Q}^T \mathbf{Q} = \mathbf{I}\)
(identity matrix) which is equivalent to its columns being
orthogonal unit vectors
(note that if \(\mathbf{Q}\) is a square matrix, then
\(\mathbf{Q}^T=\mathbf{Q}^{-1}\) if and only if
\(\mathbf{Q}^T \mathbf{Q} = \mathbf{Q} \mathbf{Q}^T = \mathbf{I}\)).

Let \(\mathbf{A}\) be a real[7] *n*-by-*m* matrix with \(n\ge m\).
Then \(\mathbf{A}=\mathbf{Q}\mathbf{R}\) is its QR
decomposition (in the so-called narrow form),
if \(\mathbf{Q}\) is an orthogonal *n*-by-*m* matrix
and \(\mathbf{R}\) is an upper triangular
*m*-by-*m* one.

The **qr** function returns an object of S3 class `qr`

from which we can extract the two components; see the
**qr.Q** and **qr.R** functions.

Let \(\mathbf{X}\) be an *n*-by-*m* data matrix, representing
*n* points in \(\mathbb{R}^m\), and a vector
\(\mathbf{y}\in\mathbb{R}^n\) of the *desired* outputs corresponding
to each input.
For fitting a linear model \(\mathbf{x}^T \boldsymbol\theta\),
where \(\boldsymbol\theta\) is a vector of *m* parameters,
we can use the method of least squares, which minimises

It might be shown that if
\(\mathbf{X}=\mathbf{Q}\mathbf{R}\),
then \(\boldsymbol\theta =
\left(\mathbf{X}^T \mathbf{X} \right)^{-1} \mathbf{X}^T \mathbf{y}
= \mathbf{R}^{-1}\mathbf{Q}^T\mathbf{y}\), which can conveniently be determined
via a call to **qr.coef**.

In particular, we can fit a simple linear regression model \(y=ax+b\) by considering \(\mathbf{X} = [x, 1]\) and \(\boldsymbol\theta = [a, b]\), for example (see FigureĀ 11.2):

```
x <- cars[["speed"]]
y <- cars[["dist"]]
X <- cbind(x, 1) # the model is theta[1]*x + theta[2]*1
qrX <- qr(X)
(theta <- solve(qr.R(qrX)) %*% t(qr.Q(qrX)) %*% y) # or: qr.coef(qrX, y)
## [,1]
## x 3.9324
## -17.5791
plot(x, y, xlab="speed", ylab="dist") # scatter plot
abline(theta[2], theta[1], lty=2) # add the regression line
```

Note that **solve** with one argument determines the inverse
of a given matrix.
The fitted model is \(y=3.93241x-17.5791\).

The same approach is used by **lm.fit**, which is
the workhorse behind the **lm** method accepting an R formula
(which some readers might be familiar with; compare SectionĀ 16.5).

```
lm.fit(cbind(x, 1), y)[["coefficients"]] # also: lm(dist~speed, data=cars)
## x
## 3.9324 -17.5791
```

### 11.4.6. SVD Decompositionļ

Given a real *n*-by-*m* matrix \(\mathbf{X}\),
its singular value decomposition (SVD) is given by
\(\mathbf{X}=\mathbf{U} \mathbf{D} \mathbf{V}^T\),
where \(\mathbf{D}\) is a *p*-by-*p* diagonal matrix
(featuring the so-called singular values of \(\mathbf{X}\),
\(d_{1,1}\ge\dots\ge d_{p,p}\ge 0\), \(p=\min\{n,m\}\))
and \(\mathbf{U}\), \(\mathbf{V}\) are orthogonal matrices
of size *n*-by-*p* and *m*-by-*p*, respectively.

**svd** may not only be used to determine the solution to
linear regression[8]
but also to perform the principal component analysis[9].
Namely, \(\mathbf{V}\) gives the eigenvectors of \(\mathbf{X}^T \mathbf{X}\).
Assuming that \(\mathbf{X}\) is centred at \(\boldsymbol{0}\),
the latter is precisely its scaled covariance matrix.

Continuing the PCA example above, we can determine the principal directions also by calling:

```
Zc <- apply(Z, 2, function(x) x-mean(x)) # centred version of Z
svd(Zc)[["v"]]
## [,1] [,2]
## [1,] -0.86715 0.49804
## [2,] -0.49804 -0.86715
```

## 11.5. S4 Classes (*)ļ

The concept of the S3-style object oriented programming
is based on a brilliantly simple idea (see Chapter 10):
calling a generic **f**`(x)`

automatically dispatches to a method
**f.class_of_x**`(x)`

or
**f.default**`(x)`

in the case where the former does not exist.
Naturally, it has some inherent limitations:

classes cannot be formally defined; the

`class`

attribute may be assigned arbitrarily onto any object[10],argument dispatch is performed only[11] with regard to one data type[12].

In most cases, and with appropriate level of mindfulness, this is not a problem at all. However, it is a typical condition of programmers who come to our world from more mainstream languages (e.g., C++; yours truly included) until they appreciate the true beauty of Rās being somewhat different. Before they fully develop such an acquired taste, though, they grow restless as āR is not a real object oriented system because it lacks polymorphism, encapsulation, formal inheritance, and so on and so forth, and something must be done about itā. The truth is that it had not have to, but with high probability it would have anyway in one way or another.

And so when the fourth version of the S language was introduced in 1998
(see [4]),
it brought a new object oriented system which we are used to referring
to as S4. Its R version has been implemented in the **methods**
package. Below we discuss it briefly; for more details, see
**help**`("Classes_Details")`

and **help**`("Methods_Details")`

as well as
[5] and [6].

Note

(*)
S4 was loosely inspired by the Common Lisp Object System
(with its `defclass`

, `defmethod`

, etc.; see, e.g., [15]).
In the current authorās opinion, the S4 system is somewhat an afterthought.
Due to appendages like this, R seems like a patchwork language;
suffice it to say that it was not the
last attempt to do a somewhat more real OOP in the overall functional
R: the story will continue in SectionĀ 16.1.5.

The main problem with all the OOP approaches is that each of them is
parallel to S3 which never lost its popularity and is still at the
very core of our language.
We are thus covering them for the sake of completeness,
because thatās what must be done. After all,
with non-zero probability, the reader will sooner or later
come across such objects
(e.g., below we explain the meaning of notation like `x@slot`

).
Also, yours truly rebelliously suggests taking
statements such as
āfor new projects, it is recommended to use the more flexible
and robust S4 scheme provided in the **methods** packageā
(see **help**`("UseMethod")`

) with a pinch of salt.

### 11.5.1. Defining S4 Classesļ

An S4 class can formally be registered by means of a call to
**setClass**.

For instance:

```
library("methods") # in the case where it is not auto-loaded
setClass("categorical", slots=c(data="integer", levels="character"))
```

defines a class named `categorical`

with two slots
`data`

and `levels`

being integer and character vectors, respectively.
Note that this notation is already quite peculiar: there is no assignment
which would suggest that we have introduced something novel.

An object of the above class can be instantiated
by calling **new**:

```
z <- new("categorical", data=c(1L, 2L, 2L, 1L, 1L), levels=c("a", "b"))
print(z)
## An object of class "categorical"
## Slot "data":
## [1] 1 2 2 1 1
##
## Slot "levels":
## [1] "a" "b"
```

That `z`

is of the recently-introduced class can be verified as follows:

```
is(z, "categorical")
## [1] TRUE
class(z) # also: attr(z, "class")
## [1] "categorical"
## attr(,"package")
## [1] ".GlobalEnv"
```

Important

Some R packages will be importing from the **methods**
only for the sake of being able to access the convenient **is**
function ā it does not mean they are defining new S4 classes.

Note

S4 objects are marked as being of the following basic type:

```
typeof(z)
## [1] "S4"
```

For technical details on how they are internally represented, see Section 1.12 in [50]. In particular, in our case, all the slots are simply stored as object attributes:

```
attributes(z)
## $data
## [1] 1 2 2 1 1
##
## $levels
## [1] "a" "b"
##
## $class
## [1] "categorical"
## attr(,"package")
## [1] ".GlobalEnv"
```

### 11.5.2. Accessing Slotsļ

Reading or writing slot contents can be done by means of the
`**@**` operator and
the **slot** function or their replacement versions.

```
z@data # or slot(z, "data")
## [1] 1 2 2 1 1
z@levels <- c("A", "B")
```

Note

The `**@**` operator can only be used on S4 objects
and some sanity checks are automatically performed:

```
z@unknown <- "spam"
## Error in (function (cl, name, valueClass) : 'unknown' is not a slot in class "categorical"
z@data <- "spam"
## Error in (function (cl, name, valueClass) : assignment of an object of class "character" is not valid for @'data' in an object of class "categorical"; is(value, "integer") is not TRUE
```

### 11.5.3. Defining Methodsļ

For the S4 counterparts of the S3 generics
(SectionĀ 10.2), see **help**`("setGeneric")`

.
Luckily, there is a good degree of interoperability between
the S3 and S4 systems.

Let us start by introducing
a new method for the well-known **as.character** generic.
Instead of defining **as.character.categorical**,
we need to register a new routine with **setMethod**.

```
setMethod(
"as.character", # name of the generic
"categorical", # class of 1st arg; or: signature=c(x="categorical")
function(x, ...) # method definition
x@levels[x@data]
)
```

Testing:

```
as.character(z)
## [1] "A" "B" "B" "A" "A"
```

The S4 counterpart of **print** is **show**:

```
setMethod(
"show",
"categorical",
function(object) {
x_character <- as.character(object)
print(x_character) # calls `print.default`
cat(sprintf("Categories: %s\n",
paste(object@levels, collapse=", ")))
}
)
```

Interestingly, it is involved automatically
upon a call to **print**:

```
print(z) # calls `show` for `categorical`
## [1] "A" "B" "B" "A" "A"
## Categories: A, B
```

Methods that dispatch on the type of multiple arguments are possible too, for example:

```
setMethod(
"split",
c(x="ANY", f="categorical"),
function (x, f, drop=FALSE, ...)
split(x, as.character(f), drop=drop, ...)
)
```

allows the first argument to be of any type (like a default method), and:

```
setMethod(
"split",
c(x="matrix", f="categorical"),
function (x, f, drop=FALSE, ...)
lapply(
split(seq_len(NROW(x)), f, drop=drop, ...), # calls the above
function(i) x[i, , drop=FALSE])
)
```

is a version tailored for matrices. Testing:

```
A <- matrix(1:35, nrow=5) # whatever
split(A, z) # matrix,categorical
## $A
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 1 6 11 16 21 26 31
## [2,] 4 9 14 19 24 29 34
## [3,] 5 10 15 20 25 30 35
##
## $B
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 2 7 12 17 22 27 32
## [2,] 3 8 13 18 23 28 33
split(1:5, z) # ANY,categorical
## $A
## [1] 1 4 5
##
## $B
## [1] 2 3
```

Overload the `**[**` operator for the `categorical`

class

### 11.5.4. Defining Constructorsļ

We can also overload the **initialize** method
which is automatically called by **new**:

```
setMethod(
"initialize", # class name
"categorical", # method name
function(.Object, x) { # the method itself
x <- as.character(x) # see above
xu <- unique(sort(x)) # drops NAs
.Object@data <- match(x, xu)
.Object@levels <- xu
.Object # return value - a modified object
}
)
```

This allows for constructing new objects of class `categorical`

based on an object like `x`

above, for instance:

```
w <- new("categorical", c("a", "c", "a", "a", "d", "c"))
print(w)
## [1] "a" "c" "a" "a" "d" "c"
## Categories: a, c, d
```

Note that we have not set the two slots directly.
They were automatically taken care of by **initialize**.

Set up a validating method for our class; see **help**`("setValidity")`

.

### 11.5.5. Inheritanceļ

New S4 classes can be derived from existing ones, for instance:

```
setClass("binary", contains="categorical")
```

is a child class inhering all slots from its parent. We can, for example, overload the initialisation method for it:

```
setMethod(
"initialize",
"binary",
function(.Object, x)
{
x <- as.character(as.integer(as.logical(x)))
xu <- c("0", "1")
.Object@data <- match(x, xu)
.Object@levels <- xu
.Object
}
)
```

Testing:

```
new("binary", c(TRUE, FALSE, TRUE, FALSE, NA, TRUE))
## [1] "1" "0" "1" "0" NA "1"
## Categories: 0, 1
```

Note that we are still using the **show** method
of the parent class.

### 11.5.6. A Note on the **Matrix** Packageļ

The **Matrix** package is perhaps the most widely known showcase of
the S4 object-orientation (and that is the reason why
we cover S4 in this very chapter). It defines classes and methods for
dense and sparse matrices, including rectangular, symmetric,
triangular, band, and diagonal ones.

For instance, large graph (e.g., in network sciences)
or preference (e.g., in recommender systems) data can be represented
using sparse matrices (those which feature many 0s; after all,
it is extremely more common for two vertices in a network
to *not* be joined by an edge than to be connected).

For example:

```
library("Matrix")
(A <- Diagonal(x=1:5))
## 5 x 5 diagonal matrix of class "ddiMatrix"
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1 . . . .
## [2,] . 2 . . .
## [3,] . . 3 . .
## [4,] . . . 4 .
## [5,] . . . . 5
```

created a real diagonal matrix. Moreover:

```
B <- as(A, "sparseMatrix")
B[1, 2] <- 7
B[4, 1] <- 42
print(B)
## 5 x 5 sparse Matrix of class "dgCMatrix"
##
## [1,] 1 7 . . .
## [2,] . 2 . . .
## [3,] . . 3 . .
## [4,] 42 . . 4 .
## [5,] . . . . 5
```

yields a general sparse real matrix in the CRC (compressed, sparse, column-oriented) format.

For more information on the package, see
**vignette**`(package="Matrix")`

.

## 11.6. Exercisesļ

Let `X`

be a matrix with `dimnames`

set, e.g.:

```
X <- matrix(1:12, byrow=TRUE, nrow=3) # example matrix
dimnames(X)[[2]] <- c("a", "b", "c", "d") # set column names
print(X)
## a b c d
## [1,] 1 2 3 4
## [2,] 5 6 7 8
## [3,] 9 10 11 12
```

Explain (in your own words) the meaning of the following expressions involving matrix subsetting. Note that not each of them is valid.

`X[1, ]`

,`X[, 3]`

,`X[, 3, drop=FALSE]`

,`X[3]`

,`X[, "a"]`

,`X[,`

**c**`("a", "b", "c")]`

,`X[, -2]`

,`X[X[,1] > 5, ]`

,`X[X[,1]>5,`

**c**`("a", "b", "c")]`

,`X[X[,1]>=5 & X[,1]<=10, ]`

,`X[X[,1]>=5 & X[,1]<=10,`

**c**`("a", "b", "c")]`

,`X[,`

**c**`(1, "b", "d")]`

.

Assuming that `X`

is an array,
what are the differences between the following indexing schemes?

`X["1", ]`

vs`X[1, ]`

,`X[, "a", "b", "c"]`

vs`X["a", "b", "c"]`

vs`X[,`

**c**`("a", "b", "c")]`

vs`X[`

**c**`("a", "b", "c")]`

,`X[1]`

vs`X[, 1]`

vs`X[1, ]`

,`X[X>0]`

vs`X[X>0, ]`

vs`X[, X>0]`

,`X[X[, 1]>0]`

vs`X[X[, 1]>0,]`

vs`X[,X[,1]>0]`

,`X[X[, 1]>5, X[1, ]<10]`

vs`X[X[1, ]>5, X[, 1]<10]`

.

For a given real *n*-by-*m* matrix \(\mathbf{X}\),
determine the bounding hyperrectangle of
thusly encoded *n* input points in an *m*-dimensional space.
Return a 2-by-*m* matrix \(\mathbf{B}\)
with \(b_{1,j}=\min_i x_{i,j}\) and \(b_{2,j}=\max_i x_{i,j}\).

Let \(\mathbf{t}\) be vector of *n* integers in \(\{1,\dots,k\}\).
Write a function to one-hot-encode each \(t_i\):
return a 0-1 matrix \(\mathbf{R}\) of size *n*-by-*k* such that \(r_{i,j}=1\)
if and only if \(j = t_i\).
For example, if \(\mathbf{t}=[1, 2, 3, 2, 4]\) and \(k=4\), then:

On a side note, such a representation is useful when solving,
e.g., a multiclass classification problem by means of *k* binary classifiers.

Then, write another function, but this time setting \(r_{i,j}=1\) if and only if \(j\ge t_i\), e.g.:

Important

Kind reminder: as usual, try to solve
all the exercises without the use of explicit
**for** and **while** loops (provided that it is possible).

Given an *n*-by-*k* real matrix, apply the softmax
function on each row, i.e., map
\(x_{i,j}\) to \(\frac{\exp(x_{i,j})}{\sum_{l=1}^k \exp(x_{i,l})}\).
Then, one-hot decode the values in each row, i.e., find the
column number with the greatest value. Return a vector of size *n*
with elements in \(\{1,\dots,k\}\).

Assume that an *n*-by-*d* real matrix \(\mathbf{X}\) represents
*n* points in \(\mathbb{R}^d\). Write a function (but do not refer to
**dist**) that determines the pairwise distances between all
the *n* points and a given \(\mathbf{y}\in\mathbb{R}^d\).
Return a vector \(\mathbf{d}\) of length *n*
with \(d_{i}=\|\mathbf{x}_{i,\cdot}-\mathbf{y}\|_2\).

Let \(\mathbf{X}\) and \(\mathbf{Y}\)
be two real-valued matrices of sizes *n*-by-*d* and
*m*-by-*d*, respectively, representing two sets of points in \(\mathbb{R}^d\).
Return an integer vector \(\mathbf{r}\) of length *m*
such that \(r_i\) indicates the index of the
point in \(\mathbf{X}\) with the least distance to (the closest to)
the *i*-th point in \(\mathbf{Y}\), i.e.,
\(r_i = \mathrm{arg}\min_j \|\mathbf{x}_{j,\cdot}-\mathbf{y}_{i,\cdot}\|_2\).

Write your own version of the built-in **utils**::**combn**.

Time series are vectors or matrices of class `ts`

equipped with the `tsp`

attribute, amongst others.
Refer to **help**`("ts")`

for more information about
how they are represented and what S3 methods have been overloaded for them.

(*) Numeric matrices can be stored in a CSV file, amongst others.
Usually, we will be loading them via **read.csv**,
which returns a data frame (see Chapter 12),
for example:

```
X <- as.matrix(read.csv(
paste0(
"https://github.com/gagolews/teaching-data/",
"raw/master/marek/eurxxx-20200101-20200630.csv"
),
comment.char="#",
sep=","
))
```

Write your own function
**read_numeric_matrix**`(file_name, comment, sep)`

which is instead based on a few calls to **scan**.
Use **file** to establish a file connection to be able
to ignore the comment lines and fetch the column names
before reading the actual numeric values.

(*)
Using **readBin**,
read the `t10k-images-idx3-ubyte.gz`

from the MNIST database homepage.
The output object should be a three-dimensional, 10000-by-28-by-28 array
with real elements between 0 and 255.
Refer to the File Formats section therein for more details.

(**)
Circular convolution of discrete-valued multidimensional signals
can be performed by means
of **fft** and matrix multiplication,
whereas affine transformations require only the latter.
Apply various image transformations such as sharpening,
shearing, and rotating on the MNIST digits
and plot the results using the **image** function.

(*) Using **constrOptim**,
find the minimum of the Constrained Betts Function
\(f(x_1, x_2) = 0.01 x_1^2 + x_2^2 - 100\)
with linear constraints \( 2\le x_1 \le 50\),
\(-50 \le x_2 \le 50\), and \(10 x_1 \ge 10 + x_2\).
(**) Also, use **solve.QP** from the **quadprog**
package of find the minimum.