3. Logical Vectors

The open-access textbook Deep R Programming by Marek Gagolewski is, and will remain, freely available for everyone’s enjoyment (also in PDF). It is a non-profit project. This book is still a work in progress. Beta versions of Chapters 1–12 are already complete, but there will be more. In the meantime, any bug/typos reports/fixes are appreciated. Although available online, it is a whole course; it should be read from the beginning to the end. Refer to the Preface for general introductory remarks. Also, check out my other book, Minimalist Data Wrangling with Python [20].

There are three logical constants in R. Wait… how many?

3.1. Creating Logical Vectors

R defines three logical constants: TRUE, FALSE, and NA – meant to represent “yes”, “no”, and “???”, respectively. Each of them, when instantiated, is an atomic vector of length one.

Some of the functions we introduced in the previous chapter can be used to generate logical vectors as well:

c(TRUE, FALSE, FALSE, NA, TRUE, FALSE)
## [1]  TRUE FALSE FALSE    NA  TRUE FALSE
rep(c(TRUE, FALSE, NA), each=2)
## [1]  TRUE  TRUE FALSE FALSE    NA    NA
sample(c(TRUE, FALSE), 10, replace=TRUE, prob=c(0.8, 0.2))
##  [1]  TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE FALSE  TRUE  TRUE

Note

T” is a synonym for TRUE and “F” stands for FALSE. However, these are not reserved keywords and can be re-assigned any other values. Therefore, we advise against relying on them and hence we will never use them throughout the course of this course.

Also note that the logical missing value is spelled simply as “NA” and not “NA_logical_”. The fact that both the logical “NA” and the numeric "NA_real_" are, for the sake of our mental well-being, both printed as "NA" on the R console, does not mean they are identical; see Section 4.1 for discussion.

3.2. Comparing Elements

3.2.1. Vectorised Comparison Operators

Logical vectors frequently come into being as results of various testing activities.

In particular, the binary operators:

  • `<` (less than),

  • `<=` (less than or equal),

  • `>` (greater than),

  • `>=` (greater than or equal)

  • `==` (equal),

  • `!=` (not equal),

compare the corresponding elements of two numeric vectors and output a logical vector.

1 < 3
## [1] TRUE
c(1, 2, 3, 4) == c(2, 2, 3, 8)
## [1] FALSE  TRUE  TRUE FALSE
1:10 <= 10:1
##  [1]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE

Thus, they operate in an elementwise manner. Moreover, the recycling rule is applied if necessary:

3 < 1:5  # c(3, 3, 3, 3, 3) < c(1, 2, 3, 4, 5)
## [1] FALSE FALSE FALSE  TRUE  TRUE
c(1, 4) == 1:4  # c(1, 4, 1, 4) == c(1, 2, 3, 4)
## [1]  TRUE FALSE FALSE  TRUE

Therefore, we can say that they are vectorised in the same manner as the arithmetic operators `+`, `*`, etc.; compare Section 2.4.1.

3.2.2. Testing for NA, NaN, and Inf

Comparisons against missing values and not-numbers yield NAs. Therefore, instead of the incorrect x == NA_reals_ or x == NaN, testing for missingness should rather be performed via a call to the vectorised is.na function.

is.na(c(NA_real_, Inf, -Inf, NaN, -1, 0, 1))
## [1]  TRUE FALSE FALSE  TRUE FALSE FALSE FALSE
is.nan(c(NA_real_, Inf, -Inf, NaN, -1, 0, 1))
## [1] FALSE FALSE FALSE  TRUE FALSE FALSE FALSE
is.na(c(TRUE, FALSE, NA, TRUE))  # works for logical vectors too
## [1] FALSE FALSE  TRUE FALSE

Moreover, is.finite is noteworthy, because it returns FALSE on Infs, NA_real_s and NaNs.

is.finite(c(NA_real_, Inf, -Inf, NaN, -1, 0, 1))
## [1] FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE

See also the more specific is.nan and is.infinite.

3.2.3. Dealing with Floating Point Round-Off Errors (*)

In mathematics, real numbers are merely an idealisation. In practice, however, it is impossible to store them with infinite precision (think \(\pi=3.1415926535897932384626433...\)): computer memory is limited and our time is precious.

Therefore, a widely agreed upon consensus had to be reached. In R, we rely on the so-called double-precision floating point format. Floating point means that the numbers can be both small (close to zero) and large: \(\pm 2.23\times 10^{-308}\) and \(\pm 1.79\times 10^{308}\) are both acceptable.

Note

2.23e-308 == 0.00000000000000000000000000000000000000000000000000
               00000000000000000000000000000000000000000000000000
               00000000000000000000000000000000000000000000000000
               00000000000000000000000000000000000000000000000000
               00000000000000000000000000000000000000000000000000
               000000000000000000000000000000000000000000000000000000000223

1.79e308 == 17900000000000000000000000000000000000000000000000000000000
                     00000000000000000000000000000000000000000000000000
                     00000000000000000000000000000000000000000000000000
                     00000000000000000000000000000000000000000000000000
                     00000000000000000000000000000000000000000000000000
                     00000000000000000000000000000000000000000000000000

These two are quite distant from each other.

Every numeric value takes 8 bytes (or equivalently 64 bits) of memory. We are, however, able to store only about 15-17 decimal digits:

print(0.12345678901234567890123456789012345678901234, digits=22)  # 22 is max
## [1] 0.1234567890123456773699

which limits the precision of our computations. The about part is – unfortunately – due to the numbers’ being written in the computer-friendly binary, not human-aligned decimal, base. This can lead to some unexpected outcomes.

In particular:

  • 0.1 cannot be represented exactly, because it cannot be written as a finite series of reciprocals of powers of 2 (it holds \(0.1=2^{-4}+2^{-5}+2^{-8}+2^{-9}+\dots\)). This leads to surprising results such as:

    0.1 + 0.1 + 0.1 == 0.3
    ## [1] FALSE
    

    Despite the fact that what follows does not show anything suspicious:

    c(0.1, 0.1 + 0.1 + 0.1, 0.3)
    ## [1] 0.1 0.3 0.3
    

    Printing involves rounding, hence, in the above context, is misleading. Above, we have something more like:

    print(c(0.1, 0.1 + 0.1 + 0.1, 0.3), digits=22)
    ## [1] 0.1000000000000000055511 0.3000000000000000444089
    ## [3] 0.2999999999999999888978
    
  • All integers between \(-2^{53}\) and \(2^{53}\) all stored exactly – this is good news. However, the next integer is beyond the representable range:

    2^53 + 1 == 2^53
    ## [1] TRUE
    
  • The above suggests that, more generally, the order of operations may matter, in particular, the associativity property may be violated when dealing with numbers of different orders of magnitude:

    2^53 + 2^-53 - 2^53 - 2^-53  # should be == 0.0
    ## [1] -1.1102e-16
    
  • Some numbers may just be just too large, too small, or too close to zero to be represented exactly:

    c(sum(2^((1023-52):1023)), sum(2^((1023-53):1023)))
    ## [1] 1.7977e+308         Inf
    c(2^(-1022-52), 2^(-1022-53))
    ## [1] 4.9407e-324  0.0000e+00
    

Important

The double-precision floating point format (IEEE 754) is not specific to R: it is used by most other computing environments, including Python and C++.

For discussion, see [27, 30, 34] ([26] can be of particular interest to the general statistical/data analysis audience).

Can we do anything about these issues?

First, when dealing with integers of reasonable order of magnitude (a frequent case where we are dealing various resource or case IDs in our datasets), rest assured that we are safe: their comparison, addition, subtraction, and multiplication is always precise.

In all other cases (including applying other operations on integers, e.g., division or sqrt), we need to be very careful with comparisons, especially involving testing for equality, `==`.

The sole fact that \(\sin \pi = 0\), mathematically speaking, does not mean that we should expect that:

sin(pi) == 0
## [1] FALSE

Instead, they are so close to each other that we can treat the difference between them as negligible. Thus, in practice, instead of testing if \(x = y\), we will be considering:

  • \(|x-y|\) (absolute error) or

  • \(\frac{|x-y|}{|y|}\) (relative error; which takes the order of magnitude of the numbers into account but obviously cannot be applied if \(y\) is very close of \(0\)),

and determining if these are less than some assumed error margin, \(\varepsilon>0\), say, \(10^{-8}\) or \(2^{-26}\).

For example:

abs(sin(pi) - 0) < 2^-26
## [1] TRUE

Note

Note that rounding can sometimes have a similar effect as testing for almost-equality in terms of the absolute error.

round(sin(pi), 8) == 0
## [1] TRUE

Important

Our recommendations are valid for the most popular applications of R, i.e., statistical and, more generally, scientific computing[1]. The datasets we handle on a daily basis do not represent accurate measurements themselves, bah, the World itself is far from ideal, therefore we do not have to lose sleep over our not being able to precisely pinpoint the exact solution.

3.3. Logical Operations

3.3.1. Vectorised Logical Operators

The comparison operators such as `==` and `>` accept only two arguments. Their chaining is forbidden; a test which we would mathematically write as \(0 \le x \le 1\) (or \(x\in[0,1]\)) cannot be expressed as “0<=x<=1” in R.

Therefore, we need a way to combine two logical conditions so as to be able to state that “\(x\ge 0\) and, at the same time, \(x\le 1\)”.

In such situations, the following logical operators and functions come in handy:

  • `!` (not, negation; unary),

  • `&` (and, conjunction; are both predicates true?),

  • `|` (or, alternation; is at least one true?),

  • xor (exclusive-or, exclusive disjunction, either-or; is one and only one of the predicates true?).

They again act elementwisely and implement the recycling rule if necessary (and applicable).

x <- c(-10, -1, -0.25, 0, 0.5, 1, 5, 100)
(x >= 0) & (x <= 1)
## [1] FALSE FALSE FALSE  TRUE  TRUE  TRUE FALSE FALSE
(x < 0) | (x > 1)
## [1]  TRUE  TRUE  TRUE FALSE FALSE FALSE  TRUE  TRUE
!((x < 0) | (x > 1))
## [1] FALSE FALSE FALSE  TRUE  TRUE  TRUE FALSE FALSE
xor(x >= -1, x <= 1)
## [1]  TRUE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE

Important

The vectorised `&` and `|` operators should not be confused with their scalar, short-circuit counterparts, `&&` and `||`, which we discuss in Section 8.1.4.

3.3.2. Operator Precedence Revisited

The operators introduced in this chapter have lower precedence than the arithmetic ones. In particular, the binary `+` and `-`. Calling help("Syntax") reveals that we can extend our listing from Section 2.4.3 as follows:

  1. `<-` (right-to-left; least binding),

  2. `|`,

  3. `&`,

  4. `!` (unary),

  5. `<`, `>`, `<=`, `>=`, `==`, and `!=`,

  6. `+` and `-`,

  7. `*` and `/`,

3.3.3. Dealing with Missingness

Operations involving missing values follow the principles of the Łukasiewicz’s three-valued logic, which is based on common sense. For instance, “NA | TRUE” is TRUE, because or needs at least one argument to be TRUE to generate such a result. On the other hand, “NA | FALSE” is NA, because the result would be different depending on what we substituted NA for.

Let us take a moment to contemplate the operations’ truth tables for all the possible combinations of inputs:

u <- c(TRUE, FALSE, NA,  TRUE,  FALSE, NA,    TRUE, FALSE, NA)
v <- c(TRUE, TRUE, TRUE, FALSE, FALSE, FALSE, NA,   NA,    NA)
!u
## [1] FALSE  TRUE    NA FALSE  TRUE    NA FALSE  TRUE    NA
u & v
## [1]  TRUE FALSE    NA FALSE FALSE FALSE    NA FALSE    NA
u | v
## [1]  TRUE  TRUE  TRUE  TRUE FALSE    NA  TRUE    NA    NA
xor(u, v)
## [1] FALSE  TRUE    NA  TRUE FALSE    NA    NA    NA    NA

3.3.4. Aggregating with all, any, and sum

Just like in the case of numeric vectors, we can summarise the contents of logical sequences.

all tests whether every element in a logical vector is equal to TRUE and any determines if there exists an element that is TRUE.

x <- runif(10000)
all(x <= 0.2)  # are all values in x <= 0.2?
## [1] FALSE
any(x <= 0.2)  # is there at least one element in x that is <= 0.2?
## [1] TRUE

Note

The all function will frequently be used in conjunction with “==”. This is because the latter, as we have said above, is itself vectorised: it does not test whether a vector as a whole is equal to another one.

z <- c(1, 2, 3)
z == 1:3  # elementwise equal
## [1] TRUE TRUE TRUE
all(z == 1:3)  # elementwise equal summarised
## [1] TRUE

However, let us keep in mind the warning about the testing for exact equality of floating-point numbers stated in Section 3.2.3. Sometimes, considering absolute or relative errors might be more appropriate.

z <- sin((0:10)*pi)  # sin(0), sin(pi), sin(2*pi), ..., sin(10*pi)
all(z == 0.0)  # danger zone! please don't...
## [1] FALSE
all(abs(z - 0.0) < 1e-9)  # are the absolute errors negligible?
## [1] TRUE

We can also call sum on a logical vector. Taken into account that it interprets TRUE as numeric 1 and FALSE as 0 (more on this in Section 4.1), it will give us the number of elements equal to TRUE.

sum(x <= 0.2)  # how many elements in x are <= 0.2?
## [1] 1998

Also, by computing sum(x)/length(x), we can obtain the proportion (fraction) of values equal to TRUE in x. Equivalently:

mean(x <= 0.2)  # proportion of elements <= 0.2
## [1] 0.1998

Naturally, we expect mean(runif(n) <= 0.2)” to be equal to 0.2 (20%), but with randomness we can never be sure.

3.3.5. Simplifying Predicates

Each aspiring programmer needs to become fluent with the rules governing the transformations of logical conditions, for example, that the negation of “(x >= 0) & (x < 1)” is equivalent to “(x < 0) | (x >= 1)”.

Each such rule is called a tautology. Here are some of them:

  • !(!p) is equivalent to p (double negation),

  • !(p & q) holds if and only if !p | !q (De Morgan’s law),

  • !(p | q) is !p & !q (another De Morgan’s law),

  • all(p) is equivalent to !any(!p).

Various combinations thereof are of course possible. Some further simplifications are enabled by other properties of the binary operations:

  • commutativity (symmetry), e.g., \(a+b = b+a\), \(a*b=b*a\),

  • associativity, e.g., \((a+b)+c = a+(b+c)\), \(\max(\max(a, b), c)=\max(a, \max(b, c))\),

  • distributivity, e.g., \(a*b+a*c = a*(b+c)\), \(\min(\max(a,b), \max(a,c))=\max(a, \min(b, c))\),

and relations, including:

  • transitivity, e.g., if \(a\le b\) and \(b\le c\) then surely \(a \le c\).

Exercise 3.1

Assuming that a, b, and c are numeric vectors, simplify the following expressions:

  • !(b>a & b<c),

  • !(a>=b & b>=c & a>=c),

  • a>b & a<c | a<c & a>d,

  • a>b | a<=b,

  • a<=b & a>c | a>b & a<=c,

  • a<=b & (a>c | a>b) & a<=c,

  • !all(a > b & b < c).

3.4. Choosing Elements with ifelse

The ifelse function is a vectorised version of the scalar ifelse conditional statement which we will do without for as long as until Chapter 8.

It allows us to select an element from either one or another vector based on some logical condition.

A call to ifelse(l, t, f), where l is a logical vector, returns a vector y such that:

\[\begin{split} y_i = \left\{ \begin{array}{ll} t_i & \text{if } l_i \text{ is TRUE }, \\ f_i & \text{if } l_i \text{ is FALSE }. \\ \end{array} \right. \end{split}\]

In other words, the \(i\)-th element of the result vector is equal to \(t_i\) if \(l_i\) is TRUE and to \(f_i\) otherwise.

For example:

(z <- rnorm(6))  # example vector
## [1] -0.560476 -0.230177  1.558708  0.070508  0.129288  1.715065
ifelse(z >= 0, z, -z)  # like abs(z)
## [1] 0.560476 0.230177 1.558708 0.070508 0.129288 1.715065

or:

(x <- rnorm(6))  # example vector
## [1]  0.46092 -1.26506 -0.68685 -0.44566  1.22408  0.35981
(y <- rnorm(6))  # example vector
## [1]  0.40077  0.11068 -0.55584  1.78691  0.49785 -1.96662
ifelse(x >= y, x, y)   # like pmax(x, y)
## [1]  0.46092  0.11068 -0.55584  1.78691  1.22408  0.35981

By now, we should not be surprised that the recycling rule is fired up if necessary:

ifelse(x > 0, x^2, 0)  # squares of positive xs and 0 otherwise
## [1] 0.21244 0.00000 0.00000 0.00000 1.49838 0.12947

Note

Keep in mind that all arguments are evaluated in their entirety before deciding on which element should be selected. Therefore, the following call will generate a warning:

ifelse(z >= 0, log(z), NA_real_)
## Warning in log(z): NaNs produced
## [1]       NA       NA  0.44386 -2.65202 -2.04571  0.53945

This is because with log(z), we are computing the logarithms of negative values anyway. To fix this, we can write:

log(ifelse(z >= 0, z, NA_real_))
## [1]       NA       NA  0.44386 -2.65202 -2.04571  0.53945

The calls to ifelse can naturally be nested in the case where we yearn for an ifelse ifelse-type expression.

Example 3.2

A version of pmax(pmax(x, y), z) can be written as:

ifelse(x >= y,
    ifelse(z >= x, z, x),
    ifelse(z >= y, z, y)
)
## [1] 0.46092 0.11068 1.55871 1.78691 1.22408 1.71506

However, determining the three intermediate logical vectors is not necessary; we can save one call to `>=` by introducing an auxiliary variable:

xy <- ifelse(x >= y, x, y)
ifelse(z >= xy, z, xy)
## [1] 0.46092 0.11068 1.55871 1.78691 1.22408 1.71506
Exercise 3.3

Figure 3.1 depicts a realisation of the mixture \(Z=0.2 X + 0.8 Y\) of two normal distributions \(X\sim\mathrm{N}(-2, 0.5)\) and \(Y\sim\mathrm{N}(3, 1)\).

n <- 100000
z <- ifelse(runif(n) <= 0.2, rnorm(n, -2, 0.5), rnorm(n, 3, 1))
hist(z, breaks=101, probability=TRUE, main="", col="white")
../_images/ex-mixture-1.png

Figure 3.1 A mixture of two Gaussians generated with ifelse

In other words, we generated a variate from the normal distribution that has expected value of -2 with probability 20% and from the one with expectation of 3 otherwise.

Inspired by the above, generate the following Gaussian mixtures:

  • \(\frac{2}{3} X + \frac{1}{3} Y\), where \(X\sim\mathrm{N}(100, 16)\) and \(Y\sim\mathrm{N}(116, 8)\),

  • \(0.3 X + 0.4 Y + 0.3 Z\), where \(X\sim\mathrm{N}(-10, 2)\), \(Y\sim\mathrm{N}(0, 2)\), and \(Z\sim\mathrm{N}(10, 2)\).

(*) On a side note, knowing that if \(X\) follows \(\mathrm{N}(0, 1)\), then the scaled-shifted \(\sigma X+\mu\) is distributed \(\mathrm{N}(\mu, \sigma)\), the above can be equivalently written as:

w <- (runif(n) <= 0.2)
z <- rnorm(n, 0, 1)*ifelse(w, 0.5, 1) + ifelse(w, -2, 3)

3.5. Exercises

Exercise 3.4

Answer the following questions:

  • Why the statement “Earth is flat or the smallpox vaccine is proven effective” is obviously true?

  • What is the difference between NA and NA_real_?

  • Why is “FALSE & NA” equal to FALSE, but “TRUE & NA” is NA?

  • Why has “ifelse(x>=0, sqrt(x), NA_real_)” a tendency to generate warnings and how to rewrite it so as to prevent that from happening?

  • What is the interpretation of “mean(x >= 0 & x <= 1)”?

  • For some integer \(x\) and \(y\), how to verify whether \(0 < x < 100\), \(0 < y < 100\), and \(x < y\), all at the same time?

  • Mathematically, for all real \(x, y > 0\), it holds \(\log xy = \log x + \log y\). Why then “all(log(x*y) == log(x)+log(y))” can sometimes return FALSE? How to fix this?

  • Is “x/y/z” always equal to “x/(y/z)”? How to fix this?

  • What is the purpose of very specific functions such as log1p and expm1 (see their help page) and many other ones listed in, e.g., the GNU GSL library [23]? Is our referring to them a violation of the beloved “let us be minimalist” approach?

  • If we know that \(x\) may be subject to error, how to test whether \(x>0\) in a robust manner?

  • Is “y<-5” the same as “y <- 5” or rather “y < -5”?

Exercise 3.5

Compute the cross-entropy loss between a numeric vector \(\boldsymbol{p}\) with values in the interval \((0, 1)\) and a logical vector \(\boldsymbol{y}\), both of length \(n\) (you can generate them randomly or manually, it does not matter, it is just an exercise):

\[ \mathcal{L}(\boldsymbol{p}, \boldsymbol{y}) = \frac{1}{n} \sum_{i=1}^n \ell_i, \]

where

\[\begin{split} \ell_i = \left\{ \begin{array}{ll} -\log p_i & \text{if } y_i \text{ is TRUE }, \\ -\log (1-p_i) & \text{if } y_i \text{ is FALSE }. \\ \end{array} \right. \end{split}\]

Interpretation: in classification problems, \(y_i\in\{\text{FALSE}, \text{TRUE}\}\) denotes the true class of the \(i\)-th object (say, whether the \(i\)-th hospital patient is symptomatic) and \(p_i\in(0,1)\) a machine learning algorithm’s confidence that \(i\) belongs to class TRUE (e.g., how sure a decision tree model is that the corresponding person is unwell). Ideally, if \(y_i\) is TRUE, \(p_i\) should be close to 1 and to \(0\) otherwise. The cross-entropy loss quantifies by how much a classifier differs from the omniscient one. The use of the logarithm penalises strong beliefs in the wrong answer.

By the way! If you have solved any of the exercises encountered so far by referring to if statements, for loops, vector indexing like x[...], or any external R package, please go back and re-write your code. Let us keep it simple (effective, readable) by using the base R’s vectorised operations that we have introduced.