= Indexing = <> == Indexing with Positive Integers == * there are circumstances where we want to select only some of the elements of a vector/array/dataframe/list * this selection is done using subscripts (also known as indices) * subscripts have square brackets [2] while functions have round brackets (2) * Subscripts on vectors, matrices, arrays and dataframes have one set of square brackets [6], [3,4] or [2,3,2,1] * when a subscript appears as a blank it is understood to mean ''all of'' thus * [,4] means all rows in column 4 of an object * [2,] means all columns in row 2 of an object. * subscripts on lists have (usually) double square brackets [ [2] ] or [ [i,j] ] * ''A vector of positive integers as index'':The index vector can be of any length and the result is of the same length as the index vector. For example, {{{#!highlight r > letters[1:3] [1] "a" "b" "c" > letters[c(1:3,1:3)] [1] "a" "b" "c" "a" "b" "c" }}} * ''A logical vector as index'': Values corresponding to T values in the index vector are selected and those corresponding to F or NA are omitted. For example, {{{#!highlight r > x<-c(1,2,3,NA) > x[!is.na(x)] [1] 1 2 3 }}} creates a vector without missing values. Also {{{#!highlight r > x[is.na(x)] <- 0 > x [1] 1 2 3 0 }}} replaces the missing value by zeros. A common operation is to select rows or columns of data frame that meet some criteria. For example, to select those rows of painters data frame with Colour >= 17: {{{#!highlight r > library(MASS) > painters[painters$Colour >= 17,] Composition Drawing Colour Expression School Bassano 6 8 17 0 D Giorgione 8 9 18 4 D Pordenone 8 14 17 5 D }}} We may want to select on more than one criterion. We can combine logical indices by the 'and', 'or' and 'not' operators. For example, {{{#!highlight r > painters[painters$Colour >= 17 & Composition Drawing Colour Titian 12 15 18 Rembrandt 15 6 17 Rubens 18 13 17 Van Dyck 15 10 17 }}} == List of Logical Operations == ||'''Operation''' ||'''Description'''|| ||! || logical NOT || ||¦ | logical OR || ||< || less than || ||<= || less than or equal to || ||> || greater than || ||>= || greater than or equal to || ||== || logical equals (double =) || ||!= || not equal || ||¦¦ || OR with IF || ||xor(x,y) ||exclusive OR || ||isTRUE(x) ||an abbreviation of identical(TRUE,x)|| If we want to select a subgroup, for example those with schools A, B, and D. We can generate a logical vector using the \mathtt{\%in\%} operator as follows: {{{#!highlight r > painters[painters$School %in% c("A","C","D"),] Da Udine 10 8 16 3 A Da Vinci 15 16 4 14 A Del Piombo 8 13 16 7 A }}} Sometimes we are interested in the indices of rows satisfying a certain condition. To extract these indices we use the which() command. {{{#!highlight r > which(painters$School %in% c("A","C","D")) [1] 1 2 3 4 5 6 7 8 9 10 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 [26] 32 }}} == Indexing with Character Vectors == A vector character strings with variable names can be used to extract those variables relevant for analysis. This is very useful when we have a large number of variables and we need to work with a few ones. For example, {{{#!highlight r > names(painters) [1] "Composition" "Drawing" "Colour" "Expression" "School" > painters[1:3,c("Drawing","Expression")] Drawing Expression Da Udine 8 3 Da Vinci 16 14 Del Piombo 13 7 }}} * ''a vector of character strings'' could a index on a vector when the vector has names: {{{#!highlight r > x <- c(1:3,NA) > names(x)<-letters[1:4] > x a b c d 1 2 3 NA > x[c("a","c")] a c 1 3 }}} == Trimming Vectors Using Negative Indices == * an extremely useful facility is to use negative indices to drop terms from a vector * suppose we wanted a new vector, z, to contain everything but the first element of x {{{#!highlight r > x<- c(5,8,6,7,1,5,3) > (z <- x[-1]) [1] 8 6 7 1 5 3 }}}