<?xml version="1.0" encoding="utf-8"?><!DOCTYPE article  PUBLIC '-//OASIS//DTD DocBook XML V4.4//EN'  'http://www.docbook.org/xml/4.4/docbookx.dtd'><article><articleinfo><title>RstatisTik/RstatisTikPortal/RcourSe/FinalFunction/indeXing</title><revhistory><revision><revnumber>2</revnumber><date>2015-03-15 09:49:00</date><authorinitials>mandy.vogel@googlemail.com</authorinitials></revision><revision><revnumber>1</revnumber><date>2015-03-15 09:48:13</date><authorinitials>mandy.vogel@googlemail.com</authorinitials></revision></revhistory></articleinfo><section><title>Indexing</title><section><title>Indexing with Positive Integers</title><itemizedlist><listitem><para>there are circumstances where we want to select only some of the elements of a vector/array/dataframe/list </para></listitem><listitem><para>this selection is done using subscripts (also known as indices) </para></listitem><listitem><para>subscripts have square brackets [2] while functions have round brackets (2) </para></listitem><listitem><para>Subscripts on vectors, matrices, arrays and dataframes have one set of square brackets [6], [3,4] or [2,3,2,1] </para></listitem><listitem><para>when a subscript appears as a blank it is understood to mean <emphasis>all of</emphasis> thus </para><itemizedlist><listitem><para>[,4] means all rows in column 4 of an object </para></listitem><listitem><para>[2,] means all columns in row 2 of an object. </para></listitem><listitem><para>subscripts on lists have (usually) double square brackets [ [2] ] or [ [i,j] ] </para></listitem></itemizedlist></listitem><listitem><para><emphasis>A vector of positive integers as index</emphasis>:The index vector can be of any length and the result is of the same length as the index vector. For example, </para></listitem></itemizedlist><programlisting format="linespecific" language="highlight" linenumbering="numbered" startinglinenumber="1"><![CDATA[> ]]><symbol><![CDATA[letters]]></symbol><methodname><![CDATA[[1]]></methodname><![CDATA[:3]]><methodname><![CDATA[]]]></methodname>
<methodname><![CDATA[[1]]></methodname><methodname><![CDATA[]]]></methodname><![CDATA[ ]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[a"]]></phrase><![CDATA[ ]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[b"]]></phrase><![CDATA[ ]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[c"]]></phrase>
<![CDATA[> ]]><symbol><![CDATA[letters]]></symbol><methodname><![CDATA[[c]]></methodname><![CDATA[(1:3,1:3)]]><methodname><![CDATA[]]]></methodname>
<methodname><![CDATA[[1]]></methodname><methodname><![CDATA[]]]></methodname><![CDATA[ ]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[a"]]></phrase><![CDATA[ ]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[b"]]></phrase><![CDATA[ ]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[c"]]></phrase><![CDATA[ ]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[a"]]></phrase><![CDATA[ ]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[b"]]></phrase><![CDATA[ ]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[c"]]></phrase>
</programlisting><itemizedlist><listitem><para><emphasis>A logical vector as index</emphasis>: Values corresponding to T values in the index vector are selected and those corresponding to F or NA are omitted. For example, </para></listitem></itemizedlist><programlisting format="linespecific" language="highlight" linenumbering="numbered" startinglinenumber="1"><![CDATA[> ]]><methodname><![CDATA[x]]></methodname><![CDATA[<-]]><methodname><![CDATA[c]]></methodname><![CDATA[(1,2,3,]]><symbol><![CDATA[NA]]></symbol><![CDATA[)]]>
<![CDATA[> ]]><methodname><![CDATA[x]]></methodname><methodname><![CDATA[[]]></methodname><![CDATA[!]]><methodname><![CDATA[is.na]]></methodname><![CDATA[(]]><methodname><![CDATA[x]]></methodname><![CDATA[)]]><methodname><![CDATA[]]]></methodname>
<methodname><![CDATA[[1]]></methodname><methodname><![CDATA[]]]></methodname><![CDATA[ 1 2 3]]>
</programlisting><para>creates a vector without missing values. Also </para><programlisting format="linespecific" language="highlight" linenumbering="numbered" startinglinenumber="1"><![CDATA[> ]]><methodname><![CDATA[x]]></methodname><methodname><![CDATA[[is.na]]></methodname><![CDATA[(]]><methodname><![CDATA[x]]></methodname><![CDATA[)]]><methodname><![CDATA[]]]></methodname><![CDATA[ <- 0]]>
<![CDATA[> ]]><methodname><![CDATA[x]]></methodname>
<methodname><![CDATA[[1]]></methodname><methodname><![CDATA[]]]></methodname><![CDATA[ 1 2 3 0]]>
</programlisting><para>replaces the missing value by zeros. A common operation is to select rows or columns of data frame that meet some criteria. For example, to select those rows of painters data frame with Colour &gt;= 17: </para><programlisting format="linespecific" language="highlight" linenumbering="numbered" startinglinenumber="1"><![CDATA[> ]]><methodname><![CDATA[library]]></methodname><![CDATA[(]]><methodname><![CDATA[MASS]]></methodname><![CDATA[)]]>
<![CDATA[> ]]><methodname><![CDATA[painters]]></methodname><methodname><![CDATA[[painters]]></methodname><![CDATA[$]]><methodname><![CDATA[Colour]]></methodname><![CDATA[ >= 17,]]><methodname><![CDATA[]]]></methodname>
<methodname><![CDATA[Composition]]></methodname><![CDATA[ ]]><methodname><![CDATA[Drawing]]></methodname><![CDATA[ ]]><methodname><![CDATA[Colour]]></methodname><![CDATA[ ]]><methodname><![CDATA[Expression]]></methodname><![CDATA[ ]]><methodname><![CDATA[School]]></methodname>
<methodname><![CDATA[Bassano]]></methodname><![CDATA[          6       8     17          0      ]]><methodname><![CDATA[D]]></methodname>
<methodname><![CDATA[Giorgione]]></methodname><![CDATA[        8       9     18          4      ]]><methodname><![CDATA[D]]></methodname>
<methodname><![CDATA[Pordenone]]></methodname><![CDATA[        8      14     17          5      ]]><methodname><![CDATA[D]]></methodname>
</programlisting><para>We may want to select on more than one criterion. We can combine logical indices by the 'and', 'or' and 'not' operators. For example, </para><programlisting format="linespecific" language="highlight" linenumbering="numbered" startinglinenumber="1"><![CDATA[> ]]><methodname><![CDATA[painters]]></methodname><methodname><![CDATA[[painters]]></methodname><![CDATA[$]]><methodname><![CDATA[Colour]]></methodname><![CDATA[ >= 17 &  ]]><methodname><![CDATA[Composition]]></methodname><![CDATA[ ]]><methodname><![CDATA[Drawing]]></methodname><![CDATA[ ]]><methodname><![CDATA[Colour]]></methodname>
<methodname><![CDATA[Titian]]></methodname><![CDATA[             12      15     18]]>
<methodname><![CDATA[Rembrandt]]></methodname><![CDATA[          15       6     17]]>
<methodname><![CDATA[Rubens]]></methodname><![CDATA[             18      13     17]]>
<methodname><![CDATA[Van]]></methodname><![CDATA[ ]]><methodname><![CDATA[Dyck]]></methodname><![CDATA[           15      10     17]]>
</programlisting></section><section><title>List of Logical Operations</title><informaltable><tgroup cols="2"><colspec colname="col_0"/><colspec colname="col_1"/><tbody><row rowsep="1"><entry colsep="1" rowsep="1"><para><emphasis role="strong">Operation</emphasis> </para></entry><entry colsep="1" rowsep="1"><para><emphasis role="strong">Description</emphasis></para></entry></row><row rowsep="1"><entry colsep="1" rowsep="1"><para>! </para></entry><entry colsep="1" rowsep="1"><para> logical NOT </para></entry></row><row rowsep="1"><entry colsep="1" rowsep="1"><para>¦<!--RAW HTML: &brvbar;--> | logical OR </para></entry></row><row rowsep="1"><entry colsep="1" rowsep="1"><para>&lt;<!--RAW HTML: &lt;--> </para></entry><entry colsep="1" rowsep="1"><para> less than </para></entry></row><row rowsep="1"><entry colsep="1" rowsep="1"><para>&lt;<!--RAW HTML: &lt;-->= </para></entry><entry colsep="1" rowsep="1"><para> less than or equal to </para></entry></row><row rowsep="1"><entry colsep="1" rowsep="1"><para>&gt;<!--RAW HTML: &gt;--> </para></entry><entry colsep="1" rowsep="1"><para> greater than </para></entry></row><row rowsep="1"><entry colsep="1" rowsep="1"><para>&gt;<!--RAW HTML: &gt;-->= </para></entry><entry colsep="1" rowsep="1"><para> greater than or equal to </para></entry></row><row rowsep="1"><entry colsep="1" rowsep="1"><para>== </para></entry><entry colsep="1" rowsep="1"><para> logical equals (double =) </para></entry></row><row rowsep="1"><entry colsep="1" rowsep="1"><para>!= </para></entry><entry colsep="1" rowsep="1"><para> not equal </para></entry></row><row rowsep="1"><entry colsep="1" rowsep="1"><para>¦<!--RAW HTML: &brvbar;-->¦<!--RAW HTML: &brvbar;--> </para></entry><entry colsep="1" rowsep="1"><para> OR with IF </para></entry></row><row rowsep="1"><entry colsep="1" rowsep="1"><para>xor(x,y) </para></entry><entry colsep="1" rowsep="1"><para>exclusive OR </para></entry></row><row rowsep="1"><entry colsep="1" rowsep="1"><para>isTRUE(x) </para></entry><entry colsep="1" rowsep="1"><para>an abbreviation of identical(TRUE,x)</para></entry></row></tbody></tgroup></informaltable><para>If we want to select a subgroup, for example those with schools A, B, and D. We can generate a logical vector using the  &lt;latex&gt; \mathtt{\%in\%}&lt;/latex&gt; operator as follows: </para><programlisting format="linespecific" language="highlight" linenumbering="numbered" startinglinenumber="1"><![CDATA[> ]]><methodname><![CDATA[painters]]></methodname><methodname><![CDATA[[painters]]></methodname><![CDATA[$]]><methodname><![CDATA[School]]></methodname><![CDATA[ %in% ]]><methodname><![CDATA[c]]></methodname><![CDATA[(]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[A"]]></phrase><![CDATA[,]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[C"]]></phrase><![CDATA[,]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[D"]]></phrase><![CDATA[),]]><methodname><![CDATA[]]]></methodname>
<methodname><![CDATA[Da]]></methodname><![CDATA[ ]]><methodname><![CDATA[Udine]]></methodname><![CDATA[           10       8     16        3      ]]><methodname><![CDATA[A]]></methodname>
<methodname><![CDATA[Da]]></methodname><![CDATA[ ]]><methodname><![CDATA[Vinci]]></methodname><![CDATA[           15      16      4       14      ]]><methodname><![CDATA[A]]></methodname>
<methodname><![CDATA[Del]]></methodname><![CDATA[ ]]><methodname><![CDATA[Piombo]]></methodname><![CDATA[          8      13     16        7      ]]><methodname><![CDATA[A]]></methodname>
</programlisting><para>Sometimes we are interested in the indices of rows satisfying a certain condition. To extract these indices we use the which() command. </para><programlisting format="linespecific" language="highlight" linenumbering="numbered" startinglinenumber="1"><![CDATA[> ]]><methodname><![CDATA[which]]></methodname><![CDATA[(]]><methodname><![CDATA[painters]]></methodname><![CDATA[$]]><methodname><![CDATA[School]]></methodname><![CDATA[ %in% ]]><methodname><![CDATA[c]]></methodname><![CDATA[(]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[A"]]></phrase><![CDATA[,]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[C"]]></phrase><![CDATA[,]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[D"]]></phrase><![CDATA[))]]>
<methodname><![CDATA[[1]]></methodname><methodname><![CDATA[]]]></methodname><![CDATA[  1  2  3  4  5  6  7  8  9 10 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31]]>
<methodname><![CDATA[[26]]></methodname><methodname><![CDATA[]]]></methodname><![CDATA[ 32]]>
</programlisting></section><section><title>Indexing with Character Vectors</title><para>A vector character strings with variable names can be used to extract those variables relevant for analysis. This is very useful when we have a large number of variables and we need to work with a few ones. For example, </para><programlisting format="linespecific" language="highlight" linenumbering="numbered" startinglinenumber="1"><![CDATA[> ]]><methodname><![CDATA[names]]></methodname><![CDATA[(]]><methodname><![CDATA[painters]]></methodname><![CDATA[)]]>
<methodname><![CDATA[[1]]></methodname><methodname><![CDATA[]]]></methodname><![CDATA[ ]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[Composition"]]></phrase><![CDATA[ ]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[Drawing"]]></phrase><![CDATA[ ]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[Colour"]]></phrase><![CDATA[ ]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[Expression"]]></phrase><![CDATA[ ]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[School"]]></phrase>
<![CDATA[> ]]><methodname><![CDATA[painters]]></methodname><methodname><![CDATA[[1]]></methodname><![CDATA[:3,]]><methodname><![CDATA[c]]></methodname><![CDATA[(]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[Drawing"]]></phrase><![CDATA[,]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[Expression"]]></phrase><![CDATA[)]]><methodname><![CDATA[]]]></methodname>
<methodname><![CDATA[Drawing]]></methodname><![CDATA[ ]]><methodname><![CDATA[Expression]]></methodname>
<methodname><![CDATA[Da]]></methodname><![CDATA[ ]]><methodname><![CDATA[Udine]]></methodname><![CDATA[         8          3]]>
<methodname><![CDATA[Da]]></methodname><![CDATA[ ]]><methodname><![CDATA[Vinci]]></methodname><![CDATA[        16         14]]>
<methodname><![CDATA[Del]]></methodname><![CDATA[ ]]><methodname><![CDATA[Piombo]]></methodname><![CDATA[      13          7]]>
</programlisting><itemizedlist><listitem><para><emphasis>a vector of character strings</emphasis> could a index on a vector when the vector has names: </para></listitem></itemizedlist><programlisting format="linespecific" language="highlight" linenumbering="numbered" startinglinenumber="1"><![CDATA[> ]]><methodname><![CDATA[x]]></methodname><![CDATA[ <- ]]><methodname><![CDATA[c]]></methodname><![CDATA[(1:3,]]><symbol><![CDATA[NA]]></symbol><![CDATA[)]]>
<![CDATA[> ]]><methodname><![CDATA[names]]></methodname><![CDATA[(]]><methodname><![CDATA[x]]></methodname><![CDATA[)<-]]><symbol><![CDATA[letters]]></symbol><methodname><![CDATA[[1]]></methodname><![CDATA[:4]]><methodname><![CDATA[]]]></methodname>
<![CDATA[> ]]><methodname><![CDATA[x]]></methodname>
<methodname><![CDATA[a]]></methodname><![CDATA[  ]]><methodname><![CDATA[b]]></methodname><![CDATA[  ]]><methodname><![CDATA[c]]></methodname><![CDATA[  ]]><methodname><![CDATA[d]]></methodname>
<![CDATA[1  2  3 ]]><symbol><![CDATA[NA]]></symbol>
<![CDATA[> ]]><methodname><![CDATA[x]]></methodname><methodname><![CDATA[[c]]></methodname><![CDATA[(]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[a"]]></phrase><![CDATA[,]]><phrase><![CDATA["]]></phrase><phrase><![CDATA[c"]]></phrase><![CDATA[)]]><methodname><![CDATA[]]]></methodname>
<methodname><![CDATA[a]]></methodname><![CDATA[ ]]><methodname><![CDATA[c]]></methodname>
<![CDATA[1 3]]>
</programlisting></section><section><title>Trimming Vectors Using Negative Indices</title><itemizedlist><listitem><para>an extremely useful facility is to use negative indices to drop terms from a vector </para></listitem><listitem><para>suppose we wanted a new vector, z, to contain everything but the first element of x </para></listitem></itemizedlist><programlisting format="linespecific" language="highlight" linenumbering="numbered" startinglinenumber="1"><![CDATA[> ]]><methodname><![CDATA[x]]></methodname><![CDATA[<- ]]><methodname><![CDATA[c]]></methodname><![CDATA[(5,8,6,7,1,5,3)]]>
<![CDATA[> (]]><methodname><![CDATA[z]]></methodname><![CDATA[ <- ]]><methodname><![CDATA[x]]></methodname><methodname><![CDATA[[]]></methodname><![CDATA[-1]]><methodname><![CDATA[]]]></methodname><![CDATA[)]]>
<methodname><![CDATA[[1]]></methodname><methodname><![CDATA[]]]></methodname><![CDATA[ 8 6 7 1 5 3]]>
</programlisting></section></section></article>