welcome: please sign in
location: Änderungen von "RstatisTik/RstatisTikPortal/RcourSe/CourseOutline/FunctionsInR/ApplyR"
Unterschiede zwischen den Revisionen 4 und 5
Revision 4 vom 2015-05-01 09:10:11
Größe: 6351
Kommentar:
Revision 5 vom 2015-05-01 10:46:07
Größe: 6377
Kommentar:
Gelöschter Text ist auf diese Art markiert. Hinzugefügter Text ist auf diese Art markiert.
Zeile 8: Zeile 8:
You can see all three parts if you type the name of the function without primitives. Exceptions are brackets. Primitive functions, like sum(), call C code directly with .Primitive() and contain no R code. Therefore their formals(), body(), and environment() are all NULL. You can see all three parts if you type the name of the function without brackets. Exceptions are primitives. Primitive functions, like sum(), call C code directly with .Primitive() and contain no R code. Therefore their formals(), body(), and environment() are all NULL.
Zeile 16: Zeile 16:
expected = E, residuals = (x - E)/sqrt(E), stdres = (x - expected = E, residuals = (x - E)/sqrt(E), stdres = (x - ...
Zeile 44: Zeile 45:
A common application of loops is to apply a function to each element of a set of values and collect the results in a single structure. In R this is done by the functions: A common application of loops is to apply a function to each element of a set of values and collect the results in a single structure. In R this is mainly done by the higher order functions:

Introduction

Every function in R has three important characteristics:

  • a body (the code inside the function) - body()
  • arguments (the list of arguments which controls how you can call the function) - formals()
  • an environment (the “map” of the location of the function’s variables) - environment()

You can see all three parts if you type the name of the function without brackets. Exceptions are primitives. Primitive functions, like sum(), call C code directly with .Primitive() and contain no R code. Therefore their formals(), body(), and environment() are all NULL.

Functions

   1 > chisq.test
   2 function (x, y = NULL, correct = TRUE, p = rep(1/length(x), length(x)),
   3 DNAME <- deparse(substitute(x))
   4 if (is.data.frame(x))
   5 expected = E, residuals = (x - E)/sqrt(E), stdres = (x - ...
   6 
   7 > sum
   8 function (..., na.rm = FALSE)  .Primitive("sum")

Function Arguments

Arguments are matched

  • first by exact name (perfect matching)
  • then by prefix matching
  • and finally by position.

By default, R function arguments are lazy, they are only evaluated if they are actually used:

   1 > f <- function(x) {
   2 f <- function(x) {
   3 +   10
   4 + }
   5 > f(stop("This is an error!"))
   6 [1] 10
   7 >

Implicit Loops

Introduction

A common application of loops is to apply a function to each element of a set of values and collect the results in a single structure. In R this is mainly done by the higher order functions:

  • lapply()
  • sapply()
  • apply()
  • tapply()

lapply()

  • The functions lapply and sapply are similar, their first argument can be a list, data frame, matrix or vector, the second argument the function to "apply". The former return a list (hence "l") and the latter tries to simplify the results (hence the "s"). For example:

   1 > lapply(dat,mean)
   2 [1] 6753.636
   3 [1] 5433.182
   4 > sapply(dat,mean)

apply()

  • apply() this function can be applied to an array. Its argument is the array, the second the dimension/s where we want to apply a function and the third is the function. For example

   1 > x<-1:12
   2 > dim(x)<-c(2,2,3)
   3 > apply(x,3,quantile) ## calculate the quantiles

tapply()

  • The function tapply() allows you to create tables (hence the "t") of the value of a function on subgroups defined by its second argument, which can be a factor or a list of factors.

For example in the quine data frame, we can summarize Days classify by Eth and Lrn as follows:

   1 > tapply(Days,list(Eth,Lrn),mean)
   2 AL       SL
   3 A 18.57500 24.89655
   4 N 13.25581 10.82353
  • the class() function shows the class of an object use it in combination with lapply() to get the classes of the columns of the quine data frame
  • do the same with sapply() what is the difference
  • try to combine this with what you learned about indexing and create a new data frame quine2 only containing the columns which are factors
  • calculate the row and column means of the below defined matrix m using the apply function PS: in real life application use the rowMeans() and colMeans() function

   1 m <- matrix(rnorm(100),nrow=10)
  • use tapply() to summarise the number of missing days at school per Ethnicity and/or per Sex (three lines) * sometimes the aggregate() function is more convenient; note the use of  #!latex $\sim$; it is read as 'is dependent on'and it is extensively used in modelling

   1 > aggregate(Days ~ Sex + Eth, data=quine,mean)
   2 Sex Eth     Days
   3 1   F   A 20.92105
   4 2   M   A 21.61290
   5 3   F   N 10.07143
   6 4   M   N 14.71429
   7 > aggregate(Days ~ Sex + Eth, data=quine,summary)
   8 Sex Eth Days.Min. Days.1st Qu. Days.Median Days.Mean Days.3rd Qu. Days.Max.
   9 1   F   A      0.00         5.25       13.50     20.92        30.25     81.00
  10 2   M   A      2.00         9.50       16.00     21.61        33.00     57.00
  11 3   F   N      0.00         5.00        7.00     10.07        14.00     37.00
  12 4   M   N      0.00         3.50        8.00     14.71        19.50     69.00

Function Exercises (Verzani)

  • Write a function to compute the average distance from the mean for some data vector.
  • Write a function f() which finds the average of the x values after squaring and substracts the square of the average of the numbers. Verify this output will always be non-negative by computing f(1:10)
  • An integer is even if the remainder upon dividing it by 2 is 0. This remainder is given by R with the syntax x \%\% 2. Use this to write a function iseven(). How would you write isodd()?
  • Write a function isprime() that checks if a number x is prime by dividing x by all values from 2,...,x-1 then checking to see if there is a remainder of 0.

Function Exercises (Verzani) Solutions

  • Write a function to compute the average distance from the mean for some data vector.

   1 > avg.dist <- function(x){
   2 +     xbar <- mean(x)
   3 +     mean(abs(x-xbar))
   4 + }

Function Exercises (Verzani) Solutions

  • Write a function f() which finds the average of the x values aufter squaring and substracts the square of the average of the numbers. Verify this output will always be non-negative by computing \texttt{f(1:10)}

   1 > f <- function(x){
   2 +     mean(x**2) - mean(x)**2
   3 + }
   4 > f(1:10)
   5 [1] 8.25

Function Exercises (Verzani) Solutions

  • An integer is even if the remainder upon dividing it by 2 is 0. This remainder is given by R with the syntax \texttt{ x \%\% 2}. Use this to write a function iseven(). How would you write isodd()?

   1 > iseven <- function(x){
   2 +     x %% 2 == 0
   3 + }
   4 > iseven(1:10)
   5 [1] FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE
   6 > isodd <- function(x){
   7 +     !iseven(x)
   8 + }
   9 > isodd(1:10)
  10 [1]  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE

Function Exercises (Verzani) Solutions

  • Write a function isprime() that checks if a number x is prime by dividing x by all values \texttt{$2,\ldots,x-1}}}} then checking to see if there is a remainder of 0.

   1 > isprime <- function(x){
   2 +     if(x == 2) return(TRUE)
   3 +     !(0 %in% (x %% (2:(x-1))))
   4 + }
   5 > isprime(2)
   6 [1] TRUE
   7 > isprime(5)
   8 [1] TRUE
   9 > isprime(15)
  10 [1] FALSE

RstatisTik/RstatisTikPortal/RcourSe/CourseOutline/FunctionsInR/ApplyR (zuletzt geändert am 2015-05-01 10:48:36 durch mandy.vogel@googlemail.com)