welcome: please sign in
location: Änderungen von "RstatisTik/RstatisTikPortal/RcourSe/FinalFunction"
Unterschiede zwischen den Revisionen 12 und 16 (über 4 Versionen hinweg)
Revision 12 vom 2015-03-15 15:06:43
Größe: 5727
Kommentar:
Revision 16 vom 2015-03-16 17:23:05
Größe: 6105
Kommentar:
Gelöschter Text ist auf diese Art markiert. Hinzugefügter Text ist auf diese Art markiert.
Zeile 2: Zeile 2:
== Function: Reading File going through the fucntion line by line == == Reading File() - going through the function line by line ==
Zeile 23: Zeile 23:
            return(x)}else{ return(x) }})             return(x)
        
} else {              return(x)
        
}
    
})
Zeile 129: Zeile 133:

=== Line 12 ===
 * [[#CA-a8692f97d06bd15e4a572260cf8b72ef3c9d984e_12|Line 12]] checks if there is at least one line containing the code for a correct (hit) or incorrect (incorrect) answer
 * if there is no such line the function gives back a NULL value
{{{#!highlight r
 if(sum(tmp$Stim.Type %in% c("hit","incorrect"))==0) return(NULL)
}}}

Final Functions

Reading File() - going through the function line by line

Here is the function as whole, below we go through it line by line.

   1 read.file <- function(file,skip,verbose=T){
   2     if(verbose) print(paste("read", file))
   3     tmp <- read.table(file,skip = skip,sep = "\t",
   4                       header=T,na.strings = c(" +",""),
   5                       fill=T)
   6     
   7     tmp <- tmp[!is.na(tmp$Subject),] 
   8 
   9     if(sum(!str_detect(tmp[,1],"^0[012][0-9]_[1-8]$|^0[012][0-9]_test[12]$")))
  10         print(paste("id",tmp$Subject[1]))
  11     
  12     if(sum(tmp$Stim.Type %in% c("hit","incorrect"))==0) return(NULL)
  13 
  14     tmp <- lapply(tmp,function(x) {
  15         if( class(x) %in% c("character","factor") ){
  16             x <- factor(gsub(" ","",as.character(x)))
  17             return(x)
  18         } else { 
  19             return(x) 
  20         }
  21     })
  22     
  23     tmp <- as.data.frame(tmp)
  24     
  25     pause <- which(tmp$Event.Type=="Picture" & tmp$Code=="Pause")
  26     if(length(pause)>0){
  27         drei <- which(tmp$Code==3 & !is.na(tmp$Code))
  28         drei <- drei[drei > pause][1:2]
  29         if(pause + 1 < drei[1]){
  30             tmp <- tmp[-(pause:drei[2]),]
  31         }}
  32 
  33     
  34     tmp <- tmp[!(tmp$Event.Type %in% c("Pause","Resume")), ]
  35 
  36     first.pic <- min(which(tmp$Event.Type=="Picture" & !is.na(tmp$Event.Type) )) - 1
  37     tmp <- tmp[-(1:first.pic),]
  38 
  39     last.pic <- min(which(tmp$Event.Type=="Picture" & !is.na(tmp$Event.Type) &
  40                               tmp$Code=="Fertig!" & !is.na(tmp$Code)))
  41     tmp <- tmp[-(last.pic:nrow(tmp)),]
  42 
  43     zeilen <- which(tmp$Event.Type %in% c("Response"))
  44     zeilen <- sort(unique(c(zeilen,zeilen-1)))
  45     zeilen <- zeilen[zeilen>0]
  46     tmp <- tmp[zeilen,]
  47     
  48     responses <- which(tmp$Code %in% c(1,2))
  49     events <- responses-1
  50     tmp$Type <- NA
  51     tmp$Type[responses] <- as.character(tmp$Event.Type[events])
  52 
  53     if(length(tmp$Type[responses])!=length(tmp$Event.Type[events])) { print(file)}
  54     tmp$Event.Code <- NA
  55     tmp$Event.Code[responses] <- as.character(tmp$Code[events])
  56     tmp$Time1 <- NA
  57     tmp$Time1[responses] <- tmp$Time[events]
  58     tmp$Stim.Type[responses] <- as.character(tmp$Stim.Type[events])
  59     tmp$Duration[responses] <- as.character(tmp$Duration[events])
  60     tmp$Uncertainty.1[responses] <- as.character(tmp$Uncertainty.1[events])
  61     tmp$ReqTime[responses] <- as.character(tmp$ReqTime[events])
  62     tmp$ReqDur[responses] <- as.character(tmp$ReqDur[events])
  63     tmp$Pair.Index[responses] <- as.character(tmp$Pair.Index[events])
  64     
  65 
  66     tmp$Stim.Type[responses] <- as.character(tmp$Stim.Type[events])
  67     tmp <- tmp[tmp$Event.Type=="Response" & !is.na(tmp$Type),]
  68     tmp <- tmp[tmp$Type=="Picture" & !is.na(tmp$Type),]
  69     return(tmp)
  70 }

line 1

  • line 1 gives the function its name including arguments and their default values

  • the file argument will take the file name and is without a default
  • skip takes a number which indicates how many lines will be skipped at the beginning of the file
  • verbose indicates if the file name will be printed out while reading

   1  read.file <- function(file,skip=3,verbose=T){

line 2

   1  if(verbose) print(paste("read", file))
  • this line just prints out the name of the file while reading it unless verbose is set to wrong

Line 3-5

  • here we have the command to read in the text file

  • it takes the skip argument from above
  • we are setting sep which indicates the field separator to tab
  • set header to T because the file contains the columns names
  • with setting na.strings to the empty string or any string containing only spaces to we indicate to code this fields as missings
  • more on reading files

   1  tmp <- read.table(file,skip = skip,sep = "\t",
   2                           header=T,na.strings = c(" +",""),
   3                           fill=T)

Line 7

  • here we remove all rows with a missing Subject field

  • therefore we need indexing
  • is.na(x) gives back a logical vector, containing TRUE for missings in x and FALSE for any existing value
  • read more on indexing/subscripting

   1     tmp <- tmp[!is.na(tmp$Subject),] 

Line 9-10

  • Line 9 and 10 print the content of Subject to stdout if the content is not in standard form

  • str_detect() is a R function and part of the stringr package it gives back a logical value dependend on if the pattern is contained in the given string
  • the pattern is a regular expression which is more flexible than to use absolut strings
  • so we check every entry of Subject, take the negation and sum the resulting logical vector - this sum is zero if no deviant Subject coding is found, otherwise the print command is executed
  • some basic information about strings and regular expression can be found here

   1     if(sum(!str_detect(tmp$Subject,"^0[012][0-9]_[1-8]$|^0[012][0-9]_test[12]$")))
   2         print(paste("id",tmp$Subject[1]))

Line 12

  • Line 12 checks if there is at least one line containing the code for a correct (hit) or incorrect (incorrect) answer

  • if there is no such line the function gives back a NULL value

   1  if(sum(tmp$Stim.Type %in% c("hit","incorrect"))==0) return(NULL)

Line 18:

Line 22:

Funtion: Reading All Files from a DIRECTORY

   1 read.files <- function(filesdir,skip=3,...){
   2     files <- paste(filedir,dir(filedir),sep="/")
   3     Reduce(rbind,lapply(files,read.file,skip=skip))}

RstatisTik/RstatisTikPortal/RcourSe/FinalFunction (zuletzt geändert am 2015-03-16 17:23:05 durch mandy.vogel@googlemail.com)