Unterschiede zwischen den Revisionen 1 und 2

Combining Data Frames

Inhaltsverzeichnis

Combining Data Frames

rbind()

rbind() can be used to combine two dataframes (or matrices) in the sense of adding rows, the column names and types must be the same for the two objects

   1 > x <- data.frame(id=1:3,score=rnorm(3))
   2 > y <- data.frame(id=13:15,score=rnorm(3))
   3 > rbind(x,y)
   4 id       score
   5 1  1  0.71121163
   6 2  2 -0.62973249
   7 3  3  1.17737595
   8 4 13 -0.45074940
   9 5 14 -0.01044197
  10 6 15 -1.05217176

cbind()

cbind() can be used to combine two dataframes (or matrices) in the sense of adding columns, the number of rows must be the same for the two objects

   1 > cbind(x,y)
   2 id      score1      score2     score3
   3 1  1  0.11440705  0.14536778 -1.1773241
   4 2  2 -1.62862651  0.02020604  0.5686415
   5 3  3  0.05335811  0.25462270  0.8844987
   6 4  4 -0.19931734  0.15625511  0.9287316
   7 5  5 -1.15217836 -1.79804503 -0.7550234

it is not recommended to use cbind() to combining data frames

merge()

merge() is the command of choice for merging or joining data frames
it is the equivalent of join in sql
there are four cases
- inner join
- left outer join
- right outer join
- full outer join

   1 > (d1 <- data.frame(id=LETTERS[c(1,2,3)],day1=sample(10,3)))
   2 id day1
   3 1  A    3
   4 2  B    4
   5 3  C    5
   6 > (d2 <- data.frame(id=LETTERS[c(1,3,5,6)],day2=sample(10,4)))
   7 id day2
   8 1  A    7
   9 2  C   10
  10 3  E    3
  11 4  F    6

inner join

inner join means: keep only the cases present in both of the data frames

   1 > merge(d1,d2)
   2 id day1 day2
   3 1  A    3    7
   4 2  C    5   10

left outer join

left outer join means: keep all cases of the left data frame no matter if they are present in the right data frame (all.x=T)

   1 > merge(d1,d2,all.x = T)
   2 id day1 day2
   3 1  A    3    7
   4 2  B    4   NA
   5 3  C    5   10

right outer join

right outer join means: keep all cases of the right data frame no matter if they are present in the left data frame (all.y=T)

   1 > merge(d1,d2,all.y = T)
   2 id day1 day2
   3 1  A    3    7
   4 2  C    5   10
   5 3  E   NA    3
   6 4  F   NA    6

full outer join

full outer join means: keep all cases of both data frames (all=T)

   1 > merge(d1,d2,all = T)
   2 id day1 day2
   3 1  A    3    7
   4 2  B    4   NA
   5 3  C    5   10
   6 4  E   NA    3
   7 5  F   NA    6

if not stated otherwise R uses the intersect of the names of both data frames, in our case only \textit{id}
you can specify these columns directly by \texttt{by=c("colname1","colname2")} if the columns are named identical or
using\\ \texttt{by.x=c("colname1.x","colname2.x"),

merge() Exercise

now read in the file personendaten.txt using the appropriate command
join the demographics with our pre1 data frame (even though it does not make sense now)

merge() Solution

   1 > persdat <- read.table("../session1/session1data/personendaten.txt",
   2 +                       sep="\t",
   3 +                       header=T)
   4 > pre1 <- merge(persdat,pre1,all.y = T)
   5 > head(pre1)
   6 Subject Sex Age_PRETEST Trial Event.Type Code   Time TTime Uncertainty
   7 1  PRE001   f        3.11     7   Response    2 178963 10009           1
   8 2  PRE001   f        3.11    12   Response    1 238680  8342           1
   9 3  PRE001   f        3.11    17   Response    2 297789  8066           1
  10 4  PRE001   f        3.11    22   Response    1 351321 10811           1
  11 5  PRE001   f        3.11    27   Response    2 403607   713           1
  12 6  PRE001   f        3.11    32   Response    1 467793 23709           1
  13 Duration Uncertainty.1 ReqTime ReqDur Stim.Type Pair.Index    Type Event.Code
  14 1    10197             2       0   next incorrect          7 Picture   RO09.jpg
  15 2     8398             2       0   next incorrect         12 Picture   RO20.jpg
  16 3     8198             2       0   next       hit         17 Picture   RS28.jpg
  17 4    10997             2       0   next       hit         22 Picture   AT26.jpg
  18 5      800             2       0   next       hit         27 Picture   RS23.jpg
  19 6    23794             2       0   next       hit         32 Picture   OF04.jpg

RstatisTik/RstatisTikPortal/RcourSe/FinalFunction/CombDataFrames (zuletzt geändert am 2015-03-15 11:06:03 durch mandy.vogel@googlemail.com)

-  ⇤ ← Revision 1 vom 2015-03-15 11:04:41 → 
  Größe: 4284
  Autor: mandy.vogel@googlemail.com
  Kommentar:
+   ← Revision 2 vom 2015-03-15 11:06:03 → ⇥
  Größe: 4283
  Autor: mandy.vogel@googlemail.com
  Kommentar:
-Gelöschter Text ist auf diese Art markiert.
+Hinzugefügter Text ist auf diese Art markiert.
 Zeile 52:
-== inner join ==
+=== inner join ===
 Zeile 60:
-== left outer join ==
+=== left outer join ===
 Zeile 69:
-== right outer join ==
+=== right outer join ===
 Zeile 79:
-== full outer join ==
+=== full outer join ===
 Zeile 90:
-== merge() ==
 Zeile 94:
-== merge() Exercise ==
+=== merge() Exercise ===
 Zeile 97:
-== merge() Solution ==
+=== merge() Solution ===

Quick Links

Search Wiki

Page Tools