r - Creating bootstrap samples and storing sampled data in different names -
when create bootstrap samples data frame datta
using following codes
boot1a <- replicate(3, do.call("rbind", lapply(sample(unique(datta$pid),2000,replace=true), function(x) datta[datta$pid==x,])), simplify=false) boot1b <- data.frame(boot1a) # data frame list sample1 <- boot1b[order(boot1b$pid),] # sorting based on pid , storing
variables in bootstrap sample sample1
have names ending .1, .2, .3, ...
. (pid
person id, takes similar values different observations of same person). instance, above code variable xy
in datta
have names xy
, xy.1
, , xy.2
associated first, second , third bootstrap samples. rather prefer have different bootstrap samples named differently variable names in each remaining same in original data frame. in above case, have bootstrap samples stored in 3 different data frames, say, boot1, boot2, boot3
, variable names in each data frame same in original data frame. began doing manually 1 replication @ time, gonna take lot of time create many bootstrap samples. has suggestion on how in better way?
edit first few observations 4 of many variables in data frame datt
follows.
pid xy zy wy 1 10 2 -5 1 12 3 -4.5 1 14 4 -4 1 16 5 -3.5 1 18 6 -3 1 20 7 -2.5 2 22 8 -2 2 24 9 -1.5 2 26 10 -1 2 28 11 -0.5 2 30 12 0 2 32 13 0.5
here sample example:
data
set.seed(123) data<-rnorm(100, 160, 20) data1<-as.data.frame(matrix(data, nrow = 20, ncol = 5, byrow = false)) n<-5 data2<-do.call("rbind", replicate(n, data1, simplify=false)) data2$fac<-as.factor(rep(1:n,each=20))
sampling
library(plyr) sample1<-ddply(data2,.(fac),summarize, mysample=sample((1:length(fac)),size=1,replace=true)) fac mysample 1 1 18 2 2 14 3 3 13 4 4 20 5 5 14
Comments
Post a Comment