How to improve performance of this linear interpolation in r -

- June 15, 2010

for given column in dataframe, want construct new vector each point consists of average of points on either side. last observation instead second last. , first observation second. wrote r code solve issue, calling repeatedly , extremely slow. can give tips on how more efficiently? thanks.

x1 <- c(rep('a',100),rep('b',100),rep('c',100)) x2 <- rnorm(300) x <- data.frame(x1,x2) names(x) <- c('col1','data1')   a.linear.interpolation <- function(x) {     require(zoo)     require(data.table)      a.dattab <- data.table(x)      setkey(a.dattab,col1)      #replace na values using locf / nocb     a.dattab[,data1:=na.locf(data1,na.rm=false),by=list(col1)]     a.dattab[,data1:=na.locf(data1,na.rm=false,fromlast=true),by=list(col1)]      #adding within group sequence number , size of group field facilitate     #row row processing     a.dattab[,grpseq:=seq_len(.n),by=list(col1)]     a.dattab[,grpseq_max:=.n,by=list(col1)]      #convert data.frame     #data.frame seems faster data.table row row type processing     a.df <- data.frame(a.dattab)      new.col <- vector(length=nrow(a.df))      for(i in seq(nrow(a.df))){         if(a.df[i,"grpseq"]==1){                 new.col[i] <- a.df[i+1,"data1"]             }         else if(a.df[i,"grpseq"]==a.df[i,"grpseq_max"]){                 new.col[i] <- a.df[i-1,"data1"]             }         else {                 new.col[i] <- (a.df[i-1,"data1"]+a.df[i+1,"data1"])/2             }     }      return(new.col) }

apart using rollmeans, base r filter function can sort of thing well. e.g.:

linint <- function(vec) {   c(vec[2], filter(vec, c(0.5, 0, 0.5))[-c(1, length(vec))], vec[length(vec) - 1]) }  x <- c(1,3,6,10,1) linint(x) #[1]  3.0  3.5  6.5  3.5 10.0

and it's pretty quick, chewing through 10m cases in less second:

x <- rnorm(1e7) system.time(linint(x)) #user  system elapsed  #0.57    0.18    0.75

Search This Blog

LAVA

How to improve performance of this linear interpolation in r -

Comments

Post a Comment

Popular posts from this blog

c++ - Linked List error when inserting for the last time -

java - activate/deactivate sonar maven plugin by profile? -

tsql - Pivot with Temp Table (definition for column must include data type) -- SQL Server 2008 -