I recently put all the EcoIP application into an R package. I was surprised at how easy it was and I wanted to share my experience :)
Root Path
Everything that is to be packaged needs to be in a staging directory. It will contain all the source files and the meta-files that describe the package. In my case, my root path was EcoIP:
$ ls EcoIP DESCRIPTION inst man NAMESPACE R
DESCRIPTION and NAMESPACE are files and the rest are directories. Lets see these one by one…
Description
This file contains information about the package: version, name, author…. Here is my DESCRIPTION file:
Package: EcoIP Type: Package Title: Ecological Image Processing Version: 0.1-20120726 Date: 20120726 Author: Joel Andres Granados <joel.granados@gmail.com> Maintainer: Joel Andres Granados <joel.granados@gmail.com> Description: EcoIP detects phenological phases based on image series. License: GPL-3 Depends: EBImage,digest,fields,RSVGTipsDevice
Though the description file has more meta variables, these are arguably the main ones. Or, at least, the ones I used :). All of them are pretty self-explanatory.
Namespace
This is the file that contains what is to be exported into the R namespace when the package is ‘imported’ with the `library` command. Here is my NAMESPACE file:
export(eip.genOutput) export(eip.nbm) export(eip.histcmp) export(eip.showModel) export(eip.plot)
Here there is the list of all the R files included in EcoIP. Each of these files has lots of functions that can potentially be exported into the R namespace. With this file you explicitly tell R what functions you want exported and the rest are just hidden. Here you put everything that you want the user to execute.
R Directory
Here is were all the R source files go. I did not have any other type of file so I did not need to use other directories. But R requires other directories if you are planning to use other languages like Java or C. Here is the contents of my R directory.
$ ls R colorTrans.R common.R ecoip.R imageTrans.R naiveBayes.R
Something worth mentioning is that I had to change my `source` calls. Before creating the package I expected the source files to be in a certain place, after the package the source files will be in the relative path that starts at R/… So if I wanted to ‘source’ a file I would have to do it like this:
source("R/common.R")
Other than this, the files did not suffer any other package related changes.
The man directory
Here is where all the manual pages are located. Something that I have noticed in the past with R package is that they are very well documented. The reason for this is that R provides a handy command that practically creates the package : `package.skeleton`. In my case I only used it to create the manual pages, but that is just me :). You can use the command to create the whole package skeleton:
package.skeleton("EcoIP", code_files=c("colorTrans.R", "common.R", "ecoip.R", "imageTrans.R", "naiveBayes.R"))
The command will create the root path and all its contents. I only used it generate the documentation templates since they were really helpful when creating the documentation. In my case I only had to fill in the explanations for each function argument, give some general information about the function and that is it. The stuff you don’t need from the template, you can just erase. Since all my exported functions were on just one file, I generated the documentation templates with the following command:
package.skeleton("EcoIP", code_files=c("ecoip.R"))
Then I went and fished for the template and changed it to my liking. Here is the output of the man directory.
$ ls man EcoIP-package.Rd eip.genOutput.Rd eip.histcmp.Rd eip.nbm.Rd eip.plot.Rd eip.showModel.Rd
The inst directory
In my case I had some data that I wanted to include with my package. It was a bit tricky y since it was not in the regular format that R expected (csv file, Rdata, tar.gz…). I had to put it in a place where the user would be able to find it. I found that putting it in inst/extdata did the trick. The data that I included was some images and the way that the user can access these images is by using the `system.file` command. So if I wanted to list the contents of the inst/extdata directory I would execute this command in the R environment:
list.files(system.file("extdata", package="EcoIP"))
Creating the package
Once you have your root path ready, you can run the following commands:
R CMD build ROOT
This command creates a tar.gz file. To check that the package is ‘sane’, you should run the following command:
R CMD check ROOT.tar.gz
Troubleshooting
I had some issues with when I first tried to check the package. The were due to my code trying to source files from places different from the R directory. To avoid this source your R source files like if they were all in an R directory.
I also had some issues where the package could only be versioned by two numbers: mayor.minor. Didn’t quite find a way around this problem.
Installing
I found that EBImage’s way of installation is really cool. I created a file on my source to mimic the behavior of EBImage. So now all you have to do to install EcoIP into an R environment is execute two commands:
source("http://www.itu.dk/~jogr/Links/PermLinks/ecoip_install.R") ecoip_install()