Planet R

November 01, 2014


New package RealVAMS with initial version 0.3-1

Package: RealVAMS
Type: Package
Title: Multivariate VAM Fitting
Version: 0.3-1
Date: 2014-09-25
Author: Andrew Karl, Jennifer Broatch, and Jennifer Green
Maintainer: Andrew Karl
Description: The RealVAMs package fits a multivariate value-added model (VAM) (see Broatch and Lohr 2012) with normally distributed test scores and a binary outcome indicator. This material is based upon work supported by the National Science Foundation under grants DRL-1336027 and DRL-1336265.
License: GPL-2
Depends: R (>= 3.0.0), Matrix
Imports: numDeriv, Rcpp (>= 0.10.6)
LazyData: yes
ByteCompile: yes
NeedsCompilation: yes
LinkingTo: Rcpp, RcppArmadillo
Packaged: 2014-11-01 03:36:55 UTC; Andrew
Repository: CRAN
Date/Publication: 2014-11-01 07:19:08

More information about RealVAMS at CRAN

November 01, 2014 07:13 AM

October 31, 2014


New package hermite with initial version 1.0

Package: hermite
Type: Package
Title: Generalized Hermite distribution
Version: 1.0
Date: 2014-10-30
Encoding: UTF-8
Author: David Moriña (Centre for Research in Environmental Epidemiology, CREAL), Manuel Higueras (Universitat Autònoma de Barcelona and Public Health England) and Pedro Puig (Universitat Autònoma de Barcelona)
Maintainer: David Moriña Soler
Description: Probability functions for the generalized Hermite distribution
Depends: R (>= 2.15.0)
Repository: CRAN
License: GPL (>= 2)
Packaged: 2014-10-31 15:08:23 UTC; dmorinya
NeedsCompilation: no
Date/Publication: 2014-10-31 16:16:01

More information about hermite at CRAN

October 31, 2014 05:13 PM

Removed CRANberries

Package STARSEQ (with last version 1.2.1) was removed from CRAN

Previous versions (as known to CRANberries) which should be available via the Archive link are:

2014-10-21 1.2.1
2012-08-24 1.02

October 31, 2014 05:13 AM

Package highriskzone (with last version 1.1) was removed from CRAN

Previous versions (as known to CRANberries) which should be available via the Archive link are:

2014-02-20 1.1
2012-12-18 1.0

October 31, 2014 05:13 AM

October 30, 2014


New package genpathmox with initial version 0.1

Package: genpathmox
Title: Generalized PATHMOX Algorithm for PLS-PM, LS and LAD Regression
Version: 0.1
Authors@R: "Giuseppe Lamberti [aut, cre]"
Description: genpathmox provides a very interesting solution for handling segmentation variables in complex statistical methodology. It contains en extended version of the PATHMOX algorithm in the context of partial least square path modeling (Sanchez, 2009) including the F-block test (to detect the responsible latent endogenous equations of the difference), the F-coefficient (to detect the path coefficients responsible of the difference) and the invariance test (to realize a comparison between the sub-models' latent variables). Furthermore, the package contains a generalized version of the PATHMOX algorithm to approach different methodologies: linear regression and least absolute regression models.
Depends: R (>= 3.1.1),plspm, quantreg,mice,diagram, methods
License: GPL-3
LazyData: true
Packaged: 2014-10-30 14:37:42 UTC; giuseppelamberti
Author: "Giuseppe Lamberti" [aut, cre]
Maintainer: "Giuseppe Lamberti"
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-30 19:35:26

More information about genpathmox at CRAN

October 30, 2014 07:13 PM

New package fPortfolio with initial version 3011.81

Package: fPortfolio
Title: Rmetrics - Portfolio Selection and Optimization
Date: 2014-10-30
Version: 3011.81
Author: Rmetrics Core Team, Diethelm Wuertz [aut], Tobias Setz [cre], Yohan Chalabi [ctb]
Maintainer: Tobias Setz
Description: Environment for teaching "Financial Engineering and Computational Finance".
Depends: R (>= 2.15.1), methods, timeDate, timeSeries, fBasics, fAssets
Imports: fCopulae, robustbase, MASS, Rglpk, slam, Rsymphony, Rsolnp, kernlab, quadprog, rneos
Suggests: Rsocp, Rnlminb2, Rdonlp2, dplR, bcp, fGarch, mvoutlier
LazyData: yes
License: GPL (>= 2)
Packaged: 2014-10-30 14:13:59 UTC; Tobi
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-30 15:54:08

More information about fPortfolio at CRAN

October 30, 2014 03:13 PM

New package PrevMap with initial version 1.0

Package: PrevMap
Type: Package
Title: PrevMap - an R package for prevalence mapping
Version: 1.0
Date: 2014-06-25
Author: Emanuele Giorgi, Peter J. Diggle
Maintainer: Emanuele Giorgi
Depends: geoR, maxLik, raster, pdist
Description: The PrevMap package provides functions for both likelihood-based and Bayesian analysis of spatially referenced prevalence data. 'PrevMap' is also an extension of the 'geoR' package which should be installed first together with the 'maxLik', 'raster' and 'pdist' packages.
LazyData: true
License: GPL (>= 2)
Packaged: 2014-10-30 09:40:57 UTC; giorgi
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-30 11:18:02

More information about PrevMap at CRAN

October 30, 2014 11:13 AM

New package webutils with initial version 0.2

Package: webutils
Type: Package
Title: Utility Functions for Web Applications
Version: 0.2
Date: 2014-10-29
Author: Jeroen Ooms
Maintainer: Jeroen Ooms
Description: Utility functions for developing web applications. Includes parsers for application/x-www-form-urlencoded as well as multipart/form-data and examples of using the parser with either httpuv or rhttpd.
License: MIT + file LICENSE
Imports: tools, utils, jsonlite
Suggests: httpuv
Packaged: 2014-10-30 00:32:01 UTC; jeroen
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-30 07:46:54

More information about webutils at CRAN

October 30, 2014 07:13 AM

October 29, 2014


New package exCon with initial version 0.1-0

Package: exCon
Type: Package
Title: Interactive Exploration of Contour Data
Version: 0.1-0
Date: 2014-10-29
Authors@R: c(person("Bryan", "Hanson", role = c("aut", "cre"), email = ""), person("Kristina", "Mulry", role = "ctb"))
Description: exCon is an interactive tool to explore topographic-like data sets. Such data sets take the form of a matrix in which the rows and columns provide location/frequency information, and the matrix elements contain altitude/response information. Such data is found in cartography, 2D spectroscopy and chemometrics. exCon creates an interactive web page showing the contoured data set along with slices from the original matrix parallel to each dimension. The page is written in d3/javascript.
License: GPL-3
Imports: jsonlite
ByteCompile: TRUE
Depends: R (>= 3.0)
Packaged: 2014-10-29 19:41:42 UTC; bryanhanson
Author: Bryan Hanson [aut, cre], Kristina Mulry [ctb]
Maintainer: Bryan Hanson
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-29 22:05:54

More information about exCon at CRAN

October 29, 2014 09:13 PM

Bioconductor Project Working Papers

sanon : An R Package for Stratified Analysis with Nonparametric Covariable Adjustment

Kawaguchi et al. (2011) provided methodology and applications for a stratified Mann-Whitney estimator that addresses the same comparison between two randomized groups for a strictly ordinal response variable as the van Elteren test statistic for randomized clinical trials with strata. The sanon package provides the implementation of the method within the R programming environment (R Core Team, 2012). The usage of sanon is illustrated with five examples. The first example is a randomized clinical trial with eight strata and a univariate ordinal response variable. The second example is a randomized clinical trial with four strata, two covariables, and four ordinal response variables. The third example is a cross over design randomized clinical trial with two strata, one covariable, and two ordinal response variables. The fourth example is a randomized clinical trial with seven strata (which are managed as a categorical covariable), three ordinal covariables with missing values, and three ordinal response variables with missing values. The fifth example is a randomized clinical trial with six strata, a categorical covariable with three levels, and three ordinal response variables with missing values.

by Atsushi Kawaguchi et al. at October 29, 2014 02:15 PM

Removed CRANberries

Package RSQLite.extfuns (with last version 0.0.1) was removed from CRAN

Previous versions (as known to CRANberries) which should be available via the Archive link are:

2010-05-30 0.0.1

October 29, 2014 11:13 AM


New package DAMOCLES with initial version 1.0

Type: Package
Title: Dynamic Assembly Model Of Colonization, Local Extinction and Speciation
Version: 1.0
Date: 2014-09-09
Depends: R (>= 2.14.2), geiger, caper, ape, deSolve, matrixStats
Author: Rampal S. Etienne & Alex L. Pigot
Maintainer: Rampal S. Etienne
License: GPL-2
Description: Simulates and computes (maximum) likelihood of a dynamical model of community assembly that takes into account the phylogenetic history
Packaged: 2014-09-09 18:19:08 UTC; p223208
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-29 09:29:22

More information about DAMOCLES at CRAN

October 29, 2014 09:13 AM

New package rqPen with initial version 1.0

Package: rqPen
Type: Package
Title: R Package for Penalized Quantile Regression
Version: 1.0
Date: 2014-10-25
Author: Ben Sherwood
Depends: R (>= 3.0.0),quantreg
Maintainer: Ben Sherwood
Description: Performs penalized quantile regression for LASSO, SCAD and MCP functions. Provides a function that automatically generates lambdas and evaluates different models with cross validation or BIC, including a large p version of BIC.
ByteCompile: TRUE
License: MIT + file LICENSE
Packaged: 2014-10-29 01:29:00 UTC; bsherwoo
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-29 07:44:31

More information about rqPen at CRAN

October 29, 2014 07:13 AM

New package addreg with initial version 1.2

Package: addreg
Title: Additive Regression for Discrete Data
Description: Methods for fitting identity-link GLMs and GAMs to discrete data. The package uses EM-type algorithms with more stable convergence properties than standard methods.
Version: 1.2
Date: 2014-10-29
Author: Mark Donoghoe
Maintainer: Mark Donoghoe
Depends: R (>= 3.0.1)
Imports: splines, combinat, glm2
License: GPL (>= 2)
LazyData: true
Packaged: 2014-10-28 22:19:48 UTC; 42674999
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-29 07:36:11

More information about addreg at CRAN

October 29, 2014 07:13 AM

October 28, 2014


New package MfUSampler with initial version 0.9

Package: MfUSampler
Type: Package
Title: Multivariate-from-Univariate (MfU) MCMC Sampler
Version: 0.9
Date: 2014-10-27
Author: Alireza S. Mahani, Mansour T.A. Sharabiani
Maintainer: Alireza S. Mahani
Description: Convenience Functions for Multivariate MCMC Using Univariate Samplers, including Slice Sampler with Stepout and Shrinkage (Neal, 2003), and Adaptive Rejection Sampler (Gilks and Wild, 1992).
License: GPL (>= 2)
Depends: ars
Suggests: sns
Packaged: 2014-10-28 18:54:28 UTC; amahani
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-28 22:00:52

More information about MfUSampler at CRAN

October 28, 2014 09:13 PM

New package CateSelection with initial version 1.0

Package: CateSelection
Type: Package
Title: Categorical Variable Selection Methods
Version: 1.0
Date: 2014-10-28
Author: Yi Xu and Jixiang Wu
Maintainer: Yi Xu
Description: A multi-factor dimensionality reduction based forward selection method for genetic association mapping.
License: GPL (>= 2)
Depends: R(>= 2.10)
Repository: CRAN
Packaged: 2014-10-28 20:38:02 UTC; Xu
NeedsCompilation: no
Date/Publication: 2014-10-28 21:58:29

More information about CateSelection at CRAN

October 28, 2014 09:13 PM

New package safi with initial version 1.0

Package: safi
Type: Package
Title: Sensitivity Analysis for Functional Input
Version: 1.0
Date: 2014-10-28
Author: Jana Fruth, Malte Jastrow
Maintainer: Jana Fruth
Description: Design and sensitivity analysis for computer experiments with scalar-valued output and functional input, e.g. over time or space. The aim is to explore the behavior of the sensitivity over the functional domain.
License: GPL (>= 2)
Packaged: 2014-10-28 15:53:50 UTC; Jana
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-28 17:35:33

More information about safi at CRAN

October 28, 2014 05:13 PM

New package settings with initial version 0.1.1

Package: settings
Type: Package
Title: Software Option Settings Manager for R
Version: 0.1.1
Date: 2014-10-22
Author: Mark van der Loo
Maintainer: Mark van der Loo
Description: Provides option settings management that goes beyond R's default 'options' function. With this package, users can define their own option settings manager holding option names and default values. Settings can then be retrieved, altered and reset to defaults with ease. For R programmers and package developers it offers cloning and merging functionality which allows for conveniently defining global and local options, possibly in a multilevel options hierarchy. See the package vignette for some examples concerning functions, S4 classes, and reference classes.
License: GPL-3
VignetteBuilder: knitr
Suggests: testthat, knitr
Packaged: 2014-10-28 07:43:29 UTC; mark
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-28 11:18:40

More information about settings at CRAN

October 28, 2014 11:13 AM

Removed CRANberries

Package HPO.db (with last version 1.2) was removed from CRAN

Previous versions (as known to CRANberries) which should be available via the Archive link are:

2013-07-04 1.2
2012-10-30 1.0

October 28, 2014 07:13 AM

Package HPOSim (with last version 1.2) was removed from CRAN

Previous versions (as known to CRANberries) which should be available via the Archive link are:

2014-02-21 1.2
2013-07-05 1.1
2013-03-28 1.0

October 28, 2014 07:13 AM

October 27, 2014


New package bio3d with initial version 2.1-3

Package: bio3d
Title: Biological Structure Analysis
Version: 2.1-3
Author: Barry Grant, Xin-Qiu Yao, Lars Skjaerven, Julien Ide
VignetteBuilder: knitr
Imports: parallel, grid
Suggests: XML, RCurl, lattice, ncdf, igraph, bigmemory, knitr, testthat (>= 0.9.1)
Depends: R (>= 3.1.0)
LazyData: yes
Description: Utilities to process, organize and explore protein structure, sequence and dynamics data. Features include the ability to read and write structure, sequence and dynamic trajectory data, perform sequence and structure database searches, data summaries, atom selection, alignment, superposition, rigid core identification, clustering, torsion analysis, distance matrix analysis, structure and sequence conservation analysis, normal mode analysis, principal component analysis of heterogeneous structure data, and correlation network analysis from normal mode and molecular dynamics data. In addition, various utility functions are provided to enable the statistical and graphical power of the R environment to work with biological sequence and structural data. Please refer to the URLs below for more information.
Maintainer: Barry Grant
License: GPL (>= 2)
Packaged: 2014-10-24 21:47:52 UTC; xinqyao
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-27 19:03:11

More information about bio3d at CRAN

October 27, 2014 07:13 PM

New package h2o with initial version

Package: h2o
Type: Package
Title: H2O R Interface
Date: 2014-05-15
Author: Anqi Fu, Spencer Aiello, Ariel Rao, Tom Kraljevic and Petr Maj, with contributions from the 0xdata team
Maintainer: Tom Kraljevic
Description: Run H2O via its REST API from within R.
License: Apache License (== 2.0)
Depends: R (>= 2.13.0), RCurl, rjson, statmod, survival, stats, tools, utils, methods
Collate: Wrapper.R Internal.R Classes.R ParseImport.R models.R Algorithms.R zzz.R
NeedsCompilation: no
SystemRequirements: Java (>= 1.6)
Suggests: plyr
Packaged: 2014-10-23 02:24:40 UTC; jenkins
Repository: CRAN
Date/Publication: 2014-10-27 18:11:24

More information about h2o at CRAN

October 27, 2014 05:13 PM

New package qdm with initial version 0.1-0

Package: qdm
Version: 0.1-0
Date: 2014-10-16
Title: Fitting a Quadrilateral Dissimilarity Model to Same-Different Judgments
Authors@R: c(person("Nora", "Umbach", role = c("aut", "cre"), email = ""), person("Florian", "Wickelmaier", role = "aut"))
Depends: R (>= 3.1.0), stats, graphics
Description: This package provides different specifications of a Quadrilateral Dissimilarity Model which can be used to fit same-different judgments in order to get a predicted matrix that satisfies regular minimality [Colonius & Dzhafarov, 2006, Measurement and representations of sensations, Erlbaum]. From such a matrix, Fechnerian distances can be computed.
License: GPL (>= 2)
Packaged: 2014-10-27 13:33:02 UTC; noraumbach
Author: Nora Umbach [aut, cre], Florian Wickelmaier [aut]
Maintainer: Nora Umbach
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-27 15:30:23

More information about qdm at CRAN

October 27, 2014 03:14 PM


ALUES: Agricultural Land Use Evaluation System, R package

Arnold R. Salvacion                                                             
Data Analysis and Visualization using R (blog)                                         

Al-Ahmadgaid B. Asaad (maintainer)

Agricultural Land Use Evaluation System (ALUES) is an R package that evaluates land suitability for different crop production. The package is based on the Food and Agriculture Organization (FAO) and the International Rice Research Institute (IRRI) methodology for land evaluation. Development of ALUES is inspired by similar tool for land evaluation, Land Use Suitability Evaluation Tool (LUSET). The package uses fuzzy logic approach to evaluate land suitability of a particular area based on inputs such as rainfall, temperature, topography, and soil properties. The membership functions used for fuzzy modeling are the following: Triangular, Trapezoidal and Gaussian. The methods for computing the overall suitability of a particular area are also included, and these are the Minimum, Maximum, Product, Sum, Average, Exponential and Gamma. Finally, ALUES uses the power of Rcpp library for efficient computation.


The package is not yet on CRAN, and is currently under development on github. To install it, run the following:

We want to hear some feedbacks, and if you have any suggestion or issues regarding this package, please do submit it here.


The package contains several datasets which can be categorized into two:
  1. Land Units' Attributes - datasets that contain the attributes of the land units of a given location.
  2. Crop Requirements - datasets that contain the required values of factors of a particular crop for the land units.

Land Units' Attributes

The package contains sample dataset of land units' attributes from two countries:
  1. Marinduque, Philippines:
    • MarinduqueLT - a dataset consisting the land and terrain characteristics of the land units of Marinduque, Philippines;
    • MarinduqueTemp - a dataset consisting the temperature characteristics of the land units of Marinduque, Philippines; and
    • MarinduqueWater - a dataset consisting the water characteristics of the land units of Marinduque, Philippines.
  2. Lao Cai, Vietnam
    • LaoCaiLT - a dataset consisting the land and terrain characteristics of the land units of Lao Cai, Vietnam;
    • LaoCaiTemp - a dataset consisting the temperature characteristics of the land units in Lao Cai, Vietnam;
    • LaoCaiWater - a dataset consisting the water characteristics of the land units of Lao Cai, Vietnam;
For example, the first six land units in MarinduqueLT is shown below

The complete list of factors is available in the pdf version.

Crop Requirements

The crops available in the package are the listed in Table 1.

Table 1: Crops Dataset Available in ALUES.
COFFEEAR-Arabica Coffee
COFFEERO-Robusta Coffee
RICEBR-Rainfed Bunded Rice
RICEIW-Irrigated Rice
RICENF-Rice Cultivation Under Natural Floods
RICEUR-Rainfed Upland Rice

From the table, the codes are suffixed with the land units' characteristics (TerrainCR, SoilCR, WaterCR and TemperatureCR) required for the crop. For example, below are the required values for the terrain characteristics of the land units on cultivating coconut:

For required characteristics of soil, water and temperature on cultivating coconut the codes are COCONUTSoilCR, COCONUTWaterCR and COCONUTTemperatureCR, respectively.


The package contains the following functions:
  1. suitability - computes the suitability scores and classes of the land units base on the requirements of the crop.
  2. overall_suit- computes the overall suitability of the land units, using the suitability scores obtained from the suitability function.


In this section, we will get into the details of the suitability function. Usage

xa data frame consisting the properties of the land units;
ya data frame consisting the crop (e.g. coconut, cassava, etc.) requirements for a given characteristics (terrain, soil, water and temperature);
mfmembership function, default is set to "triangular". Other fuzzy models are "Trapezoidal" and "Gaussian".
sow.monthsowing month of the crop. Takes integers from 1 to 12 (inclusive), representing the twelve months of a year. So if sets to 1, the function assumes sowing month on January.
minfactor's minimum value. If NULL (default), min is set to 0. But if numeric of length one, say 0.5, then minimum is set to 0.5 for all factors. If factors on land units (x) have different minimum, then these can be concatenated to vector of mins, the length of this vector should be equal to the number of factors in x. However, if sets to "average", then min is theoretically computed as:

Let X be a factor, then X has the following suitability class: S3, S2 and S1. Assuming the scores of the said suitability class for X are $a, b$ and $c$, respectively. Then, $$\mathrm{min} = a - \displaystyle\frac{(b - a) + (c - b)}{2}$$ For factors with suitability class S3, S2, S1, S1, S2 and S3 with scores $a, b, c, d, e$ and $f$, respectively. min is computed as, $$\mathrm{min} = a - \displaystyle\frac{(b - a) + (c - b) + (d - c) + (e - d) + (f - e)}{5}$$
maxfactor's maximum value. Default is set to "average". If numeric of length one, say 50, then maximum is set to 50 for all factors. If factors on land units (x) have different maximum, then these can be concatenated to vector of maxs, the length of this vector should be equal to the number of factors in x. However, if sets to "average", then max is computed from the equation below: $$\mathrm{max}=c + \displaystyle\frac{(b-a) + (c-b)}{2}$$ For factors with suitability class S3, S2, S1, S1, S2 and S3 with scores $a, b, c, d, e$ and $f$, respectively. Then, $$\mathrm{max} = f + \displaystyle\frac{(b - a) + (c - b) + (d - c) + (e - d) + (f - e)}{5}$$
intervaldomain for every suitability class (S1, S2, S3, and N). If "fixed", the interval would be 0 to 0.25 for N (Not Suitable), 0.25 to 0.50 for S3 (Marginally Suitable), 0.50 to 0.75 for S2 (Moderately Suitable), and 0.75 to 1 for (Highly Suitable). If "unbias", then the interval is set to 0 to $\displaystyle\frac{a}{\mathrm{max}}$ for N, $\displaystyle\frac{a}{\mathrm{max}}$ to $\displaystyle\frac{b}{\mathrm{max}}$ for S3, $\displaystyle\frac{b}{\mathrm{max}}$ to $\displaystyle\frac{c}{\mathrm{max}}$ for S2, and $\displaystyle\frac{c}{\mathrm{max}}$ to $\displaystyle\frac{\mathrm{max}}{\mathrm{max}}$ for S1.

The function returns the following output:
  1. Actual Factors Evaluated;
  2. Suitability Score;
  3. Suitability Class;
  4. Factors' Minimum Values; and,
  5. Factors' Maximum Values.
Example: To test the suitability of the land units in Marinduque, Philippines, for terrain requirements of coconut, we have

Before we run the function, let's check for the possible output. From the land units (MarinduqueLT), the only factor available to be evaluated is CFragm, for required soil characteristics of the coconut. The first land unit has 11% coarse fragment (CFragm), which falls within the S1 domain of the required soil characteristics, with domain [min - 15%), where min has default value set to 0. The second to sixth land units also are highly suitable as it falls within the said domain. Let's confirm it using the function,

Extract the first 6 of the outputs,

Indeed, just what we argued earlier.

Options for mf (Membership Function)
The membership function is an option for the type of fuzzy model, the available models are the following:
  1. Triangular;
  2. Trapezoidal; and,
  3. Gaussian.
The suitability scores are computed base on these fuzzy models.

Options for sow.month (Sowing Month)
The sow.month is the sowing month which takes integers from 1 to 12, representing the twelve months of a year. So if sets to 1, the function assumes sowing month on January. This argument is only use for water and temperature characteristics.

To illustrate this, we will test the land units of Marinduque for the required water and temperature for rainfed bunded rice. Thus, we have

We will test first the land units for water, and here are the following water requirements for rainfed bunded rice,

The factors to be evaluated here are the following:
  1. WmAv1 - Mean precipitation of first month (mm);
  2. WmAv2 - Mean precipitation of second month (mm);
  3. WmAv3 - Mean precipitation of third month (mm); and
  4. WmAv4 - Mean precipitation of fourth month (mm).
If sowing month is set to November, then we have
  1. WmAv1 - November;
  2. WmAv2 - December;
  3. WmAv3 - January; and
  4. WmAv4 - February.
So for Novermber, we see the first land unit falls within the domain of S1, that is, 277 mm falls within [175 - 500 mm). And same thing for the first land unit of December, highly suitable. Let's fire up the function to confirm that,

You will have this error if there is no factors to be evaluated. What just happened here is that, the function assumed the data as neither water nor temperature characteristics. Thus, it ignores the WmAv1, WmAv2, WmAv3 and WmAv4 factors. But if we specify the sowing month (sow.month) to November (11), then we have

The first land unit for November does confirms to be S1, but for December it isn't, and instead S2 is given. This problem will be discussed later on details about the interval argument.

Options for min (Factors' Minimum Value)
By default, min = 0 for all factors. This can be assigned to any positive integers, for example, using the cassava soil requirements,

Now let's try different minimums for factors, we will use the following:

Table 2: Custom min.

So we got an error, it is expected, since the length of the vector min should be equal to the number of factors in x, which is 6. Since we are not interested on the latitude (X) and longitude (Y) factors of the dataset, then we can ommit the two and rerun the code,

Only CECc and SoilTe are returned since these are the factors evaluated.

Options for max (Factors' Maximum Value)
By default max = 'average', and just like min, max can be assigned to any positive integer, example:

For different maximum value on every factor, we will use the following and ommit the first two factors in MarinduqueLT like what we did in the previous section.

Table 3: Custom max.

Options for interval (Domain of Suitability Scores)
The domain of suitability scores are set to default, 'fixed', if this option is used, the domain of the suitability scores would be,

Table 4: Domain for 'fixed'.
Domain[0, 0.25)[0.25, 0.5)[0.5, 0.75)[0.75, 1]

An example of interval = 'fixed' is the one illustrated in Options for sow.month (Sowing Month) above. Let us investigate the output of that, here is the crop requirements for water (the crop we are interested in, is the rainfed bunded rice),

Given that the starting sowing month assigned is November, then the following factors are evaluated:
  1. WmAv1 - November;
  2. WmAv2 - December;
  3. WmAv3 - January; and
  4. WmAv4 - February.
So we are going to extract this factors from the dataset, MarinduqueWater,

The suitability scores and class of this would be,

Focus your attention on suitability scores of Feb factor for the first three land units. We have here 0.3714, 0.3714 and 0.3771. And the domain of this base on Table 4, would be S3, S3 and S3. But, if we refer to the original data, the first three data points in Feb factor are all 65. Since WmAv4 is the corresponding requirements for Feb factor, with scores:

Table 5: WmAv4’s Suitability Requirements.

Then it is easy to pin point what suitability class does the scores of the land units falls into. Which follows that all first three land units falls within class S1. See the problem with 'fixed' interval? This is the same problem for other factor like Dec (December), where instead of S1, we got S2. Users can change the domain though, that is, instead of using the 'fixed' option, users can assign for example, interval = c(0, 0.33, 0.56, 0.89, 1), which equivalently:

Table 6: Custom Domains.
Domain[0, 0.33)[0.33, 0.56)[0.56, 0.89)[0.89, 1]

Assigning new values for parameters of the interval won't solve the problem, but this argument has one more option to offer, which does solve the problem, and that is by changing interval = 'fixed' to interval = 'unbias'. Let's try it,

And that supports our argument above.

The function, suitability, also considers the weights of the factors. An example of crop with no weights is the soil requirement for coconut,

The weights are assigned on the last column, Weight.class. And here is the soil requirements for the cassava, with weight on each factor:

If a given factor has a weight, then the function will compute the corresponding suitability and then use the weighting score to obtain the appropriate suitability score. The weights of the factors for the default interval (interval = 'fixed') are in Table 7:

SuitabilityFactor Weights
Table 7: Weights of the Factors for 'fixed' Interval.

Thus the function simply divides the interval of the suitability class into three, for three weights.

Overall Suitability

xa data frame consisting the suitability scores of a given characteristics (terrain, soil, water and temperature) for a given crop (e.g. coconut, cassava, etc.);
methodthe method for computing the overall suitability, which includes the minimum, maximum, sum, product, average, exponential and gamma. If NULL, minimum is used.
intervalif NULL, the interval used are the following: 0-0.25 (Not suitable, N), 0.25-0.50 (Marginally Suitable, S3), 0.50-0.75 (Moderately Suitable, S2), and 0.75-1 (Highly Suitable, S1).
outputthe output to be returned, either the scores or class. If NULL, both are returned.


Let's assume we are interested on the land units in Lao Cai, Vietnam, for cultivating irrigated rice. So here are the first 6 land units in the said location,

And here are the required values for factors of soil, terrain, temperature and water characteristics for irrigated rice,

Now, we are going to take the suitability scores for every characteristics,

Next, we will take the overall suitability on all factors in each land unit using the "average" method (default is "minimum").

Finally, take the overall suitability from these characteristics using the "maximum" method.

by Al-Ahmadgaid Asaad ( at October 27, 2014 12:14 PM

October 25, 2014


New package managelocalrepo with initial version 0.1.4

Package: managelocalrepo
Type: Package
Title: Manage a CRAN-style Local Repository
Version: 0.1.4
Date: 2014-10-25
Author: Imanuel Costigan
Maintainer: Imanuel Costigan
Description: This will allow easier management of a CRAN-style repository on local networks (i.e. not on CRAN). This might be necessary where hosted packages contain intellectual property owned by a corporation.
License: GPL-2
Depends: R (>= 3.0)
Imports: stringr (>= 0.6.2), assertthat (>= 0.1), tools (>= 3.0)
Packaged: 2014-10-25 03:44:33 UTC; imanuel
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-25 08:07:55

More information about managelocalrepo at CRAN

October 25, 2014 07:13 AM

New package HiDimDA with initial version 0.2-2

Package: HiDimDA
Type: Package
Title: High Dimensional Discriminant Analysis
Version: 0.2-2
Date: 2014-10-24
Author: Antonio Pedro Duarte Silva
Maintainer: Antonio Pedro Duarte Silva
Depends: R (>= 2.10.0)
Imports: splines
Suggests: MASS
LazyLoad: yes
LazyData: yes
Description: Performs linear discriminant analysis in high dimensional problems based on reliable covariance estimators for problems with (many) more variables than observations. Includes routines for classifier training, prediction, cross-validation and variable selection.
License: GPL (>= 3)
Repository: CRAN
Packaged: 2014-10-24 13:48:10 UTC; apedro
NeedsCompilation: yes
Date/Publication: 2014-10-25 07:51:03

More information about HiDimDA at CRAN

October 25, 2014 07:13 AM

New package choroplethrMaps with initial version 1.0

Package: choroplethrMaps
Type: Package
Title: Contains maps used by the choroplethr package
Version: 1.0
Date: 2014-10-22
Author: Ari Lamstein
Maintainer: Ari Lamstein
Description: Contains 3 maps. 1) US States 2) US Counties 3) Countries of the world.
License: BSD_3_clause + file LICENSE
Suggests: ggplot2
Packaged: 2014-10-25 02:51:32 UTC; arilamstein
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-25 08:07:52

More information about choroplethrMaps at CRAN

October 25, 2014 07:13 AM

New package BayesMixSurv with initial version 0.9

Package: BayesMixSurv
Type: Package
Title: Bayesian Mixture Survival Models using Additive Mixture-of-Weibull Hazards, with Lasso Shrinkage and Stratification
Version: 0.9
Date: 2014-10-24
Author: Alireza S. Mahani, Mansour T.A. Sharabiani
Maintainer: Alireza S. Mahani
Description: Bayesian Mixture Survival Models using Additive Mixture-of-Weibull Hazards, with Lasso Shrinkage and Stratification
License: GPL (>= 2)
Depends: survival
Packaged: 2014-10-25 01:34:35 UTC; amahani
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-25 08:07:48

More information about BayesMixSurv at CRAN

October 25, 2014 07:13 AM

New package highD2pop with initial version 1.0

Package: highD2pop
Type: Package
Title: Two-Sample Tests for Equality of Means in High Dimension
Version: 1.0
Date: 2012-11-02
Author: Karl Gregory
Maintainer: Karl Gregory
Description: Performs the generalized component test from Gregory et al (2015), as well as the tests from Chen and Qin (2010), Srivastava and Kubokawa (2013), and Cai, Liu, and Xia (2014) for equality of two population mean vectors when the length of the vectors exceeds the sample size.
License: GPL (>= 2)
LazyLoad: true
LazyData: true
ZipData: no
Depends: fastclime
Packaged: 2014-10-24 14:25:33 UTC; karlgregory
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-10-25 01:28:48

More information about highD2pop at CRAN

October 25, 2014 01:13 AM

October 24, 2014


New package ruv with initial version 0.9.4

Package: ruv
Title: Detect and Remove Unwanted Variation using Negative Controls
Description: The algorithms in this package attempt to adjust for systematic errors of unknown origin in high-dimensional data. The algorithms were originally developed for use with genomic data, especially microarray data, but may be useful with other types of high-dimensional data as well. The algorithms included in this package are RUV-2, RUV-4, RUV-inv, and RUV-rinv, along with various supporting algorithms. These algorithms were proposed by Gagnon-Bartsch and Speed (2012), and by Gagnon-Bartsch, Jacob and Speed (2013). The algorithms require the user to specifiy a set of negative control variables, as described in the references.
Version: 0.9.4
Date: 2014-10-24
Author: Johann Gagnon-Bartsch
Maintainer: Johann Gagnon-Bartsch
License: GPL
LazyLoad: yes
Packaged: 2014-10-24 09:52:13 UTC; johann
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-10-24 13:55:18

More information about ruv at CRAN

October 24, 2014 01:13 PM

New package PANICr with initial version

Package: PANICr
Title: PANIC Tests of Nonstationarity
Description: This package contains a methodology that makes use of the factor structure of large dimensional panels to understand the nature of nonstationarity inherent in data. This is referred to as PANIC - Panel Analysis of Nonstationarity in Idiosyncratic and Common Components. PANIC (2004) includes valid pooling methods that allow panel tests to be constructed. PANIC (2004) can detect whether the nonstationarity in a series is pervasive, or variable specific, or both. PANIC (2010) includes two new tests on the idiosyncratic component that estimates the pooled autoregressive coefficient and sample moment, respectively. The PANIC model approximates the number of factors based on Bai and Ng (2002)
Depends: R(>= 2.10.0)
License: GPL-3
LazyData: no
Author: Steve Bronder
Maintainer: Steve Bronder
Suggests: knitr
VignetteBuilder: knitr
Imports: MCMCpack
Packaged: 2014-10-24 06:46:57 UTC; brond_000
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-24 10:55:46

More information about PANICr at CRAN

October 24, 2014 11:13 AM

Removed CRANberries

Package bio3d (with last version 2.1-2) was removed from CRAN

Previous versions (as known to CRANberries) which should be available via the Archive link are:

2014-10-21 2.1-2
2014-10-15 2.1-1

October 24, 2014 09:13 AM


New package PHENIX with initial version 1.0

Package: PHENIX
Type: Package
Title: Phenotypic Integration Index
Version: 1.0
Date: 2014-10-15
Author: R. Torices, A. J. Muñoz-Pajares
Maintainer: A. J. Muñoz-Pajares
Imports: ppcor
Encoding: UTF-8
Description: Provides functions to estimate the size-controlled phenotypic integration index, a novel method by Torices & Méndez (2014) to solve problems due to individual size when estimating integration (namely, larger individuals have larger components, which will drive a correlation between components only due to resource availability that might overestimate the observed measures of integration). In addition, the package also provides the classical estimation by Wagner (1984) and a bootstrapping method to test the significance of both integration indices.
License: GPL (>= 2)
Packaged: 2014-10-23 20:33:24 UTC; ajesusmp
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-24 07:51:38

More information about PHENIX at CRAN

October 24, 2014 07:13 AM

New package P2C2M with initial version 0.5

Package: P2C2M
Type: Package
Title: Posterior Predictive Checks of Coalescent Models
Version: 0.5
Date: 2014-10-23
Author: Michael Gruenstaeudl, Noah Reid
Maintainer: Michael Gruenstaeudl
Depends: R (>= 3.0.0)
Imports: ape (>= 3.1-4), apTreeshape (>= 1.4-5), ggplot2 (>= 1.0.0), rPython (>= 0.0-5), stringr (>= 0.6.2)
Suggests: genealogicalSorting (>= 0.92), phybase (>= 1.3.1), Rmpi (>= 0.6-5), xtermStyle (>= 2.2-4)
Description: P2C2M is an R package to conduct posterior predictive checks of coalescent models using gene and species trees generated by BEAST and *BEAST, respectively. The functionality of P2C2M can be extended via two third-party R packages that are available from the author websites only: genealogicalSorting ( and phybase ( To use these optional packages, installation of the Python libraries NumPy (>= 1.9.0) and DendroPy (= 3.12.0) is necessary.
License: GPL (>= 2)
OS_type: unix
NeedsCompilation: yes (automatic)
SystemRequirements: Python (= 2.7)
Packaged: 2014-10-23 06:55:27 UTC; michael
Repository: CRAN
Date/Publication: 2014-10-24 07:51:37

More information about P2C2M at CRAN

October 24, 2014 07:13 AM

Removed CRANberries

Package ppmlasso (with last version 1.0) was removed from CRAN

Previous versions (as known to CRANberries) which should be available via the Archive link are:

2013-10-15 1.0

October 24, 2014 07:13 AM

Journal of Statistical Software

Implementing Reproducible Research

Vol. 61, Book Review 2, Oct 2014

Implementing Reproducible Research
Victoria Stodden, Friedrich Leisch, Roger D. Peng
CRC Press, 2014
ISBN: 978-1-4665-6159-5

October 24, 2014 07:00 AM

XML and Web Technologies for Data Sciences with R

Vol. 61, Book Review 1, Oct 2014

XML and Web Technologies for Data Sciences with R
Deborah Nolan and Duncan Temple Lang
Springer-Verlag, 2014
ISBN: 978-1-4614-7899-7

October 24, 2014 07:00 AM

Bergm: Bayesian Exponential Random Graphs in R

Vol. 61, Issue 2, Oct 2014


In this paper we describe the main features of the Bergm package for the open-source R software which provides a comprehensive framework for Bayesian analysis of exponential random graph models: tools for parameter estimation, model selection and goodness-of- fit diagnostics. We illustrate the capabilities of this package describing the algorithms through a tutorial analysis of three network datasets.

October 24, 2014 07:00 AM

evtree: Evolutionary Learning of Globally Optimal Classification and Regression Trees in R

Vol. 61, Issue 1, Oct 2014


Commonly used classification and regression tree methods like the CART algorithm are recursive partitioning methods that build the model in a forward stepwise search. Although this approach is known to be an efficient heuristic, the results of recursive tree methods are only locally optimal, as splits are chosen to maximize homogeneity at the next step only. An alternative way to search over the parameter space of trees is to use global optimization methods like evolutionary algorithms. This paper describes the evtree package, which implements an evolutionary algorithm for learning globally optimal classification and regression trees in R. Computationally intensive tasks are fully computed in C++ while the partykit package is leveraged for representing the resulting trees in R, providing unified infrastructure for summaries, visualizations, and predictions. evtree is compared to the open-source CART implementation rpart, conditional inference trees (ctree), and the open-source C4.5 implementation J48. A benchmark study of predictive accuracy and complexity is carried out in which evtree achieved at least similar and most of the time better results compared to rpart, ctree, and J48. Furthermore, the usefulness of evtree in practice is illustrated in a textbook customer classification task.

October 24, 2014 07:00 AM

October 23, 2014


New package rsml with initial version 1.0

Package: rsml
Type: Package
Title: Root System Markup Language (RSML) file processing
Version: 1.0
Date: 2014-07-22
Author: Guillaume Lobet
Maintainer: Guillaume Lobet
Description: Read and analyse Root System Markup Language (RSML) files.
License: GPL-2
Imports: XML,rgl
Packaged: 2014-10-23 14:10:40 UTC; guillaumelobet
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-23 17:25:08

More information about rsml at CRAN

October 23, 2014 05:13 PM

Dirk Eddelbuettel

<h2 id="introducing-rocker-docker-for-r">Introducing Rocker: Docker for R</h2>

You only know two things about Docker. First, it uses Linux
containers. Second, the Internet won't shut up about it.

-- attributed to Solomon Hykes, Docker CEO

So what is Docker?

Docker is a relatively new open source application and service, which is seeing interest across a number of areas. It uses recent Linux kernel features (containers, namespaces) to shield processes. While its use (superficially) resembles that of virtual machines, it is much more lightweight as it operates at the level of a single process (rather than an emulation of an entire OS layer). This also allows it to start almost instantly, require very little resources and hence permits an order of magnitude more deployments per host than a virtual machine.

Docker offers a standard interface to creation, distribution and deployment. The shipping container analogy is apt: just how shipping containers (via their standard size and "interface") allow global trade to prosper, Docker is aiming for nothing less for deployment. A Dockerfile provides a concise, extensible, and executable description of the computational environment. Docker software then builds a Docker image from the Dockerfile. Docker images are analogous to virtual machine images, but smaller and built in discrete, extensible and reuseable layers. Images can be distributed and run on any machine that has Docker software installed---including Windows, OS X and of course Linux. Running instances are called Docker containers. A single machine can run hundreds of such containers, including multiple containers running the same image.

There are many good tutorials and introductory materials on Docker on the web. The official online tutorial is a good place to start; this post can not go into more detail in order to remain short and introductory.

So what is Rocker?

rocker logo

At its core, Rocker is a project for running R using Docker containers. We provide a collection of Dockerfiles and pre-built Docker images that can be used and extended for many purposes.

Rocker is the the name of our GitHub repository contained with the Rocker-Org GitHub organization.

Rocker is also the name the account under which the automated builds at Docker provide containers ready for download.

Current Rocker Status

Core Rocker Containers

The Rocker project develops the following containers in the core Rocker repository

  • r-base provides a base R container to build from
  • r-devel provides the basic R container, as well as a complete R-devel build based on current SVN sources of R
  • rstudio provides the base R container as well an RStudio Server instance

We have settled on these three core images after earlier work in repositories such as docker-debian-r and docker-ubuntu-r.

Rocker Use Case Containers

Within the Rocker-org organization on GitHub, we are also working on

  • Hadleyverse which extends the rstudio container with a number of Hadley packages
  • rOpenSci which extends hadleyverse with a number of rOpenSci packages
  • r-devel-san provides an R-devel build for "Sanitizer" run-time diagnostics via a properly instrumented version of R-devel via a recent compiler build
  • rocker-versioned aims to provided containers with 'versioned' previous R releases and matching packages

Other repositories will probably be added as new needs and opportunities are identified.


The Rocker effort supersedes and replaces earlier work by Dirk (in the docker-debian-r and docker-ubuntu-r GitHub repositories) and Carl. Please use the Rocker GitHub repo and Rocker Containers from going forward.

Next Steps

We intend to follow-up with more posts detailing usage of both the source Dockerfiles and binary containers on different platforms.

Rocker containers are fully functional. We invite you to take them for a spin. Bug reports, comments, and suggestions are welcome; we suggest you use the GitHub issue tracker.


We are very appreciative of all comments received by early adopters and testers. We also would like to thank RStudio for allowing us the redistribution of their RStudio Server binary.

Published concurrently at rOpenSci blog and Dirk's blog.


Dirk Eddelbuettel and Carl Boettiger

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

October 23, 2014 04:39 PM


New package matR with initial version 0.9

Package: matR
Type: Package
Title: Metagenomics Analysis Tools for R
Authors@R: c (person ("Daniel", "Braithwaite", role = c ("aut", "cre"), email = ""), person ("Kevin", "Keegan", role = "aut", email = ""), person (c ("University", "of"), "Chicago", role = "cph"))
Version: 0.9
Depends: R (>= 2.10), MGRASTer, BIOM.utils, graphics, stats, utils
Suggests: RJSONIO, qvalue, ecodist, gplots, scatterplot3d
Date: 2014-10-22
Description: An analysis platform for metagenomics combining specialized tools and workflows, easy handling of the BIOM format, and transparent access to MG-RAST resources. matR integrates easily with other R packages and non-R software.
License: BSD_2_clause + file LICENSE
Copyright: University of Chicago
LazyData: yes
Collate: utils.R biom-ext.R client.R graphics.R analysis-support.R analysis-misc.R distx.R rowstats.R transform.R boxplot.R princomp.R image.R init.R
Packaged: 2014-10-23 07:39:09 UTC; dan
Author: Daniel Braithwaite [aut, cre], Kevin Keegan [aut], University of Chicago [cph]
Maintainer: Daniel Braithwaite
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-23 11:50:13

More information about matR at CRAN

October 23, 2014 11:13 AM

RCpp Gallery

Sampling Importance Resampling (SIR) and social revolution.


The purpose of this gallery post is several fold:

  • to demonstrate the use of the new and improved C++-level implementation of R’s sample() function (see here)
  • to demonstrate the Gallery’s new support for images in contributed posts
  • to demonstrate the usefulness of SIR for updating posterior beliefs given a sample from an arbitrary prior distribution

Application: Foreign Threats and Social Revolution

The application in this post uses an example from Jackman’s Bayesian Analysis for the Social Sciences (page 72) which now has a 30-year history in the Political Science (See Jackman for more references). The focus is on the extent to which the probability of revolution varies with facing a foreign threat or not. Facing a foreign threat is measured by “defeated …” or “not defeated …” over a span of 20 years. The countries come from in Latin America. During this period of time, there are only three revolutions: Bolivia (1952), Mexico (1910), and Nicaragua (1979).

table {margin-left : auto ; margin-right : auto ;} table, th, td { padding : 5px ; background-color : lightgrey ; border: 1px solid white ;}
  Revolution No Revolution
Defeated and invaded or lost territory 1 7
Not defeated for 20 years 2 74

The goal is to learn about the true, unobservable probabilities of revolution given a recent defeat or the absence of one. That is, we care about


And, beyond that, we care about whether and differ.

These data are assumed to arise from a Binomial process, where the likelihood of the probability parameter value, , is

where is the total number of revolutions and non-revolutions and is the number of revolutions. The MLE for this model is just the sample proportion, so a Frequentist statistician would be wondering whether was sufficiently larger than to be unlikely to have happened by chance alone (given the null hypothesis that the two proportions were identical).

A Bayesian statistician could approach the question a bit more directly and compute the probability that To do this, we first need samples from the posterior distribution of and . In this post, we will get these samples via Sampling Importance Resampling.

Sampling Importance Resampling

Sampling Importance Resampling allows us to sample from the posterior distribution, where

by resampling from a series of draws from the prior, . Denote one of those draws from the prior distribution, , as . Then draw from the prior sample is drawn with replacement into the posterior sample with probability

Generating Samples from the Prior Distributions

We begin by drawing many samples from a series of prior distributions. Although using a prior Beta prior distribution on the parameter admits a closed-form solution, the point here is to demonstrate a simulation based approach. On the other hand, a Gamma prior distribution over is very much not conjugate and simulation is the best approach.

In particular, we will consider our posterior beliefs about the different in probabilities under five different prior distributions.

dfPriorInfo <- data.frame(id = 1:5,
                          dist = c("beta", "beta", "gamma", "beta", "beta"),
                          par1 = c(1, 1, 3, 10, .5),
                          par2 = c(1, 5, 20, 10, .5),
                          stringsAsFactors = FALSE)
  id  dist par1 par2
1  1  beta  1.0  1.0
2  2  beta  1.0  5.0
3  3 gamma  3.0 20.0
4  4  beta 10.0 10.0
5  5  beta  0.5  0.5

Using the data frame dfPriorInfo and the plyr package, we will draw a total of 20,000 values from each of the prior distributions. This can be done in any number of ways and is completely independent of using Rcpp for the SIR magic.

MC1 <- 20000
dfPriors <- ddply(dfPriorInfo, "id",
                  .fun = (function(X) data.frame(draws = ("r", X$dist, sep = ""),
                                                                  list(MC1, X$par1, X$par2))))))

However, we can confirm that our draws are as we expect and that we have the right number of them (5 * 20k = 100k).

  id     draws
1  1 0.7124225
2  1 0.5910231
3  1 0.0595327
4  1 0.4718945
5  1 0.4485650
6  1 0.0431667
[1] 100000      2

Re-Sampling from the Prior

Now, we write a C++ snippet that will create our R-level function to generate a sample of D values from the prior draws (prdraws) given their likelihood after the data (i.e., number of success – nsucc, number of failures – nfail).

The most important feature to mention here is the use of some new and improved extensions which effectively provide an equivalent, performant mirror of R’s sample() function at the C++-level. Important: as of the time of the writing of this post these features were not on CRAN, only on github.

The return value of this function is a length D vector of draws from the posterior distribution given the draws from the prior distribution where the likelihood is used as a filtering weight.

# include <RcppArmadilloExtensions/sample.h>
# include <RcppArmadilloExtensions/fixprob.h>

// [[Rcpp::depends(RcppArmadillo)]]

using namespace Rcpp ;

// [[Rcpp::export()]]
NumericVector samplePost (const NumericVector prdraws,
                          const int D,
                          const int nsucc,
                          const int nfail) {
    int N = prdraws.size();
    NumericVector wts(N);
    for (int n = 0 ; n < N ; n++) {
        wts(n) = pow(prdraws(n), nsucc) * pow(1 - prdraws(n), nfail);
    RcppArmadillo::FixProb(wts, N, true);

    NumericVector podraws = RcppArmadillo::sample(prdraws, D, true, wts);

To use the samplePost() function, we create the R representation of the data as follows.

nS <- c(1, 2) # successes
nF <- c(7, 74) # failures

As a simple example, consider drawing a posterior sample of size 30 for the “defeated case” from discrete prior distribution with equal weight on the values of .125 (the MLE), .127, and .8. We see there is a mixture of .125 and .127 values, but no .8 values. values of .8 were simply to unlikely (given the likelihood) to be resampled from the prior.

table(samplePost(c(.125, .127, .8), 30, nS[1], nF[1]))

0.125 0.127 
    9    21 

Again making use of the plyr package, we construct samples of size 20,000 for both and under each of the 5 prior distribution samples. These posterior draws are stored in the data frame dfPost.

MC2 <- 20000
f1 <- function(X) {
    draws <- X$draws
    t1 <- samplePost(draws, MC2, nS[1], nF[1])
    t2 <- samplePost(draws, MC2, nS[2], nF[2])
    return(data.frame(theta1 = t1, theta2 = t2))

dfPost <- ddply(dfPriors, "id", f1)
  id    theta1    theta2
1  1 0.3067334 0.0130865
2  1 0.1421879 0.0420830
3  1 0.3218130 0.0634511
4  1 0.0739756 0.0363466
5  1 0.1065267 0.0460336
6  1 0.0961749 0.0440790
[1] 100000      3

Summarizing Posterior Inferences

Here, we are visualizing the posterior draws for the quantity of interest — the difference in probabilities of revolution. These posterior draws are grouped according to the prior distribution used. A test of whether revolution is more likely given a foreign threat is operationalized by the probability that is positive. This probability for each distribution is shown in white. For all choices of the prior here, the probability that “foreign threat matters” exceeds .90.

The full posterior distribution of is shown for each of the five priors in blue. A solid, white vertical band indicates “no effect”. In all cases. the majority of the mass is clearly to the right of this band.

Recall that the priors are, themselves, over the individual revolution probabilities, and . The general shape of each of these prior distributions of the parameter is shown in a grey box by the white line. For example, is actually a uniform distribution over the parameter space, . On the other hand, has most of its mass at the two tails.

plot of chunk unnamed-chunk-9

At least across these specifications of the prior distributions on , the conclusion that “foreign threats matter” finds a good deal of support. What is interesting about this application is that despite these distributions over the difference in probabilities, the p-value associated with Fisher’s Exact Test for 2 x 2 tables is just .262.

October 23, 2014 12:00 AM

October 22, 2014


New package sigloc with initial version 0.0.4

Package: sigloc
Version: 0.0.4
Date: 2013-04-23
Title: Signal Location Estimation
Author: Sergey S. Berg
Maintainer: Sergey S. Berg
Depends: R (>= 2.15.3), nleqslv, ellipse
Description: A collection of tools for estimating the location of a transmitter signal from radio telemetry studies using the maximum likelihood estimation (MLE) approach described in Lenth (1981).
License: GPL (>= 2)
Packaged: 2014-10-22 20:22:57 UTC; Serge
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-22 23:14:50

More information about sigloc at CRAN

October 22, 2014 11:13 PM

New package TDboost with initial version 1.0

Package: TDboost
Title: A Boosted Nonparametric Tweedie Model
Version: 1.0
Date: 2013-10-11
Author: Yi Yang , Wei Qian , Hui Zou
Maintainer: Yi Yang
Depends: R (>= 2.12.0), lattice
Description: A fully nonparametric Tweedie model using the gradient boosting. It is capable of fitting a flexible nonlinear model and capturing interactions among predictors.
LazyData: yes
License: GPL-3
Packaged: 2014-10-22 16:33:11 UTC; emeryyi
NeedsCompilation: yes
Date/Publication: 2014-10-22 20:48:02
Repository: CRAN

More information about TDboost at CRAN

October 22, 2014 07:13 PM

New package glmvsd with initial version 1.0

Package: glmvsd
Type: Package
Title: Variable Selection Deviation (VSD) Measures and Instability Tests for High-dimensional Generalized Linear Models
Version: 1.0
Date: 2014-09-19
Author: Ying Nan , Yi Yang , Yuhong Yang
Maintainer: Yi Yang
Depends: stats, glmnet, ncvreg, MASS
Description: Variable selection deviation measures and instability tests for high-dimensional model selection methods such as LASSO, SCAD and MCP, etc., to decide whether the sparse patterns identified by those methods are reliable.
License: GPL-2
Packaged: 2014-10-22 16:08:33 UTC; emeryyi
Date/Publication: 2014-10-22 20:49:00
NeedsCompilation: no
Repository: CRAN

More information about glmvsd at CRAN

October 22, 2014 07:13 PM

Bioconductor Project Working Papers


We present a general method for estimating the effect of a treatment on an ordinal outcome in randomized trials. The method is robust in that it does not rely on the proportional odds assumption. Our estimator leverages information in prognostic baseline variables, and has all of the following properties: (i) it is consistent; (ii) it is locally efficient; (iii) it is guaranteed to match or improve the precision of the standard, unadjusted estimator. To the best of our knowledge, this is the first estimator of the causal relation between a treatment and an ordinal outcome to satisfy these properties. We demonstrate the estimator in simulations based on resampling from a completed randomized clinical trial of a new treatment for stroke; we show potential gains of up to 39\% in relative efficiency compared to the unadjusted estimator. The proposed estimator could be a useful tool for analyzing randomized trials with ordinal outcomes, since existing methods either rely on model assumptions that are untenable in many practical applications, or lack the efficiency properties of the proposed estimator. We provide R code implementing the estimator.

by Iván Díaz et al. at October 22, 2014 05:36 PM


New package traj with initial version 1.0

Package: traj
Title: Trajectory Analysis
Description: Implements the three step procedure proposed by Leffondree et al. (2004) to identify clusters of individual longitudinal trajectories. The procedure involves (1) calculating 24 measures describing the features of the trajectories; (2) using factor analysis to select a subset of the 24 measures and (3) using cluster analysis to identify clusters of trajectories, and classify each individual trajectory in one of the clusters.
Version: 1.0
Date: 2014-07-10
Author: Marie-Pierre Sylvestre, Dan Vatnik
Maintainer: Dan Vatnik
License: GPL-2
LazyData: true
Depends: R (>= 3.0.3)
Imports: cluster, psych, pastecs, NbClust, graphics, grDevices, stats, utils, GPArotation
Packaged: 2014-10-22 15:47:16 UTC; Dan
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-22 18:20:51

More information about traj at CRAN

October 22, 2014 05:13 PM

New package saeSim with initial version 0.6.0

Package: saeSim
Type: Package
Title: Simulation Tools for Small Area Estimation
Date: 2014-10-22
Version: 0.6.0
Author: Sebastian Warnholz
Maintainer: Sebastian Warnholz
Depends: R(>= 3.1), methods
Imports: dplyr(>= 0.2), ggplot2, MASS, spdep, parallel
Suggests: testthat, knitr
Description: Tools for the simulation of data in the context of small area estimation. Combine all steps of your simulation - from data generation over drawing samples to model fitting - in one object. This enables easy modification and combination of different scenarios. You can store your results in a folder or start the simulation in parallel.
License: GPL-3 | file LICENSE
VignetteBuilder: knitr
Packaged: 2014-10-22 12:28:59 UTC; swarnholz
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-22 16:04:46

More information about saeSim at CRAN

October 22, 2014 05:13 PM

Bioconductor Project Working Papers

Nonparametric Adjustment for Measurement Error in Time to Event Data

Measurement error in time to event data used as a predictor will lead to inaccurate predictions. This arises in the context of self-reported family history, a time to event predictor often measured with error, used in Mendelian risk prediction models. Using a validation data set, we propose a method to adjust for this type of measurement error. We estimate the measurement error process using a nonparametric smoothed Kaplan-Meier estimator, and use Monte Carlo integration to implement the adjustment. We apply our method to simulated data in the context of both Mendelian risk prediction models and multivariate survival prediction models, as well as illustrate our method using a data application for Mendelian risk prediction models. Results from simulations are evaluated using measures of mean squared error of prediction (MSEP), area under the response operating characteristics curve (ROC-AUC), and the ratio of observed to expected number of events. These results show that our adjusted method mitigates the effects of measurement error mainly by improving calibration and by improving total accuracy. In some scenarios discrimination is also improved.

by Danielle Braun et al. at October 22, 2014 04:37 PM

Extending Mendelian Risk Prediction Models to Handle Misreported Family History

Mendelian risk prediction models calculate the probability of a proband being a mutation carrier based on family history and known mutation prevalence and penetrance. Family history in this setting, is self-reported and is often reported with error. Various studies in the literature have evaluated misreporting of family history. Using a validation data set which includes both error-prone self-reported family history and error-free validated family history, we propose a method to adjust for misreporting of family history. We estimate the measurement error process in a validation data set (from University of California at Irvine (UCI)) using nonparametric smoothed Kaplan-Meier estimators, and use Monte Carlo integration to implement the adjustment. In this paper, we extend BRCAPRO, a Mendelian risk prediction model for breast and ovarian cancers, to adjust for misreporting in family history. We apply the extended model to data from the Cancer Genetics Network (CGN).

by Danielle Braun et al. at October 22, 2014 04:37 PM


New package lmenssp with initial version 1.0

Package: lmenssp
Type: Package
Title: Linear Mixed Effects Models with Non-stationary Stochastic Processes
Version: 1.0
Date: 2014-10-21
Author: Ozgur Asar, Peter J. Diggle
Maintainer: Ozgur Asar
Depends: MASS, nlme
Description: Fit, filter and smooth mixed models with non-stationary processes
License: GPL (>= 2)
Packaged: 2014-10-22 08:38:04 UTC; asar
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-22 13:13:04

More information about lmenssp at CRAN

October 22, 2014 01:13 PM

New package FDGcopulas with initial version 1.0

Package: FDGcopulas
Type: Package
Title: Multivariate Dependence with FDG Copulas
Version: 1.0
Date: 2014-09-19
Author: Gildas Mazo, Stephane Girard
Maintainer: Gildas Mazo
Description: FDG copulas are a class of copulas featuring an interesting balance between flexibility and tractability. This package provides tools to construct, calculate the pairwise dependence coefficients of, simulate from, and fit FDG copulas. The acronym FDG stands for 'one-Factor with Durante Generators', as an FDG copula is a one-factor copula -- that is, the variables are independent given a latent factor -- whose linking copulas belong to the Durante class of bivariate copulas (also referred to as exchangeable Marshall-Olkin or semilinear copulas).
License: GPL (>= 3)
Depends: Rcpp (>= 0.10.6), methods
LinkingTo: Rcpp
Imports: numDeriv, randtoolbox
Packaged: 2014-10-22 07:24:07 UTC; mazo
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-10-22 09:33:34

More information about FDGcopulas at CRAN

October 22, 2014 09:13 AM

New package indicoio with initial version 0.3

Package: indicoio
Title: A simple R Wrapper for the indico set of APIs -
Version: 0.3
Author: Alexander Gedranovich
Maintainer: Madison May
Description: R-based client for Machine Learning APIs at Provides wrappers for following APIs: Positive/Negative Sentiment Analysis, Political Sentiment Analysis, Image Feature Extraction, Facial Emotion Recognition, Facial Feature Extraction, Language Detection
Depends: R (>= 3.0.2), httr, rjson, stringr, png
License: MIT + file LICENSE
LazyData: true
Suggests: testthat, jpeg
Packaged: 2014-10-22 00:37:34 UTC; mmay
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-10-22 08:58:17

More information about indicoio at CRAN

October 22, 2014 07:13 AM

October 21, 2014


New package DetMCD with initial version 0.0.1

Package: DetMCD
Type: Package
Title: DetMCD Algorithm (Robust and Deterministic Estimation of Location and Scatter)
Version: 0.0.1
Date: 2013-01-13
Depends: matrixStats, pcaPP (>= 1.8-1), robustbase, MASS
Suggests: mvtnorm
LinkingTo: Rcpp, RcppEigen
Description: DetMCD is a new algorithm for robust and deterministic estimation of location and scatter. The benefits of robust and deterministic estimation are explained in Hubert, M., Rousseeuw, P.J. and Verdonck, T. (2012),"A deterministic algorithm for robust location and scatter", Journal of Computational and Graphical Statistics, Volume 21, Number 3, Pages 618--637.
License: GPL (>= 2)
LazyLoad: yes
Authors@R: c(person("Vakili", "Kaveh", role = c("aut", "cre"), email = ""), person("Mia", "Hubert", role = "ths", email = ""))
Maintainer: Vakili Kaveh
Author: Vakili Kaveh [aut, cre], Mia Hubert [ths]
Packaged: 2014-10-21 16:15:15 UTC; kaveh
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-10-21 20:01:49

More information about DetMCD at CRAN

October 21, 2014 07:13 PM

New package scrm with initial version 1.3-1

Package: scrm
Type: Package
Title: Simulating the Evolution of Biological Sequences
Version: 1.3-1
Date: 2014-10-21
Authors@R: c( person('Paul', 'Staab', , '', role=c('aut', 'cre', 'cph')), person('Zhu', 'Sha', role=c('aut', 'cph')), person('Dirk', 'Metzler', role='ths'), person('Gerton', 'Lunter', role=c('aut', 'cph', 'ths')) )
Description: A coalescent simulator that allows the rapid simulation of biological sequences under neutral models of evolution.
License: GPL (>= 3)
Suggests: testthat (>= 0.9.0), knitr, ape
Imports: Rcpp (>= 0.11.2)
SystemRequirements: C++11
VignetteBuilder: knitr
LinkingTo: Rcpp
Packaged: 2014-10-21 11:42:11 UTC; paul
Author: Paul Staab [aut, cre, cph], Zhu Sha [aut, cph], Dirk Metzler [ths], Gerton Lunter [aut, cph, ths]
Maintainer: Paul Staab
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-10-21 17:48:23

More information about scrm at CRAN

October 21, 2014 05:13 PM

New package rtkpp with initial version 0.8.2

Package: rtkpp
Type: Package
Title: STK++ integration to R using Rcpp.
Version: 0.8.2
Encoding: UTF-8
Date: 2014-02-08
Authors@R: c(person("Serge", "Iovleff", role=c("aut","cre"), email=""), person("Vincent", "Kubicki", role="ctb", email=""), person("Quentin", "Grimonprez", role="ctb", email=""), person("Parmeet", "Bhatia", role="ctb", email=""))
Copyright: Inria, and specifically inst/COPYRIGHTS for the STK++ library
Maintainer: Serge Iovleff
Description: STK++ ( is a collection of C++ classes for statistics, clustering, linear algebra, arrays (with an Eigen-like API), regression, dimension reduction, etc. The integration of the library to R is using Rcpp. Some functionalities of the Clustering project provided by the library are available in the R environment as R functions. The rtkpp package includes the header files from the STK++ library (currently version 0.8.2). Thus users do not need to install STK++ itself in order to use it. STK++ is licensed under the GNU LGPL version 2 or later. rtkpp (the stkpp integration into R) is licensed under the GNU GPL version 2 or later.
License: GPL (>= 2) | LGPL (>= 2) | file LICENSE
LazyLoad: yes
Depends: R (>= 3.0.2), Rcpp
LinkingTo: Rcpp
Imports: methods
NeedsCompilation: yes
SystemRequirements: GNU make
Collate: 'ClusterAlgo.R' 'ClusterInit.R' 'ClusterStrategy.R' 'IClusterModel.R' 'ClusterCategorical.R' 'ClusterDiagGaussian.R' 'ClusterGamma.R' 'ClusterModelNames.R' 'ClusterPlot.R' 'global.R' 'rtkpp.R' 'rtkppFlags.R' 'inlineCxxPlugin.R'
Packaged: 2014-10-21 12:37:40 UTC; iovleff
Author: Serge Iovleff [aut, cre], Vincent Kubicki [ctb], Quentin Grimonprez [ctb], Parmeet Bhatia [ctb]
Repository: CRAN
Date/Publication: 2014-10-21 17:48:19

More information about rtkpp at CRAN

October 21, 2014 05:13 PM

New package cdcsis with initial version 1.0

Package: cdcsis
Type: Package
Title: Conditional Distance Correlation and Its Related Feature Screening Method
Version: 1.0
Date: 2014-09-01
Author: Canhong Wen, Wenliang Pan, Mian Huang, and Xueqin Wang
Depends: R(>= 3.0.1), stats
Imports: ks
Suggests: MASS, energy
Maintainer: Canhong Wen
Description: Gives conditional distance correlation and performs the conditional distance correlation sure independence screening procedure for ultrahigh dimensional data. The conditional distance correlation is a novel conditional dependence measurement of two random variables given a third variable. The conditional distance correlation sure independence screening is used for screening variables in ultrahigh dimensional setting.
License: GPL (>= 2)
Packaged: 2014-10-21 14:47:45 UTC; wencanh
Repository: CRAN
Collate: 'cdcov.R' 'cdcor.R' 'cdcor.ada.R' 'bw.R' 'cdcsis.R'
NeedsCompilation: yes
Date/Publication: 2014-10-21 17:48:22

More information about cdcsis at CRAN

October 21, 2014 05:13 PM