Planet R

July 22, 2014

CRANberries

New package RMRAINGEN with initial version 1.0

Package: RMRAINGEN
Maintainer: Emanuele Cordano
License: GPL (>= 2)
Title: RMRAINGEN (R Multi-site RAINfall GENeretor): a package to generate daily time series of rainfall from monthly mean values
Type: Package
Author: Emanuele Cordano
Description: This package contains functions and S3 methods for spatial multi-site stochastic generation of daily precipitation. It generates precipitation occurrence in several sites using Wilks' Approach (1998). Bugs/comments/questions/collaboration of any kind are warmly welcomed.
Version: 1.0
Repository: CRAN
Date: 2014-07-22
Depends: R (>= 3.0), copula, RGENERATE,RMAWGEN,blockmatrix,Matrix
URL: https://github.com/ecor/RMRAINGEN
Packaged: 2014-07-22 14:35:22 UTC; ecor
NeedsCompilation: no
Date/Publication: 2014-07-22 21:18:30

More information about RMRAINGEN at CRAN

July 22, 2014 09:13 PM

New package nontargetData with initial version 1.1

Package: nontargetData
Type: Package
Title: Quantized simulation data of isotope pattern centroids
Version: 1.1
Date: 2014-07-22
Author: Martin Loos, Francesco Corona
Maintainer: Martin Loos
Description: Data sets for isotope pattern grouping of LC-HRMS peaks with package nontarget. Based on a vast set of unique PubChem molecular formulas, quantized (a) m/z, (b) m/z differences, (c) intensity ratios and (d) marker centroids of simulated centroid pairs are listed for different instrument resolutions.
License: GPL-3
Packaged: 2014-07-22 13:59:18 UTC; uchemadmin
Depends: R (>= 2.10)
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-22 21:18:38

More information about nontargetData at CRAN

July 22, 2014 09:13 PM

New package MixGHD with initial version 1.0

Package: MixGHD
Type: Package
Title: Model based clustering and classification using the mixture of generalized hyperbolic distributions.
Version: 1.0
Date: 2014-07-15
Author: Cristina Tortora, Ryan P. Browne, Brian C. Franczak and Paul D. McNicholas.
Maintainer: Cristina Tortora
Description: Carries out model-based clustering using three different models. The models are all based on the generalized hyperbolic distribution.The first model MGHD is the classical mixture of generalized hyperbolic distributions. The MGHFA is the mixture of generalized hyperbolic factor analyzers for high dimensional data sets. The MCGHD, mixture of coalesced generalized hyperbolic distributions is a new more flexible model.
Depends: Bessel, stats, MASS, mvtnorm, ghyp, numDeriv, R (>= 3.1.1)
NeedsCompilation: no
License: GPL (>= 2)
Packaged: 2014-07-22 13:24:40 UTC; cristina
Repository: CRAN
Date/Publication: 2014-07-22 21:18:29

More information about MixGHD at CRAN

July 22, 2014 09:13 PM

New package FreeSortR with initial version 1.0

Package: FreeSortR
Type: Package
Title: Free Sorting data analysis.
Version: 1.0
Date: 2014-04-29
Author: Philippe Courcoux
Maintainer: Philippe Courcoux
Description: The package FreeSortR provides tools for describing and analysing free sorting data. Main methods are computation of consensus partition and factorial analysis of the dissimilarity matrix between stimuli (using multidimensional scaling approach).
License: GPL-2
Depends: methods, smacof, vegan, ellipse
Packaged: 2014-07-22 14:33:58 UTC; courcoux
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-22 21:18:25

More information about FreeSortR at CRAN

July 22, 2014 09:13 PM

New package rnbn with initial version 1.0.0

Package: rnbn
Type: Package
Title: Access NBN data
Version: 1.0.0
Date: 2014-07-22
Author: Stuart Ball & Tom August
Maintainer: Tom August
Description: Access to data held by the National Biodiversity Network (NBN, www.nbn.org.uk). The NBN acts as a data warehouse for biological records data in the UK and is the UK node of GBIF (Global Biodiversity Information Facility). In this package NBN data is accessed via its web-services.
License: MIT + file LICENSE
Depends: RCurl, RJSONIO, tcltk
BuildVignettes: FALSE
LazyData: TRUE
Packaged: 2014-07-22 12:02:46 UTC; tomaug
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-22 14:29:38

More information about rnbn at CRAN

July 22, 2014 01:13 PM

New package nycflights13 with initial version 0.1

Package: nycflights13
Title: Data about flights departing NYC in 2013.
Version: 0.1
Authors@R: 'Hadley Wickham [aut,cre]'
Description: Airline on-time data for all flights departing NYC in 2013. Also includes useful metadata on airlines, airports, weather, and planes.
Depends: R (>= 3.1.0)
License: CC0
LazyData: true
Suggests: dplyr
URL: http://github.com/hadley/nycflights13
Packaged: 2014-07-21 19:26:17 UTC; hadley
Author: 'Hadley Wickham' [aut, cre]
Maintainer: 'Hadley Wickham'
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-22 11:08:54

More information about nycflights13 at CRAN

July 22, 2014 09:13 AM

New package nasaweather with initial version 0.1

Package: nasaweather
Title: Collection of datasets from the ASA 2006 data expo
Version: 0.1
Authors@R: 'Hadley Wickham [aut,cre]'
Description: This package contains tidied data from the ASA 2006 data expo, as well as a number of useful other related data sets.
Depends: R (>= 3.1.0)
License: GPL-3
LazyData: true
URL: http://github.com/hadley/nasaweather, http://stat-computing.org/dataexpo/2006/
Packaged: 2014-07-21 19:30:24 UTC; hadley
Author: 'Hadley Wickham' [aut, cre]
Maintainer: 'Hadley Wickham'
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-22 11:08:36

More information about nasaweather at CRAN

July 22, 2014 09:13 AM

New package fueleconomy with initial version 0.1

Package: fueleconomy
Title: EPA fuel economy data
Version: 0.1
Authors@R: 'Hadley Wickham [aut,cre]'
Description: Fuel economy data from the EPA, 1985-2015, conveniently packaged for consumption by R users.
Depends: R (>= 3.1.0)
License: CC0
LazyData: true
Suggests: dplyr
URL: http://github.com/hadley/fueleconomy
Packaged: 2014-07-21 19:31:13 UTC; hadley
Author: 'Hadley Wickham' [aut, cre]
Maintainer: 'Hadley Wickham'
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-22 11:08:43

More information about fueleconomy at CRAN

July 22, 2014 09:13 AM

New package babynames with initial version 0.1

Package: babynames
Title: US baby names 1880-2013
Version: 0.1
Authors@R: 'Hadley Wickham [aut,cre]'
Description: US baby names provided by the SSA. This package contains all names used for at least 5 children of either sex.
Depends: R (>= 3.1.0)
License: CC0
LazyData: true
URL: http://github.com/hadley/babynames
Packaged: 2014-07-21 19:46:18 UTC; hadley
Author: 'Hadley Wickham' [aut, cre]
Maintainer: 'Hadley Wickham'
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-22 11:08:48

More information about babynames at CRAN

July 22, 2014 09:13 AM

July 21, 2014

CRANberries

New package tidyr with initial version 0.1

Package: tidyr
Title: Easily tidy data with spread and gather functions.
Version: 0.1
Authors@R: 'Hadley Wickham [aut,cre]'
Description: tidyr is an evolution of reshape2. It's design specifically for data tidying (not general reshaping or aggregating) and works well with dplyr data pipelines.
Depends: R (>= 3.1.0)
License: MIT + file LICENSE
LazyData: true
Imports: reshape2, dplyr (>= 0.2)
URL: https://github.com/hadley/tidyr
Suggests: knitr, testthat
VignetteBuilder: knitr
Packaged: 2014-07-21 19:17:00 UTC; hadley
Author: 'Hadley Wickham' [aut, cre]
Maintainer: 'Hadley Wickham'
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-21 23:16:05

More information about tidyr at CRAN

July 21, 2014 11:13 PM

New package bootSVD with initial version 0.1

Package: bootSVD
Title: Fast, Exact Bootstrap Principal Component Analysis for High Dimensional Data
Description: Implements fast, exact bootstrap Principal Component Analysis and Singular Value Decompositions for high dimensional data, as described in (arxiv.org/abs/1405.0922).
Version: 0.1
Author: Aaron Fisher
Maintainer: Aaron Fisher
Depends: R (>= 3.0.2)
Imports: parallel
License: GPL-2
LazyData: true
Packaged: 2014-07-21 20:05:34 UTC; aaronfisher
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-21 23:18:12

More information about bootSVD at CRAN

July 21, 2014 11:13 PM

New package TableMonster with initial version 1.1

Package: TableMonster
Version: 1.1
Depends: xtable
Date: 2014-07-18
Title: Table Monster
Author: Grant Izmirlian Jr
Maintainer: Grant Izmirlian Jr
Description: Provides a user friendly interface to generation of booktab style tables using xtable.
License: GPL (>= 2)
LazyLoad: yes
Packaged: 2014-07-21 17:14:28 UTC; izmirlig
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-21 20:07:51

More information about TableMonster at CRAN

July 21, 2014 07:13 PM

New package HSSVD with initial version 1.1

Package: HSSVD
Type: Package
Title: Biclustering with heterogeneous variance
Version: 1.1
Date: 2014-07-21
Authors@R: c(person("Guanhua", "Chen", role = c("aut","cre"), email = "guanhuac@live.unc.edu"), person("Michael", "Kosorok", role = c("aut"), email = "kosorok@unc.edu"), person("Shannon","Holloway", role="ctb", email="sthollow@ncsu.edu"))
Description: HSSVD is a recently developed data mining tool for discovering subgroups of patients and genes which simultaneously display unusual levels of variability compared to other genes and patients. Previous biclustering methods were restricted to mean level detection, while the new method can detect both mean and variance biclusters.
Depends: R (>= 2.10), bcv
License: GPL-2
Packaged: 2014-07-21 17:38:29 UTC; sthollow
Author: Guanhua Chen [aut, cre], Michael Kosorok [aut], Shannon Holloway [ctb]
Maintainer: Guanhua Chen
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-21 20:17:46

More information about HSSVD at CRAN

July 21, 2014 07:13 PM

Bioconductor Project Working Papers

CRANberries

New package densityClust with initial version 0.1-1

Package: densityClust
Type: Package
Title: Clustering by fast search and find of density peaks
Version: 0.1-1
Date: 2014-06-30
Author: Thomas Lin Pedersen
Maintainer: Thomas Lin Pedersen
Description: An implementation of the clustering algorithm described by Alex Rodriguez and Alessandro Laio (Science, 2014 vol. 344), along with tools to inspect and visualize the results.
License: GPL (>= 2)
Packaged: 2014-07-21 08:01:51 UTC; Thomas
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-21 13:29:49

More information about densityClust at CRAN

July 21, 2014 01:13 PM

New package logitchoice with initial version 0.9.0

Package: logitchoice
Type: Package
Title: Fitting l2-regularized logit choice models via generalized gradient descent
Version: 0.9.0
Date: 2014-7-20
Author: Michael Lim
Maintainer: Michael Lim
Depends:
Suggests:
Description: Fits linear discrete logic choice models with l2 regularization. To handle reasonably sized datasets, we employ an accelerated version of generalized gradient descent.
License: GPL-2
URL:
Packaged: 2014-07-21 05:13:14 UTC; mlim
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-07-21 07:40:03

More information about logitchoice at CRAN

July 21, 2014 07:13 AM

July 20, 2014

CRANberries

New package SBRect with initial version 0.26

Package: SBRect
Version: 0.26
Date: 2014-02-07
Title: Detecting structural breaks using rectangle covering (non-parametric method).
Authors@R: c(person("Paul", "Fischer", role = c("aut", "cre","cph"), email = "pafi@dtu.dk"),person("Astrid", "Hilbert", role = c("ctb","cph"), email = "astrid.hilbert@lnu.se"))
Author: Paul Fischer [aut, cre, cph], Astrid Hilbert [ctb, cph]
Maintainer: Paul Fischer
Depends: rJava
SystemRequirements: java
Suggests: MASS
Description: The package uses fitting axes-aligned rectangles to a time series in order to find structural breaks. The algorithm enclose the time series in a number of axes-aligned rectangles and tries to minimize their area and number. As these are conflicting aims, the user has to specify a parameter alpha in [0.0,1.0]. Values close to 0 result in more breakpoints, values close to 1 in fewer. The left edges of the rectangles are the breakpoints. The package supplies two methods, computeBreakPoints(series,alpha) which returns the indices of the break points and computeRectangles(series,alpha) which returns the rectangles. The algorithm is randomised; it uses a genetic algorithm. Therefore, the break point sequence found can be different in different executions of the method on the same data, especially when used on longer series of some thousand observations. The algorithm uses a range-tree as background data structure which makes i very fast and suited to analyse series with millions of observations. A detailed description can be found in Paul Fischer, Astrid Hilbert, Fast detection of structural breaks, Proceedings of Compstat 2014.
License: GPL-2
URL: http://www2.imm.dtu.dk/~pafi/StructBreak/index.html
BugReports: pafi@dtu.dk
Packaged: 2014-07-20 18:56:52 UTC; Paul
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-20 21:17:32

More information about SBRect at CRAN

July 20, 2014 09:13 PM

New package coxinterval with initial version 1.0

Package: coxinterval
Title: Cox-type models for interval-censored data
Version: 1.0
Date: 2014-07-19
Depends: Matrix, parallel, survival, timereg, R (>= 2.13.0)
LazyData: Yes
LazyLoad: Yes
Author: Audrey Boruvka and Richard J. Cook
Maintainer: Audrey Boruvka
Description: Fits Cox-type models based on interval-censored data from a survival or illness-death process
SystemRequirements: GNU make
NeedsCompilation: Yes
License: GPL (>= 2)
URL: https://github.com/aboruvka/coxinterval
BugReports: https://github.com/aboruvka/coxinterval/issues
Packaged: 2014-07-20 17:02:19 UTC; aboruvka
Repository: CRAN
Date/Publication: 2014-07-20 20:52:59

More information about coxinterval at CRAN

July 20, 2014 07:13 PM

New package ecp with initial version 1.6.0

Package: ecp
Type: Package
Title: Nonparametric Multiple Change Point Analysis of Multivariate Data
Version: 1.6.0
Date: 2014-07-19
Author: Nicholas A. James and David S. Matteson
Maintainer: Nicholas A. James
Description: This package performs hierarchical change point analysis through the use of U-statistics. Both agglomerative and divisive procedures return the set of change points estimates, and other summary information.
License: GPL (>= 2)
Depends: R (>= 2.10), Rcpp
Suggests: mvtnorm,MASS,combinat
LinkingTo: Rcpp
Packaged: 2014-07-19 20:24:32 UTC; nick
NeedsCompilation: yes
Repository: CRAN
X-CRAN-Comment: Archived on 2014-04-04 as long-term memory access errors which caused crashes remained unresolved.
Date/Publication: 2014-07-20 10:23:54

More information about ecp at CRAN

July 20, 2014 09:13 AM

New package lm.beta with initial version 1.0

Package: lm.beta
Type: Package
Title: Add standardized regression coefficients to lm objects
Version: 1.0
Date: 2014-07-20
Author: Stefan Behrendt
Maintainer: Stefan Behrendt
Description: This package adds standardized regression coefficients to objects created by lm. It also overwrites the S3 methods print, summary and coef with additional argument standardized.
License: GPL (>= 2)
Packaged: 2014-07-20 00:35:30 UTC; Stefan
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-20 07:44:12

More information about lm.beta at CRAN

July 20, 2014 07:13 AM

New package astrochron with initial version 0.3.1

Package: astrochron
Type: Package
Title: An R Package for Astrochronology
Version: 0.3.1
Date: 2014-07-19
Author: Stephen Meyers
Maintainer: Stephen Meyers
Description: Astrochonologic testing, astronomical time scale development, time series analysis
Imports: baseline, multitaper, fields
License: GPL-3
Packaged: 2014-07-19 20:14:42 UTC; smeyers
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-07-20 07:32:44

More information about astrochron at CRAN

July 20, 2014 07:13 AM

July 19, 2014

Removed CRANberries

Package astrochron (with last version 0.3) was removed from CRAN

Previous versions (as known to CRANberries) which should be available via the Archive link are:

2014-06-30 0.3

July 19, 2014 05:13 PM

CRANberries

New package slp with initial version 1.0-3

Package: slp
Version: 1.0-3
Author: Wesley Burr, with contributions from Karim Rahim
Copyright: file COPYRIGHTS
Maintainer: Wesley Burr
Title: Discrete Prolate Spheroidal (Slepian) Sequence Regression Smoothers
Description: Interface for creation of 'slp' class smoother objects for use in Generalized Additive Models (as implemented by packages 'gam' and 'mgcv').
Depends: R (>= 2.15.1)
License: GPL (>= 2)
Imports: mgcv (>= 1.7.18)
Suggests: gam (>= 1.09)
Packaged: 2014-07-18 19:16:59 UTC; wburr
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-07-19 08:42:29

More information about slp at CRAN

July 19, 2014 07:13 AM

July 18, 2014

CRANberries

New package fuzzyMM with initial version 1.0.1

Package: fuzzyMM
Type: Package
Title: Map Matching Using Fuzzy Logic
Version: 1.0.1
Date: 2014-07-15
Author: Nikolai Gorte
Description: Implements a fuzzy-logic-based map-matching algorithm used to match GPS trajectories to the OpenStreetMap digital road network.
Maintainer: Nikolai Gorte
Depends: R (>= 2.15.0), osmar, frbs,
Imports: methods, igraph, rgeos, rgdal,
Suggests: rjson, maptools, stringr, RCurl,
License: GPL (>= 2)
Collate: var_bounds.R fuzzy_model.R FIS1.R FIS2.R FIS3.R IMP.R SMP1.R SMP2.R DRN.R fuzzyMM.R Data-track.R fuzzyMM-package.R
Packaged: 2014-07-18 15:01:00 UTC; Niko
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-18 20:20:05

More information about fuzzyMM at CRAN

July 18, 2014 07:13 PM

New package colorscience with initial version 1.0.0

Package: colorscience
Type: Package
Title: Color Science methods and data
Version: 1.0.0
Encoding: UTF-8
Maintainer: Jose Gama
Authors@R: c(person(given = "Jose", family = "Gama", role = c("aut","cre"),email = "jgama@abo.fi"))
Description: Methods and data for color science - color conversions by observer, illuminant and gamma. Color matching functions and chromaticity diagrams. Color indices, color differences and spectral data conversion/analysis.
License: GPL (>= 3)
Depends: R (>= 2.10), Hmisc, munsellinterpol, pracma
Enhances: png
LazyData: yes
Packaged: 2014-07-17 05:01:34.477 UTC; poky
Author: Jose Gama [aut, cre]
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-18 09:25:05

More information about colorscience at CRAN

July 18, 2014 09:13 AM

New package bayou with initial version 1.0

Package: bayou
Type: Package
Title: Bayesian fitting of Ornstein-Uhlenbeck models to phylogenies
Version: 1.0
Date: 2013-09-12
Author: Josef C. Uyeda, Jon Eastman and Luke Harmon
Maintainer: Josef C. Uyeda
Description: Tools for fitting and simulating multi-optima Ornstein-Uhlenbeck models to phylogenetic comparative data using Bayesian reversible-jump methods.
License: GPL (>= 2)
Depends: ape (>= 3.0-6), geiger(>= 2.0), R (>= 2.15.0), phytools, coda
Imports: Rcpp (>= 0.10.3), MASS, mnormt, fitdistrplus, denstrip
Suggests: doMC, foreach
LinkingTo: Rcpp, RcppArmadillo
Collate: 'RcppExports.R' 'bayou-utilities.R' 'probability.R' 'bayou-weight_matrix.R' 'bayou-moves.R' 'bayou-likelihood.R' 'bayou-prior.R' 'conversion-utilities.R' 'bayou-mcmc-utilities.R' 'bayou-mcmc.R' 'bayou-plotting.R' 'bayou-simulation.r' 'bayou-steppingstone.R' 'bayou-package.R'
Packaged: 2014-07-18 02:36:06 UTC; josef
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-07-18 07:09:20

More information about bayou at CRAN

July 18, 2014 05:13 AM

July 17, 2014

CRANberries

New package RNeXML with initial version 1.1-0

Package: RNeXML
Type: Package
Title: Implement semantically rich I/O for NeXML format
Version: 1.1-0
Date: 2013-06-30
Authors@R: c(person("Carl", "Boettiger", role = c("cre", "aut"), email="cboettig@gmail.com"), person("Scott", "Chamberlain", role = "aut"), person("Hilmar", "Lapp", role = "aut"), person("Kseniia", "Shumelchyk", role = "aut"), person("Rutger", "Vos", role = "aut"))
Description: R package that allows access to phyloinformatic data in NeXML format. The package should add new functionality to R such as the possibility to manipulate NeXML objects in more various and refined way and compatibility with 'ape' objects. Note that the Sxslt package suggested can be installed from the Omegahat repository, though it can also be obtained from https://github.com/cboettig/Sxslt
License: BSD_3_clause + file LICENSE
Additional_repositories: http://www.omegahat.org/R
VignetteBuilder: knitr
Suggests: rrdf (>= 2.0.2), geiger (>= 2.0), phytools (>= 0.3.93), knitr (>= 1.5), testthat (>= 0.8.1), rfigshare (>= 0.3.0), knitcitations (>= 1.0.1), Sxslt
Depends: R (>= 3.0.0), ape (>= 3.1), methods (>= 3.0.0)
Imports: XML (>= 3.95), plyr (>= 1.8), taxize (>= 0.2.2), reshape2 (>= 1.2.2), httr (>= 0.3), uuid (>= 0.1-1)
Collate: 'classes.R' 'add_basic_meta.R' 'add_characters.R' 'add_meta.R' 'add_namespaces.R' 'add_trees.R' 'character_classes.R' 'concatenate_nexml.R' 'get_basic_metadata.R' 'get_characters.R' 'get_metadata.R' 'get_namespaces.R' 'get_rdf.R' 'get_taxa.R' 'get_taxa_meta.R' 'get_trees.R' 'internal_get_node_maps.R' 'internal_isEmpty.R' 'internal_name_by_id.R' 'internal_nexml_id.R' 'meta.R' 'nexmlTree.R' 'nexml_add.R' 'nexml_get.R' 'nexml_methods.R' 'nexml_publish.R' 'nexml_read.R' 'nexml_validate.R' 'nexml_write.R' 'simmap.R' 'taxize_nexml.R'
Packaged: 2014-07-17 16:28:19 UTC; cboettig
Author: Carl Boettiger [cre, aut], Scott Chamberlain [aut], Hilmar Lapp [aut], Kseniia Shumelchyk [aut], Rutger Vos [aut]
Maintainer: Carl Boettiger
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-18 00:22:54

More information about RNeXML at CRAN

July 17, 2014 11:13 PM

New package rnaseqWrapper with initial version 1.0

Package: rnaseqWrapper
Type: Package
Title: Wrapper for several R packages and scripts to automate RNA-seq analysis
Version: 1.0
Date: 2013-12-06
Author: Mark Peterson
Maintainer: Mark Peterson
Description: This package is designed to streamline several of the common steps for RNA-seq analysis, including differential expression and variant discovery. For the development build, or to contribute changes to this package, please see our repository at https://bitbucket.org/petersmp/rnaseqwrapper/
License: GPL
Depends: ecodist, gplots, gtools
Suggests: topGO,seqinr,DESeq
Packaged: 2014-07-17 13:55:00 UTC; markp
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-18 00:18:31

More information about rnaseqWrapper at CRAN

July 17, 2014 11:13 PM

New package nscancor with initial version 0.6

Package: nscancor
Authors@R: c( person("Christian", "Sigg", email = "christian@sigg-iten.ch", role = c("aut", "cre")), person("R Core team", role = "aut"))
Version: 0.6
Date: 2014-02-03
Title: Non-Negative and Sparse CCA
Description: This package implements two algorithms for canonical correlation analysis (CCA) that are based on iterated regression steps. By choosing the appropriate regression algorithm for each data modality, it is possible to enforce sparsity, non-negativity or other kinds of constraints on the projection vectors. Multiple canonical variables are computed sequentially using a generalized deflation scheme, where the additional correlation not explained by previous variables is maximized. 'nscancor' is used to analyze paired data from two domains, and has the same interface as the 'cancor' function from the 'stats' package (plus some extra parameters). 'mcancor' is appropriate for analyzing data from three or more domains.
URL: http://sigg-iten.ch/research/
License: GPL (>= 2)
Suggests: CCA, glmnet, MASS, PMA, testthat (>= 0.8), roxygen2
Packaged: 2014-07-17 18:44:15 UTC; chrsigg
Author: Christian Sigg [aut, cre], R Core team [aut]
Maintainer: Christian Sigg
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-17 23:46:07

More information about nscancor at CRAN

July 17, 2014 11:13 PM

New package itertools2 with initial version 0.1

Package: itertools2
Title: itertools2: Functions creating iterators for efficient looping
Version: 0.1
Date: 2014-07-17
Author: John A. Ramey , Kayla Schaefer
Maintainer: John A. Ramey
Description: A port of Python's excellent itertools module to R for efficient looping.
Depends: R (>= 3.0.2)
Imports: iterators (>= 1.0.7)
Suggests: testthat (>= 0.8.1)
License: MIT + file LICENSE
URL: https://github.com/ramhiser/itertools2, http://ramhiser.com
Packaged: 2014-07-17 21:55:01 UTC; ramhiser
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-18 00:15:26

More information about itertools2 at CRAN

July 17, 2014 11:13 PM

Bioconductor Project Working Papers

Entering the Era of Data Science: Targeted Learning and the Integration of Statistics and Computational Data Analysis

This outlook article will appear in Advances in Statistics and it reviews the research of Dr. van der Laan's group on Targeted Learning, a subfield of statistics that is concerned with the construction of data adaptive estimators of user-supplied target parameters of the probability distribution of the data and corresponding confidence intervals, aiming to only rely on realistic statistical assumptions. Targeted Learning fully utilizes the state of the art in machine learning tools, while still preserving the important identity of statistics as a field that is concerned with both accurate estimation of the true target parameter value and assessment of uncertainty in order to make sound statistical conclusions. We also provide a philosophical historical perspective on Targeted Learning, also relating it to the new developments in Big Data. We conclude with some remarks explaining the immediate relevance of Targeted Learning to the current big data movement.

by Mark J. van der Laan et al. at July 17, 2014 11:05 PM

CRANberries

New package RTDE with initial version 0.1-0

Package: RTDE
Type: Package
Title: Robust Tail Dependence Estimation
Version: 0.1-0
Date: 2014-07-31
Author: Christophe Dutang [aut, cre], Armelle Guillou [ctb], Yuri Goegebeur [ctb]
Maintainer: Christophe Dutang
Description: Robust tail dependence estimation for bivariate models. This package is based on two papers by the authors:'Robust and bias-corrected estimation of the coefficient of tail dependence' and 'Robust and bias-corrected estimation of extreme failure sets'. This work was supported by a research grant (VKR023480) from VILLUM FONDEN and an international project for scientific cooperation (PICS-6416).
Depends: R (>= 3.0.0), parallel
Suggests: tseries
License: GPL (>= 2)
URL: https://www.rmetrics.org/files/Paris2014/
Packaged: 2014-07-17 14:07:04 UTC; dutangc
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-17 18:41:53

More information about RTDE at CRAN

July 17, 2014 05:13 PM

New package R6 with initial version 1.0.1

Package: R6
Title: Classes with reference semantics
Version: 1.0.1
Authors@R: "Winston Chang [aut, cre]"
Description: The R6 package allows the creation of classes with reference semantics, similar to R's built-in reference classes. Compared to reference classes, R6 classes are simpler and lighter-weight, and they are not built on S4 classes so they do not require the methods package. These classes allow public and private members, and they support inheritance.
Depends: R (>= 3.0)
Suggests: knitr, microbenchmark, testthat
License: MIT + file LICENSE
LazyData: true
VignetteBuilder: knitr
Packaged: 2014-07-17 05:32:08 UTC; winston
Author: "Winston Chang" [aut, cre]
Maintainer: "Winston Chang"
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-17 07:40:51

More information about R6 at CRAN

July 17, 2014 07:13 AM

New package HDPenReg with initial version 0.89.7

Package: HDPenReg
Version: 0.89.7
Date: 2014-07-11
Title: High-Dimensional Penalized Regression.
Authors@R: c(person("Quentin", "Grimonprez", role = c("aut","cre"), email = "quentin.grimonprez@inria.fr"), person("Serge", "Iovleff", role = "ctb"))
Copyright: inria 2013-2014 for HDPenReg. Serge Iovleff is the copyright holder of the c++ library STKpp.
Depends: methods
Imports: Rcpp
Description: This package contains algorithms for lasso and fused-lasso problems. It contains an implementation of the lars algorithm [1], for the lasso and fusion penalization and EM-based algorithms for (logistic) lasso and fused-lasso.
License: GPL (>= 2)
LinkingTo: Rcpp
SystemRequirements: GNU make
Packaged: 2014-07-16 12:04:43 UTC; grimonprez
Author: Quentin Grimonprez [aut, cre], Serge Iovleff [ctb]
Maintainer: Quentin Grimonprez
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-07-17 08:21:35

More information about HDPenReg at CRAN

July 17, 2014 07:13 AM

Bioconductor Project Working Papers

Bayesian Model Averaging:- An Application in Cancer Clinical Trial

Data driven conclusion is mostly accepted approach in any medical research problem. In case of limited knowledge of deep idea about supportive data on the problem, automatic digging of the variable plays important role for insight view of the study. Bayesian model averaging can be considered for automatics variable selection. It can be used as an alternative of stepwise regression method. The aim of this paper is to show the application of Bayesian modeling averaging in medical research particularly in cancer trial. Method is illustrated on Bone marrow transplant data. It can be recommended that BMA can be used frequently in data selection and as a tool of exploratory data analysis method. It is very handy method of choice for data analysis.

by Atanu Bhattacharjee at July 17, 2014 05:37 AM

July 16, 2014

Dirk Eddelbuettel

Introducing RcppParallel: Getting R and C++ to work (some more) in parallel

A common theme over the last few decades was that we could afford to simply sit back and let computer (hardware) engineers take care of increases in computing speed thanks to Moore's law. That same line of thought now frequently points out that we are getting closer and closer to the physical limits of what Moore's law can do for us.

So the new best hope is (and has been) parallel processing. Even our smartphones have multiple cores, and most if not all retail PCs now possess two, four or more cores. Real computers, aka somewhat decent servers, can be had with 24, 32 or more cores as well, and all that is before we even consider GPU coprocessors or other upcoming changes.

And sometimes our tasks are embarassingly simple as is the case with many data-parallel jobs: we can use higher-level operations such as those offered by the base R package parallel to spawn multiple processing tasks and gather the results. I covered all this in some detail in previous talks on High Performance Computing with R (and you can also consult the Task View on High Performance Computing with R which I edit).

But sometimes we can't use data-parallel approaches. Hence we have to redo our algorithms. Which is really hard. R itself has been relying on the (fairly mature) OpenMP standard for some of its operations. Luke Tierney's (awesome) keynote in May at our (sixth) R/Finance conference mentioned some of the issues related to OpenMP. One which matters is that OpenMP works really well on Linux, and either not so well (Windows) or not at all (OS X, due the usual issue with the gcc/clang switch enforced by Applem but the good news is that the OpenMP toolchain is expected to make it to OS X is some more performant form "soon"). R is still expected to make wider use of OpenMP in future versions.

Another tool which has been around for a few years, and which can be considered to be equally mature is the Intel Threaded Building Blocks library, or TBB. JJ recently started to wrap this up for use by R. The first approach resulted in a (now superseded, see below) package TBB. But hardware and OS issues bite once again, as the Intel TBB is not really building that well for the Windows toolchain used by R (and based on MinGW).

(And yes, there are two more options. But Boost Threads requires linking which precludes easy use as e.g. via our BH package. And C++11 with its threads library (based on Boost Threads) is not yet as widely available as R and Rcpp which means that it is not a real deployment option yet.)

Now, JJ, being as awesome as he is, went back to the drawing board and integrated a second threading toolkit: TinyThread++, a small header-only library without further dependencies. Not as feature-rich as Intel Threaded Building Blocks, but at least available everywhere. So a new package RcppParallel, so far only on GitHub, wraps around both TinyThread++ and Intel Threaded Building Blocks and offers a consistent interface available on all platforms used by R.

Better still, JJ also authored several pieces demonstrating this new package for the Rcpp Gallery:

All four are interesting and demonstrate different aspects of parallel computing via RcppParallel. But the last article is key. Based on a question by Jim Bullard, and then written with Jim, it shows how a particular matrix distance metric (which is missing from R) can be implemented in a serial manner in both R, and also via Rcpp. The key implementation, however, uses both Rcpp and RcppParallel and thereby achieves a truly impressive speed gain as the gains from using compiled code (via Rcpp) and from using a parallel algorithm (via RcppParallel) are multiplicative! Between JJ's and my four-core machines the gain was between 200 and 300 fold---which is rather considerable. For kicks, I also used a much bigger machine at work which came in at an even larger speed gain (but gains become clearly sublinear as the number of cores increases; there are however some tuning parameters).

So these are exciting times. I am sure there will be lots more to come. For now, head over to the RcppParallel package and start playing. Further contributions to the Rcpp Gallery are not only welcome but strongly encouraged.

July 16, 2014 02:56 PM

Bioconductor Project Working Papers

CRANberries

New package hpcwld with initial version 0.4

Package: hpcwld
Version: 0.4
Date: 2014-07-01
Title: High Performance Cluster Models Based on Kiefer-Wolfowitz Recursion
Authors@R: c(person("Alexander", "Rumyantsev", role = c("aut", "cre"), email = "ar0@sampo.ru"))
Depends: R (>= 1.8.0)
Description: This package contains several models describing the behavior of workload and queue on a High Performance Cluster and computing GRID under FIFO service discipline basing on modified Kiefer-Wolfowitz recursion. Also sample data for inter-arrival times, service times, number of cores per task and waiting times of HPC of Karelian Research Centre are included, measurements took place from 06/03/2009 to 02/30/2011.
License: GPL (>= 2)
URL: http://www.r-project.org, http://cluster.krc.karelia.ru
Packaged: 2014-07-16 06:14:31 UTC; ar0
Author: Alexander Rumyantsev [aut, cre]
Maintainer: Alexander Rumyantsev
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-16 09:52:58

More information about hpcwld at CRAN

July 16, 2014 09:13 AM

New package trimTrees with initial version 1.0

Package: trimTrees
Type: Package
Title: Trimmed opinion pools of trees in a random forest
Version: 1.0
Date: 2014-07-15
Depends: R (>= 2.5.0),stats,randomForest,mlbench
Author: Yael Grushka-Cockayne, Victor Richmond R. Jose, Kenneth C. Lichtendahl Jr. and Huanghui Zeng, based on the source code from the randomForest package by Andy Liaw and Matthew Wiener and on the original Fortran code by Leo Breiman and Adele Cutler.
Maintainer: Yael Grushka-Cockayne
Description: Creates point and probability forecasts from the trees in a random forest using a trimmed opinion pool.
Suggests: MASS
License: GPL (>= 2)
Packaged: 2014-07-16 01:26:19 UTC; Huanghui
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-07-16 07:46:05

More information about trimTrees at CRAN

July 16, 2014 07:13 AM

Alstatr

LaTeX: Using gnuplot for Plotting Functions

$\mathrm{\LaTeX}$ has the capability to draw beautiful graphics. This feature is possible with TikZ package. Here is the plot of $f(x) = x$,


In $\mathrm{\LaTeX}$, everything has to be coded. From axes, to labels, to points on the $xy$-plane; that explains why four lines of codes, only for single, very simple plot.

To start sketching, one has to enclose the drawing inside the tikzpicture environment with options, for this case domain -- the domain of $x$. On second and third line, the codes draw the $x$ and $y$ axes; with usage,

options above is set to ->, which is the type of line, that is, with pointed end. (x1,y1) and (x2,y2) are cartesian coordinates for the line segment. Finally, the fourth line connects the $xy$-points: $\{(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5)\}$. As witnessed, $\mathrm{\LaTeX}$ has no capability when it comes to plotting functions, since one has to manually enter the points computed from the function to the draw output. However, $\mathrm{\LaTeX}$ can call external program and do the plotting. The said program is called gnuplot. This program has to be installed first, and then include this as option on $\mathrm{\LaTeX}$ compiler. For example on my pdflatex, I include this, --shell-escape. For Texmaker users, go to Options > Configure Texmaker, on PdfLaTeX insert --shell-escape. See photo below,

 
With this option, $\mathrm{\LaTeX}$ can finally call the gnuplot and do the computation on functions to obtain the $xy$-points that is needed by TikZ for plotting. For example, here are the plots of the functions: $f(x) = x; g(x) = x^2;$ and $h(x) = \sin(x)$.


 

The first four lines of the script are almost the same with the previous one above, except for the option, scale, which scales the $x$ and $y$ axes; and the second line, that draws the grid on the graph. The next four lines are loops, that generate the ticks of the two axes. Line 9, is a label for tick 0. The proceeding lines, plot the three functions. Unlike before, the $xy$-points are computed by the gnuplot, from the statement plot[id] function{...}.

Here are other examples taken from TikZ and PGF manual,





by Al-Ahmadgaid Asaad (noreply@blogger.com) at July 16, 2014 04:34 AM

CRANberries

New package hillmakeR with initial version 0.2

Package: hillmakeR
Type: Package
Title: Perform occupancy analysis
Version: 0.2
Depends: R (>= 2.10)
URL: https://github.com/gilinson/hillmakeR
Date: 2014-07-14
Author: David Gilinson
Maintainer: David Gilinson
Description: Generate occupancy patterns based on arrival and departure timestamps
License: MIT + file LICENSE
LazyLoad: true
LazyData: true
Suggests: plyr
Packaged: 2014-07-15 17:17:39 UTC; dgilinson
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-07-16 01:13:38

More information about hillmakeR at CRAN

July 16, 2014 01:13 AM

July 15, 2014

CRANberries

New package GCAI.bias with initial version 1.0

Package: GCAI.bias
Type: Package
Title: Guided Correction Approach for Inherited bias (GCAI.bias)
Version: 1.0
Date: 2014-07-14
Author: Guoshuai Cai
Maintainer: Guoshuai Cai
Description: Many inherited biases and effects exists in RNA-seq due to both biological and technical effects. We observed the biological variance of testing target transcripts can influence the yield of sequencing reads, which might indicate a resource competition existing in RNA-seq. We developed this package to capture the bias depending on local sequence and perform the correction of this type of bias by borrowing information from spike-in measurement.
License: GPL (>= 2)
LazyLoad: yes
Packaged: 2014-07-15 16:50:22 UTC; gcai
Depends: R (>= 2.10)
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-16 01:07:26

More information about GCAI.bias at CRAN

July 15, 2014 11:13 PM

New package quad with initial version 1.0

Package: quad
Type: Package
Title: Exact permutation moments of quadratic form statistics
Version: 1.0
Date: 2014-07-05
Imports: PearsonDS
Author: Yi-Hui Zhou
Maintainer: Yi-Hui Zhou
Description: This package gives you the exact first four permutation moments for the most commonly used quadratic form statistics, which need not be positive definite. The extension of this work to quadratic forms greatly expands the utility of density approximations for these problems, including for high-dimensional applications, where the statistics must be extreme in order to exceed stringent testing thresholds. Approximate p-values are obtained by matching the exact moments to the Pearson family of distributions using the PearsonDS package.
License: GPL (>= 2)
LazyLoad: yes
Packaged: 2014-07-15 14:54:18 UTC; yzhou19
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-15 17:09:05

More information about quad at CRAN

July 15, 2014 05:13 PM

New package btf with initial version 1.0

Package: btf
Type: Package
Title: Estimates univariate function via Bayesian trend filtering
Version: 1.0
Date: 2014-07-14
Author: Edward A. Roualdes
Maintainer: Edward A. Roualdes
Description: Trend filtering uses the generalized lasso framework to fit an adaptive polynomial of degree k to estimate the function f_0 at each input x_i in the model: y_i = f_0(x_i) + epsilon_i, for i = 1, ..., n, and epsilon_i is sub-Gaussian with E(epsilon_i) = 0. Bayesian trend filtering adapts the genlasso framework to a fully Bayesian hierarchical model, estimating the penalty parameter lambda within a tractable Gibbs sampler.
License: GPL (>= 2.0)
Depends: R (>= 3.0.2)
Imports: Matrix, coda,
LinkingTo: Rcpp (>= 0.11.0), RcppEigen (>= 0.3.2.1.1)
NeedsCompilation: yes
Packaged: 2014-07-15 14:01:16 UTC; easy-e
Repository: CRAN
Date/Publication: 2014-07-15 16:42:44

More information about btf at CRAN

July 15, 2014 03:13 PM

New package archdata with initial version 0.1

Package: archdata
Type: Package
Title: Example Datasets from Archaeological Research
Version: 0.1
Date: 2014-06-24
Author: David L. Carlson
Maintainer: David L. Carlson
Description: The archdata package provides several types of data that are typically used in archaeological research.
Suggests: circular, plotrix
License: GPL
Packaged: 2014-06-25 14:40:20 UTC; dcarlson
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-15 16:42:42

More information about archdata at CRAN

July 15, 2014 03:13 PM

RCpp Gallery

Parallel Distance Matrix Calculation with RcppParallel

The RcppParallel package includes high level functions for doing parallel programming with Rcpp. For example, the parallelFor function can be used to convert the work of a standard serial “for” loop into a parallel one.

This article describes using RcppParallel to compute pairwise distances for each row in an input data matrix and return an n x n lower-triangular distance matrix which can be used with clustering tools from within R, e.g., hclust.

Jensen-Shannon Distance

In this example, we compute the Jensen-Shannon distance (JSD); a metric not a part of base R. Calculating distance matrices is a common practice in clustering applications (unsupervised learning). Certain clustering methods, such as partitioning around medoids (PAM) and hierarchical clustering, operate directly on this matrix.

A distance matrix stores the n*(n-1)/2 pairwise distances/similarities between observations in an n x p matrix where n correspond to the independent observational units and p represent the covariates measured on each individual. As a result we are typically limited by the size of n as the algorithm scales quadratically in both time and space in n.

Implementation in R

As a baseline we’ll start with the implementation of Jenson-Shannon distance in plain R:

js_distance <- function(mat) {
  kld = function(p,q) sum(ifelse(p == 0 | q == 0, 0, log(p/q)*p))
  res = matrix(0, nrow(mat), nrow(mat))
  for (i in 1:(nrow(mat) - 1)) {
    for (j in (i+1):nrow(mat)) {
      m = (mat[i,] + mat[j,])/2
      d1 = kld(mat[i,], m)
      d2 = kld(mat[j,], m)
      res[j,i] = sqrt(.5*(d1 + d2))
    }
  }
  res
}

Implementation using Rcpp

Here is a re-implementation of js_distance using Rcpp. Note that this doesn’t yet take advantage of parallel processing, but still yields an approximately 50x speedup over the original R version on a 2.6GHz Haswell MacBook Pro.

Abstractly, a Distance function will take two vectors in RJ and return a value in R+. In this implementation, we don’t support arbitrary distance metrics, i.e., the JSD code computes the values from within the parallel kernel.

Our distance function kl_divergence is defined below and takes three parameters: iterators to the beginning and end of vector 1 and an iterator to the beginning of vector 2 (the end position of vector2 is implied by the end position of vector1).

#include <Rcpp.h>
using namespace Rcpp;

#include <cmath>
#include <algorithm>

// generic function for kl_divergence
template <typename InputIterator1, typename InputIterator2>
inline double kl_divergence(InputIterator1 begin1, InputIterator1 end1, 
                            InputIterator2 begin2) {
  
   // value to return
   double rval = 0;
   
   // set iterators to beginning of ranges
   InputIterator1 it1 = begin1;
   InputIterator2 it2 = begin2;
   
   // for each input item
   while (it1 != end1) {
      
      // take the value and increment the iterator
      double d1 = *it1++;
      double d2 = *it2++;
      
      // accumulate if appropirate
      if (d1 > 0 && d2 > 0)
         rval += std::log(d1 / d2) * d1;
   }
   return rval;  
}

With the kl_distance function defined we can now iteratively apply it to the rows of the input matrix to generate the distance matrix:

// helper function for taking the average of two numbers
inline double average(double val1, double val2) {
   return (val1 + val2) / 2;
}

// [[Rcpp::export]]
NumericMatrix rcpp_js_distance(NumericMatrix mat) {
  
   // allocate the matrix we will return
   NumericMatrix rmat(mat.nrow(), mat.nrow());
   
   for (int i = 0; i < rmat.nrow(); i++) {
      for (int j = 0; j < i; j++) {
      
         // rows we will operate on
         NumericMatrix::Row row1 = mat.row(i);
         NumericMatrix::Row row2 = mat.row(j);
         
         // compute the average using std::tranform from the STL
         std::vector<double> avg(row1.size());
         std::transform(row1.begin(), row1.end(), // input range 1
                        row2.begin(),             // input range 2
                        avg.begin(),              // output range 
                        average);                 // function to apply
      
         // calculate divergences
         double d1 = kl_divergence(row1.begin(), row1.end(), avg.begin());
         double d2 = kl_divergence(row2.begin(), row2.end(), avg.begin());
        
         // write to output matrix
         rmat(i,j) = std::sqrt(.5 * (d1 + d2));
      }
   }
   
   return rmat;
}

Parallel Version using RcppParallel

Adapting the serial version to run in parallel is straightforward. A few notes about the implementation:

  • To implement a parallel version we need to create a function object that can process discrete chunks of work (i.e. ranges of input).

  • Since the parallel version will be called from background threads, we can’t use R and Rcpp APIs directly. Rather, we use the threadsafe RMatrix accessor class provided by RcppParallel to read and write to directly the underlying matrix memory.

  • Other than organzing the code as a function object and using RMatrix, the parallel code is almost identical to the serial code. The main difference is that the outer loop starts with the begin index passed to the worker function rather than 0.

Parallelizing in this case has a big payoff: we observe performance of about 5.5x the serial version on a 2.6GHz Haswell MacBook Pro with 4 cores (8 with hyperthreading). Here is the definition of the JsDistance function object:

// [[Rcpp::depends(RcppParallel)]]
#include <RcppParallel.h>
using namespace RcppParallel;

struct JsDistance : public Worker {
   
   // input matrix to read from
   const RMatrix<double> mat;
   
   // output matrix to write to
   RMatrix<double> rmat;
   
   // initialize from Rcpp input and output matrixes (the RMatrix class
   // can be automatically converted to from the Rcpp matrix type)
   JsDistance(const NumericMatrix mat, NumericMatrix rmat)
      : mat(mat), rmat(rmat) {}
   
   // function call operator that work for the specified range (begin/end)
   void operator()(std::size_t begin, std::size_t end) {
      for (std::size_t i = begin; i < end; i++) {
         for (std::size_t j = 0; j < i; j++) {
            
            // rows we will operate on
            RMatrix<double>::Row row1 = mat.row(i);
            RMatrix<double>::Row row2 = mat.row(j);
            
            // compute the average using std::tranform from the STL
            std::vector<double> avg(row1.length());
            std::transform(row1.begin(), row1.end(), // input range 1
                           row2.begin(),             // input range 2
                           avg.begin(),              // output range 
                           average);                 // function to apply
              
            // calculate divergences
            double d1 = kl_divergence(row1.begin(), row1.end(), avg.begin());
            double d2 = kl_divergence(row2.begin(), row2.end(), avg.begin());
               
            // write to output matrix
            rmat(i,j) = sqrt(.5 * (d1 + d2));
         }
      }
   }
};

Now that we have the JsDistance function object we can pass it to parallelFor, specifying an iteration range based on the number of rows in the input matrix:

// [[Rcpp::export]]
NumericMatrix rcpp_parallel_js_distance(NumericMatrix mat) {
  
   // allocate the matrix we will return
   NumericMatrix rmat(mat.nrow(), mat.nrow());

   // create the worker
   JsDistance jsDistance(mat, rmat);
     
   // call it with parallelFor
   parallelFor(0, mat.nrow(), jsDistance);

   return rmat;
}

Benchmarks

We now compare the performance of the three different implementations: pure R, serial Rcpp, and parallel Rcpp:

# create a matrix
n  = 1000
m = matrix(runif(n*10), ncol = 10)
m = m/rowSums(m)

# ensure that serial and parallel versions give the same result
r_res <- js_distance(m)
rcpp_res <- rcpp_js_distance(m)
rcpp_parallel_res <- rcpp_parallel_js_distance(m)
stopifnot(all(rcpp_res == rcpp_parallel_res))
stopifnot(all(rcpp_parallel_res - r_res < 1e-10)) ## precision differences

# compare performance
library(rbenchmark)
res <- benchmark(js_distance(m),
                 rcpp_js_distance(m),
                 rcpp_parallel_js_distance(m),
                 replications = 3,
                 order="relative")
res[,1:4]
                          test replications elapsed relative
3 rcpp_parallel_js_distance(m)            3   0.110    1.000
2          rcpp_js_distance(m)            3   0.618    5.618
1               js_distance(m)            3  35.560  323.273

The serial Rcpp version yields a more than 50x speedup over straight R code. The parallel Rcpp version provides another 5.5x speedup, amounting to a total gain of over 300x compared to the original R version.

Note that performance gains will typically be 30-50% less on Windows systems as a result of less sophisticated thread scheduling (RcppParallel does not currently use TBB on Windows whereas it does on the Mac and Linux).

You can learn more about using RcppParallel at https://github.com/RcppCore/RcppParallel.

July 15, 2014 07:00 AM

Computing an Inner Product with RcppParallel

The RcppParallel package includes high level functions for doing parallel programming with Rcpp. For example, the parallelReduce function can be used aggreggate values from a set of inputs in parallel. This article describes using RcppParallel to parallelize the inner-product example previously posted to the Rcpp Gallery.

Serial Version

First the serial version of computing the inner product. For this we use a simple call to the STL std::inner_product function:

#include <Rcpp.h>
using namespace Rcpp;

#include <algorithm>

// [[Rcpp::export]]
double innerProduct(NumericVector x, NumericVector y) {
   return std::inner_product(x.begin(), x.end(), y.begin(), 0.0);
}

Parallel Version

Now we adapt our code to run in parallel. We’ll use the parallelReduce function to do this. This function requires a “worker” function object (defined below as InnerProduct). For details on worker objects see the parallel-vector-sum article on the Rcpp Gallery.

// [[Rcpp::depends(RcppParallel)]]
#include <RcppParallel.h>
using namespace RcppParallel;

struct InnerProduct : public Worker
{   
   // source vectors
   const RVector<double> x;
   const RVector<double> y;
   
   // product that I have accumulated
   double product;
   
   // constructors
   InnerProduct(const NumericVector x, const NumericVector y) 
      : x(x), y(y), product(0) {}
   InnerProduct(const InnerProduct& innerProduct, Split) 
      : x(innerProduct.x), y(innerProduct.y), product(0) {}
   
   // process just the elements of the range I've been asked to
   void operator()(std::size_t begin, std::size_t end) {
      product += std::inner_product(x.begin() + begin, 
                                    x.begin() + end, 
                                    y.begin() + begin, 
                                    0.0);
   }
   
   // join my value with that of another InnerProduct
   void join(const InnerProduct& rhs) { 
     product += rhs.product; 
   }
};

Note that InnerProduct derives from the RcppParallel::Worker class. This is required for function objects passed to parallelReduce.

Note also that we use the RVector<double> type for accessing the vector. This is because this code will execute on a background thread where it’s not safe to call R or Rcpp APIs. The RVector class is included in the RcppParallel package and provides a lightweight, thread-safe wrapper around R vectors.

Now that we’ve defined the function object, implementing the parallel inner product function is straightforward. Just initialize an instance of InnerProduct with the input vectors and call parallelReduce:

// [[Rcpp::export]]
double parallelInnerProduct(NumericVector x, NumericVector y) {
   
   // declare the InnerProduct instance that takes a pointer to the vector data
   InnerProduct innerProduct(x, y);
   
   // call paralleReduce to start the work
   parallelReduce(0, x.length(), innerProduct);
   
   // return the computed product
   return innerProduct.product;
}

Benchmarks

A comparison of the performance of the two functions shows the parallel version performing about 2.5 times as fast on a machine with 4 cores:

x <- runif(1000000)
y <- runif(1000000)

library(rbenchmark)
res <- benchmark(sum(x*y),
                 innerProduct(x, y),
                 parallelInnerProduct(x, y),
                 order="relative")
res[,1:4]
                        test replications elapsed relative
3 parallelInnerProduct(x, y)          100   0.035    1.000
2         innerProduct(x, y)          100   0.088    2.514
1                 sum(x * y)          100   0.283    8.086

Note that performance gains will typically be 30-50% less on Windows systems as a result of less sophisticated thread scheduling (RcppParallel does not currently use TBB on Windows whereas it does on the Mac and Linux).

You can learn more about using RcppParallel at https://github.com/RcppCore/RcppParallel.

July 15, 2014 07:00 AM

July 14, 2014

CRANberries

New package tuple with initial version 0.3-06

Package: tuple
Type: Package
Title: Find every match, or orphan, duplicate or triplicate values
Author: Emmanuel Lazaridis [aut, cre]
Maintainer: Emmanuel Lazaridis
Depends: R (>= 2.10.0)
Description: Functions to find all matches, or the orphan, duplicate or triplicate values in a vector.
License: LGPL-3
Encoding: UTF-8
LazyLoad: no
URL: http://statistics.lazaridis.eu
Authors@R: c(person(given = "Emmanuel", family = "Lazaridis", email="emmanuel@lazaridis.eu", role = c("aut", "cre")))
Version: 0.3-06
Date: 2014-07-14
Packaged: 2014-07-14 14:41:27 UTC; james
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-14 23:32:30

More information about tuple at CRAN

July 14, 2014 11:13 PM

New package inTrees with initial version 1.0

Package: inTrees
Title: Interpret Tree Ensembles
Version: 1.0
Date: 2014-07-04
Imports: RRF, arules, gbm
Suggests: xtable
Author: Houtao Deng
Maintainer: Houtao Deng
Description: From a tree ensemble, extract, measure and prune rules; select a compact rule set; summarize rules into a learner; calculate frequent variable interactions.
URL: https://sites.google.com/site/houtaodeng/intrees
License: GPL (>= 3)
Packaged: 2014-07-14 17:51:04 UTC; hdeng1
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-14 23:25:56

More information about inTrees at CRAN

July 14, 2014 11:13 PM

New package FatTailsR with initial version 1.0-3

Package: FatTailsR
Title: Power Hyperbolic Functions and Kiener Distributions
Description: A set of functions to manipulate power hyperbolas, power hyperbolic functions and Kiener distributions of type I, II, III and IV which exhibit left and right fat tails like those that occur in financial markets. These distributions can be used to estimate with a high accuracy market risks and value-at-risk.
URL: http://www.inmodelia.com/fattailsr-en.html
Version: 1.0-3
Date: 2014-07-14
Author: Patrice Kiener
Maintainer: Patrice Kiener
Depends: R (>= 3.1.0)
Imports: minpack.lm
Suggests: timeSeries, timeDate
License: GPL-2
LazyData: true
NeedsCompilation: no
Collate: 'FatTailsR-package.r' 'trigohp.R' 'logishp.R' 'conversion.R' 'kiener1.R' 'kiener2.R' 'kiener3.R' 'kiener4.R' 'regression.R'
Packaged: 2014-07-14 19:50:23 UTC; Patrice
Repository: CRAN
Date/Publication: 2014-07-14 23:24:37

More information about FatTailsR at CRAN

July 14, 2014 11:13 PM

New package IDPSurvival with initial version 1.0

Package: IDPSurvival
Version: 1.0
Date: 2014-07-14
Title: Imprecise Dirichlet Process for Survival Analysis
Author: Francesca Mangili , Alessio Benavoli , Cassio P. de Campos , Marco Zaffalon
Maintainer: Francesca Mangili
Depends: R (>= 3.0.2), Rsolnp, gtools, survival
Description: This package contains functions to perform robust nonparametric survival analysis with right censored data using a prior near-ignorant Dirichlet Process.
License: GPL (>= 3) | file LICENSE
URL: http://ipg.idsia.ch/software/
Packaged: 2014-07-14 12:07:03 UTC; graycassio
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-14 14:50:08

More information about IDPSurvival at CRAN

July 14, 2014 01:13 PM

New package enviPick with initial version 1.0

Package: enviPick
Type: Package
Title: Peak picking for high resolution mass spectrometry data
Version: 1.0
Date: 2014-07-14
Author: Martin Loos
Maintainer: Martin Loos
Description: Sequential partitioning, clustering and peak detection of centroided LC-MS mass spectrometry data (.mzXML). Interactive result and raw data plot.
License: GPL (>= 2)
Depends: R (>= 3.0.1), shiny(>= 0.7.0), readMzXmlData(>= 2.7)
Packaged: 2014-07-14 11:49:42 UTC; uchemadmin
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-07-14 14:50:11

More information about enviPick at CRAN

July 14, 2014 01:13 PM

July 13, 2014

CRANberries

New package QoLR with initial version 1.0

Package: QoLR
Version: 1.0
Date: 2014-07-13
Title: Analysis of Health-Related Quality of Life in oncology
Author: Amelie Anota
Maintainer: Amelie Anota
Depends: R (>= 2.10.0), survival, zoo
Description: To generate the scores of the EORTC QLQ-C30 questionnaire and supplementary modules and to determine the time to quality of life score deterioration in longitudinal analysis.
License: GPL (>= 2.0)
Packaged: 2014-07-13 07:55:28 UTC; Amlie
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-13 13:02:36

More information about QoLR at CRAN

July 13, 2014 11:13 AM

New package pgmm with initial version 1.1

Package: pgmm
Type: Package
Title: Parsimonious Gaussian mixture models
Version: 1.1
Date: 2014-07-12
Author: Paul McNicholas, Raju Jampani, Aaron McDaid, Brendan Murphy, Larry Banks
Maintainer: Paul McNicholas
Description: Carries out model-based clustering or classification using parsimonious Gaussian mixture models.
License: GPL (>= 2)
LazyLoad: yes
Packaged: 2014-07-12 17:28:12 UTC; paul
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-07-13 08:19:14

More information about pgmm at CRAN

July 13, 2014 07:13 AM

New package dbarts with initial version 0.8-2

Package: dbarts
Version: 0.8-2
Date: 2014-07-12
Title: Discrete Bayesian Additive Regression Trees Sampler
Author: Hugh Chipman , Robert McCulloch , Vincent Dorie
Maintainer: Vincent Dorie
Depends: R (>= 3.0-0)
Imports: methods, stats
Suggests: testthat
Description: Fits Bayesian additive regression trees (BART) while allowing the updating of predictors or response so that BART can be incorporated as a conditional model in a Gibbs/MH sampler. Also serves as a drop-in replacement for package BayesTree.
NeedsCompilation: yes
License: GPL (>= 2)
URL: https://github.com/vdorie/dbarts
BugReports: https://github.com/vdorie/dbarts/issues
Packaged: 2014-07-13 00:38:09 UTC; vdorie
Repository: CRAN
Date/Publication: 2014-07-13 08:15:40

More information about dbarts at CRAN

July 13, 2014 07:13 AM

July 12, 2014

Dirk Eddelbuettel

RcppArmadillo 0.4.320.0

While I was out at the (immensely impressive and equally enjoyable) useR! 2014 conference at UCLA, Conrad provided a bug-fix release 4.320 of Armadillo, the nifty templated C++ library for linear algebra. I quickly rolled that into RcppArmadillo release 0.4.320.0 which has been on CRAN and in Debian for a good week now.

This release fixes some minor things with sparse and dense Eigen solvers (as well as one RNG issue probably of lesser interest to R users deploying the RNGs from R) as shown in the NEWS entry below.

Changes in RcppArmadillo version 0.4.320.0 (2014-07-03)

  • Upgraded to Armadillo release Version 4.320 (Daintree Tea Raider)

    • expanded eigs_sym() and eigs_gen() to use an optional tolerance parameter

    • expanded eig_sym() to automatically fall back to standard decomposition method if divide-and-conquer fails

    • automatic installer enables use of C++11 random number generator when using gcc 4.8.3+ in C++11 mode

Courtesy of CRANberries, there is also a diffstat report for the most recent release. As always, more detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

July 12, 2014 11:45 PM

CRANberries

New package imputeLCMD with initial version 1.0

Package: imputeLCMD
Type: Package
Title: A collection of methods for left-censored missing data imputation
Version: 1.0
Date: 2014-07-04
Author: Cosmin Lazar
Maintainer: Cosmin Lazar
Description: The package contains a collection of functions for left-censored missing data imputation. Left-censoring is a special case of missing not at random (MNAR) mechanism that generates non-responses in proteomics experiments. The package also contains functions to artificially generate peptide/protein expression data (log-transformed) as random draws from a multivariate Gaussian distribution as well as a function to generate missing data (both randomly and non-randomly). For comparison reasons, the package also contains several wrapper functions for the imputation of non-responses that are missing at random.
License: GPL (>= 2)
Depends: R (>= 2.10), tmvtnorm, norm, pcaMethods, impute
Packaged: 2014-07-12 18:51:48 UTC; cosminlazar
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-13 00:40:33

More information about imputeLCMD at CRAN

July 12, 2014 11:13 PM

New package BayesNetDiscovery with initial version 0.1

Package: BayesNetDiscovery
Type: Package
Title: Bayesian Network Discovery
Version: 0.1
Description: This package provides efficient Bayesian nonparametric models for network discovery
Depends: R (>= 3.0.3)
License: GPL-2
Imports: DPpackage,igraph,mclust,pscl,tmvtnorm
Packaged: 2014-07-12 17:34:24 UTC; Zhou Lan
Author: Zhou Lan, Jian Kang, Tianwei Yu, Yize Zhao
Maintainer: Zhou Lan
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-07-13 00:40:10

More information about BayesNetDiscovery at CRAN

July 12, 2014 11:13 PM

July 11, 2014

CRANberries

New package refund.wave with initial version 0.1

Package: refund.wave
Type: Package
Title: Wavelet-Domain Regression with Functional Data
Version: 0.1
Date: 2014-07-03
Author: Lan Huo, Philip Reiss and Yihong Zhao
Maintainer: Adam Ciarleglio
Depends: R (>= 2.14.0), glmnet, wavethresh
Description: Methods for regressing scalar responses on functional or image predictors, via transformation to the wavelet domain and back.
License: GPL (>= 2)
LazyLoad: yes
Repository: CRAN
Collate: 'decomp.R' 'decomp2d.R' 'decomp3d.R' 'reconstr.R' 'reconstr2d.R' 'reconstr3d.R' 'wcr.R' 'wcr.perm.R' 'wnet.R' 'wnet.perm.R' 'wUtil.R' 'plot.wnet-and-plot.wcr.R' 'Data2wd.R' 'wd2fhat.R'
Packaged: 2014-07-11 18:19:29 UTC; adamciarleglio
NeedsCompilation: no
Date/Publication: 2014-07-11 23:41:47

More information about refund.wave at CRAN

July 11, 2014 11:13 PM