Planet R

April 19, 2014


New package RApiSerialize with initial version 0.1.0

Package: RApiSerialize
Type: Package
Title: R API Serialization
Version: 0.1.0
Date: 2014-04-19
Author: Dirk Eddelbuettel, Junji Nakano, Ei-ji Nakama, and R Core (original code)
Maintainer: Dirk Eddelbuettel
Description: This package provides other packages with access to the internal R serialization code. Access to this code is provided at the C function level by using the registration of native function mechanism. Client packages simply include a single header file RApiSerializeAPI.h provided by this package. This packages builds on the Rhpc package by Junji Nakano and Ei-ji Nakama which also includes a (partial) copy of the file src/main/serialize.c from R itself. The R Core group is the original author of the serialization code made available by this package.
License: GPL (>= 2)
Packaged: 2014-04-19 13:45:33.622462 UTC; edd
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-04-19 18:49:29

More information about RApiSerialize at CRAN

April 19, 2014 05:12 PM

April 18, 2014


New package rWBclimate with initial version 0.1.3

Package: rWBclimate
Authors@R: c(person("Edmund", "Hart", role = c("aut", "cre"), email = ""))
Version: 0.1.3
License: MIT + file LICENSE
Type: Package
Title: A package for accessing World Bank climate data
Description: This package will download model predictions from 15 different global circulation models in 20 year intervals from the world bank. Users can also access historical data, and create maps at 2 different spatial scales.
LazyData: True
VignetteBuilder: knitr
Suggests: knitr
Imports: ggplot2, httr, plyr, rgdal, jsonlite, reshape2, sp
Packaged: 2014-04-18 21:09:16 UTC; tedhart
Author: Edmund Hart [aut, cre]
Maintainer: Edmund Hart
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-19 00:53:12

More information about rWBclimate at CRAN

April 18, 2014 11:12 PM


Because it's Friday: This is why dogs hate wizards

What happens when you offer a dog a treat, but then make it vanish via sleight of hand? This:


Like Sullivan, I'm surprised these dogs are fooled at all, and can't tell where the treat is by scent.

That's all for this week. See you on Monday!

by David Smith at April 18, 2014 08:48 PM


New package QuantifQuantile with initial version 0.1

Package: QuantifQuantile
Type: Package
Title: Estimation of conditional quantiles using optimal quantization
Version: 0.1
Date: 2014-04-18
Author: Isabelle Charlier and Davy Paindaveine and Jerome Saracco
Maintainer: Isabelle Charlier
Description: Estimation of conditional quantiles using optimal quantization. Construction of an optimal grid of N quantizers, estimation of conditional quantiles and data driven selection of the size N of the grid. Graphical illustrations for the selection of N and of resulting estimated curves or surfaces when the dimension of the covariable is one or two.
License: GPL (>= 2.0)
Depends: rgl
Packaged: 2014-04-18 14:24:25 UTC; isabellecharlier
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-18 19:41:42

More information about QuantifQuantile at CRAN

April 18, 2014 07:12 PM


R and the weather in the local news

The Mountain View Voice is a weekly newspaper serving the Silicon Valley area, and is a familiar sight to anyone wandering the streets of Palo Alto or Menlo Park. Angela Hey writes for 'Hey Tech!', an online blog of the Voice, and has just published a feature on R and the local Bay Area User Group (BARUG). It includes a nice history of R, and an in-depth recap of Ram Narasimhan's lightning talk on the weatherData package and his weatherCompare app at the last BARUG meeting. (You can read about other talks at that BARUG meetup in Joe Rickert's recap.)


Read Angela Hay's feature from the Mountain View Voice blog at the link below.

Mountain View Voice: Analyze data yourself with R - a fast growing language for statistics, forecasting and graphs

by David Smith at April 18, 2014 05:13 PM

DM Radio on Data Science

A couple of weeks ago, I participated in a panel discussion for DM Radio: "Still Sexy? How's that Data Scientist Gig Working Out?". The title was provocative, but the discussion mostly revolved around the rise of data science and how advanced analytics (often implemented with R) is changing the way many companies do business today. Also on the panel hosted by Eric Kavanagh were Geoffrey Malafsky of Phasic Systems, John Whittaker of Dell, Chandran Saravana of SAP. The podcast is now available online, which you can listen to at the link below. (And the answer is: Yes, still sexy!)

Information Management / DM Radio: Still Sexy? How's that Data Scientist Gig Working Out? (reg. req.)

by David Smith at April 18, 2014 04:31 PM


New package Density.T.HoldOut with initial version 1.02

Encoding: UTF-8
Package: Density.T.HoldOut
Type: Package
Title: Density.T.HoldOut: Non-combinatorial T-estimation Hold-Out for density estimation
Version: 1.02
Date: 2014-01-08
Author: Nelo Magalhães (Univ. Paris Sud 11 - INRIA team Select) and Yves Rozenholc (Univ. Paris Descartes - INRIA team Select)
Maintainer: Nelo Magalhães
Description: Implementation in the density framework of the non-combinatorial algorithm and its greedy version, introduced by Magalhães and Rozenholc (2014), for T-estimation Hold-Out proposed in Birgé (2006, Section 9). The package provide an implementation which uses several families of estimators (regular and irregular histograms, kernel estimators) which may be used alone or combined. As a complement, provides also a comparison with other Held-Out derived from least-squares and maximum-likelihood.
License: GPL (>= 2)
Imports: histogram
Packaged: 2014-04-17 18:33:01 UTC; rozen
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-18 16:17:26

More information about Density.T.HoldOut at CRAN

April 18, 2014 03:12 PM

Removed CRANberries

Package rbundler (with last version 0.3.6) was removed from CRAN

Previous versions (as known to CRANberries) which should be available via the Archive link are:

2014-02-25 0.3.6

April 18, 2014 09:12 AM

Package WMTregions (with last version 3.2.6) was removed from CRAN

Previous versions (as known to CRANberries) which should be available via the Archive link are:

2013-02-28 3.2.6
2012-05-16 3.2.5
2012-02-02 3.2.4
2010-11-30 2.5.4
2010-11-03 2.3.10
2010-10-21 2.3.9

April 18, 2014 07:12 AM

Package StochaTR (with last version 1.0.4) was removed from CRAN

Previous versions (as known to CRANberries) which should be available via the Archive link are:

2011-11-03 1.0.4
2011-11-02 1.0.3

April 18, 2014 07:12 AM

April 17, 2014


New package WikipediR with initial version 0.5

Package: WikipediR
Type: Package
Title: A MediaWiki API wrapper
Version: 0.5
Date: 2014-04-13
Author: Oliver Keyes
Maintainer: Oliver Keyes
Description: WikipediR is a wrapper for the MediaWiki API, aimed particularly at the Wikimedia 'production' wikis, such as Wikipedia
License: MIT + file LICENSE
Depends: httr, jsonlite, methods
Classification/ACM: J.1
Packaged: 2014-04-17 17:55:16 UTC; ironholds
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-18 00:38:14

More information about WikipediR at CRAN

April 17, 2014 11:13 PM

New package userfriendlyscience with initial version 0.1.1

Package: userfriendlyscience
Type: Package
Title: Userfriendlyscience: quantitative analysis made accessible
Version: 0.1.1
Date: 2014-02-25
Author: Gjalt-Jorn Peters
Maintainer: Gjalt-Jorn Peters
License: GPL (>= 2)
Description: This package contains a number of functions that serve two goals: first, make R more accessible to people migrating from SPSS by adding a number of functions that behave roughly like their SPSS equivalents; and second, make a number of slightly more advanced functions more userfriendly to relatively novice users.
Imports: lavaan, psych, GGally, pwr, plyr, grid, fBasics, e1071, ltm, MBESS, knitr, xtable, foreign
Depends: ggplot2
Packaged: 2014-04-17 16:18:32 UTC; micro_000
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-17 23:17:14

More information about userfriendlyscience at CRAN

April 17, 2014 11:13 PM

New package ShrinkCovMat with initial version 1.0.0

Package: ShrinkCovMat
Type: Package
Title: Shrinkage Covariance Matrix Estimators
Version: 1.0.0
Date: 2014-04-17
Author: Anestis Touloumis
Maintainer: Anestis Touloumis
Description: This package provides nonparametric Stein-type shrinkage estimators of the covariance matrix that are suitable in high-dimensional settings, that is when the number of variables is larger than the sample size.
License: GPL-2 | GPL-3
Depends: R (>= 2.10)
LazyLoad: yes
Packaged: 2014-04-17 14:20:12 UTC; toulou01
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-17 23:12:07

More information about ShrinkCovMat at CRAN

April 17, 2014 11:12 PM

New package HSAUR3 with initial version 1.0-0

Package: HSAUR3
Title: A Handbook of Statistical Analyses Using R (3rd Edition)
Date: 2014-04-17
Version: 1.0-0
Author: Brian S. Everitt and Torsten Hothorn
Maintainer: Torsten Hothorn
Description: Functions, data sets, analyses and examples from the third edition of the book `A Handbook of Statistical Analyses Using R' (Torsten Hothorn and Brian S. Everitt, Chapman & Hall/CRC, 2014). The first chapter of the book, which is entitled `An Introduction to R', is completely included in this package, for all other chapters, a vignette containing all data analyses is available. In addition, Sweave source code for slides of selected chapters is included in this package (see HSAUR3/inst/slides).
URL: The publishers web page is
Depends: R (>= 3.0.0), tools
Suggests: boot (>= 1.3-9), lattice (>= 0.20-23), MASS (>= 7.3-29), mgcv (>= 1.7-27), rpart (>= 4.1-4), survival (>= 2.37-4), alr3 (>= 2.0.5), ape (>= 3.0-11), coin (>= 1.0-23), flexmix (>= 2.3-11), Formula (>= 1.1-1), gamair (>= 0.0.8), (>= 4.2.6), gee (>= 4.13-18), KernSmooth (>= 2.23-10), lme4 (>= 1.0-5), maps (>= 2.3-6), maptools (>= 0.8-27), mboost (>= 2.2-3), mclust (>= 4.2), mlbench (>= 2.1-1), mice (>= 2.18), multcomp (>= 1.3-1), mvtnorm (>= 0.9-9996), partykit (>= 0.8-0), quantreg (>= 5.05), randomForest (>= 4.6-7), rmeta (>= 2.16), sandwich (>= 2.3-0), scatterplot3d (>= 0.3-34), sp (>= 1.0-14), (>= 1.0-2), tm (>= 0.5-9.1), vcd (>= 1.3-1), wordcloud (>= 2.4)
LazyData: yes
License: GPL-2
Encoding: UTF-8
Packaged: 2014-04-17 10:31:57 UTC; hothorn
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-18 00:21:17

More information about HSAUR3 at CRAN

April 17, 2014 11:12 PM

New package MSQC with initial version 1.0.1

Package: MSQC
Type: Package
Title: Multivariate Statistical Quality Control
Version: 1.0.1
Date: 2014-04-17
Author: Edgar Santos-Fernandez
Maintainer: Edgar Santos-Fernandez
Depends: rgl
Description: This package is a toolkit for multivariate process monitoring. It contains the main alternatives of multivariate control chart such as: Hotelling, Chi squared, MEWMA, MCUSUM and Generelized Variance control chart. Also, it includes some tools for assessing multivariate normality like: Mardia, Royston and Henze Zirkler test and the univariate D'Agostino test. Moreover, it possess ten didactic datasets.
License: GPL (>= 2)
LazyLoad: yes
Packaged: 2014-04-16 23:55:48 UTC; esantosf
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-17 18:28:39

More information about MSQC at CRAN

April 17, 2014 05:12 PM

New package ATmet with initial version 1.2

Package: ATmet
Type: Package
Title: Advanced Tools for Metrology
Version: 1.2
Date: 2014-01-06
Author: S.Demeyer, A.Allard
Maintainer: Alexandre Allard
Depends: R (>= 2.7.0), DiceDesign, lhs, metRology, msm, sensitivity
Description: This package provides functions for smart sampling and sensitivity analysis for metrology applications, including computationally expensive problems.
License: GPL-3
Packaged: 2014-04-17 09:21:13 UTC; ALLARD
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-17 19:04:39

More information about ATmet at CRAN

April 17, 2014 05:12 PM

April 16, 2014


Why writing vectorized code in R is a good idea

As a language for statistical computing, R has always had a bias towards linear algebra, and is optimized for operations dealing in complete vectors and matrixes. This can be surprising to programmers coming to R from lower-level languages, where iterative programming (looping over the elements of a vector or matrix) is more natural and often more efficient. That's not the case with R, though: Noam Ross explains why vectorized programming in R is a good idea:   

If you can express what you want to do in R in a line or two, with just a few function calls that are actually calling compiled code, it’ll be more efficient than if you write long program, with the added overhead of many function calls. This is not the case in all other languages. Often, in compiled languages, you want to stick with lots of very simple statements, because it allows the compiler to figure out the most efficient translation of the code.

Read Noam's complete article at the link below for a bunch of useful tips and tricks for writing efficient and clear code in the R langauge using vectorized programming.

Noam Ross: Vectorization in R: Why?


by David Smith at April 16, 2014 09:20 PM

Diving into H2O

by Joseph Rickert

One of the remarkable features of the R language is its adaptability. Motivated by R’s popularity and helped by R’s expressive power and transparency developers working on other platforms display what looks like inexhaustible creativity in providing seamless interfaces to software that complements R’s strengths. The H2O R package that connects to 0xdata’s H2O software (Apache 2.0 License) is an example of this kind of creativity.

According to the 0xdata website, H2O is “The Open Source In-Memory, Prediction Engine for Big Data Science”. Indeed, H2O offers an impressive array of machine learning algorithms. The H2O R package provides functions for building GLM, GBM, Kmeans, Naive Bayes, Principal Components Analysis, Principal Components Regression, Random Forests and Deep Learning (multi-layer neural net models). Examples with timing information of running all of these models on fairly large data sets are available on the 0xdata website. Execution speeds are very impressive. In this post, I thought I would start a little slower and look at H2O from an R point of View.

H2O is a Java Virtual Machine that is optimized for doing “in memory” processing of distributed, parallel machine learning algorithms on clusters. A “cluster” is a software construct that can be can be fired up on your laptop, on a server, or across the multiple nodes of a cluster of real machines, including computers that form a Hadoop cluster. According to the documentation a cluster’s “memory capacity is the sum across all H2O nodes in the cluster”. So, as I understand it, if you were to build a 16 node cluster of machines each having 64GB of DRAM, and you installed H2O everything then you could run the H2O machine learning algorithms using a terabyte of memory.

Underneath the covers, the H2O JVM sits on an in-memory, non-persistent key-value (KV) store that uses a distributed JAVA memory model. The KV store holds state information, all results and the big data itself. H2O keeps the data in a heap. When the heap gets full, i.e. when you are working with more data than physical DRAM, H20 swaps to disk. (See Cliff Click’s blog for the details.) The main point here is that the data is not in R. R only has a pointer to the data, an S4 object containing the IP address, port and key name for the data sitting in H2O.

The R H2O package communicates with the H2O JVM over a REST API. R sends RCurl commands and H2O sends back JSON responses. Data ingestion, however, does not happen via the REST API. Rather, an R user calls a function that causes the data to be directly parsed into the H2O KV store. The H2O R package provides several functions for doing this Including: h20.importFile() which imports and parses files from a local directory, h20.importURL() which imports and pareses files from a website, and h2o.importHDFS() which imports and parses HDFS files sitting on a Hadoop cluster.

So much for the background: let’s get started with H2O. The first thing you need to do is to get Java running on your machine. If you don’t already have Java the default download ought to be just fine. Then fetch and install the H2O R package. Note that the h2o.jar executable is currently shipped with the h2o R package. The following code from the 0xdata website ran just fine from RStudio on my PC:

# The following two commands remove any previously installed H2O packages for R.
if ("package:h2o" %in% search()) { detach("package:h2o", unload=TRUE) }
if ("h2o" %in% rownames(installed.packages())) { remove.packages("h2o") }
# Next, we download, install and initialize the H2O package for R.
install.packages("h2o", repos=(c("", getOption("repos"))))
localH2O = h2o.init()
# Finally, let's run a demo to see H2O at work.

Created by Pretty R at

Note that the function h20.init() uses the defaults to start up R on your local machine. Users can also provide parameters to specify an IP address and port number in order to connect to a remote instance of H20 running on a cluster. h2o.init(Xmx="10g") will start up the H2O KV store with 10GB of RAM. demo(h2o,glm) runs the glm demo to let you know that everything is working just fine. I will save examining the model for another time. Instead let's look at some other H2O functionality.

The first thing to get straight with H2O is to be clear about when you are working in R and when you are working in the H2O JVM. The H2O R package implements several R functions that are wrappers to H2O native functions. "H2O supports an R-like language" (See a note on R) but sometimes things behave differently than an R programmer might expect.

For example, the R code:

y <- apply(iris[,1:4],2,sum)

produces the following result:

Sepal.Length Sepal.Width Petal.Length Petal.Width 
876.5        458.6       563.7        179.9

Now, let's see how things work in H2O, The following code loads the H2O package, starts a local instance of H2O, uploads the iris data set into the H2O instance from the H2O R package and produces a very R-like summary.

library(h2o)                # Load H2O library  
localH2O = h2o.init()       # initial H2O locl instance
# Upload iris file from the H2O package into the H2O local instance
iris.hex <-  h2o.uploadFile(localH2O, path = system.file("extdata", "iris.csv", package="h2o"), key = "iris.hex")

However, the apply() function from the H2O R package behaves a bit differently

x <- apply(iris.hex[,1:4],2,sum)
IP Address:
Port : 54321
Parsed Data Key: Last.value.17

Instead of returning the the results it returns the attributes of file in which the results are stored. You can see this from looking at the structure of x.

Formal class 'H2OParsedData' [package "h2o"] with 3 slots
..@ h2o :Formal class 'H2OClient' [package "h2o"] with 2 slots
.. .. ..@ ip : chr ""
.. .. ..@ port: num 54321
..@ key : chr "Last.value.17"
..@ logic: logi FALSE

H2O dataset 'Last.value.17': 4 obs. of 1 variable:
$ C1: num 876.5 458.1 563.8 179.8

You can get the data out, by coercing x into being a data frame.

df <-
1 876.5
2 458.1
3 563.8
4 179.8

So, as one might expect, there are some differences that take a little getting used to. However, the focus ought not to be on the differences from R but on the pontential of having some capabilities for manipulating huge data sets from with R. In combination, the H2O R package functions h2o.ddply() and h2o.addFunction(), which permits users to push a new function into the H2O JVM, do a fine job of providing some ddply() features to H2O data sets. 

The following code loads one year of the airlines data set from my hard drive into the H2O instance, gives me the dimensions of the data, and lets me know what variables I have.

path <- "C:/DATA/Airlines_87_08/2008.csv"
air2008.hex <- h2o.uploadFile(localH2O, path = path,key="air2008")
[1] 7009728 29


Then, using h20.addFunction(), define a function to compute the average departure delay, and create a new H2O data set without DepDelay missing values that would otherwise blow up the added function.

# Define function to compute an average for colume 16
fun = function(df) { sum(df[,16])/nrow(df) }
h2o.addFunction(localH2O, fun)  # Push the function to H2O
# Filter out missing values
air2008.filt = air2008.hex[!$DepDelay),]

Finally, run h2o.ddply() to get average departure delay by day of the week and pull down the results from H2O.

airlines.ddply = h2o.ddply(air2008.filt, "DayOfWeek", fun)

  DayOfWeek C1
1 2         8.976897
2 6         8.645681
3 7        11.568973
4 4         9.772897
5 1        10.269990
6 5        12.158036
7 3         8.289761

Exactly, what you would expect! 

Having h2o.ddply() being limited to functions that can be pushed to H2O may seem limiting to some. However, in the context of working with huge data sets I don't see this to be a problem. Presumably the real data cleaning and preperation will be accompished by other tools that are appropriate for the environment (e.g. Hadoop) where the data resides. In a future post, I hope to more closely examine H2O's machine learning algorithms. As it stands, from and R perspective H2O appears to be an impressive accomplishment and welcome addition to the open source world. 

by Joseph Rickert at April 16, 2014 09:13 PM


New package BEQI2 with initial version 1.0-0

Package: BEQI2
Type: Package
Title: Benthic Ecosystem Quality Index 2
Description: Benthic Ecosystem Quality Index 2. This package facilitates the analysis of benthos data. It estimates several quality indices like the total abundance of species, species richness, Shannon index, AZTI Marine Biotic Index (AMBI), and the BEQI-2 index. The package includes two additional optional features that enhance data preprocessing: (1) genus to species conversion, i.e.,taxa counts at the taxonomic genus level can optionally be converted to the species level and (2) pooling: small samples are combined to bigger samples with a standardized size to (a) meet the data requirements of the AMBI, (b) generate comparable species richness values and (c) give a higher benthos signal to noise ratio.
Version: 1.0-0
Date: 2014-04-16
Authors@R: c(person(given = "Willem", family = "van Loon", email = "", role = c("aut", "cph")), person(given = "Dennis", family = "Walvoort", email = "", role = c("aut", "cre")))
Depends: R (>= 3.0.2), methods, tcltk
Imports: knitr, markdown, RJSONIO, xtable, plyr, reshape2
Suggests: testthat
VignetteBuilder: knitr
License: GPL (>= 3)
Packaged: 2014-04-16 14:38:31 UTC; dennis
Author: Willem van Loon [aut, cph], Dennis Walvoort [aut, cre]
Maintainer: Dennis Walvoort
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-16 22:50:45

More information about BEQI2 at CRAN

April 16, 2014 09:12 PM

New package RHive with initial version 2.0-0.0

Package: RHive
Type: Package
Title: R and Hive
Version: 2.0-0.0
Description: RHive is an R extension facilitating distributed computing via HIVE query. It provides an easy to use HQL like SQL and R objects and functions in HQL.
Author: NexR
Maintainer: Johan Ahn
License: Apache License (== 2.0)
Depends: R (>= 2.13.0), rJava (>= 0.9-0)
Suggests: RUnit
SystemRequirements: Hadoop core >= 0.20.3 (, Hive >= 0.8 (
OS_type: unix
Repository: CRAN
Packaged: 2014-04-15 01:26:27 UTC; root
NeedsCompilation: no
X-CRAN-Comment: Archived 2013-08-07 as the former maintainer withdrew maintaienrship and the nominated new maintainer did never respond to requests to update the package.
Date/Publication: 2014-04-16 09:21:53

More information about RHive at CRAN

April 16, 2014 09:12 AM

New package bigalgebra with initial version 0.8.4

Package: bigalgebra
Version: 0.8.4
Date: 2014-04-15
Title: BLAS routines for native R matrices and big.matrix objects.
Author: Michael J. Kane, Bryan Lewis, and John W. Emerson
Maintainer: Michael J. Kane
Imports: methods
Depends: bigmemory (>= 4.0.0)
LinkingTo: bigmemory, BH
Description: This package provides arithmetic functions for R matrix and big.matrix objects.
License: LGPL-3 | Apache License 2.0
Copyright: (C) 2014 Michael J. Kane, Bryan Lewis, and John W. Emerson
LazyLoad: yes
Packaged: 2012-12-26 22:17:06 UTC; blewis
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-04-16 09:21:50

More information about bigalgebra at CRAN

April 16, 2014 09:12 AM

April 15, 2014


New package HiLMM with initial version 1.0

Package: HiLMM
Type: Package
Title: Estimation of heritability in high dimensional Linear Mixed Models
Version: 1.0
Date: 2014-04-10
Author: Anna Bonnet
Maintainer: Anna Bonnet
Description: HiLMM provides estimation of heritability with confidence intervals in linear mixed models.
License: GPL-2
Packaged: 2014-04-15 13:09:37 UTC; levyleduc
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-15 18:21:12

More information about HiLMM at CRAN

April 15, 2014 05:12 PM

New package StereoMorph with initial version 1.0

Package: StereoMorph
Title: Stereo Camera Calibration and Reconstruction
Description: StereoMorph provides functions for the collection of 3D points and curves using a stereo camera setup.
Version: 1.0
Depends: grid
Suggests: rgl
Author: Aaron Olsen, Annat Haber
Maintainer: Aaron Olsen
License: GPL-2
Packaged: 2014-04-15 03:34:20 UTC; aaron
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-15 15:47:06

More information about StereoMorph at CRAN

April 15, 2014 03:13 PM

New package MPCI with initial version 1.0.6

Package: MPCI
Type: Package
Title: Multivariate Process Capability Indices (MPCI)
Version: 1.0.6
Date: 2014-04-13
Author: Edgar Santos-Fernandez, Michele Scagliarini.
Maintainer: Edgar Santos-Fernandez
Description: MPCI package performs the followings Multivariate Process Capability Indices: Shahriari et al. (1995) Multivariate Capability Vector, Taam et al. (1993) Multivariate Capability Index (MCpm), Pan and Lee (2010) proposal (NMCpm) and the followings based on Principal Component Analysis (PCA):Wang and Chen (1998), Xekalaki and Perakis (2002) and Wang (2005). Moreover, it includes two datasets.
License: GPL (>= 2)
LazyLoad: yes
Packaged: 2014-04-14 23:18:37 UTC; esantosf
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-15 15:31:19

More information about MPCI at CRAN

April 15, 2014 03:13 PM

New package Johnson with initial version 1.4

Package: Johnson
Type: Package
Title: Johnson Transformation
Version: 1.4
Date: 2014-04-15
Author: Edgar Santos Fernandez
Maintainer: Edgar Santos Fernandez
Description: RE.Johnson performs the Johnson Transformation to increase the normality.
License: GPL (>= 2)
LazyLoad: yes
Packaged: 2014-04-14 23:37:38 UTC; esantosf
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-15 15:55:00

More information about Johnson at CRAN

April 15, 2014 03:13 PM

Dirk Eddelbuettel

BH release 1.54.0-2

Yesterday's release of RcppBDT 0.2.3 lead to an odd build error. If one used at the same time a 32-bit OS, a compiler as recent as g++ 4.7 and the Boost 1.54.0 headers (directly or via the BH package) then the file lexical_cast.hpp barked and failed to compile for lack of an 128-bit integer (which is not a surprise on a 32-bit OS).

After looking at this for a bit, and looking at some related bug report, I came up with a simple fix (which I mentioned in an update to the RcppBDT 0.2.3 release post). Sleeping over it, and comparing to the Boost 1.55 file, showed that the hunch was right, and I have since made a new release 1.54.0-2 of the BH package which contains the fix.

Changes in version 1.54.0-2 (2014-04-14)

  • Bug fix to lexical_cast.hpp which now uses the test for INT128 which the rest of Boost uses, consistent with Boost 1.55 too.

Courtesy of CRANberries, there is also a diffstat report for the most recent release.

Comments and suggestions are welcome via the mailing list or issue tracker at the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

April 15, 2014 01:47 AM

April 14, 2014


New package VAR.etp with initial version 0.1

Package: VAR.etp
Type: Package
Title: VAR modelling: estimation, testing, and prediction
Version: 0.1
Date: 2014-04-14
Author: Jae. H. Kim
Maintainer: Jae H. Kim
Description: Estimation, Hypothesis Testing, Prediction for Stationary Vector Autoregressive Models
License: GPL-2
Packaged: 2014-04-14 20:21:51 UTC; jkim
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-15 00:32:09

More information about VAR.etp at CRAN

April 14, 2014 11:13 PM


Quantitative Finance Applications in R - 5: an Introduction to Monte Carlo Simulation

by Daniel Hanson

Last time, we looked at the four-parameter Generalized Lambda Distribution, as a method of incorporating skew and kurtosis into an estimated distribution of market returns, and capturing the typical fat tails that the normal distribution cannot.  Having said that, however, the Normal distribution can be useful in constructing Monte Carlo simulations, and it is still commonly found in applications such as calculating the Value at Risk (VaR) of a portfolio, pricing options, and estimating the liabilities in variable annuity contracts. 

We will start here with a simple example using R, focusing on a single security.  Although perhaps seemingly trivial, this lays the foundation for in more complexities such as multiple correlated securities and stochastic interest rates.  Discussion of these topics is planned for articles to come, as well as topics in option pricing.

Single Security Example

Under the oft-used assumption of Brownian Motion dynamics, the return of a single security (eg, an equity) over a period of time Δt is approximately [See Pelsser for example.]

μΔt + σZ・sqrt(Δt)                                                                (*)

where μ is the mean annual return of the equity (also called the drift), and σ is its annualized volatility (i.e., standard deviation).  Z is a standard Normal random variable, which makes the second term in the expression stochastic.  The time t is measured in units of years, so for quarterly returns, for example, Δt = 0.25.

As μ, σ, and Δt are all known values, generating a simulated distribution of returns is a simple task.  As an example, suppose we are interested in constructing a distribution of quarterly returns, where μ = 10% and σ= 15%.  In order to get a reasonable approximation of the distribution, we will generate n = 10,000 returns. 

n <- 10000
# Fixing the seed gives us a consistent set of simulated returns

z <- rnorm(n)        # mean = 0 and sd = 1 are defaults
mu <- 0.10
sd <- 0.15
delta_t <- 0.25
# apply to expression (*) above
qtr_returns <- mu*delta_t + sd*z*sqrt(delta_t)  

Note that R is “smart enough” here by adding the scalar mu*delta_t to each element of the vector in the second term, thus giving us a set of 10,000 simulated returns. Finally, let’s check out results.  First, we plot a histogram:

hist(qtr_returns, breaks = 100, col = "green")

This gives us the following:


The symmetric bell shape of the histogram is consistent with the Normal assumption.  Checking the annualized mean and variance of the simulated returns,

stats <- c(mean(qtr_returns) * 4, sd(qtr_returns) * 2)   # sqrt(4)
names(stats) <- c("mean", "volatility")

We get:

      mean    volatility
0.09901252    0.14975805

which is very close to our original parameter settings of μ = 10% and σ= 15%.

Again, this is rather simple example, but in future discussions, we will see how it extends to using Monte Carlo simulation for option pricing and risk management models.

by Joseph Rickert at April 14, 2014 08:50 PM


New package OceanView with initial version 1.0

Package: OceanView
Version: 1.0
Title: Visualisation of Oceanographic Data and Model Output.
Author: Karline Soetaert
Maintainer: Karline Soetaert
Depends: plot3D, plot3Drgl, R (>= 2.10)
Imports: shape
Description: Functions for transforming and viewing 2-D and 3-D (oceanographic) data and model output.
License: GPL (>= 3.0)
LazyData: yes
Repository: CRAN
Repository/R-Forge/Project: plot3d
Repository/R-Forge/Revision: 80
Repository/R-Forge/DateTimeStamp: 2014-04-11 08:12:26
Date/Publication: 2014-04-14 19:18:54
Packaged: 2014-04-11 10:18:52 UTC; rforge
NeedsCompilation: yes

More information about OceanView at CRAN

April 14, 2014 07:12 PM

New package LumiWCluster with initial version 1.0.2

Package: LumiWCluster
Version: 1.0.2
Date: 2010-09-02
Title: Weighted model based clustering
Author: Pei Fen Kuan , Xin Zhou
Maintainer: Xin Zhou
Depends: R (>= 2.10.0)
Description: The LumiWCluster package implements the proposed weighted model based clustering for Illumina Methylation BeadArray (Kuan et al., 2010). There are two parts to this package. The first part provides function to normalize the methylation data from the Illumina GoldenGate platform. We will include the normalization for Illumina Infinium platform in future. The second part provides function to identify subgroups with distinct methylation profiles in Illumina BeadArray. It works for both Infinium and GoldenGate platform, as well as any other platforms including gene expression arrays. The core of the algorithm is based on the paper by Wang and Zhu (2008), which automatically selects important CpGs/probes. LumiWCluster incorporates the detection p-values (a quality measure of probe performance reported by Illumina BeadStudio or GenomeStudio) systematically in the clustering algorithm and avoids arbitrary cutoff for excluding unreliable probes. Further details and motivation can be found in Kuan et al. (2010).
License: GPL (>= 2)
Packaged: 2014-04-13 23:11:57 UTC; peifenkuan
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-04-14 18:40:03

More information about LumiWCluster at CRAN

April 14, 2014 05:12 PM

New package DivE with initial version 1.0

Package: DivE
Type: Package
Title: Diversity Estimator
Version: 1.0
Date: 2014-04-01
Author: Daniel Laydon, Aaron Sim, Charles Bangham, Becca Asquith
Maintainer: Daniel Laydon
Depends: deSolve, FME, rgeos, sp, R (>= 2.15.3)
Description: R-package DivE contains functions for the DivE estimator (Laydon, D. et al., Quantification of HTLV-1 clonality and TCR diversity, PLOS Comput. Biol. 2014). The DivE estimator is a heuristic approach to estimate the number of classes or the number of species (species richness) in a population.
License: GPL (>= 2)
LazyData: TRUE
Packaged: 2014-04-14 12:24:21 UTC; aaronsim
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-14 19:07:04

More information about DivE at CRAN

April 14, 2014 05:12 PM

New package compositions with initial version 1.40-0

Package: compositions
Version: 1.40-0
Date: 2013-04-08
Title: Compositional Data Analysis
Author: K. Gerald van den Boogaart , Raimon Tolosana, Matevz Bren
Maintainer: K. Gerald van den Boogaart
Depends: R (>= 2.2.0), tensorA, robustbase, energy, bayesm
Suggests: rgl
Description: The package provides functions for the consistent analysis of compositional data (e.g. portions of substances) and positive numbers (e.g. concentrations) in the way proposed by Aitchison and Pawlowsky-Glahn.
License: GPL (>= 2)
Packaged: 2014-04-12 09:04:49 UTC; boogaart
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-04-14 18:36:22

More information about compositions at CRAN

April 14, 2014 05:12 PM


Interfacing R with Web technologies

A new Task View on CRAN will be of anyone who needs to connect R with Web-based applications. The Web Technologies and Services Task View lists R functions and pacakges for reading data from websites (via public APIs or by scraping data from HTML packegs); for interfacing with Cloud-based platforms (including AWS); for authenticating and accessing data from social media services (including Twitter and Facebook); and for integrating with Web frameworks for building your own Web-aware applications with R. It also includes an extensive list of web-based data sources you can access with R.

This useful guide is maintained by Scott Chamberlain, Karthik Ram, Christopher Gandrud, and Patrick Mair. If you have suggestions for other Web-related packages that should be included, you can submit a pull request to the webservices task view project on Github.

CRAN Task Views: Web Technologies and Services

by David Smith at April 14, 2014 05:07 PM


New package rHealthDataGov with initial version 1.0.0

Package: rHealthDataGov
Type: Package
Title: Retrieve data sets from the data API
Version: 1.0.0
Date: 2014-04-13
Author: Erin LeDell
Maintainer: Erin LeDell
Description: An R interface for the data API. For each data resource, you can filter results (server-side) to select subsets of data.
License: GPL-2
Depends: R (>= 3.0.1), bit64, httr, jsonlite
LazyLoad: yes
LazyData: yes
Packaged: 2014-04-14 00:08:14 UTC; ledell
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-14 14:14:26

More information about rHealthDataGov at CRAN

April 14, 2014 01:14 PM

New package mdatools with initial version 0.5.1

Package: mdatools
Title: Multivariate data analysis for chemometrics
Version: 0.5.1
Date: 2014-04-13
Author: Sergey Kucheryavskiy
Maintainer: Sergey Kucheryavskiy
Description: The package implements projection based methods for preprocessing, exploring and analysis of multivariate data used in chemometrics.
License: GPL-3
Packaged: 2014-04-14 06:03:36 UTC; svkucheryavski
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-14 14:13:42

More information about mdatools at CRAN

April 14, 2014 01:13 PM

New package ibs with initial version 1.0

Package: ibs
Type: Package
Title: Integrated B-spline
Version: 1.0
Date: 2014-04-12
Author: Feng Chen
Maintainer: Feng Chen
Description: Calculate B-spline basis functions with a given set of knots and order, or a B-spline function with a given set of knots and order and set of de Boor points (coefficients), or the integral of a B-spline function.
License: GPL (>= 2)
Packaged: 2014-04-13 22:43:15 UTC; z3243864
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-04-14 14:12:30

More information about ibs at CRAN

April 14, 2014 01:13 PM

New package binda with initial version 1.0.0

Package: binda
Version: 1.0.0
Date: 2014-04-14
Title: Multi-Class Discriminant Analysis using Binary Predictors
Author: Sebastian Gibb and Korbinian Strimmer.
Maintainer: Korbinian Strimmer
Depends: R (>= 2.15.1), entropy (>= 1.2.0)
Suggests: MASS
Description: The "binda" package implements functions for multi-class discriminant analysis using binary predictors, for corresponding variable selection, and for dichotomizing continuous data.
License: GPL (>= 3)
Packaged: 2014-04-14 00:26:52 UTC; strimmer
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-14 14:54:59

More information about binda at CRAN

April 14, 2014 01:13 PM

Statistical Modelling

Modelling multivariate, overdispersed binomial data with additive and multiplicative random effects

When modelling multivariate binomial data, it often occurs that it is necessary to take into consideration both clustering and overdispersion, the former arising from the dependence between data, and the latter due to the additional variability in the data not prescribed by the distribution. If interest lies in accommodating both phenomena at the same time, we can use separate sets of random effects that capture the within-cluster association and the extra variability. In particular, the random effects for overdispersion can be included in the model either additively or multiplicatively. For this purpose, we propose a series of Bayesian hierarchical models that deal simultaneously with both phenomena. The proposed models are applied to bivariate repeated prevalence data for hepatitis C virus (HCV) and human immunodeficiency virus (HIV) infection in injecting drug users in Italy from 1998 to 2007.

by Del Fava, E., Shkedy, Z., Aregay, M., Molenberghs, G. at April 14, 2014 05:48 AM

A parametric time series model with covariates for integers in Z

While models for integer valued time series are now abundant, there is a shortage of similar models when the time series refer to data defined on Z, i.e., in both the positive and negative integers. Such data occur in certain disciplines and the need for such models also appear when taking differences of a positive integer count time series. In addition one would often like to include covariates to explain variations in the variable of interest. In this article we construct a model doing all these assuming a specific innovation distribution and provide fully parametric inference, including prediction. Real data applications on accidents and financial returns are given. Finally we also discuss alternative models and extensions.

by Andersson, J., Karlis, D. at April 14, 2014 05:48 AM

Regularization and model selection with categorical predictors and effect modifiers in generalized linear models

Varying-coefficient models with categorical effect modifiers are considered within the framework of generalized linear models. We distinguish between nominal and ordinal effect modifiers, and propose adequate Lasso-type regularization techniques that allow for (1) selection of relevant covariates, and (2) identification of coefficient functions that are actually varying with the level of a potentially effect modifying factor. For computation, a penalized iteratively reweighted least squares algorithm is presented. We investigate large sample properties of the penalized estimates; in simulation studies, we show that the proposed approaches perform very well for finite samples, too. In addition, the presented methods are compared with alternative procedures, and applied to real-world data.

by Oelker, M.-R., Gertheiss, J., Tutz, G. at April 14, 2014 05:48 AM

Optimal information in authentication of food and beverages

Food and beverage authentication is the process by which food or beverages are verified as complying with their label descriptions (Winterhalter, 2007). A common way to deal with an authentication process is to measure attributes, such as, groups of chemical compounds on samples of food, and then use these as input for a classification method. In many applications there may be several types of measurable attributes. An important problem thus consists of determining which of these would provide the best information, in the sense of achieving the highest possible classification accuracy at low cost. We approach the problem under a decision theoretic strategy, by framing it as the selection of an optimal test (Geisser and Johnson, 1992) or as the optimal dichotomization of screening tests variables (Wang and Geisser, 2005), where the ‘test’ is defined through a classification model applied to different groups of chemical compounds. The proposed methodology is motivated by data consisting of measurements of 19 chemical compounds (Anthocyanins, Organic Acids and Flavonols) on samples of Chilean red wines. The main goal is to determine the combination of chemical compounds that provides the best information for authentication of wine varieties, considering the losses associated to wrong decisions and the cost of the chemical analysis. The proposed methodology performs well on simulated data, where the best combination of responses is known beforehand.

by Gutierrez, L., Quintana, F. A. at April 14, 2014 05:48 AM

Dirk Eddelbuettel

RcppBDT 0.2.3

A new release of the RcppBDT package is now on CRAN.

Several new modules were added; the package can now work on dates, date durations, "ptime" (aka posix time), and timezones. Most interesting may be the fact that ptime is configured to use 96 bits. This allows a precise representation of dates and times down to nanoseconds, and permits date and time calculations at this level.

The complete NEWS entry is below:

Changes in version 0.2.3 (2014-04-13)

  • New module 'bdtDt' replacing the old 'bdtDate' module in a more transparent style using a local class which is wrapped, just like the three other new classes do

  • New module 'bdtTd' providing date durations which can be added to dates.

  • New module 'bdtTz' providing time zone information such as offset to UTC, amount of DST, abbreviated and full timezone names.

  • New module 'bdtDu' using 'posix_time::duration' for time durations types

  • New module 'bdtPt' using 'posix_time::ptime' for posix time, down to nanosecond granularity (where hardware and OS permit it)

  • Now selects C++11 compilation by setting CXX_STD = CXX11 in src/Makevars* and hence depend on R 3.1.0 or later – this gives gives us long long needed for the nano-second high-resolution time calculations across all builds and platforms.

Courtesy of CRANberries, there is also a diffstat report for the lastest release. As always, feedback is welcome and the rcpp-devel mailing list off the R-Forge page for Rcpp is the best place to start a discussion.

Update: I just learned the hard way that the combination of 32-bit OS, g++ at version 4.7 or newer and a Boost version of 1.53 or 1.54 does not work with this new upload. Some Googling suggests that this ought to be fixed in Boost 1.54; seemingly it isn't as our trusted BH package with Boost headers provides that very version 1.54. However, the Googling also suggested a quick two-line fix which I just committed in the Github repo. A new BH package with the fix may follow in a few days.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

April 14, 2014 12:37 AM

April 13, 2014


New package LOGICOIL with initial version 0.99.0

Type: Package
Version: 0.99.0
Date: 2014-04-12
Title: LOGICOIL: multi-state prediction of coiled-coil oligomeric state.
Author: Thomas L. Vincent , Peter J. Green and Derek N. Woolfson
Maintainer: Thomas L. Vincent
Depends: R (>= 2.12), nnet
LazyData: true
ZipData: no
License: GPL (>= 2)
Description: This package contains the functions necessary to run the LOGICOIL algorithm. LOGICOIL can be used to differentiate between antiparallel dimers, parallel dimers, trimers and higher-order coiled-coil sequence. By covering >90 percent of the known coiled-coil structures, LOGICOIL is a net improvement compared with other existing methods, which achieve a predictive coverage of around 31 percent of this population. As such, LOGICOIL is particularly useful for researchers looking to characterize novel coiled-coil sequences or studying coiled-coil containing protein assemblies. It may also be used to assist in the structural characterization of synthetic coiled-coil sequences.
Repository: CRAN
Packaged: 2014-04-13 21:36:51 UTC; ThomasVincent
NeedsCompilation: no
Date/Publication: 2014-04-14 00:04:00

More information about LOGICOIL at CRAN

April 13, 2014 11:12 PM

New package freqweights with initial version 0.0.1

Package: freqweights
Type: Package
Title: Working with frequency tables
Version: 0.0.1
Date: 2014-04-03
Author: Emilio Torres-Manzanera
Maintainer: Emilio Torres-Manzanera
Description: The frequency of a particular data value is the number of times it occurs. A frequency table is a table of values with their corresponding frequencies. Frequency weights are integer numbers that indicate how many cases each case represents. This package provides some functions to work with such type of collected data.
License: GPL-3
Imports: plyr, data.table, dplyr, biglm, fastcluster
Suggests: MASS, hflights, cluster, ggplot2
Packaged: 2014-04-13 00:49:45 UTC; emilio
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-14 00:05:17

More information about freqweights at CRAN

April 13, 2014 11:12 PM

New package FisHiCal with initial version 1.0

Package: FisHiCal
Type: Package
Title: Iterative FISH-based Calibration of Hi-C Data
Version: 1.0
Date: 2014-04-09
Author: Yoli Shavit, Fiona Kathryn Hamey and Pietro Lio'
Maintainer: Yoli Shavit
Description: FisHiCal integrates Hi-C and FISH data, offering a modular and easy-to-use tool for chromosomal spatial analysis.
License: GPL
Depends: R (>= 3.0.2), igraph, RcppArmadillo (>=
Suggests: rgl
LinkingTo: Rcpp (>= 0.11.1), RcppArmadillo (>=
Packaged: 2014-04-13 18:55:42 UTC; ys388
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-04-13 23:26:01

More information about FisHiCal at CRAN

April 13, 2014 11:12 PM

New package FGSG with initial version 1.0

Package: FGSG
Title: Feature grouping and selection over an undirected graph
Version: 1.0
Author: Xiaotong Shen, Yiwen Sun, Julie Langou
Maintainer: Yiwen Sun
Description: FGSG package implements algorithms for feature grouping and selection over an undirected graph.
License: GPL-2
Note: The header file blaswrap.h, f2c.h and fgsg.h are from the VisualStudio library created by Julie Langou.
Packaged: 2014-04-13 17:25:11 UTC; sunxx847
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-04-13 23:04:06

More information about FGSG at CRAN

April 13, 2014 09:12 PM

New package anominate with initial version 0.3

Package: anominate
Type: Package
Title: alpha-NOMINATE Ideal Point Estimator
Version: 0.3
Date: 2014-04-11
Author: Christopher Hare , Jeffrey Lewis , Keith Poole , Howard Rosenthal , Royce Carroll , James Lo
Maintainer: Christopher Hare
Depends: wnominate, oc, pscl, MCMCpack
Description: Fits ideal point model described in Carroll, Lewis, Lo, Poole and Rosenthal, "The Structure of Utility in Models of Spatial Voting," American Journal of Political Science 57(4): 1008--1028.
License: GPL-2
Packaged: 2014-04-13 01:45:17 UTC; Christopher
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-04-13 19:57:16

More information about anominate at CRAN

April 13, 2014 07:12 PM

New package helsinki with initial version 0.9.12

Package: helsinki
Type: Package
Title: Helsinki open data R tools
Version: 0.9.12
Date: 2014-04-13
Author: Juuso Parkkinen, Leo Lahti, Joona Lehtomaki
Maintainer: Juuso Parkkinen
Description: Tools for accessing various open data sources in the Helsinki region in Finland. Current data sources include the Real Estate Department and the Environmental Services Authority.
License: BSD_2_clause + file LICENSE
Depends: R (>= 3.0.1), rjson, RCurl, maptools, utils
LazyLoad: yes
Packaged: 2014-04-13 07:32:03 UTC; juusoparkkinen
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-13 16:05:54

More information about helsinki at CRAN

April 13, 2014 03:12 PM

April 12, 2014


New package hasseDiagram with initial version 0.1

Package: hasseDiagram
Type: Package
Title: Drawing Hasse diagram
Version: 0.1
Date: 2014-02-13
Author: Krzysztof Ciomek
Maintainer: Krzysztof Ciomek
Depends: Rgraphviz (>= 2.6.0), grid (>= 3.0.2)
Description: Drawing Hasse diagram - visualization of transitive reduction of a finite partially ordered set.
License: MIT + file LICENSE
Packaged: 2014-04-12 10:09:02 UTC; Krzysztof
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-12 23:27:47

More information about hasseDiagram at CRAN

April 12, 2014 11:12 PM

New package EntropyEstimation with initial version 0.1

Package: EntropyEstimation
Type: Package
Title: Tools for the estimation of entropy and related quantities.
Version: 0.1
Date: 2014-04-06
Author: Lijuan Cao and Michael Grabchak
Maintainer: Michael Grabchak
Description: This package contains methods for the estimation of Shannon's entropy, variants of Renyi's entropy, mutual information, and Kullback-Leibler divergence. The estimators used have a bias that decays exponentially fast.
License: GPL (>= 3)
Packaged: 2014-04-12 04:06:04 UTC; fionacao
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-04-13 00:34:45

More information about EntropyEstimation at CRAN

April 12, 2014 11:12 PM

April 11, 2014


Because it's Friday: About a Company

You've probably seen dozens of those brand videos from big corporations, carefully designed to make you feel good about how wonderful the company is. This one is just like all the others, except that it's based on a satirical poem by Kendra Eash and made entirely of stock video footage (which you can buy from the company that made the video, which is rather meta).


That's all from us this week. Enjoy the fine weekend, and we'll see you back here on Monday.

by David Smith at April 11, 2014 11:34 PM

Bioconductor Project Working Papers

Nonparametric Identifiability of Finite Mixture Models with Covariates for Estimating Error Rate without a Gold Standard

Finite mixture models provide a flexible framework to study unobserved entities and have arisen in many statistical applications. The flexibility of these models in adapting various complicated structures makes it crucial to establish model identifiability when applying them in practice to ensure study validity and interpretation. However, researches to establish the identifiability of finite mixture model are limited and are usually restricted to a few specific model configurations. Conditions for model identifiability in the general case have not been established. In this paper, we provide conditions for both local identifiability and global identifiability of a finite mixture model. The former is based on Jacobian matrix of the model, and the latter is based on decomposition of three-way contingency table. The results are derived for a general finite mixture model, which allows for continuous, discrete or mix-typed manifest variables, ordinal or nominal latent groups, and flexible inclusion of covariates. We also provide intuitive explanation of the conditions and discuss the effect of including covariates in the model.

by Zheyu Wang et al. at April 11, 2014 09:16 PM


New package rorutadis with initial version 0.1.1

Package: rorutadis
Type: Package
Title: Robust Ordinal Regression UTADIS
Version: 0.1.1
Date: 2014-04-10
Author: Krzysztof Ciomek
Maintainer: Krzysztof Ciomek
Depends: Rglpk (>= 0.5-1), ggplot2 (>=, gridExtra (>= 0.9.1)
Description: Implementation of Robust Ordinal Regression for value-based sorting with some extensions and additional tools. It is a novel Multiple-Criteria Decision Aiding (MCDA) framework.
License: GPL-3
Suggests: testthat (>= 0.7.1)
Packaged: 2014-04-11 09:47:35 UTC; Krzysztof
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-11 22:44:27

More information about rorutadis at CRAN

April 11, 2014 09:13 PM

New package gRapfa with initial version 1.0

Package: gRapfa
Type: Package
Title: Acyclic Probabilistic Finite Automata
Version: 1.0
Date: 2014-04-10
Author: Smitha Ankinakatte and David Edwards
Maintainer: Smitha Ankinakatte
Description: gRapfa is for modelling discrete longitudinal data using acyclic probabilistic finite automata (APFA). The package contains functions for constructing APFA models from a given data using penalized likelihood methods. For graphical display of APFA models, gRapfa depends on 'igraph package'. gRapfa also contains an interface function to Beagle software that implements an efficient model selection algorithm.
License: GPL (>= 2)
Depends: R (>= 3.0.2), igraph
Packaged: 2014-04-11 12:02:26 UTC; SMAA
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-11 22:47:57

More information about gRapfa at CRAN

April 11, 2014 09:12 PM

New package CircE with initial version 1.0

Package: CircE
Version: 1.0
Date: 2014/3/4
Title: Circumplex models Estimation
Author: Michele Grassi
Maintainer: Michele Grassi
Depends: R (>= 1.6.0), stats
Description: This package contains functions for fitting circumplex structural models for correlation matrices (with negative correlation) by the method of maximum likelihood.
License: GPL (>= 2)
LazyLoad: true
Packaged: 2014-04-11 16:11:21 UTC; michelegrassi
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-11 22:52:57

More information about CircE at CRAN

April 11, 2014 09:12 PM


Create an impressionist self-portrait from your Twitter followers

Here's something fun you can do with R and its interface to Twitter, the TwitteR package. An R script by CMU student Mark Patterson downloads your Twitter profile picture, counts the number of Twitter followers you have, and then creates a pointillist version of your profile picture with as many dots as you have followers. Here's mine:


Note that to use Mark's script you will need to install the BioConductor package EBImage (follow the instructions in yellow on that page) and create a Twitter app and authenticate it. Once you've got those set up, call general.func("yourTwitterHandle") to create your own!

Decisions and R: Visualizing Twitter Followers Using Pointillism

by David Smith at April 11, 2014 06:30 PM


New package Rcpp11 with initial version 3.1.0

Package: Rcpp11
Title: R and C++11
Version: 3.1.0
Date: 2014-04-10
Authors@R: c( person("Romain", "Francois", role = c("aut", "cre"), email = ""), person("Kevin", "Ushey", role = "aut", email = ""), person("John", "Chambers", role = "ctb", email = "") )
Description: R and C++11
Depends: R (>= 3.1.0)
License: MIT + file LICENSE
SystemRequirements: C++11
Packaged: 2014-04-10 13:08:01 UTC; romain
Author: Romain Francois [aut, cre], Kevin Ushey [aut], John Chambers [ctb]
Maintainer: Romain Francois
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-11 10:46:36

More information about Rcpp11 at CRAN

April 11, 2014 09:12 AM

New package rvTDT with initial version 1.0

Package: rvTDT
Type: Package
Title: population control weighted rare-variants TDT
Version: 1.0
Date: 2014-04-07
Author: Yu Jiang, Andrew S. Allen
Maintainer: Yu Jiang
Description: Used to compute population controls weighted rare variants transmission distortion test
License: GPL-3
Depends: CompQuadForm (>= 1.4.1)
Packaged: 2014-04-10 20:16:07 UTC; yujiang
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2014-04-11 07:52:45

More information about rvTDT at CRAN

April 11, 2014 07:12 AM

New package extWeibQuant with initial version 1.0

Package: extWeibQuant
Type: Package
Title: Estimate the lower extreme quantile with the censored Weibull MLE and censored Weibull Mixture
Version: 1.0
Date: 2014-04-09
Author: Yang (Seagle) Liu
Maintainer: Yang (Seagle) Liu
Description: This package implements the subjectively censored Weibull MLE and censored Weibull mixture for the lower quantile estimation. It also includes functions to evaluation the standard error of the resulting quantile estimates. Also, the methods here can be used to fit the Weibull or Weibull mixture for the Type-I or Type-II right censored data.
License: GPL (>= 2)
Packaged: 2014-04-10 21:33:42 UTC; User
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2014-04-11 07:52:38

More information about extWeibQuant at CRAN

April 11, 2014 07:12 AM

Dirk Eddelbuettel

RcppCNPy 0.2.3

R 3.1.0 came out today. Among the (impressive and long as usual) list of changes is the added ability to specify CXX_STD = CXX11 in order to get C++11 (or the best available subset on older compilers). This brings a number of changes and opportunities which are frankly too numerous to be discussed in this short post. But it also permits us, at long last, to use long long integer types.

For RcppCNPy, this means that we can finally cover NumPy integer data (along with the double precision we had from the start) on all platforms. Python encodes these as an int64, and that type was unavailable (at least in 32-bit OSs) until we got long long made available to us by R. So today I made the change to depend on R 3.1.0, and select C++11 which allowed us to free the code from a number if #ifdef tests. This all worked out swimmingly and the new package has already been rebuilt for Windows.

I also updated the vignette, and refreshed its look and feel. Full changes are listed below.

Changes in version 0.2.3 (2014-04-10)

  • src/Makevars now sets CXX_STD = CXX11 which also provides the long long type on all platforms, so integer file support is no longer conditional.

  • Consequently, code conditional on RCPP_HAS_LONG_LONG_TYPES has been simplified and is no longer conditional.

  • The package now depends on R 3.1.0 or later to allow this.

  • The vignette has been updated and refreshed to reflect this.

CRANberries also provides a diffstat report for the latest release. As always, feedback is welcome and the rcpp-devel mailing list off the R-Forge page for Rcpp is the best place to start a discussion.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

April 11, 2014 12:49 AM