Quantcast
Channel: r software hub
Viewing all articles
Browse latest Browse all 1015

Sunday morning puzzle

$
0
0

By xi’an

(This article was originally published at Xi’an’s Og » R, and syndicated at StatsBlogs.)

A question from X validated that took me quite a while to fathom and then the solution suddenly became quite obvious:

If a sample taken from an arbitrary distribution on {0,1}⁶ is censored from its (0,0,0,0,0,0) elements, and if the marginal probabilities are know for all six components of the random vector, what is an estimate of the proportion of (missing) (0,0,0,0,0,0) elements?

Since the censoring modifies all probabilities by the same renormalisation, i.e. divides them by the probability to be different from (0,0,0,0,0,0), ρ, this probability can be estimated by looking at the marginal probabilities to be equal to 1, which equal the original and known marginal probabilities divided by ρ. Here is a short R code illustrating the approach that I wrote in the taxi home yesterday night:

#generate vectors
N=1e5
zprobs=c(.1,.9) #iid example
smpl=matrix(sample(0:1,6*N,rep=TRUE,prob=zprobs),ncol=6)
pty=apply(smpl,1,sum)
smpl=smpl[pty>0,]
ps=apply(smpl,2,mean)
cor=mean(ps/rep(zprobs[2],6))
#estimated original size
length(smpl[,1])*cor

A broader question is how many values (and which values) of the sample can be removed before this recovery gets impossible (with the same amount of information).

Filed under: Books, Kids, R Tagged: conditional probability, cross validated, mathematical puzzle, R

Please comment on the article here: Xi’an’s Og » R

Tags: books, conditional probability, cross validated, Kids, mathematical puzzle, R

…read more

Source:: statsblogs.com


Viewing all articles
Browse latest Browse all 1015

Trending Articles