Earlier

Source: Statistics Blog

Source: Xian Blog

Source: Xian Blog

Source: Xian Blog

vtreat is a DataFrame processor/conditioner that prepares real-world data for supervised machine learning or predictive modeling in a statistically sound manner. vtreat takes an input DataFrame that has a specified column called “the outcome variable” (or “y”) that is the quantity to be predicted (and must not have missing values). Other input columns are possible … What is vtreat?
→
The post What is vtreat? appeared first on StatsBlogs. [...]

Wed, Aug 14, 2019Source: Statistics Blog

““We do so much for Sweden but it doesn’t seem to work the other way around. Sweden should focus on its real crime problem.” DT, 25 July
“The Justice Department upholds the rule of law—and we owe it to the victims and their families to carry forward the sentence imposed by our justice system.” William Barr, Attorney General, 25 July
“And you had the Nobel Prize? That’s incredible. They gave it to you for what reason?” DT, 17 July
“Why don’t they go back and help fix the totally broken and crime infested places from which they came.” DT, 14 July
“Donald Trump is [...]

Tue, Aug 13, 2019Source: Xian Blog

At ICML last year, Ciwan Ceylan and Michael Gutmann presented a new version of noise constrative estimation to deal with intractable constants. While noise contrastive estimation relies upon a second independent sample to contrast with the observed sample, this approach uses instead a perturbed or noisy version of the original sample, for instance a Normal generation centred at the original datapoint. And eliminates the annoying constant by breaking the (original and noisy) samples into two groups. The probability to belong to one group or the other then does not depend on the constant, which is a very effective trick. And [...]

Mon, Aug 12, 2019Source: Xian Blog

Samuel Wiqvist and co-authors from Scandinavia have recently arXived a paper on a new version of delayed acceptance MCMC. The ADA in the novel algorithm stands for approximate and accelerated, where the approximation in the first stage is to use a Gaussian process to replace the likelihood. In our approach, we used subsets for partial likelihoods, ordering them so that the most varying sub-likelihoods were evaluated first. Furthermore, if a parameter reaches the second stage, the likelihood is not necessarily evaluated, based on the global probability that a second stage is rejected or accepted. Which of course creates an approximation. [...]

Sat, Aug 10, 2019Source: Xian Blog