
Lasso is a machine-learning technique used for model selection, prediction, and inference.
The new lasso command selects “optimal” predictors for continuous, count, and binary outcomes using deviances from linear, Poisson, logit, or probit regression models.
For instance, if you type
. lasso linear y x1-x500
lasso will select a subset of the specified covariates—say, x2, x10, x11, and x21. You can then use the standard predict command to obtain predictions of y.
If you instead have a binary or count outcome, you can use lasso logit, lasso probit, or lasso poisson in the same way. And if you prefer to select variables using the elastic net or square-root lasso method, you can use the elasticnet or sqrtlasso command.
Sometimes, variable selection or prediction is the final goal of lasso. Other times, you are interested in estimating and testing coefficients. Stata 16 provides 11 commands that allow you to estimate coefficients, standard errors, and confidence intervals and to perform tests for variables of interest while using lasso methods to select from among potential control variables. The commands are
dsregress, dslogit, dspoisson, poregress, pologit, popoisson, poivpoisson, xporegress, xpologit,
xpopoisson, and xpoivregress.
The ds commands perform double-selection lasso, the po commands perform partialing-out lasso, and the xpo commands perform cross-fit partialing-out lasso. They do this for models with continuous, binary, and count outcomes. They can even handle endogenous covariates in models for continuous outcomes. The literature currently discusses many methods for lasso-based inference. We make some of these methods available so that researchers can select their favorite. In fact, there are even more lasso-based methods of inference in the literature, and often researchers may use the tools available in lasso, sqrtlasso, and elasticnet to implement other methods.
The lasso and elasticnet commands are standard lasso tools often requested for variable selection and prediction. The lasso tools for inference implement newer methods developed primarily by econometricians. However, these inference methods will be popular in all disciplines because they provide a method for testing and interpreting coefficients on variables of interest.
Users can easily learn all about the lasso features in the new Lasso Reference Manual.

Stata 是一套提供其使用者数据分析、数据管理以及绘制专业图表的完整及整合性统计软件。它提供许许多多功能,包含线性混合模型、均衡重复反复及多项式普罗比模式。用Stata绘制的统计图形相当精美。
除此之外,Stata软件可以透过网络实时更新每天的较新功能,更可以得知世界各地的使用者对于STATA公司提出的问题与解决之道。使用者也可以透过Stata Journal获得许许多多的相关讯息以及书籍介绍等。另外一个获取庞大资源的管道就是Statalist,它是一个独立的listserver,每月交替提供使用者**过1000个讯息以及50个程序。
Stata的统计功能很强,除了传统的统计分析方法外,还收集了近20年发展起来的新方法,如Cox比例风险回归,指数与Weibull回归,多类结果与有序结果的logistic回归,Poisson回归,负二项回归及广义负二项回归,随机效应模型等。具体说, Stata具有如下统计分析能力:
分类资料的一般分析:参数估计,列联表分析 ( 列联系数,确切概率 ) ,流行病学表格分析等。
相关与回归分析:简单相关,偏相关,典型相关,以及多达数十种的回归分析方法,如多元线性回归,逐步回归,加权回归,稳键回归,二阶段回归,百分位数 ( 中位数 ) 回归,残差分析、强影响点分析,曲线拟合,随机效应的线性回归模型等。
