(Note: an overall list of publications is available on another page)
Abstract. A new class of density functions depending on a shape parameter is introduced, such that the value 0 for this parameter corresponds to the standard normal density. The properties of this class of density functions are studied.
Abstract Further results are presented about a class of density functions considered by the author in a previous paper (1985). In particular, an additional shape parameter is introduced which allows a wide range for the indices of skewness and curtosis.
Note The paper has been reprinted along with comments and corrigenda in Statistica 80 (2020) and it is freely accessible.
Abstract. The paper extends earlier work on the so-called skew-normal distribution, a family of distributions including the normal, but with an extra parameter to regulate skewness. The present work introduces a multivariate parametric family such that the marginal densities are scalar skew-normal, and studies its properties, with special emphasis on the bivariate case.
Abstract. Azzalini & Dalla Valle (1996) have recently discussed the multivariate skew-normal distribution which extends the class of normal distributions by the addition of a shape parameter. The first part of the present paper examines further probabilistic properties of the distribution, with special emphasis on aspects of statistical relevance. Inferential and other statistical issues are discussed in the following part, with applications to some multivariate statistics problems, illustrated by numerical examples. Finally, a further extension is described which introduces a skewing factor of an elliptical density.
Full-length paper. This paper is the abriged version of the full-length paper which is available here and at arXiv.org
Abstract. The problem of finding the smallest region with given probability mass is considered for the case of a multivariate random variable with skew-normal distribution. A simple but accurate approximate solution is proposed.
Abstract. A fairly general procedure is studied to perturbate a multivariate density satisfying a weak form of multivariate symmetry, and to generate a whole set of non-symmetric densities. The approach is general enough to encompass a number of recent proposals in the literature, variously related to the skew normal distribution, The special case of skew elliptical densities is examined in detail, establishing connections with existing similar work. The final part of the paper specializes further to a form of multivariate skew-t density. Likelihood inference for this distribution is examined, and it is illustrated with numerical examples.
Full-length paper. This paper is the abriged version of the full-length paper which is available here and at arXiv.org
Abstract. This paper explores the usefulness of the multivariate skew-normal distribution in the context of graphical models. A slight extension of the family recently discussed by Azzalini & Dalla Valle (1996) and Azzalini & Capitanio (1999) is described, the main motivation being the additional property of closure under conditioning. After considerations of the main probabilistic features, the focus of the paper is on the construction of conditional independence graphs for skew-normal variables. Necessary and sufficient conditions for conditional independence are stated, and the admissible structures of a graph under restriction on univariate marginal distribution are studied. Finally, parameter estimation is considered. It is shown how the factorization of the likelihood function according to a graph can be rearranged in order to obtain a parameter based factorization.
Abstract. The U.S. family income data for the years 1970, 1975, 1978, 1980, 1985 and 1990 was fitted using the log-normal, Gamma, Singh--Maddala, Dagum type I and generalized Beta of second kind distributions, among others, in earlier publications. Here we supplement these fittings by adding the log-skew-normal and log-skew-t distributions. In addition, we have performed similar numerical comparisons using 1997 income data collected in a sample survey from several European countries. The overall picture emerging from these numerical comparisons indicates that, while the log-skew-normal distribution provides a somewhat variable degree of goodness-of-fit, the log-skew-t distribution seems to fit the data satisfactorily in a quite consistent way, and on the par with the most creditable distributions.
Abstract. The stress-strength model is considered in the case of skew-normal variates. Some exact probability results are given. For the case that either the stress or the strength has a skew-normal distribution, inferential issues are considered in a likelihood context, and simulation results provided which indicate a satisfactory agreement of nominal and actual confidence level for interval estimation.
Abstract. This paper provides an introductory overview of a portion of distribution theory which is currently under intense development. The starting point of this topic has been the so-called skew-normal distribution, but the connected area is becoming increasingly broad, and its branches include now many extensions, such as the skew-elliptical families, and some forms of semi-parametric formulations, extending the relevance of the field much beyond the original theme of `skewness'. The final part of the paper illustrates connections with various areas of application, including selective sampling, models for compositional data, robust methods, some problems in econometrics, non-linear time series, especially in connection with financial data, and more.
The paper is followed by a discussion by Marc Genton (pp. 189-198) and the author (pp. 199-200).
Abstract. The distribution theory literature connected to the multivariate skew-normal distribution has grown rapidly in recent years, and a number of extensions and alternative formulations have been put forward. Presently there are various coexisting proposals, similar but not identical, and with rather unclear connections. The purpose of this paper is to unify these proposals under a new general formulation, clarifying at the same time their relationships. The final part sketches an extension of the argument to the skew-elliptical family.
Abstract. In neuropsychological single-case research inferences concerning a patient's cognitive status are often based on referring the patient's test score to those obtained from a modestly sized control sample. Two methods of testing for a deficit (z and a method proposed by Crawford and Howell [Crawford, J. R. & Howell, D. C. (1998). Comparing an individualâs test score against norms derived from small samples. The Clinical Neuropsychologist, 12, 482-486]) both assume the control distribution is normal but this assumption will often be violated in practice. Monte Carlo simulation was employed to study the effects of leptokurtosis and the combination of skew and leptokurtosis on the Type I error rates for these two methods. For Crawford and Howell's method, leptokurtosis produced only a modest inflation of the Type I error rate when the control sample N was small-to-modest in size and error rates were lower than the specified rates at larger N. In contrast, the combination of leptokurtosis and skew produced marked inflation of error rates for small Ns. With a specified error rate of 5%, actual error rates as high as 14.31% and 9.96% were observed for z and Crawford and Howellâs method respectively. Potential solutions to the problem of non-normal data are evaluated.
Abstract. For statistical inference connected to the scalar skew-normal distribution, it is known that the so-called centred parametrization provides a more convenient parametrization than the one commonly employed for writing the density function. We extend the definition of the centred parametrization to the multivariate case, and study the corresponding information matrix.
Abstract. The robustness problem is tackled by adopting a parametric class of distributions flexible enough to match the behaviour of the observed data. In a variety of practical cases, one reasonable option is to consider distributions which include parameters to regulate their skewness and kurtosis. As a specific representative of this approach, the skew-$t$ distribution is explored in more detail, and reasons are given to adopt this option as a sensible general-purpose compromise between robustness and simplicity, both of treatment and of interpretation of the outcome. Some theoretical arguments, outcomes of a few simulation experiments and various wide-ranging examples with real data are provided in support of the claim.
Abstract. In the context of clinical trials where one of several doses or treatments is selected in a phase II study to be examined further in a phase III study, we develop a formulation for the combination of the overall information obtained from such studies, which mimics the logic followed in actual drug development. The associated distribution theory is exact under the normality assumption. Extensions to more complex situations are sketched briefly.
Abstract. We develop estimating equations for the parameters of the base density of a skewsymmetric distribution. The method is based on an invariance property with respect to asymmetry. Various properties of this approach and the selection of a root are discussed. We also present several extensions of the methodology, namely to the regression setting, the multivariate case, and the skew-t distribution. The approach is illustrated on several simulations and a numerical example.
Abstract. The skew-t family, in its univariate and multivariate versions, is a parametric family of probability distributions which is currently under intense investigation because of several appealing properties. The present paper addresses the question of the choice of its parameterization, and more generally of the selection of quantities of interest associated to this distribution.
Abstract. An active stream of literature has followed up the idea of skew-elliptical densities initiated by Azzalini and Capitanio (1999). Their original formulation was based on a general lemma which is however of broader applicability than usually perceived. This note examines new directions of its use, and illustrates them with the construction of some probability distributions falling outside the family of the so-called skew-symmetric densities.
Abstract. The family of skew-symmetric distributions is a wide set of probability density functions obtained by suitably combining a few components which can be quite freely selected provided some simple requirements are satisfied. Although intense recent work has produced several results for certain sub-families of this construction, much less is known in general terms. The present paper explores some questions within this framework, and provides conditions for the above-mentioned components to ensure that the final distribution enjoys specific properties.
Abstract. The skew-normal and the skew-$t$ distributions are parametric families which are currently under intense investigation since they provide a more flexible formulation compared to the classical normal and $t$ distributions by introducing a parameter which regulates their skewness. While these families enjoy attractive formal properties from the probability viewpoint, a practical problem with their usage in applications is the possibility that the maximum likelihood estimate of the parameter which regulates skewness diverges. This situation has vanishing probability for increasing sample size, but for finite samples it occurs with non-negligible probability, and its occurrence has unpleasant effects on the inferential process. Methods for overcoming this problem have been put forward both in the classical and in the Bayesian formulation, but their applicability is restricted to simple situations. We formulate a proposal based on the idea of penalized likelihood, which has connections with some of the existing methods, but it applies more generally, including the multivariate case.
Abstract. The current literature on so-called skew-symmetric distributions is closely linked to the idea of a selection mechanism operated by some latent variable. We illustrate the pioneering work of Fernando de Helguero who in 1908 put forward a formulation for the genesis of non-normal distributions via a selection mechanism which perturbs a normal distribution, hence employing an argument closely connected with the one now widely used in this context. Arguably, de Helguero can then be considered the precursor of the current idea of skew-symmetric distributions. Unfortunately, a tragic quirk of fate did not allow him to pursue his project beyond the initial formulation and his work went unnoticed for the rest of the 20th century.
Abstract. Substantial work has been dedicated in recent years to the construction of families of continuous distributions obtained by applying a modulation factor to a base symmetric density so as to obtain non-symmetric variant forms, often denoted skew-symmetric distributions. All this development has dealt with the case of continuous variables, while here we extend the formulation to the discrete case; moreover, some of the statements are of general validity. The results are illustrated with an application to the distribution of the score difference in sport matches.
Related information is available here
Abstract. We examine some distributions used extensively within the model-based clustering literature in recent years, paying special attention to claims that have been made about their relative efficacy. Theoretical arguments are provided as well as real data examples.
Abstract. A standard method of obtaining non-symmetrical distributions is that of modulating symmetrical distributions by multiplying the densities by a perturbation factor. This has been considered mainly for central symmetry of a Euclidean space in the origin. This paper enlarges the concept of modulation to the general setting of symmetry under the action of a compact topological group on the sample space. The main structural result relates the density of an arbitrary distribution to the density of the corresponding symmetrised distribution. Some general methods for constructing modulating functions are considered. The effect that transformations of the sample space have on symmetry of distributions is investigated. The results are illustrated by general examples, many of them in the setting of directional statistics.
Abstract. The use of flexible distributions with adaptive tails as a route to robustness has a long tradition. Recent developments in distribution theory, especially of non-symmetric form, provide additional tools for this purpose. We discuss merits and limitations of this approach to robustness as compared with classical methodology. Operationally, we adopt the skew-t as the working family of distributions used to implement this line of thinking.
Abstract. In the context of modulated-symmetry distributions, there exist various forms of skew-elliptical families. We present yet another one, but with an unusual feature: the modulation factor of the baseline elliptical density is represented by a distribution function with an argument that is not an odd function, as it occurs instead with the overwhelming majority of similar formulations, not only with other skew-elliptical families. The proposal is obtained by going back to the use of a lemma known since 1999, which can be seen as the general frame for a vast number of existing formulations, and use it on a different route. The broader target is to show that this "mother lemma" can still generate novel progeny. The final part of the paper examines a further level of generalization of the “mother-lemma” where symmetry conditions are removed.
Abstract. Despite a flourishing activity, especially in recent times, for the study of flexible parametric classes of distributions, little work has dealt with the case where the tail weight and degree of peakedness is regulated by two parameters instead of a single one, as it is usually the case. The present contribution starts off from the symmetric distributions introduced by Kotz in 1975, subsequently evolved into the so-called Kotz-type distribution, and builds on their univariate versions by introducing a parameter which allows for the presence of asymmetry. We study some formal properties of these distributions and examine their practical usefulness in some real-data illustrations, considering both symmetric and asymmetric variants of the distributions.
Abstract. In the context of measurement error models, the true unobservable covariates are commonly assumed to have a normal distribution. This assumption is replaced here by a more flexible two-piece normal distribution, which allows for asymmetry. After setting-up a general formulation for two-piece distributions, we focus on the case of the normal two-piece construction. It turns out that the joint distribution of the actual observations (the multivariate observed covariates and the response) is a two-component mixture of multivariate skew-normal distributions. This connection facilitates the construction of an EM-type algorithm for performing maximum likelihood estimation. Some numerical experimentation with two real datasets indicates a substantial improvement of the present formulation with respect to the classical normal-theory construction, which greatly compensates the introduction of a single parameter for regulation of skewness.
Abstract. Within the context of flexible parametric families of distributions, much work has been dedicated in recent years to the theme of skew-symmetric distributions, or symmetry-modulated distributions as we prefer to call them. The present contribution constitutes a review of this area, with special emphasis on multivariate skew-elliptical families, which represent the subset with more immediate impact on applications. After providing background information of the distribution theory aspects, we focus on the aspects more relevant for applied work. The exposition is targeted to non-specialists in this domain, although some general knowledge of probability and multivariate statistics is assumed. Given this aim, the mathematical profile is kept to the minimum required.
Abstract. Annotations and corrigenda to the 1986 paper with the same root title, in connection with its reprint.
Abstract. Since its introduction, the skew-t distribution has received much attention in the literature both for the study of theoretical properties and as a model for data fitting in empirical work. A major motivation for this interest is the high degree of flexibility of the distribution as the parameters span their admissible range, with ample variation of the associated measures of skewness and kurtosis. While this high flexibility allows to adapt a member of the parametric family to a wide range of data patterns, it also implies that parameter estimation is a more delicate operation with respect to less flexible parametric families, given that a small variation of the parameters can have a substantial effect on the selected distribution. In this context, the aim of the present contribution is to deal with some computational aspects of maximum likelihood estimation. A problem of interest is the possible presence of multiple local maxima of the log-likelihood function. Another one, to which most of our attention is dedicated, is the development of a quick and reliable initialization method for the subsequent numerical maximization of the log-likelihood function, both in the univariate and the multivariate context.
Abstract. For the family of multivariate probability distributions variously denoted as unified skew-normal, closed skew-normal and other names, a number of properties are already known, but many others are not, even some basic ones. The present contribution aims at filling some of the missing gaps. Specifically, the moments up to the fourth order are obtained, and from here the expressions of the Mardia's measures of multivariate skewness and kurtosis. Other results concern the property of log-concavity of the distribution, and closure with respect to conditioning on intervals.
Abstract. In the last two decades or so, much work has been dedicated to the portion of distribution theory stemming from the skew-normal distribution and its ramification. This contribution presents an outline of the theme, without attempting a detailed review, which would be unfeasible, given the amount of available material. The aim is to present a panoramic view of the theme, leaving out the fine details, with rather more emphasis on the evolution of the underlying ideas and on the breath of the overall developments, as for range of specific directions considered.
Abstract. We investigate the non-identifiability of the multivariate unified skew-normal distribution under permutation of its latent variables. We show that the non-identifiability issue also holds with other parameterizations and extends to the family of unified skew-elliptical distributions and more generally to selection distributions. We provide several suggestions to make the unified skew-normal model identifiable and describe various sub-models that are identifiable.
Abstract. For the extended skew-normal distribution, which represents an extension of the normal (or Gaussian) distribution, we focus on the properties of the log-likelihood function and derived quantities in the bivariate case. Specifically, we derive explicit expressions for the score function and the information matrix, in the observed and the expected form; these do not appear to have been examined before in the literature. Corresponding computing code in R language is provided, which implements the formal expressions.
Back to
The skew-normal distribution