3

The generalized gamma distribution

 3 years ago
source link: https://blogs.sas.com/content/iml/2021/03/15/generalized-gamma-distribution.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

The generalized gamma distribution

0

A SAS customer wanted to compute the cumulative distribution function (CDF) of the generalized gamma distribution. For any continuous distribution, the CDF is the integral of the probability density function (PDF), which usually has an explicit formula. Accordingly, he wanted to compute the CDF by using the QUAD function in PROC IML to integrate the PDF.

I have previously shown how to integrate the PDF to obtain the CDF. I have also shown how to use the trapezoid rule of numerical integration to obtain the entire CDF curve from a curve of the PDF. But before I sent him those links, I looked closer at the distribution that he wanted to compute. It turns out that you don't have to perform an integral to get the CDF of the generalized gamma distribution.

What is the generalized gamma distribution

I was not familiar with the generalized gamma distribution, so I looked at an article on Wikipedia. The original formulation is due to Stacy, E. (1962), "A Generalization of the Gamma Distribution." The distribution is "generalized" in the sense that it contains many other familiar distributions for certain parameter values, including the gamma, chi-square, exponential, half-normal, and Weibull distributions.

The following definition is from Wikipedia, but I changed the notation for the incomplete gamma function to agree with my previous article. The generalized gamma has three parameters: a>0, d>0, and p>0. For non-negative x, the probability density function of the generalized gamma is
f(x;a,d,p)=(p/ad)xd−1e−(x/a)p/Γ(d/p)f(x;a,d,p)=(p/ad)xd−1e−(x/a)p/Γ(d/p)
where Γ(⋅)Γ(⋅) denotes the complete gamma function.

The cumulative distribution function is
F(x;a,d,p)=γ((x/a)p,d/p)/Γ(d/p)F(x;a,d,p)=γ((x/a)p,d/p)/Γ(d/p),
where γ(⋅)γ(⋅) denotes the lower incomplete gamma function.

Compute the generalized gamma distribution in SAS

Evaluating the PDF is usually easy because it has an explicit formula. It is the CDF that requires thought and effort. In this case, however, the CDF is given in terms of the lower incomplete gamma function, and I recently showed how to compute the lower incomplete gamma function in SAS. As a rule of thumb, if a distribution has 'gamma' as part of its name, you might be able to use this trick to evaluate the CDF.

In my previous article, I showed that the lower incomplete gamma function is nothing more than the "unstandardized" CDF of the gamma distribution, as represented by this SAS macro:

%macro LowerIncGamma(x, alpha); /* lower incomplete gamma function */
   (Gamma(&alpha) * CDF('Gamma', &x, &alpha))
%mend;

But the formula for the CDF of the generalized gamma distribution divides by the complete gamma function, which just leaves the call to the CDF function. Instead of the parameter x, you use (x/a)p. For the parameter α, you substitute α=d/p. This means that you can compute the CDF for the generalized gamma distribution by using the CDF('GAMMA') function in SAS. The following SAS DATA step evaluates the PDF and CDF for five combinations of the a, d, and p parameters. These are the same parameter values that are used in the Wikipedia article.

data GenGamma;
label PDF = "Density" CDF = "Cumulative Probability";
array _a[5] _temporary_ (2,   1,   2, 5, 7);
array _d[5] _temporary_ (0.5, 1,   1, 1, 1);
array _p[5] _temporary_ (0.5, 0.5, 2, 5, 7);
do i = 1 to dim(_a);
   a = _a[i]; d = _d[i]; p = _p[i];
   params = cats('a=',a,', d=',d,', p=',p);
   do x = 0.0001, 0.001, 0.01 to 7 by 0.01;
      PDF = (p/a**d) * x**(d-1) * exp(-(x/a)**p) / Gamma(d/p);
      CDF = CDF('Gamma', (x/a)**p, d/p);
      output;
   end;
end;
run;

The following call to PROC SGPLOT creates a graph of the PDF for the five sets of parameter values:

title "PDF of the Generalized Gamma Distribution";
proc sgplot data=GenGamma;
   series x=x y=PDF / group=params;
   yaxis max = 0.7 grid;
   xaxis grid;
   keylegend / location=inside position=NE across=1 title="" opaque;
run;

The PDF curves are shown in the graph in the previous section. You can use almost the same statements to create a graph of the five CDF functions:

title "CDF of the Generalized Gamma Distribution";
proc sgplot data=GenGamma;
   series x=x y=CDF / group=params;
   yaxis grid;
   xaxis grid;
   keylegend / location=inside position=SE across=1 title="" opaque;
run;

Quantiles of the generalized gamma distribution

Although the SAS programmer did not request the quantile function for the generalized gamma distribution, it can be evaluated by using the QUANTILE('GAMMA") function in SAS. The details are in the Wikipedia article. For completeness, the following DATA step computes the quantiles. The graph is not shown, since it is the same as the graph of the cumulative distribution, but with the axes exchanged.

data GenGammaQuantile;
label prob = "Cumulative Probability" q = "Quantile";
array _a[5] _temporary_ (2,   1,   2, 5, 7);
array _d[5] _temporary_ (0.5, 1,   1, 1, 1);
array _p[5] _temporary_ (0.5, 0.5, 2, 5, 7);
do i = 1 to dim(_a);
   a = _a[i]; d = _d[i]; p = _p[i];
   params = cats('a=',a,', d=',d,', p=',p);
   do prob = 0.001, 0.01 to 0.99 by 0.01;
      y = quantile("Gamma", prob, d/p);
      q = a * y**(1/p);
      output;
   end;
end;
run;
 
title "Quantiles of the Generalized Gamma Distribution";
proc sgplot data=GenGammaQuantile;
   series x=prob y=q / group=params;
   yaxis grid max=7;
   xaxis grid;
   keylegend / location=inside position=NW across=1 title="" opaque;
run;

Summary

Although you can integrate the PDF to obtain the CDF for a continuous probability distribution, sometimes it is possible to evaluate the CDF in an easier way. In the case of the generalized gamma distribution, the CDF can be evaluated by using the CDF of the ordinary gamma distribution. You just need to be clever about how to pass in the parameter values.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK