JPEG: Image compression algorithm
source link: http://pi.math.cornell.edu/~web6140/TopTenAlgorithms/JPEG.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
JPEG: Image compression algorithm
Alex Townsend, March 2017
Contents
What is JPEG?
JPEG stands for Joint Photographic Experts Group, which was a group of image processing experts that devised a standard for compressing images (ISO).
So, JPEG (or JPG) is not really a file format but rather an image compression standard. The JPEG standard is complicated with many different options and color space regulations. It was not widely adopted. A much simpler standard version was advocated at the same time, called JFIF. This is the image compression algorithm that most people mean when they say JPEG compression, and the one that we will be describing in this class. Note that the file extensions .jpeg and .jpg have stuck, even though the underneath algorithm is (strictly speaking) JFIF compression.
The underlying assumptions of the JPEG algorithm
The JPEG algorithm is designed specifically for the human eye. It exploits the following biological properties of human sight:
(1) We are more sensitive to the illuminocity of color, rather than the chromatric value of an image, and
(2) We are not particularly sensitive to high-frequency content in images.
The algorithm can be neatly broken up into several stages: There is an input image I, which goes through the following process:
1) A colour transform, 2) A 2D discrete cosine transform on 8x8 blocks, 3) A quantization (filtering) stage, 4) Huffman encoding.
Finally, a compressed image is returned in the .jpg file format. This format contains the compressed image as well as information that is needed to uncompressed, with other information to allow for reexpanding the image.
An uncompressed image of mixed peppers
MATLAB has various images uploaded into MATLAB. Here, is one of some peppers:
I = imread('peppers.png'); imshow(I)
It currently requires about the following number of bits to store:
ImageSize = 8*prod(size(I))
ImageSize = 4718592
Color transform: Convert RGB to YCbCr
Right now, the image is stored in RGB format. While this colorspace is convenient for projecting the image on the computer screen, it does not isolate the illuminance and color of an image. The intensity of color is intermixed in the colorspace. The YCbCr is a more convenitent colorspace for image compression because it separates the illuminance and the chromatic strength of an image.
Y = I; A = [.229 .587 .114 ; -.168736 -.331264 .5 ; .5 -.418688 -.081312]; Y(:,:,1) = A(1,1)*I(:,:,1) + A(1,2)*I(:,:,2) + A(1,3)*I(:,:,3); Y(:,:,2) = A(2,1)*I(:,:,1) + A(2,2)*I(:,:,2) + A(2,3)*I(:,:,3) + 128; Y(:,:,3) = A(3,1)*I(:,:,1) + A(3,2)*I(:,:,2) + A(3,3)*I(:,:,3) + 128; % Let's use MATLAB's inbuilt convert (because of the normalizations): Y = rgb2ycbcr(I);
Plot Y'CbCr colorspace
Let's see what that colorspace looks like.
lb = {'Y', 'Cb', 'Cr'}; for channel = 1:3 subplot(1,3,channel) Y_C = Y; Y_C(:,:,setdiff(1:3,channel)) = intmax(class(Y_C))/2; imshow(ycbcr2rgb(Y_C)) title([lb{channel} ' component'],'fontsize',16) end
Our eyes are senstitive to illuminance, but not color
Since our eyes are not particularly sensitive to chrominance, we can "downsample" that stuff. Here, we remove x100 amount of "color" from the image and see that it has barely changed:
subplot(1,2,1) imshow( I ) title('Original') subplot(1,2,2) Y_d = Y; Y_d(:,:,2) = 10*round(Y_d(:,:,2)/10); Y_d(:,:,3) = 10*round(Y_d(:,:,3)/10); imshow(ycbcr2rgb(Y_d)) title('Downsampled image')
Eyes are senstitive to intensity
However, if we downsample the illuminance by x10, then there is a noticeable difference. (You'll have to zoom in to see it.)
subplot(1,2,1) imshow( I ) title('original') subplot(1,2,2) Y_d = Y; Y_d(:,:,1) = 10*round(Y_d(:,:,1)/10); imshow(ycbcr2rgb(Y_d)) title('Downsample illuminance')
JPEG downsampling
Here, we will be a little conservative and downsample the chrominance by only a factor of 2.
Y_d = Y; Y_d(:,:,2) = 2*round(Y_d(:,:,2)/2); Y_d(:,:,3) = 2*round(Y_d(:,:,3)/2);
The 2D discrete cosine transform
Once the image is in YCrCb color space and downsampled, it is partitioned into 8x8 blocks. Each block is transformed by the two-dimensional discrete cosine transform (DCT). Let's extract one 8x8 block of pixels for demonstration, shown here in white:
We apply the DCT to that box:
clf box = Y_d; II = box(200:207,200:207,1); Y = chebfun.dct(chebfun.dct(II).').'; surf(log10(abs(Y))), title('DCT coefficients') set(gca,'fontsize',16)
Quantization table
Next we apply a quantization table to Y, which filters out the high frequency DCT coefficients:
Q = [16 11 10 16 24 40 51 61 ; 12 12 14 19 26 28 60 55 ; 14 13 16 24 40 57 69 56 ; 14 17 22 29 51 87 80 62 ; 18 22 37 56 68 109 103 77 ; 24 35 55 64 81 104 113 92 ; 49 64 78 87 103 121 120 101; 72 92 95 98 112 100 103 99]; before = nnz(Y); Y = Q.*round(Y./Q); after = nnz(Y);
The number of nonzero DCT coefficients after quantization is:
before, after
before = 64 after = 6
We now apply this compression to every 8x8 block. We obtain:
I = imread('peppers.png'); Y_d = rgb2ycbcr( I ); % Downsample: Y_d(:,:,2) = 2*round(Y_d(:,:,2)/2); Y_d(:,:,3) = 2*round(Y_d(:,:,3)/2); % DCT compress: A = zeros(size(Y_d)); B = A; for channel = 1:3 for j = 1:8:size(Y_d,1)-7 for k = 1:8:size(Y_d,2) II = Y_d(j:j+7,k:k+7,channel); freq = chebfun.dct(chebfun.dct(II).').'; freq = Q.*round(freq./Q); A(j:j+7,k:k+7,channel) = freq; B(j:j+7,k:k+7,channel) = chebfun.idct(chebfun.idct(freq).').'; end end end subplot(1,2,1) imshow(ycbcr2rgb(Y_d)) title('Original') subplot(1,2,2) imshow(ycbcr2rgb(uint8(B))); title('Compressed') shg
Compression so far
The quantization step has successfully compressed the image by about a factor of 7.
CompressedImageSize = 8*nnz(A(:,:,1)) + 7*nnz(A(:,:,2)) + 7*nnz(A(:,:,3)) CompressedImageSize/ImageSize
CompressedImageSize = 701189 ans = 0.148601320054796
The formula above is obtained by noting that we downsampled in Cr and Cb are downsampled.
Huffman encoding
We now encode the DCT blocks using Huffman encoding:
b = A(:); b = b(:); b(b==0)=[]; %remove zeros. b = floor(255*(b-min(b))/(max(b)-min(b))); symbols = unique(b); prob = histcounts(b,length(symbols))/length(b); dict = huffmandict(symbols, prob); enco = huffmanenco(b, dict); FinalCompressedImage = length(enco)
FinalCompressedImage = 695755
Final compression
We have successfully reduced the pepper image by about x7, while being extremely conservative:
length(enco)/ImageSize % x7 compression.
ans = 0.147449705335829
A different image
The images of peppers is not ideal for JPEG. Here is an image for which JPEG gets a better compression rate (of about x30).
I = imread('saturn.png'); ImageSize = 8*prod(size(I)); Y_d = rgb2ycbcr( I ); % Downsample: Y_d(:,:,2) = 2*round(Y_d(:,:,2)/2); Y_d(:,:,3) = 2*round(Y_d(:,:,3)/2); % DCT compress: A = zeros(size(Y_d)); B = A; for channel = 1:3 for j = 1:8:size(Y_d,1)-7 for k = 1:8:size(Y_d,2)-7 II = Y_d(j:j+7,k:k+7,channel); freq = chebfun.dct(chebfun.dct(II).').'; freq = Q.*round(freq./Q); A(j:j+7,k:k+7,channel) = freq; % do the inverse at the same time: B(j:j+7,k:k+7,channel) = chebfun.idct(chebfun.idct(freq).').'; end end end b = A(:); b = b(:); b(b==0)=[]; %remove zeros. b = floor(255*(b-min(b))/(max(b)-min(b))); symbols = unique(b); prob = histcounts(b,length(symbols))/length(b); dict = huffmandict(symbols, prob); enco = huffmanenco(b, dict); FinalCompressedImage = length(enco); FinalCompressedImage/ImageSize
ans = 0.031958287037037
Here's what the images look like:
subplot(1,2,1) imshow(I) title('Original') subplot(1,2,2) imshow(ycbcr2rgb(uint8(B))); title('Compressed')
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK