Everyone knows that digital images can contain metadata inside the actual file which give information regarding the image or photo. A digital camera, whether it be a dSLR or a smartphone, will embed EXIF data which includes information such as the date and time in which the photo was taken, as well as the exposure settings of the camera for that photo (shutter speed, aperture, ISO, focal length, etc).
In my photo work flow, whether it’s portrait or just some generic stock photo, I always apply a metadata template in Adobe Lightroom which overlays specific IPTC data on my photos, including the year of production, my mailing address, and other contact information (email, web).
If it’s a specific project, I’ll overlay additional IPTC data including information such as the location of the project (where the photo was taken) and the name of the subject/model.
After I apply all this metadata to the photos as applicable, I’ll run through my standard Lightroom workflow and then export the final JPEG images for the target media platforms (whether it be my blog, my photo portfolio, print, Instagram, etc).
Running Google PageSpeed Insights against my blog to see what types of slowdowns are occurring when loading my site, I noticed a lot of warnings about optimizing my images with lossless compression.
Losslessly compressing http://somerandom.jpg could save 42.7KiB (46% reduction).
Going through the PageSpeed Insight docs about this message, it mentioned using jpegoptim with the -s flag.
I compiled jpegoptim to see what it was all about and found that the -s flag meant –strip-all, or remove metadata.
-s, --strip-all Strip all markers from output file. (NOTE! by default only Comment & Exif/IPTC/PhotoShop/ICC/XMP markers are kept, everything else is discarded). Output JPEG still likely will contains one or two markers (JFIF and Adobe APP14) depending on colorspace used in the image, as these markers are generated by the libjpeg encoder automatically.
I decided to test some jpegoptim runs on some of my stock images that I already uploaded to my blog using the -n flag (which only does a compression check without committing/writing the changes) and found a reduction in file sizes by anywhere from 30% to 50%+.
Ocab-20130628-171140-600.jpg 600x400 24bit N Exiff IPTC ICC XMP Adobe [OK] 72869 --> 38116 bytes (47.69%), optimized. Ocab-20130728-145315-600.jpg 600x400 24bit N Exiff IPTC ICC XMP Adobe [OK] 118727 --> 49450 bytes (58.35%), optimized. Ocab-20131113-081218-600.jpg 600x600 24bit N Exiff IPTC ICC XMP Adobe [OK] 144379 --> 87768 bytes (39.21%), optimized. Ocab-20131115-063024-600.jpg 600x600 24bit N Exiff IPTC ICC XMP Adobe [OK] 105424 --> 64389 bytes (38.92%), optimized. Ocab-20131116-183408-600.jpg 600x600 24bit N Exiff IPTC ICC XMP Adobe [OK] 125470 --> 80658 bytes (35.72%), optimized. Ocab-20131117-090856-600.jpg 600x400 24bit N Exiff IPTC ICC XMP Adobe [OK] 124083 --> 70683 bytes (43.04%), optimized. Ocab-20131117-122617-600.jpg 600x600 24bit N Exiff IPTC ICC XMP Adobe [OK] 137011 --> 91351 bytes (33.33%), optimized. Ocab-20131126-115229-600.jpg 600x600 24bit N Exiff IPTC ICC XMP Adobe [OK] 121879 --> 76022 bytes (37.63%), optimized. Ocab-20131129-202244-600.jpg 600x600 24bit N Exiff IPTC ICC XMP Adobe [OK] 96800 --> 58526 bytes (39.54%), optimized.
Dropping an average of 40% per file just for metadata? In terms of web page load times, this is quite significant.
Let’s take the following example which is a 600×600 pixel photo, exported from Adobe Lightroom with all EXIF and IPTC data intact:
It has a file size of 293,645 bytes or 294KB. What happens when I run the above photo through jpegoptim?
$ jpegoptim --strip-xmp --strip-icc --strip-com -n Ocab-20140607-080324-600.jpg Ocab-20140607-080324-600.jpg 600x600 24bit N Exiff IPTC ICC XMP Adobe [OK] 293645 --> 223336 bytes (23.94%), optimized.
The above is a 24% reduction (roughly 70KB) just by stripping out all the metadata except EXIF and IPTC.
$ jpegoptim --strip-xmp --strip-icc --strip-com --strip-iptc -n Ocab-20140607-080324-600.jpg Ocab-20140607-080324-600.jpg 600x600 24bit N Exiff IPTC ICC XMP Adobe [OK] 293645 --> 198432 bytes (32.42%), optimized.
The above is a 32% (roughly 95KB) by removing all metadata but EXIF.
$ jpegoptim --strip-xmp --strip-icc --strip-com --strip-iptc --strip-exif -n Ocab-20140607-080324-600.jpg Ocab-20140607-080324-600.jpg 600x600 24bit N Exiff IPTC ICC XMP Adobe [OK] 293645 --> 172860 bytes (41.13%), optimized.
Above shows that finally adding the –strip-exif flag (essentially a –strip-all) gives a 41% reduction (roughly 121KB). 121KB of this sample photo is metadata.
Here is the same photo, but after being run through a jpegoptim -s execution:
So it would seem like in order to decrease the page load times on my blog, I need to start stripping metadata from my photo assets.
But I was a bit torn. I like metadata in my photos, especially on the web. Aside from visual watermarks, the metadata is my way of tagging my digital images with my information so it’s known that they are mine. Granted, stripping EXIF, IPTC, and other metadata is pretty easy if someone were to steal an image (and some services such as Twitter will strip the metadata upon upload, anyway), but it is just a formality I wanted to use.
Unfortunately, in terms of my blog, I figure it would be a good idea just to start stripping metadata from the stock photos, at least the ones loaded immediately on a given page or article. This way, page load times are faster for a better user experience (and for higher search engine ranking).
I’m keeping my Lightroom workflow the same, but now I will start running jpegoptim on my image assets after they are uploaded to my web server since stripping the metadata is something I want to limit within the scope of my ocabj.net blog (at least for now). I also went ahead and ran through my webserver and applied jpegoptim -s to a bunch of the existing photo assets.
While I have accepted this new process as a necessary evil for me, I am quite surprised by how much the metadata counts towards a digital image’s file size.
Note that jpegoptim can also reduce JPEG file sizes by adjusting quality. I do not want to do this. I prefer to control the quality of the JPEG via my Lightroom workflow.