{"id":8049,"date":"2022-09-30T13:20:00","date_gmt":"2022-09-30T20:20:00","guid":{"rendered":"https:\/\/mattfife.com\/?p=8049"},"modified":"2023-02-07T13:28:15","modified_gmt":"2023-02-07T20:28:15","slug":"using-stable-diffusion-for-compression","status":"publish","type":"post","link":"https:\/\/mattfife.com\/?p=8049","title":{"rendered":"Using Stable Diffusion for compression"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">Last week, Swiss software engineer <a rel=\"noreferrer noopener\" href=\"https:\/\/pub.towardsai.net\/stable-diffusion-based-image-compresssion-6f1f0a399202\" data-type=\"URL\" data-id=\"https:\/\/pub.towardsai.net\/stable-diffusion-based-image-compresssion-6f1f0a399202\" target=\"_blank\">Matthias B\u00fchlmann\u00a0discovered<\/a>\u00a0that the popular image synthesis model\u00a0<a href=\"https:\/\/arstechnica.com\/information-technology\/2022\/09\/with-stable-diffusion-you-may-never-believe-what-you-see-online-again\/\">Stable Diffusion<\/a>\u00a0could compress existing 2D images with fewer visual artifacts than JPEG or WebP at high compression ratios, though there are some important limitations.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">When Stable Diffusion analyzes and &#8220;compresses&#8221; images into weight form, they reside in what researchers call &#8220;latent space,&#8221; which is a way of saying that they exist as a sort of fuzzy potential that can be realized into images once they&#8217;re decoded. With Stable Diffusion 1.4, the weights file is roughly 4GB, but it represents knowledge about hundreds of millions of images.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">While most people use Stable Diffusion with text prompts, B\u00fchlmann cut out the text encoder and instead forced his images through Stable Diffusion&#8217;s image encoder process, which takes a low-precision 512\u00d7512 image and turns it into a higher-precision 64\u00d764 latent space representation. At this point, the image exists at a much smaller data size than the original, but it can still be expanded (decoded) back into a 512\u00d7512 image with fairly good results.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"640\" height=\"479\" data-attachment-id=\"8050\" data-permalink=\"https:\/\/mattfife.com\/?attachment_id=8050\" data-orig-file=\"https:\/\/i0.wp.com\/mattfife.com\/wp-content\/themes\/mattTheme\/headerimgs\/2023\/02\/compression_comparison.jpg?fit=893%2C669&amp;ssl=1\" data-orig-size=\"893,669\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"compression_comparison\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/mattfife.com\/wp-content\/themes\/mattTheme\/headerimgs\/2023\/02\/compression_comparison.jpg?fit=640%2C479&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/mattfife.com\/wp-content\/themes\/mattTheme\/headerimgs\/2023\/02\/compression_comparison.jpg?resize=640%2C479&#038;ssl=1\" alt=\"\" class=\"wp-image-8050\" srcset=\"https:\/\/i0.wp.com\/mattfife.com\/wp-content\/themes\/mattTheme\/headerimgs\/2023\/02\/compression_comparison.jpg?w=893&amp;ssl=1 893w, https:\/\/i0.wp.com\/mattfife.com\/wp-content\/themes\/mattTheme\/headerimgs\/2023\/02\/compression_comparison.jpg?resize=300%2C225&amp;ssl=1 300w, https:\/\/i0.wp.com\/mattfife.com\/wp-content\/themes\/mattTheme\/headerimgs\/2023\/02\/compression_comparison.jpg?resize=768%2C575&amp;ssl=1 768w, https:\/\/i0.wp.com\/mattfife.com\/wp-content\/themes\/mattTheme\/headerimgs\/2023\/02\/compression_comparison.jpg?resize=360%2C270&amp;ssl=1 360w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">B\u00fchlmann&#8217;s method currently comes with significant limitations. It&#8217;s not good with faces or text, and in some cases, it can inject detail features in the decoded image that were not present in the source image. (You probably don&#8217;t want your image compressor inventing details in an image that don&#8217;t exist.) Also, decoding requires the 4GB Stable Diffusion weights file and extra decoding time that are inherent with Stable Diffusion.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Not the first time that AI has been explored as a method of compression as much as generation. <a href=\"https:\/\/mattfife.com\/?p=7798\" data-type=\"URL\" data-id=\"https:\/\/mattfife.com\/?p=7798\" target=\"_blank\" rel=\"noreferrer noopener\">Daniel Holden of Ubisoft presented an astounding paper at GDC in 2018 about using neural nets to compress animation data used in video game character animation.<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Links:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/arstechnica.com\/information-technology\/2022\/09\/better-than-jpeg-researcher-discovers-that-stable-diffusion-can-compress-images\/\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/arstechnica.com\/information-technology\/2022\/09\/better-than-jpeg-researcher-discovers-that-stable-diffusion-can-compress-images\/<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/mattfife.com\/?p=7798\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/mattfife.com\/?p=7798<\/a><\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Last week, Swiss software engineer Matthias B\u00fchlmann\u00a0discovered\u00a0that the popular image synthesis model\u00a0Stable Diffusion\u00a0could compress existing 2D images with fewer visual artifacts than JPEG or WebP at high compression ratios, though there are some important limitations. When Stable Diffusion analyzes and &#8220;compresses&#8221; images into weight form, they reside in what researchers call &#8220;latent space,&#8221; which is a way of saying that they exist as a sort of fuzzy potential that can be realized into images once they&#8217;re decoded. With Stable Diffusion&#8230;<\/p>\n<p class=\"read-more\"><a class=\"btn btn-default\" href=\"https:\/\/mattfife.com\/?p=8049\"> Read More<span class=\"screen-reader-text\">  Read More<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[28,9,5],"tags":[],"class_list":["post-8049","post","type-post","status-publish","format-standard","hentry","category-ai","category-cool","category-technical"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p4WECr-25P","jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/mattfife.com\/index.php?rest_route=\/wp\/v2\/posts\/8049","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mattfife.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mattfife.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mattfife.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/mattfife.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=8049"}],"version-history":[{"count":3,"href":"https:\/\/mattfife.com\/index.php?rest_route=\/wp\/v2\/posts\/8049\/revisions"}],"predecessor-version":[{"id":8053,"href":"https:\/\/mattfife.com\/index.php?rest_route=\/wp\/v2\/posts\/8049\/revisions\/8053"}],"wp:attachment":[{"href":"https:\/\/mattfife.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=8049"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mattfife.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=8049"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mattfife.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=8049"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}