Embed HTML in JPEG images

Recently came across an interesting web page: lcamtuf.coredump.cx/squirrel/ , or an interesting picture - because when you open it with a web browser, you see a web page, and when you open it with a picture browser, you see another picture.

It is very simple to implement different responses to the same request address on the server side, such as judging by the request header Accept:

  • Accept=text/html returns hypertext;
  • Accept=image/* returns the image;
  • ......

But lcamtuf.coredump.cx/squirrel/ This webpage (or picture) can still present two kinds of file content, the webpage and the picture, even when it is separated from the server. How is this achieved?

There are no secrets on the front end, open the developer tools of the web browser and check the response content of the URL:

There are familiar HTML tags in the response content, but also a lot of garbled characters. This pile of garbled characters should be the character reading code of the picture file, but why can't it be seen on the web page? It turns out that the body { visibility: hidden; } style and the <!-- comment tag (browser auto-completion) are used in the HTML, and the garbled part is thus hidden.

HTML content:

<html><body><style>body { visibility: hidden; } .n { visibility: visible; position: absolute; padding: 0 1ex 0 1ex; margin: 0; top: 0; left: 0; } h1 { margin-top: 0.4ex; margin-bottom: 0.8ex; }</style><div class=n><h1><i>Hello, squirrel fans<img src="http://lcamtuf.coredump.cx/squirrel/">Click here!</a></xmp><xmp><img src="http://lcamtuf.coredump.cx/squirrel/"></xmp><p>No server-side hacks involved - the magic happens in your browser. Let's try embedding the current page as an image right now (INCEPTION!):<p><img src="#" style="border: 1px solid crimson"><p>Pretty radical, eh? Send money to: lcamtuf@coredump.cx<!--" style="margin: auto" />

However, when viewing the picture, I didn't see the HTML content, so why?

JPEG related

First of all, it can be determined that this picture is in JPEG format, because there is an obvious JFIF mark at the beginning of the content ( JFIF , is the most common file storage format of JPEG, and is also the standard JPEG file conversion format). The JPEG format defines a series of marker codes, all of which start with 0xFF. The common ones are:

  • 0xFFD8, SOI (Start Of Image), the image start mark;
  • 0xFFD9, EOI (End Of Image), the end mark of the picture;
  • 0xFFDA, SOS (Start Of Scan), scan start flag;
  • 0xFFDB, DQT (Define Quantization Table), defines the quantization table;
  • 0xFFC4, DHT (Define Huffman Table), define the Huffman table;
  • 0xFFEn, APPn (Application Specific), application information;
  • 0xFFFE, COM (Comment), comment;
  • ......

The annotation content of the image will not be displayed, obviously we can hide the HTML in the image annotation. Reading this picture in hexadecimal, you can see:

In this picture, the annotation mark 0xFFFE is used, followed by 0x0372. 0x0372 is 882 in decimal, and the length of the HTML content is exactly 880 bytes (0x0372 itself occupies 2 bytes), so we can know that the HTML content is Write in this picture comment.


The following is a simple implementation of a function to embed HTML content in JPEG with PHP code:

function embedHtmlInJpeg($jpeg_file, $html_str, $html_file) {$length = strlen($html_str) + 2;if ($length > 256 * 256 - 1) {return false;}$content = '';$reader= fopen($jpeg_file, 'rb');$writer= fopen($html_file, 'wb');$content = fread($reader, 2); // read 0xFFD8fwrite($writer, $content); // write 0xFFD8$header= 'FFFE' . sprintf('%04X', $length);$header= pack('H*', $header);$content = $header . $html_str;fwrite($writer, $content); // write 0xFFFEwhile (!feof($reader)) {$content = fread($reader, 8192);fwrite($writer, $content); // write else}fclose($reader);fclose($writer);return true;

// call it
embedHtmlInJpeg('lena.jpg','<html><body><style>body { visibility: hidden; } .n { visibility: visible; position: absolute; padding: 0 1ex 0 1ex; margin: 0; top: 0; left: 0; } h1 { margin-top: 0.4ex; margin-bottom: 0.8ex; }</style><div class=n><h1><i>This image is a page.</i></h1>Just open it in new tab.<p><img src="#" style="border: 1px solid crimson"><!--','lena.html'); 

This picture is a web page, if you don't believe me, just open it in a new tab

click here experience.

practical application

Embed a piece of HTML text into a picture, in fact, there is no application, hahaha 😂. It would be interesting if the JS or Shell script could be hidden in the picture and executed later; and it is a picture file itself, which can avoid the inspection of some security software.

Network Security Growth Roadmap

This direction is relatively easy to get started in the early stage, master some basic technologies, and pick up various ready-made tools to hack. However, if you want to change from a script kiddie to a master hacker, the further you go in this direction, the more things you need to learn and master. The following is the direction you need to take to learn network security:

# Cyber ​​Security Learning Methods

​ The technical classification and learning route are introduced above. Here are some learning methods:
​ ## Video learning

​ Whether you go to B ilibili or Youtube, there are many videos related to network security that you can learn. Of course, if you still don’t know which set of learning to choose, I have also compiled a set of video tutorials linked to the above growth roadmap. The full version of the video has been uploaded to the CSDN official website. If you need it, you can click this link to get it for free. Network security heavy benefits: entry & advanced full set of 282G learning resource packs to share for free!

Tags: html Javascript Cyber Security Front-end

Posted by BuzzLY on Thu, 16 Feb 2023 02:58:38 +0530