Week-12 - Network Forensics Lab 3: HTTP Traffic Analysis & Image Extraction
Aim
Extract images transmitted over HTTP from network traffic captures. You will identify image downloads, extract the raw image data using magic numbers and trailers, and verify file integrity using hash values.
Learning Objectives
- Understand HTTP request/response structure
- Identify image transfers in HTTP traffic
- Extract files using Wireshark's Export Objects feature
- Manually extract files using magic numbers and hex editors
- Verify extracted files using MD5 hash values
Resources Needed
- Wireshark
- Hex editor (HxD, Bless, or similar)
- Lab file:
rhino2.log: You can find the file in you Cyberlab or you can download from here.
Background
Scenario: Illegal Possession Investigation
A suspect is under investigation for possession of illegal images of endangered species (rhinos). Network traffic has been captured from their computer. Your task is to:
- Identify any image downloads via HTTP
- Extract the images from the network traffic
- Document evidence with hash values
HTTP Protocol
HTTP (Hypertext Transfer Protocol) transfers data across the web using request/response pairs:
| Method | Purpose |
|---|---|
| GET | Request/download data from server |
| POST | Send data to server |
File Magic Numbers
Every file type has a signature (magic number) at the start and often a trailer at the end:
| File Type | Magic Number (Hex) | Trailer (Hex) |
|---|---|---|
| JPEG | FF D8 FF | FF D9 |
| GIF | 47 49 46 38 (GIF8) | 00 3B |
| PNG | 89 50 4E 47 | AE 42 60 82 |
Part A: Initial Traffic Analysis
Task 1: Open and Explore the Capture
- Open
rhino2.login Wireshark - Examine the packet list
Q1.1: How many packets are in this capture?
Your Answer: _______________________________________________
Click to reveal answer
Answer: 370 packets
Q1.2: What IP addresses are involved in this capture? List them:
Your Answer: _______________________________________________
Click to reveal answer
Answer:
- 137.30.123.234 (suspect's machine)
- 137.30.120.37 (web server - cs.uno.edu)
- 64.233.167.104 (Google)
- 137.30.120.39
Q1.3: What protocols are present? (Check Statistics → Protocol Hierarchy)
Your Answer: _______________________________________________
Click to reveal answer
Answer: HTTP, TCP, IMAP (email), and others
Task 2: Filter HTTP Traffic
- Apply the display filter:
http
Q2.1: How many HTTP packets are displayed?
Your Answer: _______________________________________________
Click to reveal answer
Answer: Approximately 20-30 HTTP packets (23 ideally)
Q2.2: What is the suspect's IP address (the one making GET requests)?
Your Answer: _______________________________________________
Click to reveal answer
Answer: 137.30.123.234
Part B: Finding Image Downloads
Task 3: Search for JPEG Downloads
- Apply the filter to find HTTP requests for .jpg files:
http.request.uri contains ".jpg"
Q3.1: How many HTTP requests for .jpg files are there?
Your Answer: _______________________________________________
Click to reveal answer
Answer: 1 (rhino4.jpg)
Q3.2: What is the filename of the JPEG being requested?
Your Answer: _______________________________________________
Click to reveal answer
Answer: rhino4.jpg
Q3.3: What is the full URL path of this request?
Your Answer: _______________________________________________
Click to reveal answer
Answer: /~gnome/rhino4.jpg
Q3.4: What packet number contains the GET request for this image?
Your Answer: _______________________________________________
Click to reveal answer
Answer: Packet 49
Task 4: Examine the Server Response
- Find the server's response to the rhino4.jpg request (packet 50)
- Examine the HTTP headers
Q4.1: What HTTP response code was returned?
Your Answer: _______________________________________________
Click to reveal answer
Answer: 200 OK
Q4.2: What is the Content-Type header value?
Your Answer: _______________________________________________
Click to reveal answer
Answer: image/jpeg
Q4.3: What is the Content-Length (file size in bytes)?
Your Answer: _______________________________________________
Click to reveal answer
Answer: 153191 bytes (approximately 150 KB)
Q4.4: What web server software is hosting this file?
Your Answer: _______________________________________________
Click to reveal answer
Answer: Apache/1.3.29 (Unix)
Task 5: Search for GIF Downloads
- Apply the filter to find HTTP requests for .gif files:
http.request.uri contains ".gif"
Q5.1: How many HTTP requests for .gif files are there?
Your Answer: _______________________________________________
Click to reveal answer
Answer: 5 GIF requests
Q5.2: List the GIF filenames requested:
Your Answer:
Click to reveal answer
Answer:
- logo.gif (Google logo)
- blank.gif
- image2.gif
- back.gif
- rhino5.gif
Q5.3: Which of these GIFs is related to the investigation?
Your Answer: _______________________________________________
Click to reveal answer
Answer: rhino5.gif
Q5.4: What packet number contains the GET request for rhino5.gif?
Your Answer: _______________________________________________
Click to reveal answer
Answer: Packet 217
Part C: Extracting Images - Method 1 (Export Objects)
Task 6: Use Wireshark's Export Objects Feature
- Go to File → Export Objects → HTTP
- A window will display all HTTP objects that can be exported
Q6.1: How many objects are listed in the Export HTTP Objects window?
Your Answer: _______________________________________________
Click to reveal answer
Answer: Multiple objects including HTML pages, GIFs, and JPEGs
Q6.2: Can you see rhino4.jpg in the list?
Your Answer: _______________________________________________
Click to reveal answer
Answer: Yes
- Select
rhino4.jpgand click Save - Save the file to your working directory
Q6.3: Were you able to successfully export and open rhino4.jpg?
Your Answer: _______________________________________________
Click to reveal answer
Answer: Yes - Export Objects is the easiest method for extracting HTTP-transferred files
- Repeat for
rhino5.gif
Part D: Extracting Images - Method 2 (Manual Hex Extraction)
Sometimes Export Objects doesn't work (corrupted streams, partial captures). Learn the manual method:
Task 7: Follow the HTTP Stream
- Find packet 49 (GET request for rhino4.jpg)
- Right-click → Follow → HTTP Stream
- Change "Show data as" to Raw
Q7.1: Can you identify the HTTP headers at the start of the response?
Your Answer: _______________________________________________
Click to reveal answer
Answer: Yes - you can see HTTP/1.1 200 OK followed by headers like Content-Type, Content-Length, etc.
Task 8: Find the JPEG Magic Number
- In the raw data, look for the JPEG magic number:
FF D8 FF - This marks the start of the actual image data (after the HTTP headers)
Q8.1: Can you locate the JPEG magic number (FF D8 FF) in the stream?
Your Answer: _______________________________________________
Click to reveal answer
Answer: Yes - it appears after the HTTP headers (after the blank line following headers)
Task 9: Find the JPEG Trailer
- Scroll to the end of the data
- Look for the JPEG trailer:
FF D9
Q9.1: Can you locate the JPEG trailer (FF D9) at the end?
Your Answer: _______________________________________________
Click to reveal answer
Answer: Yes - FF D9 marks the end of the JPEG file
Task 10: Manual Extraction with Hex Editor
- In Wireshark's hex pane, select from
FF D8 FFtoFF D9(inclusive) - Copy the raw bytes
- Open a hex editor (HxD, Bless, etc.)
- Paste the bytes
- Save as
rhino4_manual.jpg
Q10.1: Does the manually extracted file open correctly as an image?
Your Answer: _______________________________________________
Click to reveal answer
Answer: Yes - if you correctly selected from the magic number to the trailer, the image should display correctly
Task 11: Extract rhino5.gif Manually
- Find packet 217 (GET request for rhino5.gif)
- Follow HTTP Stream → Raw
- Find GIF magic number:
47 49 46 38(or "GIF8" in ASCII) - Find GIF trailer:
00 3B - Extract and save as
rhino5_manual.gif
Q11.1: What text can you see at the start of the GIF data (the magic number in ASCII)?
Your Answer: _______________________________________________
Click to reveal answer
Answer: GIF89a or GIF87a (the "GIF8" followed by version)
Part E: Verification with Hash Values
Task 12: Calculate MD5 Hashes
- Calculate MD5 hash of your extracted rhino4.jpg:
Or on Windows:md5sum rhino4.jpgcertutil -hashfile rhino4.jpg MD5
Q12.1: What is the MD5 hash of rhino4.jpg?
Your Answer: _______________________________________________
Click to reveal answer
Answer: a64102afff71b93ed61fb100af8d52a (or similar - verify with your extraction)
- Calculate MD5 hash of rhino5.gif:
Q12.2: What is the MD5 hash of rhino5.gif?
Your Answer: _______________________________________________
Click to reveal answer
Answer: 1e90b7f70b2ecb605898524a88269029
Q12.3: Why are MD5 hashes important in forensic investigations?
Your Answer: _______________________________________________
Click to reveal answer
Answer:
- Prove file integrity (file hasn't been modified)
- Create unique identifier for evidence
- Enable searching hash databases for known illegal content
- Chain of custody verification
Part F: Evidence Documentation
Task 13: Complete the Evidence Table
Fill in this evidence summary:
| Evidence # | Filename | Type | Source IP | Server IP | File Size | MD5 Hash |
|---|---|---|---|---|---|---|
| 1 | JPEG | |||||
| 2 | GIF |
Click to reveal answer
| Evidence # | Filename | Type | Source IP | Server IP | File Size | MD5 Hash |
|---|---|---|---|---|---|---|
| 1 | rhino4.jpg | JPEG | 137.30.123.234 | 137.30.120.37 | 153,191 bytes | a64102afff71b93ed61fb100af8d52a |
| 2 | rhino5.gif | GIF | 137.30.123.234 | 137.30.120.37 | 85,137 bytes | 1e90b7f70b2ecb605898524a88269029 |
Task 14: Determine the Server
- Use the filter
ip.addr == 137.30.120.37to see traffic to/from the server
Q14.1: What organisation owns the server hosting the rhino images? (Hint: look at the URL path /~gnome/)
Your Answer: _______________________________________________
Click to reveal answer
Answer: cs.uno.edu (University of New Orleans Computer Science department) - the ~gnome suggests a user directory
Key Findings
In this lab, you extracted images from HTTP traffic:
| Item | Details |
|---|---|
| Suspect IP | 137.30.123.234 |
| Server IP | 137.30.120.37 (cs.uno.edu) |
| Image 1 | rhino4.jpg (JPEG, 153KB) |
| Image 2 | rhino5.gif (GIF, 85KB) |
| Date | Wed, 28 Apr 2004 |
Useful Wireshark Filters
| Filter | Purpose |
|---|---|
http | All HTTP traffic |
http.request.method == "GET" | HTTP GET requests only |
http.request.uri contains ".jpg" | Requests for JPEG files |
http.request.uri contains ".gif" | Requests for GIF files |
http.content_type contains "image" | Responses containing images |
http.response.code == 200 | Successful HTTP responses |
Magic Numbers Reference
| Type | Start (Hex) | End (Hex) |
|---|---|---|
| JPEG | FF D8 FF | FF D9 |
| GIF | 47 49 46 38 | 00 3B |
| PNG | 89 50 4E 47 | AE 42 60 82 |
Best,
Ali.