2-3) 물리이미징 수집 도구 만들기(기초)

Python 을 이용한 소스 코드 작성 및 실행 방법

환경 구성 : 0-2) Python 을 이용한 개발 환경 구성
Python 소스를실행파일로 만들기 : 0-3) Python 소스로 실행파일 만들기

우리가 알고 있는 것

물리이미징은 저장매체의 0번 섹터부터 마지막 섹터까지 그대로 복사해서 파일을 만드는 것
저장매체의 물리디스크 0번부터 마지막 섹터를 완벽하게 가져오는 것이 중요하다.
저장매체보다 큰 용량을 저장할 공간이 필요하다.

물리이미징을 획득하는 도구를 개발 주요 요구사항
물리디스크를 읽고 0번 섹터부터 마지막 섹터까지 읽어서 저장
GUI로 만들때 물리디스크 정보를 보여주고, 클릭한 뒤 원하는 경로에 원하는 파일명으로 저장하는 버튼 추가
물리이미징이 끝나면 "해당파일명.log"로 MD5 해시값을 저장

1. 물리이미징 획득 도구 소스 및 사용 방법

우선 업로드한 실행파일을 받아서 사용할 경우 제가 공유한 프로그램이 정확한지 확인 후 사용

파이썬 소스를 이용하여 물리이미징 획득 도구를 만들어 볼 수 있습니다. 먼저 제가 작성한 프로그램 및 소스를 공유드리니 참고로만 활용하도록 합시다.

ChatGPT를 이용하여 물리이미징 도구 만들어보기

물리이미징 도구 만들기 내역 (질문을 통해 만들어진 소스가 한번에 만들어진 것은 아니며, 여러가지 시행착오 후 질문을 정리한 것 입니다.)

최종 소스

14KB

물리이미징 도구.py

Python 최종 소스

import ctypes
from ctypes import wintypes
import subprocess
import logging
from tkinter import messagebox
import tkinter as tk
from tkinter import filedialog, ttk

import threading
import os
import hashlib
import atexit
# Windows 상수 정의
GENERIC_READ = 0x80000000
FILE_SHARE_READ = 0x00000001
FILE_SHARE_WRITE = 0x00000002
OPEN_EXISTING = 3
IOCTL_DISK_GET_DRIVE_GEOMETRY = 0x00070000
IOCTL_DISK_GET_LENGTH_INFO = 0x0007405C

# DISK_GEOMETRY 구조체
class DISK_GEOMETRY(ctypes.Structure):
    _fields_ = [
        ("Cylinders", wintypes.LARGE_INTEGER),
        ("MediaType", wintypes.DWORD),
        ("TracksPerCylinder", wintypes.DWORD),
        ("SectorsPerTrack", wintypes.DWORD),
        ("BytesPerSector", wintypes.DWORD)
    ]

# GET_LENGTH_INFORMATION 구조체
class GET_LENGTH_INFORMATION(ctypes.Structure):
    _fields_ = [("Length", wintypes.LARGE_INTEGER)]

# Kernel32 DLL 로드
kernel32 = ctypes.WinDLL('kernel32', use_last_error=True)

# CreateFile 함수 정의
def create_file(filename, access, mode, creation, flags):
    return kernel32.CreateFileW(
        ctypes.c_wchar_p(filename),
        ctypes.c_ulong(access),
        ctypes.c_ulong(mode),
        None,
        ctypes.c_ulong(creation),
        ctypes.c_ulong(flags),
        None
    )

# DeviceIoControl 함수 정의
def device_io_control(device, io_control_code, in_buffer, out_buffer):
    bytes_returned = wintypes.DWORD(0)
    status = kernel32.DeviceIoControl(
        ctypes.wintypes.HANDLE(device),
        wintypes.DWORD(io_control_code),
        None,
        0,
        ctypes.byref(out_buffer),
        wintypes.DWORD(ctypes.sizeof(out_buffer)),
        ctypes.byref(bytes_returned),
        None
    )
    return status

# 마지막 섹터 번호를 계산하는 함수
def get_last_sector_number(disk_number):
    drive_path = f"\\\\.\\PhysicalDrive{disk_number}"
    h_drive = create_file(drive_path, GENERIC_READ, FILE_SHARE_READ | FILE_SHARE_WRITE, OPEN_EXISTING, 0)
    if h_drive == ctypes.wintypes.HANDLE(-1).value:
        raise ctypes.WinError(ctypes.get_last_error())

    try:
        length_info = GET_LENGTH_INFORMATION()
        if not device_io_control(h_drive, IOCTL_DISK_GET_LENGTH_INFO, None, length_info):
            raise ctypes.WinError(ctypes.get_last_error())

        disk_geometry = DISK_GEOMETRY()
        if not device_io_control(h_drive, IOCTL_DISK_GET_DRIVE_GEOMETRY, None, disk_geometry):
            raise ctypes.WinError(ctypes.get_last_error())

        total_length = length_info.Length
        bytes_per_sector = disk_geometry.BytesPerSector
        last_sector_number = (total_length // bytes_per_sector) - 1

        return last_sector_number

    finally:
        kernel32.CloseHandle(ctypes.wintypes.HANDLE(h_drive))


def is_admin():
    try:
        return ctypes.windll.shell32.IsUserAnAdmin()
    except:
        return False

# 물리 드라이브 정보를 가져오는 함수
def get_physical_drives():
    logging.info("Fetching physical drives")
    try:
        result = subprocess.check_output("wmic diskdrive get DeviceID, Model, Size", shell=True).decode('cp949').strip()
        logging.debug(f"Raw wmic output: {result}")

        lines = result.split('\n')[1:]  # 첫 번째 줄은 헤더이므로 제외
        drives_info = []
        for index, line in enumerate(lines):
            parts = line.strip().split()
            if len(parts) < 3:
                continue

            drive = parts[0]
            model = ' '.join(parts[1:-1])

            try:
                # 마지막 섹터 번호를 기반으로 크기 계산
                disk_number = int(drive[-1])
                last_sector_number = get_last_sector_number(disk_number)
                bytes_per_sector = 512  # 일반적으로 섹터 크기는 512 바이트입니다.
                total_size = (last_sector_number + 1) * bytes_per_sector

                # Size 변환 및 표시 단위 결정
                size_in_gb = total_size / (1024 ** 3)  # bytes to gigabytes
                if size_in_gb < 1:
                    size_in_mb = size_in_gb * 1024  # Convert GB to MB
                    display_size = f"{size_in_mb:.2f} MB"
                else:
                    display_size = f"{size_in_gb:.2f} GB"

            except Exception as e:
                logging.error(f"Failed to get size for {drive}: {e}")
                display_size = "Unknown"

            drives_info.append((drive, model, display_size))
            logging.debug(f"Drive: {drive}, Model: {model}, Size: {display_size}")

        return drives_info

    except subprocess.CalledProcessError as e:
        messagebox.showerror("오류", "물리 드라이브를 조회하는 중 오류가 발생했습니다.")
        logging.error("Failed to get physical drives: %s", str(e))
        return []
    except Exception as e:
        messagebox.showerror("오류", "예상치 못한 오류가 발생했습니다.")
        logging.error("Unexpected error: %s", str(e))
        return []

# 로깅 설정
logging.basicConfig(filename='imaging_process.log', level=logging.DEBUG, format='%(asctime)s - %(levelname)s - %(message)s')

# GUI 애플리케이션 정의
class App:
    def __init__(self, root):
        self.root = root
        self.root.title("물리 이미징 도구")

        self.current_process = None  # 현재 실행 중인 이미징 프로세스 객체를 저장하기 위한 변수
        self.progress_var = tk.DoubleVar()

        tk.Label(root, text="물리 드라이브 선택:").pack(padx=10, pady=5)
        self.drive_list = tk.Listbox(root, width=60, height=10)
        self.drive_list.pack(padx=10, pady=5)

        # 드라이브 정보를 가져와서 리스트 박스에 표시합니다.
        self.drives = get_physical_drives()
        for idx, (drive, model, size) in enumerate(self.drives):
            self.drive_list.insert(tk.END, f"{drive} - {model} - {size}")

        tk.Label(root, text="저장할 이미지 파일 경로:").pack(padx=10, pady=5)
        self.file_path = tk.Entry(root, width=50)
        self.file_path.pack(padx=10, pady=5)
        tk.Button(root, text="파일 선택", command=self.select_output_file).pack(padx=10, pady=5)

        self.progress_bar = ttk.Progressbar(root, variable=self.progress_var, maximum=100, length=400)
        self.progress_bar.pack(padx=10, pady=10)

        self.status_label = tk.Label(root, text="물리이미지를 획득할 드라이브를 선택해주세요.\n(용량이 충분한지 확인!!)")
        self.status_label.pack(padx=10, pady=5)

        button_frame = tk.Frame(root)
        button_frame.pack(padx=10, pady=10)

        self.start_button = tk.Button(button_frame, text="이미징 시작", command=self.start_imaging)
        self.start_button.pack(side=tk.LEFT, padx=5)

        self.stop_button = tk.Button(button_frame, text="이미징 중지", command=self.stop_imaging)
        self.stop_button.pack(side=tk.LEFT, padx=5)

        self.quit_button = tk.Button(root, text="종료", command=root.quit)
        self.quit_button.pack(padx=10, pady=10)

    def select_output_file(self):
        logging.info("Prompting user to select an output file")
        file_path = filedialog.asksaveasfilename(filetypes=[("Image Files", "*.img"), ("All Files", "*.*")])
        if file_path:
            self.file_path.delete(0, tk.END)
            self.file_path.insert(0, file_path)
            logging.info(f"Output file selected: {file_path}")

    def update_progress(self, progress):
        self.progress_var.set(progress)
        self.progress_bar['value'] = progress

    def update_status_label(self, status):
        self.status_label.config(text=status)

    def start_imaging(self):
        if self.current_process and self.current_process.is_alive():
            messagebox.showinfo("알림", "이미지 작업이 이미 진행 중입니다.")
            return

        if not self.drive_list.curselection():
            messagebox.showerror("오류", "물리 드라이브가 선택되지 않았습니다.")
            logging.error("No physical drive was selected by the user")
            return

        selected_index = self.drive_list.curselection()[0]
        selected_drive, _, display_size = self.drives[selected_index]
        output_file = self.file_path.get()

        if not output_file:
            messagebox.showerror("오류", "출력 파일 경로가 지정되지 않았습니다.")
            logging.error("No output file path was provided")
            return

        logging.info(f"Selected drive: {selected_drive}, Output file: {output_file}")
        self.current_process = threading.Thread(target=self.create_image, args=(selected_drive, output_file), daemon=True)
        self.current_process.start()

    def stop_imaging(self):
        if self.current_process and self.current_process.is_alive():
            self.current_process = None            
            # 이미지 생성이 중지되면 이미지 파일과 로그 파일을 삭제하기 위해 atexit 모듈을 사용합니다.
            output_file = self.file_path.get()
            
            def delete_files():
                if os.path.exists(output_file):
                    os.remove(output_file)
                    logging.info(f"Deleted image file: {output_file}")
            
            # 프로그램이 종료될 때 파일 삭제 함수를 실행하도록 예약합니다.
            atexit.register(delete_files)
        else:
            messagebox.showinfo("알림", "진행 중인 이미지 작업이 없습니다.")

    def create_image(self, source_drive, output_file):
        logging.info(f"Starting imaging process for {source_drive} to {output_file}")
        fd = None
        try:
            fd = os.open(source_drive, os.O_RDONLY | os.O_BINARY)
            logging.info("Opened the physical drive successfully")

            with open(output_file, 'wb') as img:
                logging.info(f"Attempting to write to the output file: {output_file}")
                total_bytes = 0
                try:
                    total_size = self.get_drive_size(source_drive)  # Convert GB to Bytes for accuracy
                except Exception as e:
                    logging.error(f"Error getting total size: {e}")
                    total_size = 0

                while total_bytes < total_size:
                    if not self.current_process:
                        logging.info("Imaging process was stopped by the user")
                        self.update_status_label("이미지 작업이 중지되었습니다.\n(진행중이던 파일은 프로그램 종료 시 삭제됩니다. 먼저 수동 삭제 하셔도 됩니다.)")
                        return

                    data = os.read(fd, 1024 * 1024)
                    if not data:
                        logging.info("No more data to read from source drive, end of file reached.")
                        break

                    img.write(data)
                    total_bytes += len(data)
                    #logging.info(f"{total_bytes} bytes written successfully")

                    if total_size > 0:
                        progress = (total_bytes / total_size) * 100
                        self.root.after(0, self.update_progress, progress)
                        self.root.after(0, self.update_status_label, f"이미지 생성중: {total_bytes} / {total_size} bytes\n({progress:.3f}%)")
                    else:
                        logging.warning("Total size of the drive is reported as zero, unable to calculate progress.")

                    self.root.update_idletasks()

                logging.info(f"Data written to the image file successfully: {total_bytes} bytes")

                try:
                    with open(output_file, 'rb') as img_file:
                        img_data = img_file.read()
                        md5_hash = hashlib.md5(img_data).hexdigest()
                        with open(f"{output_file}.log", "a") as log_file:
                            log_file.write(f"Image file: {output_file}\nMD5: {md5_hash}\n")
                    logging.info("Image file name and MD5 hash logged successfully.")
                except Exception as e:
                    logging.error(f"Error logging image file name and MD5 hash: {e}")

                self.update_status_label(f"이미지가 생성되었습니다.\n이미지 파일명: {output_file}\nMD5: {md5_hash}\n추가로 이미지 획득 가능합니다.")
                messagebox.showinfo("완료", "이미징이 완료되었습니다.")
                logging.info("Imaging completed successfully and user informed")

        except PermissionError as e:
            logging.error(f"PermissionError during read/write operation: {e}")
            messagebox.showerror("권한 오류", f"드라이브에 접근할 권한이 없습니다: {e}")
        except Exception as e:
            logging.error("Error during imaging: %s", str(e))
            messagebox.showerror("오류", f"이미징 중 오류가 발생했습니다: {str(e)}")
        finally:
            if fd:
                os.close(fd)
                logging.info("Closed the file descriptor for the drive")

    def get_drive_size(self, source_drive):
        for drive, model, display_size in self.drives:
            if drive == source_drive:
                size_str = display_size.split()[0]
                unit = display_size.split()[1]
                size_num = float(size_str)
                if unit == "GB":
                    return size_num * (1024 ** 3)
                elif unit == "MB":
                    return size_num * (1024 ** 2)
        return 0

if __name__ == "__main__":
    if not is_admin():
        messagebox.showerror("권한 오류", "관리자 권한으로 실행해 주세요.")
    else:
        root = tk.Tk()
        app = App(root)
        root.mainloop()

위 소스를 실행파일로 만들기 위해서 0-3) Python 소스로 실행파일 만들기를 참고! auto-py-to-exe (pyinstaller 버전 5.13.2)를 이용하여 실행파일로 만들 수 있습니다. 필요한 모듈은 PIP 명령어로 설치하면 됩니다.

프로그램 다운로드

아래 프로그램 다운로드 후 해당 파일의 해시값(md5, sha256)을 비교해보고 정확하지 않으면 어떤 소스인지 검증할 수 없으니 꼭 잘 알아보고 사용합시다. 아래 실습 화면을 이어서 참고해봅시다. 완벽한 프로그램은 아니니 이렇게 만들 수 있겠구나.. 정로도 참고만 부탁드립니다!
실행파일 다운로드

10MB

물리이미징 도구.zip

2. 마무리

이러한 기능 개발을 해보는 것을 추천하는 것은 분석도구를 잘 사용하는 것도 중요하지만 어떤 원리인지 파악해보고, 만드는 과정에서 어떤 부분이 어려운지 파악해두는 것도 중요합니다.
우선 이 도구는 당연하게도 아직 발전해야 할 기능이 매우 많습니다. 특히 포렌식 도구를 많이 써본 분들은 필요한 기능이 많이 보일것 입니다. E01 지원이나, 분할 저장 그리고 더 나아가서 원격으로 물리 이미징 파일을 저장할 수도 있겠지요. 기타 등등.. 얼마든지 발전시킬 수 있을 것입니다.
원리를 알고 있으면 이러한 포렌식 도구 개발을 할 수 있을 것이며, 이렇게 본인이 익힌것을 실제로 개발을 통해 구현해보는 활동은 매우 중요합니다.
개인이 어떠한 연구를 하고 할 때 활용할 수 있습니다. 사실 개인적으로 많이 활용하는 부분은 기존에 유명한 포렌식 도구를 쓰겠지만, 해당 도구가 기능이 매우 많거나, 도구를 활용하는 과정에 오래걸릴 때 아주 간단하게 내가 원하는 기능만 하는 도구나 소스만을 이용하여 먼저 빠르게 만들어 분석하거나 테스트하는데 유용하게 활용할 수 있습니다. 그런 도구가 지원하지 않을 경우 필요하다면 이렇게 만들어서 사용할 수 있어야 합니다.

3줄 요약

물리이미징을 하는데 생각보다 어려웠던 점은 저장매체의 물리적으로 접근하여 0번 섹터 부터 마지막 섹터에 접근하는 부분이 까다로웠습니다.
분석도구, ChatGPT 등을 이용하면 우리가 일일이 확인해야되는 부분을 읽고 16진수를 10진수로 변환하는 부분을 모두 사실상 자동 해주기 때문에 최대한 잘 활용하도록 합시다.
그러나! 원리는 정확하게 파악하고 있어야 ChatGPT나 분석도구가 잘못하고 있을 때 무엇을 잘못하고 있는지, 어떻게 수정해야 하는지 정확히 파악이 가능해질 것이다!

Previous2-2) 물리이미징(raw) 실습 Next3) 파티션

Last updated 7 months ago