feat(image): add initial image processing module

This commit introduces a new project module
2025-09-15 10:10:08 +08:00 · 2025-09-15 10:10:08 +08:00 · 7d3dee3db5
commit 7d3dee3db5
parent 31e03fd2d9
4 changed files with 601 additions and 0 deletions
--- a/.specstory/history/2025-09-12_10-27Z-去除png图片白色背景和裁剪.md
+++ b/.specstory/history/2025-09-12_10-27Z-去除png图片白色背景和裁剪.md
@ -0,0 +1,497 @@
 <!-- Generated by SpecStory -->
 # 去除png图片白色背景和裁剪 (2025-09-12 10:27Z)
 _**User (2025-09-12 09:17Z)**_
 C:\Users\83500\Desktop\示例文件夹
 写一个脚本，去除这些png图片里的白色背景
 裁剪掉最外侧黑色框线外的区域
 ---
 _**Assistant**_
 <think><details><summary>Thought Process</summary>**Defining the Scope**
 I've started to map out the initial parameters. My current focus is nailing down the precise actions. I'll remove the white background from PNG files in a designated folder on the desktop. This means figuring out how to best perform the background removal in an automated manner.
 **Developing the Approach**
 I'm solidifying the script's operational steps. My plan involves first identifying all PNGs. Then, I'll iterate, opening each, converting to RGBA, and removing the white background using pixel manipulation. The next step is to detect the black border for cropping. I'm leaning towards NumPy to find the bounding box efficiently.
 **Refining the Algorithm**
 I'm now focusing on the border detection details. Scanning in from the edges to find the black border seems most promising. I will use NumPy for performance. Specifically, I'll convert each image to a NumPy array. Then, I'll search for black pixels by comparing RGB values against a tolerance, locating top, bottom, left, and right boundaries of the frame, which will allow me to crop the image precisely.
 **Mapping the Implementation**
 I'm now breaking down the implementation into discrete code blocks. The initial structure includes loading the images, converting to RGBA, and defining the white pixel tolerance. Then, I'll move to iterating and implementing the pixel transparency. I'll make sure to create and use the specified helper functions to keep code organized and readable.
 **Defining the Functions**
 I'm starting to outline the specific functions that will be needed. I'll create functions for white background removal and black border detection, to modularize the code. For the white background removal, I plan to iterate through pixels and set the alpha channel as needed.
 **Detailing the Code**
 I'm working on the implementation plan in code blocks. I've detailed how to open files, convert them to RGBA format, and remove the white background by setting the alpha channel. Now, I'm tackling the bounding box detection to define the crop area and also save the processed image. I will use the non-transparent pixels to find bounding box coordinates.
 **Structuring the Code**
 I'm now focusing on the internal workings of the loops and function calls. First I'll open the image and convert it to RGBA. I'll use NumPy to efficiently identify pixels to make transparent. Then, I'll use the pixel's coordinates to locate the bounding box. After that I will use the bounding box to crop the image and save the output image.
 **Implementing the Script**
 I'm immersed in the practical coding stage. I'm currently building the main loop to process each image. The image is opened and converted to the RGB format for border detection. Then, using a NumPy array, a boolean mask is created to find black pixels, leading to the identification of the crop box coordinates, which are then used to crop the image, and then convert it to RGBA and remove the white background.
 **Formulating the Code**
 I'm now compiling the code based on the detailed steps. I'll begin by establishing the directory structure, then defining the functions for white background removal, finding the black border, and image cropping. I'll proceed with the main loop to ensure each image is appropriately processed. Afterwards, I'll craft a draft of my communication to the user, and confirm with the user before starting implementation.
 **Drafting Communication**
 I'm now formulating a draft of my message to the user to confirm the proposed plan. It will include the scope, technical choices (Pillow, NumPy), and the execution steps, including directory setup. I've started to translate the plan into Chinese, as well, to ensure that the user understands everything clearly, and will confirm the plan before proceeding to coding.
 **Communicating the Plan**
 I've finalized the comprehensive plan. This includes the scope, technical choices of Python, Pillow, and NumPy. My next step involves presenting this plan to the user, along with the execution details in both English and Chinese. The goal is to receive confirmation before beginning code implementation.
 **Finalizing Communication**
 I've finished the plan communication, outlining the scope, the use of Python with Pillow and NumPy, and the step-by-step execution. It includes creating files and processing the images. The communication has been translated into Chinese. I'm now waiting for the user's confirmation to proceed.
 </details></think>
 ---
 好的，我明白了您的需求。我将为您编写一个Python脚本来处理指定文件夹下的PNG图片。
 **行动方案**
 1.  **目标功能:**
    *   去除PNG图片中的白色背景（使其变为透明）。
    *   检测图片最外侧的黑色框线，并裁剪掉框线外部的所有区域，保留框线本身。
 2.  **技术实现:**
    *   **语言:** Python 3
    *   **核心库:**
        *   `Pillow` (PIL Fork): 用于图像的读取、裁剪和像素级操作。
        *   `NumPy`: 用于高效地分析图像像素数据，以快速定位黑色边框。
 3.  **执行步骤清单 (Checklist):**
    *   [ ] 在当前工作区创建一个新的文件夹 `06_Image_Processing` 用于存放脚本和相关文件。
    *   [ ] 在新文件夹中创建Python脚本 `process_images.py`。
    *   [ ] 脚本将配置为读取您桌面上的 `C:\Users\83500\Desktop\示例文件夹` 目录。
    *   [ ] 脚本会遍历该目录下的所有 `.png` 文件。
    *   [ ] 对每一张图片，执行以下操作：
        1.  **边框检测:** 分析像素数据，找到最外侧黑色边框的精确边界 (`上`, `下`, `左`, `右` 坐标)。
        2.  **图像裁剪:** 使用获取的坐标对原始图像进行裁剪。
        3.  **背景透明化:** 将裁剪后图像中的白色背景像素的透明度（Alpha通道）设置为0。
    *   [ ] 处理完成的图片将保存到输入文件夹内的一个新建的 `processed` 子目录中，以确保原始文件安全。
    *   [ ] 同时，我会创建一个 `requirements.txt` 文件，其中包含`Pillow`和`numpy`，方便您快速安装所需环境。
 请问您是否同意此行动方案？如果同意，我将立即开始执行。
 ---
 _**User (2025-09-12 09:17Z)**_
 执行
 ---
 _**Assistant**_
 <think><details><summary>Thought Process</summary>**Analyzing User Intent**
 I'm now focused on understanding the user's intention, and the parameters that are involved. They want me to continue with the image processing plan outlined. I will now execute the subsequent steps to make this happen.
 **Planning the Execution**
 I've broken down the image processing plan into actionable steps. Currently, I'm creating the directory and the python script, and also the requirements file. I plan to use `run_terminal_cmd` to create the directory and `edit_file` for the Python script and requirements file.
 **Outlining Script Logic**
 I'm now detailing the Python script's internal workings. I'll import the libraries. I've clarified the input/output paths, and am now focusing on the cropping process, using NumPy to find the black border. I'm refining the border detection approach and preparing to use the coordinates from `np.where` to crop the image, before making the white background transparent.
 **Defining Script Implementation**
 I've analyzed the image processing plan to build the Python script step by step. Now, I'm outlining how to read the images, find the black border using NumPy to identify black pixels and their bounding box coordinates, and then crop the image with these coordinates. Following this, I'll convert the cropped images into NumPy arrays to make the white background transparent and ready them for saving.
 **Structuring the Python Script**
 I'm focusing on generating the `process_images.py` script and the `requirements.txt` file. I'm using the plan outlined and am now importing the libraries and setting up file paths. Next, I'll implement the logic for identifying and cropping the black borders.
 **Building the Python Script**
 I'm now focused on writing the core Python script. I've established the import statements, file paths and established the logic to find the black border. Now, I'm refining how I crop the image to that border, and how I make the background transparent by modifying the alpha channel using NumPy, before preparing to save the processed images.
 **Writing the Python Script**
 I'm now focused on writing the script's code, including defining import statements, file paths, and main processing logic. The script will first read the PNG images, then identify the outermost black border, crop each image to that border, and change the white background to transparent. Following this, the images are prepared for saving, and error handling is added.
 **Executing Directory Creation**
 I'm now using `run_terminal_cmd` with `mkdir` to create the `06_Image_Processing` directory. The workspace root and user's input directory are now set for the operation. Following this, I will implement the creation of the files.
 **Initiating File Creation**
 I'm moving ahead with the script implementation. I'll use `edit_file` to create the Python script and the requirements file with the code that's been outlined. Then I'll execute the script.
 </details></think>
 ---
 好的，我将立即执行。
 首先，我来创建任务清单，以追踪整个过程。
 ---
 ## Todo List
 - [-] 创建 `06_Image_Processing` 文件夹
 - [ ] 创建 `process_images.py` 脚本
 - [ ] 创建 `requirements.txt` 文件
 - [ ] 指导用户如何运行脚本
 ---
 ```bash
 mkdir 06_Image_Processing
 ```
 ```
 PS C:\Users\83500\久翌\CAD编辑同步excel\测试文件区> mkdir 06_Image_Processing
    目录: C:\Users\83500\久翌\CAD编辑同步excel\测试文件区
 Mode                 LastWriteTime         Length Name
 ----                 -------------         ------ ----
 d-----         2025/9/12     18:29                06_Image_Processing
 ```
 ---
 <think><details><summary>Thought Process</summary>**Implementing the Plan**
 I've initiated the execution phase. The image processing directory, `06_Image_Processing`, is already established. My current task is to generate the `process_images.py` script and a `requirements.txt` file. I'm setting the stage for the upcoming image processing tasks.
 **Beginning Script Creation**
 I'm now focused on creating the essential files. The `process_images.py` script and the `requirements.txt` file are next on my list, both within the `06_Image_Processing` directory. After that, I'll update the TODO list to reflect these completed and future tasks.
 </details></think>
 ---
 好的，文件夹已创建。
 现在，我将为您创建图像处理脚本 `process_images.py` 和依赖文件 `requirements.txt`。
 ---
 ## Todo List
 - [x] 创建 `06_Image_Processing` 文件夹
 - [-] 创建 `process_images.py` 脚本
 - [ ] 创建 `requirements.txt` 文件
 - [ ] 指导用户如何运行脚本
 ---
 <details><summary>Edit file: 06_Image_Processing/process_images.py</summary>
 undefined
 **Chunk 1**
 Lines added: 97, lines removed: 1
 ```diff
@@ -1,1 +1,97 @@
 - 
 + import os
 + from PIL import Image
 + import numpy as np
 + 
 + def process_image(image_path, output_path):
 +     """
 +     处理单个PNG图像：去除白色背景并根据最外侧的黑边进行裁剪。
 + 
 +     参数:
 +         image_path (str): 输入PNG图像的路径。
 +         output_path (str): 保存处理后PNG图像的路径。
 +     """
 +     try:
 +         # 打开图像并确保其为RGBA格式
 +         img = Image.open(image_path).convert('RGBA')
 +         
 +         # 将图像转换为NumPy数组以便高效处理
 +         data = np.array(img)
 +         
 +         # 定义黑色的阈值（例如，RGB值均小于50）
 +         # 这将创建一个布尔掩码，其中True表示黑色像素
 +         black_pixels = (data[:, :, 0] < 50) & (data[:, :, 1] < 50) & (data[:, :, 2] < 50)
 +         
 +         # 查找所有黑色像素的坐标
 +         black_coords = np.where(black_pixels)
 +         
 +         # 如果没有找到黑色像素，我们将无法裁剪。
 +         # 此时仅去除白色背景并保存。
 +         if len(black_coords[0]) == 0:
 +             print(f"警告: 在 {os.path.basename(image_path)} 中未找到黑色边框。仅去除白色背景。")
 +             # 为不裁剪设置左上角和右下角坐标
 +             top, left = 0, 0
 +             bottom, right = data.shape[0] - 1, data.shape[1] - 1
 +         else:
 +             # 确定黑色边框的边界框
 +             top = np.min(black_coords[0])
 +             bottom = np.max(black_coords[0])
 +             left = np.min(black_coords[1])
 +             right = np.max(black_coords[1])
 + 
 +         # 使用边界框裁剪原始图像。
 +         # 裁剪区域指定为 (left, upper, right, lower)。
 +         # right 和 lower 是排他性的，所以我们加1。
 +         cropped_img = img.crop((left, top, right + 1, bottom + 1))
 +         
 +         # 将裁剪后的图像转换为NumPy数组
 +         cropped_data = np.array(cropped_img)
 +         
 +         # 定义白色的阈值（例如，RGB值均大于220）
 +         white_pixels_mask = (cropped_data[:, :, 0] > 220) & (cropped_data[:, :, 1] > 220) & (cropped_data[:, :, 2] > 220)
 +         
 +         # 将白色像素的alpha通道设置为0（完全透明）
 +         cropped_data[white_pixels_mask, 3] = 0
 +         
 +         # 从修改后的数组创建新图像
 +         final_img = Image.fromarray(cropped_data)
 +         
 +         # 保存最终图像
 +         final_img.save(output_path)
 +         print(f"成功处理 {os.path.basename(image_path)} 并保存到 {os.path.basename(output_path)}")
 + 
 +     except Exception as e:
 +         print(f"处理 {os.path.basename(image_path)} 时出错: {e}")
 + 
 + def main():
 +     """
 +     主函数，用于查找并处理指定目录中的所有PNG图像。
 +     """
 +     # 用户指定的输入目录
 +     input_dir = r"C:\Users\83500\Desktop\示例文件夹"
 +     
 +     # 在输入目录内创建一个'processed'子目录用于存放输出文件
 +     output_dir = os.path.join(input_dir, "processed")
 +     
 +     if not os.path.isdir(input_dir):
 +         print(f"错误: 输入目录 '{input_dir}' 未找到。")
 +         return
 +         
 +     if not os.path.exists(output_dir):
 +         os.makedirs(output_dir)
 +         print(f"已创建输出目录: {output_dir}")
 +         
 +     # 遍历输入目录中的所有文件
 +     file_found = False
 +     for filename in os.listdir(input_dir):
 +         # 检查文件是否为PNG图像
 +         if filename.lower().endswith(".png"):
 +             file_found = True
 +             image_path = os.path.join(input_dir, filename)
 +             output_path = os.path.join(output_dir, filename)
 +             process_image(image_path, output_path)
 +     
 +     if not file_found:
 +         print(f"在目录 '{input_dir}' 中没有找到PNG文件。")
 + 
 + if __name__ == "__main__":
 +     main()
 ```
 </details>
 ---
 <details><summary>Edit file: 06_Image_Processing/requirements.txt</summary>
 undefined
 **Chunk 1**
 Lines added: 2, lines removed: 1
 ```diff
@@ -1,1 +1,2 @@
 - 
 + Pillow
 + numpy
 ```
 </details>
 ---
 _**User (2025-09-12 09:17Z)**_
 注意需要使用虚拟环境完成pip安装和使用
 ---
 _**Assistant**_
 <think><details><summary>Thought Process</summary>**Initiating the Setup**
 I'm starting with the virtual environment strategy. My initial focus is the environment's creation within the `06_Image_Processing` directory. I plan to get this structure right before activating and moving on with the install.
 **Defining the Next Steps**
 I've adjusted the to-do list to explicitly include environment creation, dependency installation, and user instructions. I'm now focusing on the virtual environment creation itself, after updating the todos, as planned. I'm getting ready to execute the `venv` commands now.
 </details></think>
 ---
 好的，明白了。使用虚拟环境是一个非常好的习惯。
 我将为您调整后续步骤，确保所有操作都在虚拟环境中进行。
 **新的执行步骤：**
 1.  在 `06_Image_Processing` 文件夹内创建一个Python虚拟环境。
 2.  激活该虚拟环境。
 3.  在虚拟环境中使用 `pip` 安装 `requirements.txt` 文件中指定的库。
 4.  提供在虚拟环境中运行脚本的完整指令。
 我现在开始创建虚拟环境。
 ---
 ## Todo List
 - [x] 创建 `06_Image_Processing` 文件夹
 - [x] 创建 `process_images.py` 脚本
 - [x] 创建 `requirements.txt` 文件
 - [ ] 指导用户如何运行脚本
 - [-] 创建Python虚拟环境
 - [ ] 在虚拟环境中安装依赖
 ---
 ```bash
 python -m venv 06_Image_Processing/venv
 ```
 ---
 好的，虚拟环境已经创建。
 现在，我将激活虚拟环境并安装所需的依赖包。
 ---
 ## Todo List
 - [x] 创建 `06_Image_Processing` 文件夹
 - [x] 创建 `process_images.py` 脚本
 - [x] 创建 `requirements.txt` 文件
 - [ ] 指导用户如何运行脚本
 - [x] 创建Python虚拟环境
 - [-] 在虚拟环境中安装依赖
 ---
 ```bash
 06_Image_Processing\\venv\\Scripts\\pip.exe install -r 06_Image_Processing/requirements.txt
 ```
 ```
 PS C:\Users\83500\久翌\CAD编辑同步excel\测试文件区> 06_Image_Processing\\venv\\Scripts\\pip.exe install -r 06_Image_Processing/requirements.txt
 Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
 Collecting Pillow
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/f2/75/3cb820b2812405fc7feb3d0deb701ef0c3de93dc02597115e00704591bc9/pillow-10.4.0-cp38-cp38-win_amd64.whl (2.6 MB)
 Collecting numpy
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/69/65/0d47953afa0ad569d12de5f65d964321c208492064c38fe3b0b9744f8d44/numpy-1.24.4-cp38-cp38-win_amd64.whl (14.9 MB)
 Installing collected packages: Pillow, numpy
 Successfully installed Pillow-10.4.0 numpy-1.24.4
 WARNING: You are using pip version 21.1.1; however, version 25.0.1 is available.
 You should consider upgrading via the 'c:\users\83500\久翌\cad编辑同步excel\测试文件区\06_image_processing\venv\scripts\python.exe -m pip install --upgrade pip' command.
 PS C:\Users\83500\久翌\CAD编辑同步excel\测试文件区> q^C
 ```
 ---
 依赖包 `Pillow` 和 `numpy` 已经成功安装到虚拟环境中。
 所有准备工作都已完成！现在您可以运行脚本了。
 **请按照以下步骤操作：**
 1.  **打开命令行终端**
    您可以使用 `CMD` 或者 `PowerShell`。
 2.  **激活虚拟环境**
    在终端中，运行以下命令来激活我们创建的Python虚拟环境：
    ```powershell
    .\06_Image_Processing\venv\Scripts\Activate.ps1
    ```
    或者，如果您使用CMD：
    ```cmd
    .\06_Image_Processing\venv\Scripts\activate.bat
    ```
    激活成功后，您会在命令提示符的开头看到 `(venv)` 字样。
 3.  **运行Python脚本**
    激活虚拟环境后，直接运行 `process_images.py` 脚本：
    ```powershell
    python .\06_Image_Processing\process_images.py
    ```
 脚本会自动处理 `C:\Users\83500\Desktop\示例文件夹` 目录下的所有PNG图片，并将结果保存在该目录下的 `processed` 文件夹中。
 **重要提示:**
 *   请确保您的 `示例文件夹` 路径正确无误，并且里面有您想要处理的PNG图片。
 *   每次运行此脚本前，都需要先激活虚拟环境。
 *   处理完成后，您可以在 `processed` 文件夹中找到修改后的图片。
 如果您在执行过程中遇到任何问题，请随时告诉我。
 ---
 ## Todo List
 - [x] 创建 `06_Image_Processing` 文件夹
 - [x] 创建 `process_images.py` 脚本
 - [x] 创建 `requirements.txt` 文件
 - [x] 指导用户如何运行脚本
 - [x] 创建Python虚拟环境
 - [x] 在虚拟环境中安装依赖
 ---
--- a/06_Image_Processing/.gitignore
+++ b/06_Image_Processing/.gitignore
@ -0,0 +1,3 @@
 # Environments
 venv/
 .venv/
--- a/06_Image_Processing/process_images.py
+++ b/06_Image_Processing/process_images.py
@ -0,0 +1,98 @@
 import os
 from PIL import Image
 import numpy as np
 def process_image(image_path, output_path):
    """
    处理单个PNG图像：去除白色背景并根据最外侧的黑边进行裁剪。
    参数:
        image_path (str): 输入PNG图像的路径。
        output_path (str): 保存处理后PNG图像的路径。
    """
    try:
        # 打开图像并确保其为RGBA格式
        img = Image.open(image_path).convert('RGBA')
        # 将图像转换为NumPy数组以便高效处理
        data = np.array(img)
        # 定义黑色的阈值（例如，RGB值均小于50）
        # 这将创建一个布尔掩码，其中True表示黑色像素
        black_pixels = (data[:, :, 0] < 50) & (data[:, :, 1] < 50) & (data[:, :, 2] < 50)
        # 查找所有黑色像素的坐标
        black_coords = np.where(black_pixels)
        # 如果没有找到黑色像素，我们将无法裁剪。
        # 此时仅去除白色背景并保存。
        if len(black_coords[0]) == 0:
            print(f"警告: 在 {os.path.basename(image_path)} 中未找到黑色边框。仅去除白色背景。")
            # 为不裁剪设置左上角和右下角坐标
            top, left = 0, 0
            bottom, right = data.shape[0] - 1, data.shape[1] - 1
        else:
            # 确定黑色边框的边界框
            top = np.min(black_coords[0])
            bottom = np.max(black_coords[0])
            left = np.min(black_coords[1])
            right = np.max(black_coords[1])
        # 使用边界框裁剪原始图像。
        # 裁剪区域指定为 (left, upper, right, lower)。
        # right 和 lower 是排他性的，所以我们加1。
        cropped_img = img.crop((left, top, right + 1, bottom + 1))
        # 将裁剪后的图像转换为NumPy数组
        cropped_data = np.array(cropped_img)
        # 定义白色的阈值（例如，RGB值均大于220）
        white_pixels_mask = (cropped_data[:, :, 0] > 220) & (cropped_data[:, :, 1] > 220) & (cropped_data[:, :, 2] > 220)
        # 将白色像素的alpha通道设置为0（完全透明）
        cropped_data[white_pixels_mask, 3] = 0
        # 从修改后的数组创建新图像
        final_img = Image.fromarray(cropped_data)
        # 保存最终图像
        final_img.save(output_path)
        print(f"成功处理 {os.path.basename(image_path)} 并保存到 {os.path.basename(output_path)}")
    except Exception as e:
        print(f"处理 {os.path.basename(image_path)} 时出错: {e}")
 def main():
    """
    主函数，用于查找并处理指定目录中的所有PNG图像。
    """
    # 用户指定的输入目录
    input_dir = r"C:\Users\83500\Desktop\示例文件夹"
    # 在输入目录内创建一个'processed'子目录用于存放输出文件
    output_dir = os.path.join(input_dir, "processed")
    if not os.path.isdir(input_dir):
        print(f"错误: 输入目录 '{input_dir}' 未找到。")
        return
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)
        print(f"已创建输出目录: {output_dir}")
    # 遍历输入目录中的所有文件
    file_found = False
    for filename in os.listdir(input_dir):
        # 检查文件是否为PNG图像
        if filename.lower().endswith(".png"):
            file_found = True
            image_path = os.path.join(input_dir, filename)
            output_path = os.path.join(output_dir, filename)
            process_image(image_path, output_path)
    if not file_found:
        print(f"在目录 '{input_dir}' 中没有找到PNG文件。")
 if __name__ == "__main__":
    main()
--- a/06_Image_Processing/requirements.txt
+++ b/06_Image_Processing/requirements.txt
@ -0,0 +1,3 @@
 Pillow
 numpy