Corleone' ink

Back

文件编码转换Blur image

远古项目往往有很多问题,编码首当其冲。鉴于 GBKUTF-8 互不兼容,IDEA默认打开就会乱码,单个转码太慢太麻烦,所以整个批量转 UTF-8 国际通用码的小工具。

import os
import chardet
import codecs

def write_file(file_path, content, encoding="utf-8"):
    with codecs.open(file_path, "w", encoding) as f:
        f.write(content)

def convert_to_utf8(src_path):
    with open(src_path, "rb") as f:
        raw_data = f.read()
        detected = chardet.detect(raw_data)
        original_encoding = detected["encoding"]

    if original_encoding is None:
        print(f"[SKIP] {src_path}: encoding not detected")
        return

    if original_encoding.lower() != "utf-8":
        try:
            with codecs.open(src_path, "r", original_encoding) as f:
                content = f.read()
            write_file(src_path, content, encoding="utf-8")
            print(f"[OK] {src_path}: {original_encoding} → utf-8")
        except Exception as e:
            print(f"[ERROR] {src_path}: failed to convert ({original_encoding}) - {e}")
    else:
        print(f"[SKIP] {src_path}: already utf-8")

def process_directory(root_dir):
    for parent, dirnames, filenames in os.walk(root_dir):
        for filename in filenames:
            if filename.endswith((".java", ".jsp")):
                full_path = os.path.join(parent, filename)
                convert_to_utf8(full_path)

if __name__ == "__main__":
    src_path = "C:/Users/File"
    process_directory(src_path)
python

主要转 javajsp 文件,如果你有需求,可在 if filename.endswith((".java", ".jsp")) 这行代码的括号中添加后缀格式。

前提:有Python环境

  1. 首先复制代码并保存为.py文件,名称随意,例: convert.py

  2. 替换需要转码的目录路径,根目录即可,会递归执行

    # 注意路径划分以正斜杠/
    src_path = "C:/Users/File"
    python
  3. 在保存的位置打开终端并执行

    pip install chardet
    plaintext
  4. 最后执行

    python convert.py
    plaintext

出现 [OK] 开头说明成功了。还有,不要忘记 jsp 文件开头的 pageEncoding 也要改为 UTF-8,否则部署打开全是乱码锟斤拷。

最后附个视频,对编码不了解的可以看看,视频相同,两个平台。

YouTube:

BiliBili:

文件编码转换
https://vnsnclo.cn/blog/transcoding
Author Corleone
Published at June 7, 2025
Comment seems to stuck. Try to refresh?✨