法式牛排烹饪实验(AI创造营 第一期)
时间:2025-08-12
法式牛排烹饪实验(AI创造营 第一期)
本文详细介绍了将法语字幕电影转换成中文字幕的实用方法。首先,利用PaddleHub的法语文本识别功能和百度云提供的文本翻译服务,通过逐帧从影片中提取出包含字母的区域并进行识别,接着调用百度翻译获得相应的中文内容。最后,将译好的中文字幕嵌入到原始电影的图像上,最终形成一个新的视频文件。本文还提供了具体的代码示例来帮助读者实现这一过程。

法语字幕电影如何转为中文字幕?
看到有些法国电影视频只有法语字幕,没有被翻译,我们称之为“生肉”,那么如何使这些“生肉”变得美味可口呢?答案是:使用两个工具的巧妙结合!PaddleHub提供的文本识别功能可以帮助你读懂每句台词,而百度云提供的文本翻译功能则能将其翻译成中文。这样一来,“生肉”就能被轻松烹饪成了既保留法式风味又易于理解的经典作品了!
原图

翻译后

实现思路
这个示例详细展示了如何将法语字幕电影转换成中文字幕的过程。首先,逐帧提取影片为静态图像;接着从含有字母的区域提取并利用Paddle Hub的法文识别技术解析文本;随后调用百度翻译服务获得准确的中文字幕内容;最后,在每一帧处理后的图像上显示中文字幕,并将处理过的图片制作成视频文件。
In []
# 进度条 def process_bar(percent, start_str='', end_str='', total_length=0): bar = ''.join(["="] * int(percent * total_length)) + '' bar = '\r' + start_str +' ['+ bar.ljust(total_length) + ']{:0>4.1f}% '.format(percent*100) + end_str print(bar, end='', flush=True) process_bar(75/100, start_str='f_name', end_str='', total_length=50)登录后复制
f_name [===================================== ]75.0%登录后复制
In []
# 针对PaddleHub的更新频率较高,建议尽快升级至最新版本。以下是升级步骤: - 使用以下命令更新PaddleHub到最新版本:`!pip install paddlehub --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple` - 此模块依赖于第三方库shapely和pyclipper,请确保先安装这两个库。运行以下命令来安装它们: - `!pip install shapely -i https://pypi.tuna.tsinghua.edu.cn/simple` - `!pip install pyclipper -i https://pypi.tuna.tsinghua.edu.cn/simple`# 安装法语识别预训练模型 - 下载并安装法国OCR数据库的CRNN移动版本:`!hub install french_ocr_db_crnn_mobile==# 安装视频处理库 - 使用moviepy处理和编辑视频片段,可执行以下命令: - `pip install moviepy -i https://pypi.tuna.tsinghua.edu.cn/simple`# 操作指南已完成,请确保已完成上述步骤,并根据需求进行操作。
In []
from PIL import Image import numpy as np import os import cv2 import matplotlib.pyplot as plt import matplotlib.image as mpimg import paddlehub as hub import cv2 ocr = hub.Module(name="french_ocr_db_crnn_mobile") #result = ocr.recognize_text(images=[cv2.imread('/PATH/TO/IMAGE')])登录后复制
[2021-03-25 19:27:36,079] [ WARNING] - The _initialize method in HubModule will soon be deprecated, you can use the __init__() to handle the initialization of the object登录后复制
In []
!ls data/data76619/video-001.mp4登录后复制
data/data76619/video-001.mp4登录后复制
In []
from IPython.display import HTML HTML("""<h3>处理前视频</h3><iframe style="width:98%;height: 450px;" src="//player.bilibili.com/player.html"appid":"2021009693695","appkey":"Jwo)n2mdwxsAdRfgwp"}
In []
# 加载配置 import json cfg=json.loads(open("config.txt","r").read())登录后复制
In []
# -*- coding: utf-8 -*- # This code shows an example of text translation from English to Simplified-Chinese. # This code runs on Python 2.7.x and Python 3.x. # You may install `requests` to run this code: pip install requests # Please refer to `https://api.fanyi.baidu.com/doc/21` for complete api document import requests import random import json from hashlib import md5 # 调用翻译 def translate_to_cn(query,from_lang='zh'): # Set your own appid/appkey. appid = cfg["appid"] #你的appid appkey = cfg["appkey"] #你的密钥 # For list of language codes, please refer to `https://api.fanyi.baidu.com/doc/21` #from_lang = 'fra' to_lang = 'zh' endpoint = 'http://api.fanyi.baidu.com' path = '/api/trans/vip/translate' url = endpoint + path #query = "La folle histoire de Max et Léon " # Generate salt and sign def make_md5(s, encoding='utf-8'): return md5(s.encode(encoding)).hexdigest() salt = random.randint(32768, 65536) sign = make_md5(appid + query + str(salt) + appkey) # Build request headers = {'Content-Type': 'application/x-www-form-urlencoded'} payload = {'appid': appid, 'q': query, 'from': from_lang, 'to': to_lang, 'salt': salt, 'sign': sign} # Send request r = requests.post(url, params=payload, headers=headers) svr_result = r.json() #svr_result_dic = json.dumps(svr_result, indent=4, ensure_ascii=False) try : tmp = svr_result["trans_result"] if len(tmp)<=0: return "" else: return tmp[-1]["dst"] except Exception: print(r) return "" # 测试一下 rst = translate_to_cn("La folle histoire de Max et Léon ","fra") print(rst)登录后复制
麦克斯和利昂的疯狂故事登录后复制
视频处理框架
In []
在视频处理中,我们通过使用PIL库实现逐帧处理。首先,读取输入的MP件,然后将源路径中的文件名转换为适当格式以适应输出avi文件。接着定义了视频和帧处理函数,包括视频大小、帧率和帧计数的获取,以及逐帧处理过程中显示进度条的功能。使用这些参数,我们可以对输入的MP件进行逐帧处理,并将其结果存储在指定目录中的.avi文件中。示例代码展示了如何将“data/datavideo-mp文件中的第一个保存为.avi格式的avi文件,同时显示进度条来跟踪处理过程。
/home/aistudio work/video-001.avi [==================================================]100.0% processed 1919 frames in total. video-001.sample_100.avi [==================================================]100.0% processed 1919 frames in total.登录后复制
处理图片
In [116]
使用Python中的PIL库,你可以创建一个自动识别和翻译视频中法语字幕的脚本。该方法涉及从图像中裁剪出文本区域并将其传递给OCR模块进行识别人工智能识别的文字内容。如果识别结果为空或为空,则保留原始图像;否则,使用翻译函数将识别的文字转换为中文,并在原位置生成翻译后的文字图像。为了提高效率和准确性,可以考虑引入缓存机制以减少重复计算的时间消耗。
[2021-03-25 22:43:37,006] [ WARNING] - The _initialize method in HubModule will soon be deprecated, you can use the __init__() to handle the initialization of the object登录后复制
In [117]
print('processed {} frames in total.'.format( video_process("data/data76619/video-001.mp4",'',proc=process_frame_img,sample=1) ))登录后复制
[2021-03-25 22:43:49,277] [ WARNING] - The _initialize method in HubModule will soon be deprecated, you can use the __init__() to handle the initialization of the object登录后复制
video-001.sample_1.avi [==================================================]100.0% processed 1919 frames in total.登录后复制
In [118]
from IPython.display import HTML HTML("""<h3>处理后视频</h3><iframe style="width:98%;height: 450px;" src="//player.bilibili.com/player.html?bvid=BV1554y1h7im&cid=186472861&page=1" scrolling="no" border="0" frameborder="no" framespacing="0" allowfullscreen="true" > </iframe>""")登录后复制
<IPython.core.display.HTML object>登录后复制登录后复制
华丽的分割线
下面是实验用到的
In []
# 视频逐帧存图片文件 import os import cv2 import matplotlib.pyplot as plt import matplotlib.image as mpimg %cd ~ def extract_images(src_video_, dst_dir_): video_ = cv2.VideoCapture(src_video_) count = 0 while True: flag, frame = video_.read() if not flag: break cv2.imwrite(os.path.join(dst_dir_, str(count) + '.png'), frame) #print(os.path.join(dst_dir_, str(count) + '.png')) #print(count) count = count + 1 print('extracted {} frames in total.'.format(count)) src_video = '/home/aistudio/data/data76619/video-001.mp4' dst_dir_ = '/home/aistudio/work/video' extract_images(src_video, dst_dir_)登录后复制
/home/aistudio extracted 1919 frames in total.登录后复制
In []
%cd ~ os.listdir('/home/aistudio/data/data76619/video-001.mp4')登录后复制
/home/aistudio登录后复制
['video-001.mp4']登录后复制
In []
# 以下示例在pil中剪切部分图片 from PIL import Image #im1 = Image.open("work/video/278.png") im1 = Image.open("work/video/1.png") print(im1.size) # (352, 288) region = (0,323,624,408) #裁切图片 cropImg = im1.crop(region) cropImg.save('sample.jpg') img_mask_2 = Image.new( 'RGB', (624, 408), ( 0, 0, 0 ) ) #img_mask_2.paste( cropImg,( 0,323,624 ,85 ) ) img_mask_2.paste( cropImg,( 0,323,624 ,408 ) ) img_mask_2.save('sample2.jpg') import paddlehub as hub import cv2 ocr = hub.Module(name="french_ocr_db_crnn_mobile") result = ocr.recognize_text(images=[cv2.imread('sample2.jpg')])#,cv2.imread("work/video/278.png") print(result)登录后复制
(624, 408)登录后复制
[2021-03-25 21:12:53,272] [ WARNING]登录后复制
- The _initialize method in HubModule will soon be deprecated, you can use the __init__() to handle the initialization of the object登录后复制
[2021-03-25 21:12:53,730] [ WARNING] - The _initialize method in HubModule will soon be deprecated, you can use the __init__() to handle the initialization of the object登录后复制
[{'save_path': '', 'data': []}]登录后复制
In []
将字体文件从“数据”目录中的`datasimhei.ttf`复制到您的机器的matplotlib字体路径,方法如下:```bash !cp data/datasimhei.ttf /opt/conda/envs/pythonpaddleenv/lib/pythonsite-packages/matplotlib/mpl-data/fonts/ttf/ ```或者,如果在aistudio上没有足够的权限复制到`/usr/share/fonts/`,可以尝试将字体文件直接复制到`~/.fonts/`:```bash !cp simhei.ttf ~/.fonts/ ```确保字体文件已正确复制到上述路径后,您可能还需要重启您的终端或Python环境以使更改生效。
In []
print(trans_cn_txt) from PIL import Image , ImageOps , ImageDraw, ImageFont font_size = 20 str_out = trans_cn_txt font = ImageFont.truetype("data/data76619/simhei.ttf", font_size, encoding="unic")#设置字体 back_color = 'black' d = Image.new("RGB", (624 ,85), back_color) t = ImageDraw.Draw(d) t.text((font_size, font_size), str_out, 'white', font) #d.save('sample3.jpg') from PIL import Image im1 = Image.open("work/video/278.png") im1.paste( d,( 0,323,624 ,408 ) ) im1.save('sample3.jpg')登录后复制
在过去的几年里登录后复制
In [1]
<br/>登录后复制
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Requirement already satisfied: moviepy in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (1.0.1) Requirement already satisfied: decorator<5.0,>=4.0.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from moviepy) (4.4.0) Requirement already satisfied: imageio<3.0,>=2.5; python_version >= "3.4" in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from moviepy) (2.6.1) Requirement already satisfied: imageio-ffmpeg>=0.2.0; python_version >= "3.4" in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from moviepy) (0.3.0) Requirement already satisfied: tqdm<5.0,>=4.11.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from moviepy) (4.36.1) Requirement already satisfied: numpy in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from moviepy) (1.16.4) Requirement already satisfied: requests<3.0,>=2.8.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from moviepy) (2.22.0) Requirement already satisfied: proglog<=1.0.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from moviepy) (0.1.9) Requirement already satisfied: pillow in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imageio<3.0,>=2.5; python_version >= "3.4"->moviepy) (7.1.2) Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests<3.0,>=2.8.1->moviepy) (1.25.6) Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests<3.0,>=2.8.1->moviepy) (3.0.4) Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests<3.0,>=2.8.1->moviepy) (2019.9.11) Requirement already satisfied: idna<2.9,>=2.5 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests<3.0,>=2.8.1->moviepy) (2.8)登录后复制
In [13]
#from moviepy.editor import AudioFileCLip #audio = AudioFileCLip("data/data76619/video-001.mp4") # 返回音频 from moviepy.editor import VideoFileClip audio = VideoFileClip("data/data76619/video-001.mp4").audio # 返回音频return video.audio video = VideoFileClip("data/data76619/video-001.mp4")# ("video-001.sample_1.avi")# 设置视频的音频 video = video.set_audio(audio)# 保存新的视频文件 video.write_videofile("video-001.audio_1.mp4") !ls -l登录后复制
chunk: 9%| | 122/1412 [00:00<00:02, 632.17it/s, now=None]登录后复制
Moviepy - Building video video-001.audio_1.mp4. MoviePy - Writing audio in video-001.audio_1TEMP_MPY_wvf_snd.mp3登录后复制
chunk: 12%| | 169/1412 [00:00<00:02, 566.89it/s, now=None]登录后复制
t: 3%| | 54/1920 [00:00<00:03, 493.29it/s, now=None]登录后复制
MoviePy - Done. Moviepy - Writing video video-001.audio_1.mp4登录后复制
t: 4%| | 69/1920 [00:00<00:06, 288.91it/s, now=None]登录后复制
<br/>登录后复制
Moviepy - Done ! Moviepy - video ready video-001.audio_1.mp4 total 15936 -rw-r--r-- 1 aistudio users 27504 Mar 26 08:37 1709934.ipynb -rw-r--r-- 1 aistudio aistudio 61 Mar 25 21:28 config.txt drwxrwxrwx 3 aistudio users 4096 Mar 26 08:16 data -rw-r--r-- 1 aistudio users 4610545 Mar 26 08:37 video-001.audio_1.mp4 -rw-r--r-- 1 aistudio aistudio 11664928 Mar 25 23:34 video-001.sample_1.avi drwxr-xr-x 4 aistudio aistudio 4096 Mar 25 23:33 work登录后复制
以上就是法式牛排烹饪实验(AI创造营 第一期)的详细内容,更多请关注其它相关文章!