Python code to merge text layer from one PDF with image layer of another PDF
$30-250 USD
ที่ทำเสร็จแล้ว
โพสต์ เกือบ 4 ปีที่ผ่านมา
$30-250 USD
ชำระเงินเมื่อส่งงาน
We are using the img2pdf and ocrmypdf libraries with Python 3 to take a folder of JPGS, compile them as a PDF, and do OCR on that PDF. What we're finding is that there are a variety of post processing techniques (contrast, sharpening, etc) we want to apply to the JPGs that we feed to the OCR engine, BUT we want to maintain the unadjusted JPGs as the image layer of the resulting PDF. So, in short, we want to take two paired folders of PDFs, taking the image layer from the first folder and the text layer (the text that the OCR engine found and placed as a layer) from the second folder. The code written would be work for hire, and you'd only be responsible for the code discussed above, not the larger/overall project.
Hi there,
Yes sure. I can do that. Great data processing skills. Will use ImageMagick & PyPDF2 module in your task. No worries :) Your script can have GUI too if preferred.
Best regards,
Ishaq