Table of Contents
π₯ Batched PDF to HTML Conversion (Google Apps Script)
This guide allows you to convert PDFs in a Google Drive folder to HTML files in batches using Google Apps Script. It overcomes timeout issues by tracking which files are already processed and converting only a few per run.
π Folder Setup
1. Place all the PDFs you want to convert in a Google Drive folder. 2. Share this folder with your Google account used for Apps Script. 3. This script will create a new folder (e.g., `MyFolder_html`) next to your original and store the converted `.html` files there.
π οΈ Script: Batched PDF Conversion with Resume Support
function convertPdfsToHtml_Batched() { const BATCH_SIZE = 5; // Process 5 files per run const originalFolderId = 'YOUR_FOLDER_ID_HERE'; const originalFolder = DriveApp.getFolderById(originalFolderId); const newFolderName = originalFolder.getName() + '_html'; let newFolder; const folders = DriveApp.getFoldersByName(newFolderName); newFolder = folders.hasNext() ? folders.next() : DriveApp.createFolder(newFolderName); Logger.log('Using folder: ' + newFolder.getName()); const processed = PropertiesService.getScriptProperties().getProperties(); const files = originalFolder.getFilesByType(MimeType.PDF); let processedCount = 0; while (files.hasNext() && processedCount < BATCH_SIZE) { const file = files.next(); const fileId = file.getId(); if (processed[fileId]) { Logger.log('Skipping already processed: ' + file.getName()); continue; } try { Logger.log('Processing PDF: ' + file.getName()); const pdfBlob = file.getBlob(); const convertedFile = Drive.Files.insert({ title: file.getName(), mimeType: MimeType.GOOGLE_DOCS }, pdfBlob); const doc = DocumentApp.openById(convertedFile.id); const docContent = doc.getBody().getText(); const htmlFileName = file.getName().replace('.pdf', '.html'); newFolder.createFile(htmlFileName, docContent, MimeType.HTML); Logger.log('β Converted and created: ' + htmlFileName); DriveApp.getFileById(convertedFile.id).setTrashed(true); PropertiesService.getScriptProperties().setProperty(fileId, 'done'); processedCount++; } catch (e) { Logger.log('Error converting: ' + file.getName() + ' β ' + e.message); } } Logger.log(`Batch complete: ${processedCount} file(s) processed`); }
βΆοΈ How to Use
1. Go to https://script.google.com and open or create a project. 2. Paste the code into a new `.gs` file. 3. Replace `'YOUR_FOLDER_ID_HERE'` with your actual Drive folder ID. 4. Run `convertPdfsToHtml_Batched`.
π Optional: Set Up Trigger for Automation
- Go to *Triggers* in Apps Script - Click βAdd Triggerβ - Choose:
- Function: `convertPdfsToHtml_Batched`
- Event: *Time-driven* β *Every 5 minutes*
- Save
This ensures your batch runs repeatedly and continues until all files are converted.
β Notes
- The script avoids reprocessing by tracking completed files with `PropertiesService`. - You can increase or decrease `BATCH_SIZE` to suit your needs. </code>