====== 📥 Step 5: Download HTML Files from Google Drive to AWS Server ====== This guide shows how to download `.html` files from a specific Google Drive folder directly to your AWS server using a Python script and a service account. ===== 📁 Requirements ===== You must have completed: * Service account creation and key download. * Shared your Google Drive folder with the service account email. * Installed required Python packages in a virtual environment. * Stored your `service-account.json` in a secure folder, such as: `/home/ec2-user/credentials/service-account.json` ===== 📝 1. Create the Python Script ===== - Open a terminal on your AWS server and navigate to your working directory (e.g., your home directory): cd ~ - Create the Python script: nano download_html.py - Paste the following code into the editor. **Replace** `YOUR_HTML_FOLDER_ID` with your actual Google Drive folder ID. import os import io from google.oauth2 import service_account from googleapiclient.discovery import build from googleapiclient.http import MediaIoBaseDownload SERVICE_ACCOUNT_FILE = '/home/ec2-user/credentials/service-account.json' SCOPES = ['https://www.googleapis.com/auth/drive.readonly'] HTML_FOLDER_ID = 'YOUR_HTML_FOLDER_ID' DESTINATION_FOLDER = './downloaded_html_files' credentials = service_account.Credentials.from_service_account_file( SERVICE_ACCOUNT_FILE, scopes=SCOPES ) service = build('drive', 'v3', credentials=credentials) def download_files_from_folder(folder_id, destination_folder): if not os.path.exists(destination_folder): os.makedirs(destination_folder) print(f"[+] Created folder: {destination_folder}") query = f"'{folder_id}' in parents and mimeType='text/html'" results = service.files().list(q=query, fields="files(id, name)").execute() files = results.get('files', []) if not files: print("[-] No HTML files found.") return for file in files: print(f"[~] Downloading {file['name']}") request = service.files().get_media(fileId=file['id']) file_path = os.path.join(destination_folder, file['name']) with io.FileIO(file_path, 'wb') as fh: downloader = MediaIoBaseDownload(fh, request) done = False while not done: status, done = downloader.next_chunk() if status: print(f" {int(status.progress() * 100)}% complete") print(f"[+] Downloaded: {file['name']}") if __name__ == '__main__': download_files_from_folder(HTML_FOLDER_ID, DESTINATION_FOLDER) - Save and exit: * Press `Ctrl + O`, then `Enter` to save. * Press `Ctrl + X` to exit the editor. ===== ▶️ 2. Run the Script ===== - Make sure your virtual environment is active: source ~/gdrive-env/bin/activate - Run the script: python download_html.py ===== ✅ Result ===== All `.html` files from the specified Google Drive folder will be downloaded to: ~/downloaded_html_files/ You’ll see logs like: [~] Downloading filename.html 100% complete [+] Downloaded: filename.html ===== 🧼 Optional: Deactivate Virtual Environment ===== Once you’re done: deactivate