Form File Attachment Download

This scenario demonstrates a form submission that downloads a file with Content-Disposition: attachment header in a new tab.

Download Sample File

Click the button below to submit a form via POST request. The response will be a PDF file that downloads automatically.

Technical Details:
  • Method: POST
  • Endpoint: /api/download-file
  • Button ID: download-btn
  • Target: _blank (opens in new tab)
  • Response Header: Content-Disposition: attachment; filename=download-sample.pdf
  • Content Type: application/pdf

Code Examples

Choose your preferred method for downloading files:

Method 1: Direct API Request (Python Requests)

Simplest approach - directly POST to the API endpoint without a browser.

import requests

# Submit the form
response = requests.post(
    "http://web-scraping.dev/api/download-file",
    headers={"Content-Type": "application/x-www-form-urlencoded"}
)

# Save the PDF file
if response.status_code == 200:
    with open("downloaded_file.pdf", "wb") as f:
        f.write(response.content)
    print("File downloaded successfully!")
Method 2: Playwright (Python)

Modern browser automation with built-in download handling. Recommended for headless automation.

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch(headless=True)
    page = browser.new_page()

    # Navigate to the download page
    page.goto("http://web-scraping.dev/file-download")

    # Set up download handler before clicking
    with page.expect_download() as download_info:
        # Click the download button (ID: download-btn)
        page.click("#download-btn")

    # Wait for download and save it
    download = download_info.value
    download.save_as("downloaded_file.pdf")
    print(f"Downloaded: {download.suggested_filename}")

    browser.close()
Method 3: Selenium with Chrome DevTools Protocol (Python)

Configure Chrome to automatically download PDFs to a specific directory.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
import time
import os

# Set up Chrome options
chrome_options = Options()
download_dir = os.path.abspath("./downloads")
os.makedirs(download_dir, exist_ok=True)

prefs = {
    "download.default_directory": download_dir,
    "download.prompt_for_download": False,
    "plugins.always_open_pdf_externally": True
}
chrome_options.add_experimental_option("prefs", prefs)

# Initialize driver
driver = webdriver.Chrome(options=chrome_options)

# Navigate and download
driver.get("http://web-scraping.dev/file-download")

# Click the download button using its ID
download_button = driver.find_element(By.ID, "download-btn")
download_button.click()

# Wait for download to complete
time.sleep(3)

# Check downloaded file
files = os.listdir(download_dir)
if files:
    print(f"Downloaded: {files[0]}")

driver.quit()
Method 4: cURL with ScrapFly API

Use ScrapFly's API to download the file via HTTP request with automatic anti-bot bypassing.

# Download file via ScrapFly API
curl -X POST "https://api.scrapfly.io/scrape" \
  -H "Content-Type: application/json" \
  -d '{
    "key": "YOUR_API_KEY",
    "url": "https://web-scraping.dev/api/download-file",
    "method": "POST"
  }' | jq -r '.result.content' | base64 -d > downloaded_file.pdf

echo "File downloaded successfully!"

Note: Sign up at scrapfly.io to get your API key. Free tier includes 1,000 API credits/month.

Method 5: Puppeteer (Node.js)

Node.js browser automation using Chrome DevTools Protocol.

const puppeteer = require('puppeteer');
const fs = require('fs');

(async () => {
  const browser = await puppeteer.launch({ headless: true });
  const page = await browser.newPage();

  // Enable download interception
  const client = await page.target().createCDPSession();
  await client.send('Page.setDownloadBehavior', {
    behavior: 'allow',
    downloadPath: './downloads'
  });

  await page.goto('http://web-scraping.dev/file-download');

  // Click the download button by ID
  await page.click('#download-btn');

  // Wait for download (adjust timeout as needed)
  await page.waitForTimeout(3000);

  console.log('Download completed!');
  await browser.close();
})();
Method 6: ScrapFly Python SDK

Using ScrapFly Python SDK to download files with automatic proxy rotation and anti-bot protection.

from scrapfly import ScrapflyClient, ScrapeConfig

# Initialize ScrapFly client
client = ScrapflyClient(key='YOUR_API_KEY')

# Download the file via POST request
result = client.scrape(ScrapeConfig(
    url='https://web-scraping.dev/api/download-file',
    method='POST',
    render_js=False,
))

# Save the downloaded file
with open('downloaded_file.pdf', 'wb') as f:
    f.write(result.content.encode('latin-1'))

print("File downloaded successfully!")

Install the SDK: pip install scrapfly-sdk