If Python is the language of data automation, then the requests library is its most-used tool. Nearly every Python automation script that interacts with the outside world does so through HTTP — and understanding how to use Python requests professionally, handle pagination, and manage rate limits is the difference between automation scripts that work in demos and automation pipelines that run reliably in production. This guide covers everything from basic Python requests usage to advanced pagination patterns, rate limit handling, retry logic, and session management.
The Python Requests Library: Your API Automation Foundation
The Python requests library is the most downloaded Python package in history — and for good reason. It wraps Python's built-in urllib in a clean, human-friendly API that makes HTTP calls in Python automation scripts feel natural:
import requests
import os
# Basic GET request with Python requests
response = requests.get(
'https://api.example.com/contacts',
headers={
'Authorization': f'Bearer {os.environ["API_TOKEN"]}',
'Accept': 'application/json'
},
params={
'status': 'active',
'limit': 100
},
timeout=30 # Always set a timeout in Python requests
)
# Check response status in Python
response.raise_for_status() # Raises exception for 4xx/5xx responses
# Parse JSON response in Python
data = response.json()
contacts = data['contacts']
# POST request with Python requests
new_contact = requests.post(
'https://api.example.com/contacts',
headers={'Authorization': f'Bearer {os.environ["API_TOKEN"]}'},
json={
'email': 'jane@example.com',
'name': 'Jane Smith',
'plan': 'enterprise'
},
timeout=30
)
new_contact.raise_for_status()
created = new_contact.json()
Always set a timeout on every Python requests call — without it, a hung API connection will block your automation script indefinitely.
Python Requests Sessions for Efficient API Automation
For Python automation scripts that make many calls to the same API, using a requests.Session object is dramatically more efficient than making individual requests — it reuses the underlying TCP connection and automatically applies shared headers and authentication:
import requests
import os
def create_api_session(base_url: str, token: str) -> requests.Session:
"""Create a configured Python requests session for API automation"""
session = requests.Session()
session.headers.update({
'Authorization': f'Bearer {token}',
'Content-Type': 'application/json',
'Accept': 'application/json'
})
session.base_url = base_url
return session
# Use the Python requests session throughout your automation script
session = create_api_session(
'https://api.example.com',
os.environ['API_TOKEN']
)
# All requests reuse the session's connection pool and headers
contacts_response = session.get('/contacts', params={'limit': 100}, timeout=30)
deals_response = session.get('/deals', params={'stage': 'open'}, timeout=30)
session.post('/events', json={'type': 'sync_started'}, timeout=30)
Pagination Patterns in Python API Automation
Pagination is one of the most important concepts in production Python API automation. No real-world API returns all its records in a single response — they paginate results to protect their infrastructure. Your Python automation script must handle all records across all pages, not just the first one.
Pattern 1: Page Number Pagination in Python
def fetch_all_contacts_page_number(session: requests.Session) -> list:
"""Python automation: handle page-number based pagination"""
all_contacts = []
page = 1
while True:
response = session.get(
'/contacts',
params={'page': page, 'per_page': 100},
timeout=30
)
response.raise_for_status()
data = response.json()
contacts = data.get('contacts', [])
all_contacts.extend(contacts)
# Check pagination termination condition
total_pages = data['pagination']['total_pages']
if page >= total_pages or not contacts:
break
page += 1
print(f'Fetched page {page}/{total_pages}, total: {len(all_contacts)}')
return all_contacts
Pattern 2: Cursor-Based Pagination in Python
def fetch_all_records_cursor(session: requests.Session, endpoint: str) -> list:
"""Python automation: handle cursor-based pagination"""
all_records = []
cursor = None
while True:
params = {'limit': 100}
if cursor:
params['cursor'] = cursor
response = session.get(endpoint, params=params, timeout=30)
response.raise_for_status()
data = response.json()
all_records.extend(data.get('data', []))
# Get next cursor from response
cursor = data.get('pagination', {}).get('next_cursor')
if not cursor:
break # No more pages
print(f'Fetched {len(all_records)} records so far, continuing...')
return all_records
Pattern 3: Link Header Pagination in Python
def fetch_all_github_issues(session: requests.Session, repo: str) -> list:
"""Python automation: handle Link header pagination (GitHub style)"""
all_issues = []
url = f'https://api.github.com/repos/{repo}/issues'
while url:
response = session.get(url, params={'per_page': 100}, timeout=30)
response.raise_for_status()
all_issues.extend(response.json())
# Parse Link header for next page URL in Python
link_header = response.headers.get('Link', '')
next_url = None
for part in link_header.split(','):
if 'rel="next"' in part:
next_url = part.split(';')[0].strip().strip('<>')
break
url = next_url
return all_issues
Rate Limit Handling in Python API Automation
Rate limits are the most common production failure point in Python API automation scripts. Every API imposes limits — requests per minute, requests per hour, or requests per day. Hitting a rate limit returns HTTP 429, and your Python automation must handle it gracefully rather than crashing.
Basic Rate Limit Handling in Python
import time
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
def create_resilient_session() -> requests.Session:
"""Python requests session with automatic retry and rate limit handling"""
session = requests.Session()
# Automatic retry with exponential backoff for Python requests
retry_strategy = Retry(
total=5,
backoff_factor=2,
status_forcelist=[429, 500, 502, 503, 504],
respect_retry_after_header=True # Honors Retry-After header for rate limits
)
adapter = HTTPAdapter(max_retries=retry_strategy)
session.mount('https://', adapter)
session.mount('http://', adapter)
return session
Advanced Rate Limit Handler in Python
import time
import requests
import logging
logger = logging.getLogger(__name__)
def api_call_with_rate_limit(
session: requests.Session,
method: str,
url: str,
max_retries: int = 5,
**kwargs
) -> requests.Response:
"""Python automation: handle rate limits with exponential backoff"""
for attempt in range(max_retries):
response = session.request(method, url, **kwargs)
if response.status_code == 429:
# Respect Retry-After header if provided
retry_after = int(response.headers.get('Retry-After', 2 ** attempt))
logger.warning(f'Rate limited. Waiting {retry_after}s before retry {attempt + 1}/{max_retries}')
time.sleep(retry_after)
continue
if response.status_code in (500, 502, 503, 504):
wait_time = 2 ** attempt # Exponential backoff
logger.warning(f'Server error {response.status_code}. Waiting {wait_time}s')
time.sleep(wait_time)
continue
response.raise_for_status()
return response
raise Exception(f'Max retries exceeded for {method} {url}')
Combining Pagination and Rate Limits in Python
Production Python API automation scripts must handle both pagination and rate limits simultaneously. Here is a complete, production-ready Python pattern:
import time
import requests
import logging
from typing import Generator
logger = logging.getLogger(__name__)
def paginate_api(
session: requests.Session,
endpoint: str,
records_key: str = 'data',
page_size: int = 100,
requests_per_minute: int = 60
) -> Generator[dict, None, None]:
"""
Python generator for paginated API automation with rate limiting.
Yields individual records across all pages.
"""
cursor = None
request_count = 0
window_start = time.time()
while True:
# Rate limit enforcement in Python
request_count += 1
elapsed = time.time() - window_start
if request_count > requests_per_minute and elapsed < 60:
sleep_time = 60 - elapsed
logger.info(f'Rate limit pause: sleeping {sleep_time:.1f}s')
time.sleep(sleep_time)
request_count = 0
window_start = time.time()
# Build request params
params = {'limit': page_size}
if cursor:
params['cursor'] = cursor
response = session.get(endpoint, params=params, timeout=30)
response.raise_for_status()
data = response.json()
records = data.get(records_key, [])
for record in records:
yield record # Python generator yields one record at a time
cursor = data.get('pagination', {}).get('next_cursor')
if not cursor:
break # No more pages
# Usage: process millions of records with Python pagination + rate limiting
for contact in paginate_api(session, '/contacts', 'contacts'):
process_contact(contact) # Process one at a time, memory-efficient
Using a Python generator (yield) for pagination is memory-efficient — it processes one record at a time without loading the entire dataset into memory, making it suitable for APIs with millions of records.
Python Requests Authentication Patterns
Beyond Bearer tokens, Python automation scripts encounter several authentication patterns:
import requests
from requests.auth import HTTPBasicAuth
import hmac
import hashlib
# Basic Auth in Python requests
response = requests.get(url, auth=HTTPBasicAuth('username', 'password'))
# API Key in header
response = requests.get(url, headers={'X-API-Key': api_key})
# API Key as query parameter
response = requests.get(url, params={'api_key': api_key})
# OAuth2 Bearer token
response = requests.get(url, headers={'Authorization': f'Bearer {oauth2_token}'})
# HMAC signature authentication (common in webhook verification)
def make_signed_request(session, url, payload):
body = json.dumps(payload)
signature = hmac.new(
secret_key.encode(),
body.encode(),
hashlib.sha256
).hexdigest()
return session.post(
url,
data=body,
headers={
'Content-Type': 'application/json',
'X-Signature': f'sha256={signature}'
}
)
Why Python Requests Mastery Is Core to Automation Engineering
Every Python automation engineer who works with data operations will spend a significant portion of their career writing Python requests code — fetching data from APIs, handling pagination across thousands of pages, respecting rate limits with exponential backoff, and building resilient HTTP clients that keep running when APIs misbehave. The patterns in this guide — session reuse, cursor pagination, rate limit backoff, generator-based streaming — are the building blocks of production-grade Python API automation that runs reliably at scale.
Need Help Building Production-Grade Python API Automation?
Our team specializes in designing and implementing resilient Python API automation with proper pagination, rate limiting, and error handling.
Get Free Consultation