SERP API for Python
Open the related NovaDataHub page for deeper implementation guidance.
Python is one of the easiest ways to start working with Google search result data, but the first successful request is only part of the job. This tutorial moves from a simple requests-based call into reusable session setup, response parsing, timeout handling, retry behavior, and storage patterns that work better for recurring rank-tracking or search-monitoring jobs.
Before you start, create a NovaDataHub account, enable the Google SERP API, and keep your x-api-key available in a local environment variable or secret manager. Install requests if it is not already available, and decide which market context you want to test first so your first payload is meaningful.
pip install requestsFor production code, use a Session so common headers and connection reuse stay in one place. Keep the API key in an environment variable rather than in source files.
import os
import requests
api_key = os.environ['NOVADATAHUB_SERP_KEY']
session = requests.Session()
session.headers.update({'x-api-key': api_key})
url = 'https://novadatahub.com/search'Start with sync mode so you can inspect the full payload in one response while you validate the endpoint, query parameters, authentication, and locale settings.
params = {
'q': 'google serp api python',
'gl': 'us',
'hl': 'en',
'device': 'desktop',
'sync': 'true'
}
response = session.get(url, params=params, timeout=60)
response.raise_for_status()
payload = response.json()
print(payload)A typical sync response includes top-level fields such as ok, status, jobId, and result. Inside result, look for arrays such as organic, ads_top, paa, related_searches, and local_pack so your code can route each block into the right downstream model.
from datetime import datetime, timezone
result = payload.get('result', {})
collected_at = datetime.now(timezone.utc).isoformat()
organic_rows = []
for row in result.get('organic', []):
organic_rows.append({
'query': params['q'],
'gl': params['gl'],
'hl': params['hl'],
'device': params['device'],
'collected_at': collected_at,
'position': row.get('position'),
'title': row.get('title'),
'url': row.get('url')
})
print(organic_rows[:3])Your client should treat malformed requests, invalid keys, quota issues, rate limits, and sync timeouts differently. Keep network timeouts separate from HTTP responses so your logs stay useful. For sync timeout responses, capture the returned jobId and continue through the async workflow instead of discarding the request.
try:
response = session.get(url, params=params, timeout=60)
except requests.Timeout as exc:
raise TimeoutError('Network timeout while waiting for SERP response.') from exc
if response.status_code == 400:
raise ValueError('Check required query parameters such as q.')
elif response.status_code == 401:
raise PermissionError('Verify x-api-key and enabled service access.')
elif response.status_code == 402:
raise RuntimeError('Check active plan or quota state before retrying.')
elif response.status_code == 429:
raise RuntimeError('Back off and retry later to avoid a retry storm.')
elif response.status_code == 504:
body = response.json()
print('Sync request timed out. Continue with async workflow using jobId:', body.get('jobId'))
else:
response.raise_for_status()Transient network failures and 429 responses should not use tight loops. Prefer a capped exponential backoff pattern and log the request context that caused the failure so you can debug production jobs later.
import time
for attempt in range(5):
resp = session.get(url, params=params, timeout=60)
if resp.status_code != 429:
break
delay = min(60, 2 ** attempt)
print(f'rate limited, sleeping {delay}s before retry')
time.sleep(delay)
resp.raise_for_status()If you are running many searches on a schedule, async mode is often a cleaner fit than waiting for every result inline. Submit the job, store the jobId with the request context, and poll terminal states on a measured cadence rather than tying the whole workflow to a single long sync request.
Do not store only position numbers. Keep the query, locale, device, timestamp, and raw JSON payload or normalized result rows together so you can compare matched result sets over time and explain later changes in ranking or result composition.
record = {
'query': params['q'],
'gl': params['gl'],
'hl': params['hl'],
'device': params['device'],
'collected_at': collected_at,
'payload': payload,
'organic_rows': organic_rows
}
print(record.keys())Once one sync request works, connect the same request pattern to your storage layer, monitoring jobs, and alerting. Then use the docs page for parameter details, the async tutorial for larger jobs, and the rank-tracking tutorials for comparison-ready storage and reporting flows.
Create a free NovaDataHub account, enable the API you need, and test structured JSON responses before moving into production.
New trial accounts can start with Starter Pack capacity at no cost for a limited time. Create your account and test the APIs with a much stronger quota right away.