Extracts
Updated 9/17/2025 see our OpenAPI doc for most recent changes.
Core Endpoints
List Extract Metadata
Endpoint: POST /v1/extracts
Retrieve metadata about available data extracts, including file information, creation dates, and download URLs.
Extract Types
The system provides the following types of data extracts:
Available Extract Types
opportunities_json
: Complete opportunity data in JSON formatopportunities_csv
: Complete opportunity data in CSV format
Extract Metadata API
Request Parameters
The extract metadata endpoint accepts the following parameters:
Filters
extract_type
(enum, optional): Filter by specific extract typeValues:
"opportunities_json"
,"opportunities_csv"
Example:
"opportunities_json"
created_at
(date range, optional): Filter by extract creation date{ "start_date": "2024-01-01", "end_date": "2024-12-31" }
Pagination (Required)
pagination
: Controls result pagination and sorting{ "page_offset": 1, "page_size": 25, "sort_order": [ { "order_by": "created_at", "sort_direction": "descending" } ] }
Sort Options:
created_at
: When the extract was created
Code Examples
import requests
import json
from datetime import datetime, timedelta
# Your API configuration
API_KEY = "your_api_key_here"
BASE_URL = "https://api.simpler.grants.gov"
headers = {
"X-API-Key": API_KEY,
"Content-Type": "application/json"
}
def get_latest_extracts(extract_type=None, days_back=30):
"""Get extract metadata for the last N days"""
end_date = datetime.now()
start_date = end_date - timedelta(days=days_back)
filters = {
"created_at": {
"start_date": start_date.strftime("%Y-%m-%d"),
"end_date": end_date.strftime("%Y-%m-%d")
}
}
if extract_type:
filters["extract_type"] = extract_type
payload = {
"filters": filters,
"pagination": {
"page_offset": 1,
"page_size": 50,
"sort_order": [
{
"order_by": "created_at",
"sort_direction": "descending"
}
]
}
}
response = requests.post(
f"{BASE_URL}/v1/extracts",
headers=headers,
json=payload
)
if response.status_code == 200:
data = response.json()
extracts = data["data"]
print(f"Found {len(extracts)} extracts")
for extract in extracts:
print(f"- {extract['extract_type']} created {extract['created_at']}")
print(f" File: {extract.get('file_name', 'N/A')}")
print(f" Size: {extract.get('file_size', 'Unknown')} bytes")
if extract.get('download_url'):
print(f" Download: {extract['download_url']}")
print()
return extracts
else:
print(f"Error: {response.status_code} - {response.text}")
return []
def download_extract_file(extract_metadata, local_filename):
"""Download an extract file to local storage"""
download_url = extract_metadata.get('download_url')
if not download_url:
print("No download URL available for this extract")
return False
try:
# Note: Download URLs may be pre-signed and not require API key
response = requests.get(download_url, stream=True)
response.raise_for_status()
with open(local_filename, 'wb') as f:
for chunk in response.iter_content(chunk_size=8192):
f.write(chunk)
print(f"Downloaded {local_filename}")
return True
except requests.exceptions.RequestException as e:
print(f"Error downloading file: {e}")
return False
# Usage examples
print("Getting latest opportunity JSON extracts...")
json_extracts = get_latest_extracts(extract_type="opportunities_json", days_back=7)
if json_extracts:
latest_extract = json_extracts[0]
filename = f"opportunities_{latest_extract['created_at'][:10]}.json"
download_extract_file(latest_extract, filename)
print("\nGetting all recent extracts...")
all_extracts = get_latest_extracts(days_back=14)
Response Format
Extract Metadata Response Structure
{
"message": "Success",
"data": [
{
"extract_metadata_id": "12345678-1234-1234-1234-123456789012",
"extract_type": "opportunities_json",
"file_name": "opportunities_2024-01-15.json",
"file_size": 15728640,
"download_url": "https://example-bucket.s3.amazonaws.com/extracts/opportunities_2024-01-15.json?signature=...",
"created_at": "2024-01-15T02:30:00Z",
"updated_at": "2024-01-15T02:35:00Z"
},
{
"extract_metadata_id": "87654321-4321-4321-4321-210987654321",
"extract_type": "opportunities_csv",
"file_name": "opportunities_2024-01-15.csv",
"file_size": 8294400,
"download_url": "https://example-bucket.s3.amazonaws.com/extracts/opportunities_2024-01-15.csv?signature=...",
"created_at": "2024-01-15T02:30:00Z",
"updated_at": "2024-01-15T02:35:00Z"
}
],
"pagination_info": {
"page_offset": 1,
"page_size": 25,
"total_pages": 3,
"total_records": 67
}
}
Extract File Formats
JSON Extract Structure
The opportunities JSON extract contains an array of opportunity objects with complete data:
[
{
"opportunity_id": "12345678-1234-1234-1234-123456789012",
"opportunity_number": "EPA-R9-SFUND-23-003",
"opportunity_title": "Superfund Site Remediation Research",
"agency_code": "EPA",
"agency_name": "Environmental Protection Agency",
"post_date": "2024-01-15",
"close_date": "2024-06-30",
"opportunity_status": "posted",
"funding_instrument": "grant",
"funding_category": "environment",
"award_floor": 50000,
"award_ceiling": 500000,
"estimated_total_program_funding": 2000000,
"expected_number_of_awards": 4,
"applicant_types": ["nonprofits", "universities"],
"summary": "Funding for research into innovative remediation technologies...",
"is_cost_sharing": false,
"attachments": [
{
"attachment_id": "attachment-123",
"file_name": "funding_announcement.pdf",
"download_url": "https://example.com/attachments/funding_announcement.pdf"
}
]
}
]
CSV Extract Structure
The opportunities CSV extract contains the same data as the JSON output in tabular format with the following columns:
opportunity_id
opportunity_number
opportunity_title
agency_code
agency_name
post_date
close_date
opportunity_status
funding_instrument
funding_category
award_floor
award_ceiling
estimated_total_program_funding
expected_number_of_awards
applicant_types
(pipe-separated values)summary
is_cost_sharing
Error Handling
Common HTTP Status Codes
200 OK: Request successful
400 Bad Request: Invalid request parameters
401 Unauthorized: Missing or invalid API key
403 Forbidden: API key lacks required permissions
404 Not Found: Extract not found
429 Too Many Requests: Rate limit exceeded
500 Internal Server Error: Server error
Error Response Format
{
"message": "Error description",
"status_code": 400,
"errors": [
{
"field": "filters.extract_type",
"message": "Must be one of: opportunities_json, opportunities_csv"
}
]
}
Last updated
Was this helpful?