Search API
Extend and transform the opportunity data model
Summary details
Field | Value |
---|---|
Deliverable status | In progress |
Link to GitHub issue | |
Key sections |
Overview
Summary
What
Expand the data model beyond the single opportunity table
Set up a transformation workflow so the data from the current grants.gov live data model transforms into the new data model for simpler.grants.gov
Enhance the Search API endpoint to allow most basic search on title and description fields, sorting by title or date (or other low hanging fruit), and filtering
Get security approval for storing (but not sharing) non-public opportunities in Simpler AWS account
Why
Allows us to expand from a title field to fields that support sorting, filtering, description, and other fields that produce a meaningful search experience, such as "status" and "agency"
Enables a broader and more robust API and enhances functionality, making it more versatile.
Who
Subset of system-to-system users, including but not limited to:
Federal Demonstration Partnership (FDP)
US digital response (USDR)
S2S Federal User Group
Internal development team
Out of scope
Search engine, full text search, or any other keyword search beyond basic query in title or description
Enhanced authentication
We're not handling rate-limiting or infrastructure changes as that would prepare for users and number of users outside of the scope of the limited audience
Business value
Problem
The current GET Opp API, based on a single opportunity table with excessive and redundant data, provides limited value to consumers due to complexity and difficulty in obtaining desired information. This restricts the team's ability to build future functionalities and expand beyond the existing structure.
Value
By expanding the opportunity data model and transforming data into a new, simpler structure, we will:
Enhance API functionality: Create a broader and more robust API with a well-defined endpoint, simplifying data access and retrieval for consumers
Increase developer productivity: Decouple the API from the legacy grants.gov schema, enabling faster development of new features and functionalities
Improve data usability: Organize data logically and eliminate redundancy, allowing easier searching, filtering, and analysis
Lay foundation for future expansion: Enable building additional functionalities like advanced search based on the enriched data model
Goals
This effort aims to...
Data model design: Create the first version of a new, optimized data model with clear tables, relationships, and attributes that represent opportunities efficiently
Data transformation: Implement a process to automatically convert data from the existing grants.gov model to the new simpler.grants.gov model
API development: Update and expand the API based on the new data model, offering a well-defined endpoint for various data access needs
Testing and deployment: Thoroughly test the new data model, transformation process, and API before deploying them to production
Documentation and training: Document the new data model, API changes, and transformation process for developers and consumers
User stories
As a member of HHS staff, I want to:
I want to track the usage and impact of the API, allowing me to assess its effectiveness and make improvements
As a consumer of the API, I want to:
easily access relevant opportunity data through well-defined API endpoints and documentation, enabling me to integrate the API into my systems so that I can efficiently utilize grants information
As a system-to-system user, I want:
to be able to access search features via the API, so that search results are the same whether I'm using the API or the UI
the search functionality to be outlined in the API docs, so that I don't have to rely on experimentation to learn how to search for opportunities
As a HHS contracting engineer, I want to:
clear and well-documented data model, API endpoints, and ETL processes, enabling me to maintain and update the system efficiently
easily understand the data transformation process and its dependencies, allowing me to troubleshoot issues and ensure data integrity
Technical description
Expanded data model
Goals
Enable the team to build future endpoints against a simplified data model instead of one that is tightly coupled to the existing schema in grants.gov live
Create the foundation for expanding beyond the existing
opportunity
table, which will be useful for things like search
Scope
Expand the data model in the new Postgres database beyond the single opportunity table.
Propose an approach for the data model that will be reviewed by key stakeholders
Investigation into current database structure
Determine data conversions that will be needed
Data Transformations
Goals
Create a process that translates data between the current and future data model
Scope
Set up a tool that transforms the data from the current grants.gov live data model into the new data model for simpler.grants.gov.
ADR to determine the technology for the ELT
Infrastructure set up for running the ELT job(s)
Setup logic for converting data
Configure transformation tool to connect data sources
API development
Goals
Incorporate new data model into API
Improve data accuracy by incorporating lookup value logic
Update search API to provide filtering, sorting, and query search term for Search UI
Scope
Enhance GET Opp API functionality to accommodate new data models, ensuring seamless integration with the DMS copy, and incorporating lookup value logic for improved data accuracy. We should consider how to support basic logic such as greater than, less than, as well as and/or/not conditions.
Addition of opportunity tables for DMS copy.
Inclusion of new tables based on data modeling work.
Authentication that requires users to identify who they are such as key management
Update API model with new fields from the introduced tables.
Documentation and tech spec for lookup value work
Implementation of lookup value logic for API and DB to recognize allowed values.
Setup and implementation of an approach for loading lookup data
Document API changes
Determine how to handle how search and filter criteria will be passed to the API endpoint used for search. This may mean releasing functionality in a future version
Determine how to handle basic logic such as greater than, less than, and/or/not. This may mean releasing functionality in a future version
Determine which fields should support filtering and sorting. This may also be determined by design work in the Search UI effort
Security for API
Goal
Get approval to store non-public opportunities in the Simpler AWS production account.
Not in scope
List of functionality or features that are explicitly out of scope for this deliverable.
Static site updates will be completed in the Search UI deliverable.
Advanced data processing will not be implemented in this effort
Data visualization and reporting capabilities within the API will not be completed in this effort
Search engine, full text search, or any other keyword search beyond basic query in title or description
Enhanced authentication
We're not handling rate-limiting or infrastructure changes as that would prepare for users and number of users outside of the scope of the limited audience
Definition of done
Following sections describe the conditions that must be met to consider this deliverable "done".
Must have
Functional requirements
The API reads data from a new opportunity data model with clearly defined tables, relationships, and attributes
The API pulls data from more than just the
opportunity
data of the Grants.gov live database.An ERD for the new data model is documented in a publicly accessible location, and the ERD is automatically updated with future changes to the data model
There is a service in place which transforms data from the old data model from Grants.gov live to a new, intuitive, easier-to-use data model.
Changes made to data on grants.gov live are propagated to the new simpler.grants.gov data model within 1 hour
A new (minor) version of the
GET /opportunities
API endpoint has been released and includes fields from the expanded data model (e.g. status, agency, etc.)A new search endpoint has been released that allows API consumers to:
Search for opportunities by keyword
Filter opportunities by at least one structured field from the new data model
Sort opportunities by at least one structured field from the new data model
We've received security approval to host (but not share) non-public data in the AWS Simpler environments
Select a logging and monitoring tool for backend and frontend
Started the procurement process for the logging and monitoring tool
Nice to have
S2S users can sign up for the API with a self-service authentication option that replaces the key management method established previously
S2S users can learn how to consume from the API by following a publicly documented user guide, in addition to referencing our OpenAPI specification
Observability metrics set up to display metrics in the application logging/monitoring tool selected
Proposed metrics
Total number of unique users of the search endpoint
Total number of API requests to the search endpoint
Track keywords searched in the API request
Track filters used in the API request
Regular reporting of error rate of API responses
Regular reporting of latency of API
Planning
Assumptions and dependencies
What functionality do we expect to be in place before work starts on this deliverable?
API: The search API will build on the existing backend work completed to launch the GET Opportunities endpoint which delivered the following functionality:
Backend CI/CD: Automatically tests and deploys backend code
Database Replica: Maintains eventual consistency (with low latency) between the data in grants.gov and simpler.grants.gov and ensures that simpler.grants.gov services remain available when grants.gov services experience downtime
Data Architecture: Enables simpler.grants.gov to read data from an updated (and simplified) data model
API Docs: Documents the API endpoints released with each deliverable
Is there any notable functionality we do not expect to be in place before works starts on this deliverable?
The Search UI 30k deliverable will be happening in parallel with this effort. The Search UI effort will use the Search API
The data model for Grants as a Protocol will not be completed before this work starts and we will need to update the API in a future deliverable
Open questions
Integrations
Translations
Does this deliverable involve delivering any content that needs translation?
Not at this time
If so, when will English-language content be locked? Then when will translation be started and completed?
n/a
Services going into PROD for the first time
This can include services going into PROD behind a feature flag that is not turned on.
Tool that we will use for transformation of data
Services being integrated in PROD for the first time
Are there multiple services that are being connected for the first time in PROD?
We will select a tool for ETL transformation and that will be going into production for the first time
Data being shared publicly for the first time
Are there any fields being shared publicly that have never been shared in PROD before?
Yes, new fields will be shared through this API but these fields have been shared publicly through grants.gov
Security considerations
Does this deliverable expose any new attack vectors or expand the attack surface of the product?
There will be an increase in the number of tables.
There is the potential for unauthorized access if it's not properly secured
There are risks associated with the transformation tool we choose if it is not well-vetted and configured securely
If so, how are we addressing these risks?
Implement secure coding practices
Enforce strong authentication and authorization mechanisms such as key management
ETL tool will go through our ADR process and we will select a tool with security factors as a decision criteria
Other security preventions include - scanning for vulnerabilities using automated tools and manual reviews, we have logging to track data access and usage, we will also go through the formal security review process to ensure that we are aligned with the SIA and security controls required
Logs
Change log
Major updates to the content of this page will be added here.
Date | Update | Notes |
---|---|---|
4/5/2024 | Added change log and implementation log | This is part of the April onsite follow-up |
5/10/2024 | Updated deliverable status to "In progress" to reflect current state | |
Implementation log
Use this section to indicate when acceptance criteria in the "Definition of done" section have been completed, and provide notes on steps taken to satisfy this criteria when appropriate.
Date | Criteria completed | Notes |
---|---|---|
4/30/24 | All new services have completed a 508 compliance review (as needed) | There is not a UI component to this 30k |
4/30/24 | We've received security approval to host (but not share) non-public data in the AWS Simpler environments | The SIA and PIA update were submitted |
4/12/24 | The API reads data from a new opportunity data model with clearly defined tables, relationships, and attributes | Based on data model discovery and proposal that was reviewed with key stakeholders. Documented here: https://app.gitbook.com/o/cFcvhi6d0nlLyH2VzVgn/s/v1V0jIH7mb7Yb3jlNrgk/engineering/learnings/opportunity-endpoint-data-model |
4/12/24 | The API pulls data from more than just the | The API pulls data from more than the opportunity table from the g.gov replicated database |
4/12/24 | An ERD for the new data model is documented in a publicly accessible location, and the ERD is automatically updated with future changes to the data model | |
4/12/24 | Select a logging and monitoring tool for backend and frontend | |
4/12/24 | Started the procurement process for the logging and monitoring tool | Procurement has been started: [Task]: Decide on Strategy/Workflow for NewRelic Procurement Implementation steps have been outlined here: https://github.com/HHS/simpler-grants-gov/milestone/101 |
4/12/24 | Nice-to-have: S2S users can sign up for the API with a self-service authentication option that replaces the key management method established previously | Not doing, this is part of the API authn deliverable we'll do in the future. We've started conversations with Login to see what's feasible. |
Last updated