# Search Engine

* **Status:** Active
* **Last Modified:** 2024-10-03
* **Related Issue:** [#2200](https://github.com/HHS/simpler-grants-gov/issues/2200)
* **Deciders:** Michael, Kai, Aaron
* **Tags:** search, api, backend, infra, opensearch, elastic, database

## Context and Problem Statement

Simpler needs to surface Grant Opportunities to Grant Seekers to ensure that they're able to discover grants that could support their communities. This begins with a powerful and performant search page with strong filtering and sorting capabilities, but also includes the ability to do recommendation and other second-order matching/similarity functionality that goes beyond search. We want to select a search backend that will support rapid development, and long term features and success.

## Decision Drivers

* Better Search relevance for grant seakers
* Ease of data ingestion/querying of data via the API
* Operational ease/performance
* Project mandate to lean in to Open Source

## Options Considered

* Roll our own search engine in [PostgreSQL](https://postgresql.org)
* [Elastic Search](https://www.elastic.co/)
* [OpenSearch](https://opensearch.org/)

## Decision Outcome

Chosen option: "OpenSearch" because:

* Rolling our own is more complex than we would like, even relative to deploying Opensearch, so that eliminates PostgreSQL
* Elastic had licensing issues/controversy that they've since walked back but has lost the trust of the Open Source community
* OpenSearch largely tracks Elastic, but has a dedicated community and is direcly committed to avoiding situations like Elastic experienced.

### Positive Consequences

* Using a dedicated search product provides a ton of functionality and features out of the box
* An isolated search engine allows us to scale/tune independently of the operational database.
* The "search queries" have documented syntax, an existing testing/troubleshooting environment, and can share expertise across Nava teams.

### Negative Consequences

* Additional infra/spend compared to utilizing an existing product we're already hosting

## Pros and Cons of the Options

### PostgreSQL

Utilize some PostgreSQL supported SQL syntax to do searches as SQL SELECT queries

* **Pros**
  * No additional infra/cost
  * Infra already exists
  * Works with existing data we're already tracking, no data sync needed
* **Cons**
  * Not a core/frequently utilized portion of the product functionality
  * Still wasn't really "utilizing" the existing data/skillset as special indexes/syntax are needed
  * Creates issues scaling since it would require scaling the entire DB instance to account for search traffic
  * Harder to support new search features, onboard new developers, etc.
  * Slow queries

#### Elastic Search

Stand up Elastic Search, pump Opportunity data in, use search DSL to power search

* **Pros**
  * Managed AWS service
  * Scales up/down independently of DB load
  * Purpose built for exact use case
  * Fast queries
  * Developed/Supported by a commercial entity
* **Cons**
  * Additional infra/cost
  * Recent licensing/open source feud/issues hurt the product's standing in the developer/customer realms
  * Data from DB must be kept up-to-date in the search engine

### OpenSearch

Stand up OpenSearch, pump Opportunity data in, use search DSL to power search

* **Pros**
  * Managed AWS service
  * Scales up/down independently of DB load
  * Purpose built for exact use case
  * Fast queries
  * Actively developed/supported by the Open Source Community
  * Open Source/Licensing in good standing
* **Cons**
  * Additional infra/cost
  * Data from DB must be kept up-to-date in the search engine
  * Features/improvements may lag Elastic Search a bit since it's not commercially funded


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://wiki.simpler.grants.gov/product/decisions/adr/2024-10-02-search-engine.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
