NewandModern¶
This document provides comprehensive documentation for all tables and columns in the NewandModern database schema.
Table of Contents¶
- Database Schema Diagrams
- Core Data Tables
- Classification Tables
- Mapping Tables
- Property Finder Tables
- Helper Tables
Database Schema Diagrams¶
High-Level Entity Relationship Diagram¶
The following diagram shows the main entities and their relationships:
erDiagram
attom_properties ||--o{ attom_transactions : "has"
attom_properties ||--o{ xmap_mls_attom_properties : "maps to"
attom_properties ||--o{ xmap_parcels_attom_properties : "maps to"
attom_properties ||--o{ xmap_permits_attom_properties : "maps to"
attom_transactions ||--o{ xmap_attom_transactions_companies : "links to"
companies ||--o{ companies_members : "has"
companies_members ||--o{ companies_members_contacts : "has"
registered_builders ||--o{ xmap_registered_builders_companies : "maps to"
attom_transactions ||--o{ xmap_attom_transactions_registered_builders_companies : "links to"
mls ||--o{ xmap_mls_attom_properties : "maps to"
permits ||--o{ xmap_permits_attom_properties : "links to"
parcels ||--o{ xmap_parcels_attom_properties : "maps to"
neighborhoods ||--o{ xmap_attom_properties_neighborhoods : "contains"
Core Property Data Flow¶
graph TB
A[ATTOM Properties] --> B[Property Transactions]
A --> C[Parcels]
A --> D[Neighborhoods]
A --> E[MLS Listings]
A --> F[Building Permits]
B --> G[Transaction Classifications]
B --> H[Builder Mappings]
B --> I[Company Mappings]
F --> J[Permit Classifications]
E --> K[New Construction Classifications]
Classification and Mapping Flow¶
graph LR
A[Raw Data Sources] --> B[Core Tables]
B --> C[Classification Tables]
B --> D[Mapping Tables]
C --> E[Property Finder]
D --> E
C --> F[Analytics & Rankings]
D --> F
Core Data Tables¶
The core data tables store the primary entities in the system, representing real-world objects and transactions from various data sources. These tables form the foundation of the database and are referenced by classification and mapping tables.
Entity Relationship: Core Data Tables¶
erDiagram
attom_properties {
int attom_id PK
varchar situs_address
numeric latitude
numeric longitude
int tax_assessed_value
}
attom_transactions {
int8 transaction_id PK
int4 attom_id FK
date instrument_date
numeric transfer_amount
}
companies {
varchar entity_id PK
varchar company_name
varchar registered_agent
}
companies_members {
varchar member_id PK
text entity_id FK
text member_name
}
mls {
int4 id PK
varchar mls_id
date list_date
numeric current_price
}
permits {
varchar permit_id PK
varchar muni_code
date permit_date
varchar permit_type
}
parcels {
int4 parcel_id PK
varchar pams_pin
varchar owner_name
}
registered_builders {
varchar builder_registration_id PK
varchar company_name
varchar registration_status
}
home_improvement_contractors {
int4 id PK
varchar contractor_name
varchar license_number
}
neighborhoods {
varchar id PK
varchar name
geometry geometry
}
attom_properties ||--o{ attom_transactions : "has transactions"
companies ||--o{ companies_members : "has members"
attom_properties¶
Stores property information from ATTOM Data Solutions, including address details, property characteristics, ownership information, and tax assessment data. This is the central table in the system, serving as the primary reference for all property-related data.
| Column | Type | Description |
|---|---|---|
attom_id |
int | Primary key. Unique identifier for the property in the ATTOM system. |
county_key |
int | County identifier key. |
situs_address |
varchar(150) | Full situs (legal) address of the property. |
situs_house_number |
varchar(25) | House number component of the situs address. |
situs_direction |
varchar(10) | Directional prefix (N, S, E, W, etc.) for the street. |
situs_street_name |
varchar(100) | Street name component of the situs address. |
situs_address_suffix |
varchar(25) | Street suffix (St, Ave, Blvd, etc.). |
situs_unit_prefix |
varchar(20) | Unit prefix (Apt, Unit, Suite, etc.). |
situs_unit_value |
varchar(25) | Unit number or identifier. |
situs_city |
varchar(50) | City where the property is located. |
situs_state |
varchar(15) | State where the property is located. |
situs_zip |
varchar(10) | ZIP code of the property. |
situs_zip_4 |
varchar(4) | ZIP+4 extension code. |
latitude |
numeric(9,6) | Geographic latitude coordinate. |
longitude |
numeric(9,6) | Geographic longitude coordinate. |
block_number_1 |
varchar(10) | Block number identifier. |
lot_number |
varchar(10) | Lot number identifier. |
year_built |
int | Year the property was constructed. |
property_type_key |
int | Key referencing the property type classification. |
lot_size_square_feet |
numeric(12,2) | Size of the lot in square feet. |
lot_depth |
numeric(9,1) | Depth of the lot in feet. |
lot_width |
numeric(9,1) | Width of the lot in feet. |
property_use_muni |
varchar(5) | Municipal property use code. |
property_zoning |
varchar(50) | Zoning classification of the property. |
building_square_feet |
int | Total square footage of the building. |
minor_civil_division_code |
int | Code for minor civil division. |
minor_civil_division_name |
varchar(100) | Name of the minor civil division. |
owner_1_full_name |
varchar(150) | Full name of the primary owner. |
tax_bill_mailing_address |
varchar(100) | Mailing address for tax bills. |
tax_bill_mailing_city |
varchar(50) | City for tax bill mailing address. |
tax_bill_mailing_state |
varchar(15) | State for tax bill mailing address. |
tax_bill_mailing_zip |
varchar(10) | ZIP code for tax bill mailing address. |
deed_last_sale_date |
timestamp | Date of the last recorded sale. |
last_ownership_transfer_transaction_id |
int8 | Reference to the last ownership transfer transaction. |
assessor_year |
int | Year of the tax assessment. |
tax_assessed_value |
int | Total assessed value for tax purposes. |
tax_improvement_value |
int | Assessed value of improvements (buildings). |
tax_land_value |
int | Assessed value of the land. |
tax_improvement_percent |
numeric(6,2) | Percentage of total value attributed to improvements. |
previous_assessed_value |
int | Previously assessed value for comparison. |
Relationships:
- Referenced by: attom_transactions, xclass_attom_properties_property_type, xclass_attom_properties_tax_new_const, xclass_attom_transactions_historical_classification, xclass_mls_attom_transactions_builder_lot_purchases, xclass_new_construction_by_price_increase, xclass_ranking_attom_properties, xmap_attom_properties_neighborhoods, xmap_mls_attom_properties, xmap_parcels_attom_properties, xmap_parcels_attom_properties_staging, xmap_permits_attom_properties, xproperty_finder_classifications
attom_transactions¶
Stores real estate transaction records from ATTOM Data Solutions, including deed transfers, mortgage information, and party details.
| Column | Type | Description |
|---|---|---|
transaction_id |
int8 | Primary key. Unique identifier for the transaction. |
attom_id |
int4 | Foreign key to attom_properties.attom_id. Property associated with this transaction. |
detailed_document_type_key |
int4 | Key for detailed document type classification. |
instrument_date |
date | Date the instrument (deed, mortgage, etc.) was executed. |
recording_date |
date | Date the document was recorded with the county recorder. |
transfer_type_key |
int4 | Key for transfer type classification. |
transfer_category_key |
int4 | Key for transfer category classification. |
transfer_amount |
numeric | Monetary amount of the transfer. |
grantor_1_name_full |
varchar(150) | Full name of the first grantor (seller). |
grantor_1_name_first |
varchar(50) | First name of the first grantor. |
grantor_1_name_middle |
varchar(20) | Middle name of the first grantor. |
grantor_1_name_last |
varchar(50) | Last name of the first grantor. |
grantee_1_name_full |
varchar(150) | Full name of the first grantee (buyer). |
grantee_1_name_first |
varchar(50) | First name of the first grantee. |
grantee_1_name_middle |
varchar(20) | Middle name of the first grantee. |
grantee_1_name_last |
varchar(50) | Last name of the first grantee. |
grantee_mail_address_full |
varchar(100) | Complete mailing address of the grantee. |
grantee_mail_address_house_number |
varchar(25) | House number in grantee's mailing address. |
grantee_mail_address_street_direction |
varchar(10) | Street direction in grantee's mailing address. |
grantee_mail_address_street_name |
varchar(100) | Street name in grantee's mailing address. |
grantee_mail_address_street_suffix |
varchar(25) | Street suffix in grantee's mailing address. |
grantee_mail_address_unit_prefix |
varchar(20) | Unit prefix in grantee's mailing address. |
grantee_mail_address_unit_value |
varchar(25) | Unit value in grantee's mailing address. |
grantee_mail_address_city |
varchar(50) | City in grantee's mailing address. |
grantee_mail_address_state |
varchar(2) | State abbreviation in grantee's mailing address. |
grantee_mail_address_zip |
varchar(5) | ZIP code in grantee's mailing address. |
grantee_mail_address_zip_4 |
varchar(4) | ZIP+4 extension in grantee's mailing address. |
mortgage_1_loan_type_key |
int4 | Key for the first mortgage loan type. |
mortgage_1_amount |
int4 | Amount of the first mortgage loan. |
transaction_type_key |
int4 | Key for transaction type classification. |
record_last_updated |
timestamp | Timestamp when the record was last updated. |
Relationships:
- References: attom_properties.attom_id
- Referenced by: xclass_attom_properties_image_new_construction, xclass_attom_transactions_historical_classification, xclass_attom_transactions_purchase_type, xclass_is_company_attom_transactions, xclass_mls_attom_transactions_builder_lot_purchases, xclass_new_construction_by_price_increase, xmap_attom_transactions_companies, xmap_attom_transactions_historical_builder_purchases_companies, xmap_attom_transactions_registered_builders_companies, xmap_attom_transactions_registered_builders_companies_members, xproperty_finder_classifications
companies¶
Stores company entity information extracted from business registration documents.
| Column | Type | Description |
|---|---|---|
entity_id |
varchar(25) | Primary key. Unique identifier for the company entity. |
filename |
varchar(100) | Name of the source file from which the data was extracted. |
url |
text | URL or path to the source document. |
company_name |
varchar(100) | Official name of the company. |
registered_agent |
varchar(100) | Name of the registered agent for the company. |
registered_office |
varchar(150) | Address of the registered office. |
business_purpose |
text | Description of the company's business purpose. |
main_business_address |
varchar(150) | Primary business address of the company. |
alternate_business_name |
varchar(100) | Alternative or DBA (Doing Business As) name. |
signatures |
varchar(100) | Signatures or signatories associated with the company. |
extraction_timestamp |
timestamptz | Timestamp when the data was extracted from the source document. |
Relationships:
- Referenced by: companies_members, companies_members_contacts, xmap_attom_transactions_companies, xmap_attom_transactions_historical_builder_purchases_companies, xmap_registered_builders_companies, xhelper_document_segments
companies_members¶
Stores information about members (owners, officers, etc.) associated with companies.
| Column | Type | Description |
|---|---|---|
member_id |
varchar(64) | Primary key. Unique identifier for the member record. |
id |
int4 | Internal ID. |
company_extraction_id |
int4 | Reference to the company extraction record. |
document_segment_id |
int4 | Reference to the document segment where this member was found. |
entity_id |
text | Foreign key to companies.entity_id. Company this member belongs to. |
member_name |
text | Full name of the member. |
member_address |
text | Address of the member. |
extraction_timestamp |
timestamptz | Timestamp when the member data was extracted. |
Relationships:
- References: companies.entity_id
- Referenced by: companies_members_contacts, xmap_attom_transactions_registered_builders_companies_members
companies_members_contacts¶
Stores contact information (phone numbers, email addresses) for company members, including TCPA compliance status.
| Column | Type | Description |
|---|---|---|
entity_id |
varchar | Primary key (part 1). Foreign key to companies.entity_id. |
member_id |
varchar | Primary key (part 2). Foreign key to companies_members.member_id. |
phone_numbers |
text[] | Array of phone numbers associated with the member. |
email_addresses |
text[] | Array of email addresses associated with the member. |
mailing_address |
text | Mailing address for the member. |
tcpa_status |
varchar(100) | TCPA (Telephone Consumer Protection Act) compliance status. |
batch_data_obj |
jsonb | JSON object containing batch processing data. |
created_at |
timestamptz | Timestamp when the record was created. Default: CURRENT_TIMESTAMP. |
updated_at |
timestamptz | Timestamp when the record was last updated. Default: CURRENT_TIMESTAMP. |
Relationships:
- References: companies.entity_id, companies_members.member_id
home_improvement_contractors¶
Stores information about licensed home improvement contractors.
| Column | Type | Description |
|---|---|---|
id |
int4 | Primary key. Auto-incrementing identifier. |
contractor_name |
varchar(255) | Name of the contractor or company. |
license_number |
varchar(100) | License number issued to the contractor. |
profession |
varchar(100) | Professional category or trade. |
license_type |
varchar(100) | Type of license held. |
license_status |
varchar(100) | Current status of the license (active, expired, etc.). |
city |
varchar(100) | City where the contractor is located. |
state |
varchar(100) | State where the contractor is licensed. |
owner_name |
varchar(255) | Name of the owner of the contracting business. |
street_address |
text | Street address of the contractor. |
issue_date |
date | Date the license was issued. |
doing_business_as |
varchar(255) | DBA (Doing Business As) name if applicable. |
expiration_date |
date | Date when the license expires. |
status_change_reason |
varchar(100) | Reason for any status change. |
zip |
varchar(25) | ZIP code of the contractor's address. |
mls¶
Stores Multiple Listing Service (MLS) property listing data.
| Column | Type | Description |
|---|---|---|
id |
int4 | Primary key. Auto-incrementing identifier. |
mls_id |
varchar(50) | Unique identifier for the MLS listing. |
address |
varchar(100) | Full address of the property. |
street_number |
varchar(50) | Street number component. |
street_direction |
varchar(50) | Street direction (N, S, E, W, etc.). |
street_name |
varchar(50) | Street name. |
street_suffix |
varchar(50) | Street suffix (St, Ave, etc.). |
unit_number |
varchar(50) | Unit or apartment number. |
state |
varchar(50) | State where the property is located. |
county |
varchar(50) | County where the property is located. |
city |
varchar(50) | City where the property is located. |
municipality |
varchar(50) | Municipality name. |
zip |
varchar(50) | ZIP code. |
list_date |
date | Date the property was listed. |
current_price |
numeric | Current listing price. |
status |
varchar(50) | Current listing status (active, pending, sold, etc.). |
status_date |
date | Date when the status was last updated. |
sold_price |
numeric | Price at which the property was sold (if applicable). |
sold_date |
date | Date the property was sold (if applicable). |
original_list_price |
numeric | Original listing price. |
property_type |
varchar(50) | Type of property (single family, condo, etc.). |
beds |
numeric | Number of bedrooms. |
baths |
numeric | Number of bathrooms. |
home_sqft |
numeric | Square footage of the home. |
lot_sqft |
numeric | Square footage of the lot. |
lot_dimension |
varchar(50) | Lot dimensions description. |
year_built |
int4 | Year the property was built. |
advertising_remarks |
text | Marketing and advertising remarks. |
image |
text | URL or path to property image. |
block |
varchar(50) | Block number. |
lot |
varchar(50) | Lot number. |
latitude |
numeric | Geographic latitude coordinate. |
longitude |
numeric | Geographic longitude coordinate. |
mls_source |
varchar(50) | Source MLS system or organization. |
source_file |
varchar(100) | Source file from which the data was imported. |
created_at |
timestamp | Timestamp when the record was created. Default: now(). |
updated_at |
timestamp | Timestamp when the record was last updated. Default: now(). |
Relationships:
- Referenced by: xclass_mls_attom_transactions_builder_lot_purchases, xclass_mls_new_construction, xclass_mls_new_construction_new, xclass_new_construction_by_price_increase, xmap_mls_attom_properties
neighborhoods¶
Stores neighborhood boundary definitions with geometric data.
| Column | Type | Description |
|---|---|---|
id |
varchar(50) | Primary key. Unique identifier for the neighborhood. |
name |
varchar(255) | Name of the neighborhood. |
created_date |
timestamp | Date when the neighborhood record was created. |
updated_date |
timestamp | Date when the neighborhood record was last updated. |
geometry |
geometry(polygon, 4326) | Geographic polygon defining the neighborhood boundaries (WGS84). |
Relationships:
- Referenced by: xmap_attom_properties_neighborhoods
parcels¶
Stores parcel data from the PAMS (Property Assessment Management System) with geographic boundaries.
| Column | Type | Description |
|---|---|---|
parcel_id |
int4 | Primary key. Auto-incrementing identifier. |
pams_pin |
varchar | PAMS parcel identification number. |
pcl_mun |
varchar | Municipality code. |
pclblock |
varchar | Block identifier. |
pcllot |
varchar | Lot identifier. |
pclqcode |
varchar | Quality code. |
pcllastupd |
timestamptz | Last update timestamp. |
pin_nodup |
varchar | PIN without duplicates. |
gis_pin |
varchar | GIS parcel identification number. |
cd_code |
varchar | Code identifier. |
prop_class |
varchar | Property class code. |
county |
varchar | County name. |
mun_name |
varchar | Municipality name. |
prop_loc |
varchar | Property location description. |
owner_name |
varchar | Name of the property owner. |
st_address |
varchar | Street address. |
city_state |
varchar | City and state. |
zip_code |
varchar | ZIP code. |
land_val |
int4 | Land value assessment. |
imprvt_val |
int4 | Improvement value assessment. |
net_value |
int4 | Net assessed value. |
last_yr_tx |
varchar | Last year tax information. |
bldg_desc |
varchar | Building description. |
land_desc |
varchar | Land description. |
calc_acre |
float8 | Calculated acreage. |
add_lots1 |
varchar | Additional lot identifier 1. |
add_lots2 |
varchar | Additional lot identifier 2. |
fac_name |
varchar | Facility name. |
prop_use |
varchar | Property use code. |
bldg_class |
varchar | Building class code. |
deed_book |
varchar | Deed book reference. |
deed_page |
varchar | Deed page reference. |
deed_date |
varchar | Deed date. |
yr_constr |
varchar | Year of construction. |
sales_code |
varchar | Sales code. |
sale_price |
int4 | Sale price. |
dwell |
varchar | Dwelling information. |
comm_dwell |
varchar | Commercial dwelling information. |
old_propid |
varchar | Old property identifier. |
zip5 |
varchar | 5-digit ZIP code. |
zip_plus4 |
varchar | ZIP+4 code. |
pcl_pbdate |
timestamptz | Publication date. |
pcl_guid |
varchar | GUID identifier. |
shape_length |
float8 | Length of the parcel shape boundary. |
shape_area |
float8 | Area of the parcel shape. |
shape |
geometry(geometry, 3424) | Geographic geometry of the parcel (State Plane coordinate system). |
Relationships:
- Referenced by: xmap_parcels_attom_properties, xmap_parcels_attom_properties_staging
permits¶
Stores building permit information from municipal sources.
| Column | Type | Description |
|---|---|---|
permit_id |
varchar(25) | Primary key. Unique identifier for the permit. |
muni_code |
varchar(4) | Municipality code. |
treasury_code |
varchar(4) | Treasury code. |
municipality_name |
varchar(100) | Name of the municipality. |
muni_type |
varchar(20) | Type of municipality. |
county |
varchar(50) | County name. |
record_id |
varchar(15) | Record identifier. |
block |
varchar(20) | Block number. |
lot |
varchar(20) | Lot number. |
permit_number |
varchar(30) | Permit number. |
status |
bpchar(1) | Single character status code. |
permit_status_description |
varchar(50) | Description of the permit status. |
permit_date |
date | Date the permit was issued. |
certificate_date |
date | Date the certificate was issued. |
permit_type |
varchar(5) | Type code for the permit. |
permit_type_description |
varchar(50) | Description of the permit type. |
update |
varchar(10) | Update indicator. |
certificate_type |
varchar(5) | Type code for the certificate. |
certificate_type_description |
varchar(100) | Description of the certificate type. |
certificate_count |
int4 | Number of certificates. |
building_fee |
numeric(12,2) | Building permit fee. |
plumbing_fee |
numeric(12,2) | Plumbing permit fee. |
electrical_fee |
numeric(12,2) | Electrical permit fee. |
fire_fee |
numeric(12,2) | Fire permit fee. |
dca_fee |
numeric(12,2) | DCA (Department of Community Affairs) fee. |
certificate_fee |
numeric(12,2) | Certificate fee. |
elevator_fee |
numeric(12,2) | Elevator permit fee. |
other_fee |
numeric(12,2) | Other fees. |
total_fee |
numeric(14,2) | Total of all fees. |
volume |
int8 | Volume reference. |
square_feet |
int8 | Square footage covered by the permit. |
construction_cost |
numeric(16,2) | Estimated or actual construction cost. |
sale_units_gained |
int4 | Number of sale units gained from the project. |
rental_units_gained |
int4 | Number of rental units gained from the project. |
use_group |
varchar(10) | Use group code. |
use_group_description |
varchar(255) | Description of the use group. |
census_item_number |
varchar(10) | Census item number. |
census_item_number_description |
varchar(100) | Description of the census item. |
public |
varchar(10) | Public indicator. |
storage |
varchar(10) | Storage indicator. |
manufactured |
varchar(10) | Manufactured home indicator. |
hud_seal |
varchar(10) | HUD seal indicator. |
source |
varchar(10) | Source code. |
source_description |
varchar(50) | Description of the data source. |
version |
varchar(10) | Version identifier. |
process_date |
date | Date the record was processed. |
attom_id |
int8 | Foreign key to attom_properties.attom_id (if mapped). |
Relationships:
- Referenced by: xclass_permits_muni_publishing_status, xmap_permits_attom_properties
registered_builders¶
Stores information about registered builders from various sources.
| Column | Type | Description |
|---|---|---|
builder_registration_id |
varchar(100) | Primary key. Unique identifier for the builder registration. |
company_name |
varchar(100) | Name of the builder company. |
company_location |
varchar(100) | Location of the builder company. |
registration_status |
varchar(100) | Current status of the registration. |
source |
varchar(100) | Source of the registration data. |
Relationships:
- Referenced by: xclass_registered_builders_is_company, xmap_attom_transactions_registered_builders_companies, xmap_attom_transactions_registered_builders_companies_members, xmap_registered_builders_companies, xproperty_finder_classifications
log_ingestion¶
Stores logging information for data ingestion processes, tracking script execution, status, and results.
| Column | Type | Description |
|---|---|---|
id |
int4 | Primary key. Auto-incrementing identifier. |
script_name |
varchar(255) | Name of the ingestion script that was executed. |
start_time |
timestamp | Timestamp when the script started execution. |
end_time |
timestamp | Timestamp when the script finished execution. |
status |
varchar(50) | Status of the ingestion (success, failed, in_progress, etc.). |
comments |
text | Additional comments or notes about the ingestion. |
rows_inserted |
int4 | Number of rows inserted during the ingestion. |
rows_updated |
int4 | Number of rows updated during the ingestion. |
rows_deleted |
int4 | Number of rows deleted during the ingestion. |
error_message |
text | Error message if the ingestion failed. |
metadata |
jsonb | JSON object containing additional metadata about the ingestion. |
created_at |
timestamp | Timestamp when the log record was created. Default: CURRENT_TIMESTAMP. |
Classification Tables¶
Classification tables store AI-powered analysis and categorization of data from core tables. These tables use machine learning and rule-based logic to classify properties, transactions, and entities for business intelligence purposes. Common classifications include identifying new construction, builder purchases, company transactions, and property rankings.
Classification System Overview¶
graph TB
A[Core Data Tables] --> B[Classification Engine]
B --> C{Classification Types}
C --> D[New Construction]
C --> E[Builder Purchases]
C --> F[Company Transactions]
C --> G[Property Rankings]
C --> H[Transaction Types]
D --> I[xclass_attom_properties_tax_new_const]
D --> J[xclass_mls_new_construction]
E --> K[xclass_mls_attom_transactions_builder_lot_purchases]
F --> L[xclass_is_company_attom_transactions]
G --> M[xclass_ranking_attom_properties]
H --> N[xclass_attom_transactions_purchase_type]
Classification Table Relationships¶
erDiagram
attom_properties ||--o{ xclass_attom_properties_property_type : "classified as"
attom_properties ||--o{ xclass_attom_properties_tax_new_const : "analyzed for"
attom_properties ||--o{ xclass_ranking_attom_properties : "ranked in"
attom_transactions ||--o{ xclass_attom_transactions_historical_classification : "classified as"
attom_transactions ||--o{ xclass_attom_transactions_purchase_type : "typed as"
attom_transactions ||--o{ xclass_is_company_attom_transactions : "flagged as"
attom_transactions ||--o{ xclass_mls_attom_transactions_builder_lot_purchases : "identified as"
mls ||--o{ xclass_mls_new_construction : "classified as"
mls ||--o{ xclass_new_construction_by_price_increase : "analyzed for"
permits ||--o{ xclass_permits_muni_publishing_status : "aggregated by"
registered_builders ||--o{ xclass_registered_builders_is_company : "identified as"
xclass_attom_properties_image_new_construction¶
Classifies properties as construction sites based on image analysis.
| Column | Type | Description |
|---|---|---|
transaction_id |
int8 | Primary key. Foreign key to attom_transactions.transaction_id. |
is_construction_site |
bool | Boolean flag indicating if the property is identified as a construction site. |
processed_timestamp |
timestamptz | Timestamp when the classification was processed. Default: CURRENT_TIMESTAMP. |
Relationships:
- References: attom_transactions.transaction_id
xclass_attom_properties_property_type¶
Stores property type classifications for ATTOM properties.
| Column | Type | Description |
|---|---|---|
attom_id |
int8 | Primary key. Foreign key to attom_properties.attom_id. |
property_type |
varchar | Classification of the property type. |
created_at |
timestamp | Timestamp when the classification was created. Default: CURRENT_TIMESTAMP. |
Relationships:
- References: attom_properties.attom_id
xclass_attom_properties_tax_new_const¶
Identifies likely new construction based on tax assessment value changes.
| Column | Type | Description |
|---|---|---|
attom_id |
int4 | Foreign key to attom_properties.attom_id. |
situs_address |
varchar(150) | Property address. |
situs_city |
varchar(50) | Property city. |
situs_state |
varchar(15) | Property state. |
situs_zip |
varchar(10) | Property ZIP code. |
assessor_year |
int4 | Year of the assessment. |
tax_assessed_value |
int4 | Current tax assessed value. |
previous_assessed_value |
int4 | Previous tax assessed value. |
tax_land_value |
int4 | Current tax land value. |
tax_improvement_value |
int4 | Current tax improvement value. |
total_value_delta |
int4 | Change in total assessed value. |
improvement_value_delta |
int4 | Change in improvement value. |
total_percent_change |
float8 | Percentage change in total value. |
improvement_percent_change |
float8 | Percentage change in improvement value. |
likely_new_construction |
bool | Boolean flag indicating likely new construction based on value changes. |
Relationships:
- References: attom_properties.attom_id
xclass_attom_transactions_historical_classification¶
Stores historical transaction classifications with reasoning.
| Column | Type | Description |
|---|---|---|
transaction_id |
int8 | Primary key. Foreign key to attom_transactions.transaction_id. |
attom_id |
int8 | Foreign key to attom_properties.attom_id. |
year_built |
int4 | Year the property was built. |
instrument_date |
date | Date of the instrument. |
grantor_name |
varchar | Name of the grantor. |
grantee_name |
varchar | Name of the grantee. |
transfer_amount |
numeric | Amount of the transfer. |
document_type_description |
varchar | Description of the document type. |
mortgage_amount |
numeric | Mortgage amount if applicable. |
mortgage_type_description |
varchar | Description of the mortgage type. |
classification |
varchar | Classification assigned to the transaction. |
reasoning |
text | Text explaining the reasoning behind the classification. |
classified_at |
timestamp | Timestamp when the classification was made. Default: CURRENT_TIMESTAMP. |
Relationships:
- References: attom_transactions.transaction_id, attom_properties.attom_id
xclass_attom_transactions_purchase_type¶
Classifies transaction purchase types.
| Column | Type | Description |
|---|---|---|
transaction_id |
int8 | Primary key. Foreign key to attom_transactions.transaction_id. |
transaction_type |
varchar | Type of transaction purchase classification. |
notes |
text | Additional notes about the classification. |
status |
varchar | Status of the classification. |
created_at |
timestamp | Timestamp when the classification was created. Default: CURRENT_TIMESTAMP. |
Relationships:
- References: attom_transactions.transaction_id
xclass_is_company_attom_transactions¶
Flags transactions where the grantee is identified as a company.
| Column | Type | Description |
|---|---|---|
transaction_id |
int8 | Primary key. Foreign key to attom_transactions.transaction_id. |
is_company_flag_nm |
bool | Boolean flag indicating if the transaction involves a company (name matching). Default: false. |
Relationships:
- References: attom_transactions.transaction_id
xclass_mls_attom_transactions_builder_lot_purchases¶
Identifies MLS listings that represent builder lot purchases, with confidence scoring and AI reasoning.
| Column | Type | Description |
|---|---|---|
id |
int4 | Primary key. Auto-incrementing identifier. |
mls_id |
varchar | Unique identifier. Foreign key to mls.mls_id. |
attom_id |
int4 | Foreign key to attom_properties.attom_id. |
transaction_id |
int8 | Foreign key to attom_transactions.transaction_id. |
confidence_score |
int4 | Confidence score for the classification (typically 0-100). |
gemini_reasoning |
text | Reasoning provided by Gemini AI for the classification. |
property_summary |
jsonb | JSON object containing property summary information. |
created_at |
timestamptz | Timestamp when the classification was created. Default: CURRENT_TIMESTAMP. |
Relationships:
- References: mls.mls_id, attom_properties.attom_id, attom_transactions.transaction_id
xclass_mls_new_construction¶
Classifies MLS listings as new construction with AI reasoning.
| Column | Type | Description |
|---|---|---|
mls_id |
text | Primary key. Foreign key to mls.mls_id. |
construction_classification |
text | Classification of the construction type. |
gemini_reasoning |
text | Reasoning provided by Gemini AI for the classification. |
created_at |
timestamp | Timestamp when the classification was created. Default: now(). |
Relationships:
- References: mls.mls_id
xclass_mls_new_construction_new¶
Alternative table for classifying MLS listings as new construction (newer version).
| Column | Type | Description |
|---|---|---|
mls_id |
text | Primary key. Foreign key to mls.mls_id. |
construction_classification |
text | Classification of the construction type. |
gemini_reasoning |
text | Reasoning provided by Gemini AI for the classification. |
created_at |
timestamp | Timestamp when the classification was created. Default: now(). |
Relationships:
- References: mls.mls_id
xclass_new_construction_by_price_increase¶
Identifies new construction properties based on significant price increases between purchase and sale transactions.
| Column | Type | Description |
|---|---|---|
id |
int4 | Primary key. Auto-incrementing identifier. |
mls_id |
varchar | Unique identifier. Foreign key to mls.mls_id. |
attom_id |
int4 | Foreign key to attom_properties.attom_id. |
purchase_transaction_id |
int8 | Foreign key to attom_transactions.transaction_id for the purchase transaction. |
sale_transaction_id |
int8 | Foreign key to attom_transactions.transaction_id for the sale transaction. |
purchase_date |
date | Date of the purchase transaction. |
sale_date |
date | Date of the sale transaction. |
purchase_amount |
numeric | Amount of the purchase transaction. |
sale_amount |
numeric | Amount of the sale transaction. |
price_increase_pct |
numeric | Percentage increase in price between purchase and sale. |
days_between_transactions |
int4 | Number of days between purchase and sale transactions. |
confidence_score |
int4 | Confidence score for the classification. |
gemini_reasoning |
text | Reasoning provided by Gemini AI for the classification. |
property_summary |
jsonb | JSON object containing property summary information. |
created_at |
timestamptz | Timestamp when the classification was created. Default: CURRENT_TIMESTAMP. |
Relationships:
- References: mls.mls_id, attom_properties.attom_id, attom_transactions.transaction_id (purchase), attom_transactions.transaction_id (sale)
xclass_permits_muni_publishing_status¶
Analyzes and classifies the publishing status of permit data by municipality.
| Column | Type | Description |
|---|---|---|
muni_code |
varchar(4) | Primary key. Municipality code. Foreign key to permits.muni_code. |
mcdc |
varchar(10) | MCDC code. |
municipality_name |
varchar(100) | Name of the municipality. |
county |
varchar(50) | County name. |
total_permits |
int4 | Total number of permits for this municipality. |
earliest_permit_date |
date | Earliest permit date in the dataset. |
latest_permit_date |
date | Latest permit date in the dataset. |
date_range_days |
int4 | Number of days between earliest and latest permit dates. |
unique_permit_years |
int4 | Number of unique years with permit data. |
avg_permits_per_year |
numeric(10,2) | Average number of permits per year. |
publishing_status |
varchar(50) | Classification of the publishing status (active, inactive, etc.). |
publishing_confidence |
varchar(20) | Confidence level of the publishing status assessment. |
analysis_notes |
text | Notes from the analysis. |
created_at |
timestamp | Timestamp when the analysis was created. Default: CURRENT_TIMESTAMP. |
updated_at |
timestamp | Timestamp when the analysis was last updated. Default: CURRENT_TIMESTAMP. |
Relationships:
- References: permits.muni_code
xclass_ranking_attom_properties¶
Stores ranking and scoring information for ATTOM properties, including hotspot scores and tax improvement scores.
| Column | Type | Description |
|---|---|---|
attom_id |
int4 | Foreign key to attom_properties.attom_id. |
city |
text | City where the property is located. |
ranking |
float8 | Overall ranking score for the property. |
hotspot_score |
float8 | Hotspot score indicating desirability or activity level. |
tax_improvement_score |
numeric | Score based on tax improvement value changes. |
Relationships:
- References: attom_properties.attom_id
xclass_registered_builders_is_company¶
Classifies registered builders as companies using AI reasoning.
| Column | Type | Description |
|---|---|---|
builder_registration_id |
text | Primary key. Foreign key to registered_builders.builder_registration_id. |
is_company |
bool | Boolean flag indicating if the builder is classified as a company. |
gemini_reasoning |
text | Reasoning provided by Gemini AI for the classification. |
created_at |
timestamptz | Timestamp when the classification was created. Default: CURRENT_TIMESTAMP. |
Relationships:
- References: registered_builders.builder_registration_id
Mapping Tables¶
Mapping tables establish relationships between entities from different data sources. These tables enable cross-referencing and correlation of data from ATTOM properties, transactions, MLS listings, permits, parcels, companies, and builders. Mapping tables often include match scores and AI-powered confirmation flags to validate the quality of relationships.
Mapping System Overview¶
graph LR
A[ATTOM Properties] --> B[Mapping Engine]
C[MLS Listings] --> B
D[Permits] --> B
E[Parcels] --> B
F[Transactions] --> B
G[Companies] --> B
H[Builders] --> B
I[Neighborhoods] --> B
B --> J[Match Scoring]
J --> K[AI Confirmation]
K --> L[Mapping Tables]
Mapping Table Relationships¶
erDiagram
attom_properties ||--o{ xmap_mls_attom_properties : "maps to"
attom_properties ||--o{ xmap_parcels_attom_properties : "maps to"
attom_properties ||--o{ xmap_permits_attom_properties : "maps to"
attom_properties ||--o{ xmap_attom_properties_neighborhoods : "located in"
attom_transactions ||--o{ xmap_attom_transactions_companies : "involves"
attom_transactions ||--o{ xmap_attom_transactions_registered_builders_companies : "linked to"
companies ||--o{ xmap_attom_transactions_companies : "participates in"
registered_builders ||--o{ xmap_registered_builders_companies : "maps to"
registered_builders ||--o{ xmap_attom_transactions_registered_builders_companies : "linked to"
mls ||--o{ xmap_mls_attom_properties : "maps to"
permits ||--o{ xmap_permits_attom_properties : "maps to"
parcels ||--o{ xmap_parcels_attom_properties : "maps to"
neighborhoods ||--o{ xmap_attom_properties_neighborhoods : "contains"
xmap_attom_properties_neighborhoods¶
Maps ATTOM properties to neighborhoods.
| Column | Type | Description |
|---|---|---|
attom_id |
int8 | Primary key. Foreign key to attom_properties.attom_id. |
neighborhood_id |
varchar(255) | Foreign key to neighborhoods.id. |
created_at |
timestamp | Timestamp when the mapping was created. Default: CURRENT_TIMESTAMP. |
Relationships:
- References: attom_properties.attom_id, neighborhoods.id
xmap_attom_transactions_companies¶
Maps ATTOM transactions to companies, with match scoring and AI confirmation.
| Column | Type | Description |
|---|---|---|
transaction_id |
int8 | Primary key (part 1). Foreign key to attom_transactions.transaction_id. |
entity_id |
text | Primary key (part 2). Foreign key to companies.entity_id. |
grantee_1_name_full |
text | Full name of the grantee from the transaction. |
company_name |
text | Name of the matched company. |
match_score |
float4 | Numeric score indicating the quality of the match. |
gemini_confirmed |
bool | Boolean flag indicating if Gemini AI confirmed the match. Default: false. |
gemini_reasoning |
text | Reasoning provided by Gemini AI for the match confirmation. |
created_at |
timestamptz | Timestamp when the mapping was created. Default: CURRENT_TIMESTAMP. |
Relationships:
- References: attom_transactions.transaction_id, companies.entity_id
xmap_attom_transactions_historical_builder_purchases_companies¶
Maps historical builder purchase transactions to companies.
| Column | Type | Description |
|---|---|---|
transaction_id |
int8 | Primary key (part 1). Foreign key to attom_transactions.transaction_id. |
entity_id |
text | Primary key (part 2). Foreign key to companies.entity_id. |
grantee_1_name_full |
text | Full name of the grantee from the transaction. |
company_name |
text | Name of the matched company. |
match_score |
float4 | Numeric score indicating the quality of the match. |
gemini_confirmed |
bool | Boolean flag indicating if Gemini AI confirmed the match. Default: false. |
gemini_reasoning |
text | Reasoning provided by Gemini AI for the match confirmation. |
created_at |
timestamptz | Timestamp when the mapping was created. Default: CURRENT_TIMESTAMP. |
Relationships:
- References: attom_transactions.transaction_id, companies.entity_id
xmap_attom_transactions_registered_builders_companies¶
Maps ATTOM transactions to registered builders and companies with detailed match scoring across multiple attributes.
| Column | Type | Description |
|---|---|---|
id |
int4 | Primary key. Auto-incrementing identifier. |
transaction_id |
int4 | Foreign key to attom_transactions.transaction_id. |
builder_registration_id |
varchar | Foreign key to registered_builders.builder_registration_id. |
transaction_entity_id |
varchar | Entity ID from the transaction side. |
builder_entity_id |
varchar | Entity ID from the builder side. |
grantee_name_transaction |
text | Grantee name from the transaction. |
grantee_name_builder |
text | Grantee name from the builder record. |
grantee_name_score |
float8 | Match score for grantee name. |
grantee_name_confirmed |
bool | Boolean flag indicating if grantee name match was confirmed. Default: false. |
grantee_name_reasoning |
text | Reasoning for grantee name match. |
grantee_address_transaction |
text | Grantee address from the transaction. |
grantee_address_builder |
text | Grantee address from the builder record. |
grantee_address_score |
float8 | Match score for grantee address. |
grantee_address_confirmed |
bool | Boolean flag indicating if grantee address match was confirmed. Default: false. |
grantee_address_reasoning |
text | Reasoning for grantee address match. |
business_address_transaction |
text | Business address from the transaction. |
business_address_builder |
text | Business address from the builder record. |
business_address_score |
float8 | Match score for business address. |
business_address_confirmed |
bool | Boolean flag indicating if business address match was confirmed. Default: false. |
business_address_reasoning |
text | Reasoning for business address match. |
agent_name_transaction |
text | Agent name from the transaction. |
agent_name_builder |
text | Agent name from the builder record. |
agent_name_score |
float8 | Match score for agent name. |
agent_name_confirmed |
bool | Boolean flag indicating if agent name match was confirmed. Default: false. |
agent_name_reasoning |
text | Reasoning for agent name match. |
agent_address_transaction |
text | Agent address from the transaction. |
agent_address_builder |
text | Agent address from the builder record. |
agent_address_score |
float8 | Match score for agent address. |
agent_address_confirmed |
bool | Boolean flag indicating if agent address match was confirmed. Default: false. |
agent_address_reasoning |
text | Reasoning for agent address match. |
created_at |
timestamp | Timestamp when the mapping was created. Default: CURRENT_TIMESTAMP. |
Relationships:
- References: attom_transactions.transaction_id, registered_builders.builder_registration_id
xmap_attom_transactions_registered_builders_companies_members¶
Maps ATTOM transactions to registered builder company members with match scoring.
| Column | Type | Description |
|---|---|---|
id |
int4 | Primary key. Auto-incrementing identifier. |
transaction_id |
int4 | Foreign key to attom_transactions.transaction_id. |
builder_registration_id |
varchar | Foreign key to registered_builders.builder_registration_id. |
transaction_member_id |
varchar | Member ID from the transaction side. |
builder_member_id |
varchar | Member ID from the builder side. |
xmatcm_member_name |
text | Member name from transaction-company mapping. |
xmrbcm_member_name |
text | Member name from registered builder-company mapping. |
match_score_member_name |
float8 | Match score for member name. |
gemini_confirmed_member_name |
bool | Boolean flag indicating if member name match was confirmed by Gemini AI. |
xmatcm_member_address |
text | Member address from transaction-company mapping. |
xmrbcm_member_address |
text | Member address from registered builder-company mapping. |
match_score_member_address |
float8 | Match score for member address. |
gemini_confirmed_member_address |
bool | Boolean flag indicating if member address match was confirmed by Gemini AI. |
created_at |
timestamp | Timestamp when the mapping was created. Default: CURRENT_TIMESTAMP. |
Relationships:
- References: attom_transactions.transaction_id, registered_builders.builder_registration_id
xmap_mls_attom_properties¶
Maps MLS listings to ATTOM properties with match scoring and AI confirmation.
| Column | Type | Description |
|---|---|---|
id |
int4 | Primary key. Auto-incrementing identifier. |
mls_id |
varchar | Foreign key to mls.mls_id. |
attom_id |
int4 | Foreign key to attom_properties.attom_id. |
mls_address |
text | Address from the MLS listing. |
situs_address |
text | Situs address from the ATTOM property. |
match_score |
float4 | Numeric score indicating the quality of the address match. |
confidence_score |
float4 | Confidence score for the match. |
gemini_confirmed |
bool | Boolean flag indicating if Gemini AI confirmed the match. Default: false. |
gemini_reasoning |
text | Reasoning provided by Gemini AI for the match confirmation. |
match_rank |
int4 | Ranking of this match among multiple potential matches. |
created_at |
timestamptz | Timestamp when the mapping was created. Default: CURRENT_TIMESTAMP. |
Relationships:
- References: mls.mls_id, attom_properties.attom_id
xmap_parcels_attom_properties¶
Maps parcels to ATTOM properties.
| Column | Type | Description |
|---|---|---|
parcel_id |
int4 | Primary key (part 1). Foreign key to parcels.parcel_id. |
attom_id |
int8 | Primary key (part 2). Foreign key to attom_properties.attom_id. |
match_type |
varchar(10) | Type of match used to link the parcel to the property. |
created_at |
timestamp | Timestamp when the mapping was created. Default: CURRENT_TIMESTAMP. |
Relationships:
- References: parcels.parcel_id, attom_properties.attom_id
xmap_parcels_attom_properties_staging¶
Staging table for mapping parcels to ATTOM properties before final processing.
| Column | Type | Description |
|---|---|---|
parcel_id |
int4 | Primary key (part 1). Foreign key to parcels.parcel_id. |
attom_id |
int8 | Primary key (part 2). Foreign key to attom_properties.attom_id. |
match_type |
varchar(20) | Type of match used to link the parcel to the property. |
created_at |
timestamp | Timestamp when the mapping was created. Default: CURRENT_TIMESTAMP. |
Relationships:
- References: parcels.parcel_id, attom_properties.attom_id
xmap_permits_attom_properties¶
Maps building permits to ATTOM properties.
| Column | Type | Description |
|---|---|---|
permit_id |
varchar(25) | Primary key (part 1). Foreign key to permits.permit_id. |
attom_id |
int8 | Primary key (part 2). Foreign key to attom_properties.attom_id. |
created_at |
timestamp | Timestamp when the mapping was created. Default: CURRENT_TIMESTAMP. |
Relationships:
- References: permits.permit_id, attom_properties.attom_id
xmap_registered_builders_companies¶
Maps registered builders to companies with match scoring and AI confirmation.
| Column | Type | Description |
|---|---|---|
builder_registration_id |
text | Primary key (part 1). Foreign key to registered_builders.builder_registration_id. |
entity_id |
text | Primary key (part 2). Foreign key to companies.entity_id. |
original_builder_name |
text | Original builder name from the registration. |
matched_company_name |
text | Name of the matched company. |
match_score |
float4 | Numeric score indicating the quality of the match. |
gemini_confirmed |
bool | Boolean flag indicating if Gemini AI confirmed the match. Default: false. |
gemini_reasoning |
text | Reasoning provided by Gemini AI for the match confirmation. |
created_at |
timestamptz | Timestamp when the mapping was created. Default: CURRENT_TIMESTAMP. |
Relationships:
- References: registered_builders.builder_registration_id, companies.entity_id
Property Finder Tables¶
Tables used for the property finder application functionality.
xproperty_finder_classifications¶
Stores property classifications for the property finder application with versioning support.
| Column | Type | Description |
|---|---|---|
id |
int4 | Primary key. Auto-incrementing identifier. |
attom_id |
int4 | Unique identifier. Foreign key to attom_properties.attom_id. |
transaction_id |
int4 | Foreign key to attom_transactions.transaction_id. |
property_type |
varchar(50) | Type of property classification. |
transaction_type |
varchar(50) | Type of transaction classification. |
current_status |
varchar(50) | Current status of the property listing. |
beds |
int4 | Number of bedrooms. |
baths |
numeric(3,1) | Number of bathrooms. |
price |
numeric(12,2) | Property price. |
home_sqft |
int4 | Square footage of the home. |
image_url |
text | URL to the property image. |
is_published |
bool | Boolean flag indicating if the property is published. Default: false. |
is_featured |
bool | Boolean flag indicating if the property is featured. Default: false. |
builder_registration_id |
varchar(100) | Foreign key to registered_builders.builder_registration_id. |
notes |
text | Additional notes about the property. |
created_at |
timestamp | Timestamp when the record was created. Default: CURRENT_TIMESTAMP. |
updated_at |
timestamp | Timestamp when the record was last updated. Default: CURRENT_TIMESTAMP. |
created_by |
varchar(100) | User who created the record. |
updated_by |
varchar(100) | User who last updated the record. |
version |
int4 | Version number for optimistic locking. Default: 1. |
Relationships:
- References: attom_properties.attom_id, attom_transactions.transaction_id, registered_builders.builder_registration_id
xproperty_finder_classifications_history¶
Stores historical versions of property finder classifications for audit trail purposes.
| Column | Type | Description |
|---|---|---|
history_id |
int4 | Primary key. Auto-incrementing identifier. |
id |
int4 | Reference to the original record ID in xproperty_finder_classifications. |
attom_id |
int4 | ATTOM property ID. |
transaction_id |
int4 | Transaction ID. |
property_type |
varchar(50) | Type of property classification. |
transaction_type |
varchar(50) | Type of transaction classification. |
current_status |
varchar(50) | Current status of the property listing. |
beds |
int4 | Number of bedrooms. |
baths |
numeric(3,1) | Number of bathrooms. |
price |
numeric(12,2) | Property price. |
home_sqft |
int4 | Square footage of the home. |
image_url |
text | URL to the property image. |
is_published |
bool | Boolean flag indicating if the property was published. |
is_featured |
bool | Boolean flag indicating if the property was featured. |
builder_registration_id |
varchar(100) | Builder registration ID. |
notes |
text | Additional notes about the property. |
created_at |
timestamp | Original creation timestamp. |
updated_at |
timestamp | Original update timestamp. |
created_by |
varchar(100) | User who originally created the record. |
updated_by |
varchar(100) | User who last updated the record. |
version |
int4 | Version number. |
action |
varchar(20) | Action that created this history record (INSERT, UPDATE, DELETE). |
changed_at |
timestamp | Timestamp when the history record was created. Default: CURRENT_TIMESTAMP. |
changed_by |
varchar(100) | User who performed the action. |
Helper Tables¶
Helper tables support data processing and management operations.
xhelper_mls_file_registry¶
Tracks MLS data files that have been processed or are pending processing.
| Column | Type | Description |
|---|---|---|
id |
int4 | Primary key. Auto-incrementing identifier. |
file_id |
varchar(255) | Unique identifier for the file. |
file_name |
varchar(255) | Name of the file. |
mls_source |
varchar(50) | Source MLS system or organization. |
file_size |
int8 | Size of the file in bytes. |
file_modified_time |
timestamp | Last modified time of the file. |
processed_at |
timestamp | Timestamp when the file was processed. |
status |
varchar(20) | Processing status (pending, processed, failed, etc.). Default: 'pending'. |
record_count |
int4 | Number of records extracted from the file. |
error_message |
text | Error message if processing failed. |
created_at |
timestamp | Timestamp when the record was created. Default: now(). |
updated_at |
timestamp | Timestamp when the record was last updated. Default: now(). |
xhelper_document_segments¶
Tracks document segments for company entity extraction, storing page ranges and URLs for segmented documents.
| Column | Type | Description |
|---|---|---|
id |
int4 | Primary key. Auto-incrementing identifier. |
entity_id |
text | Foreign key to companies.entity_id. Company entity this segment belongs to. |
document_type |
text | Type of document (articles of incorporation, annual report, etc.). |
start_page |
int4 | Starting page number of the segment. |
end_page |
int4 | Ending page number of the segment. |
segment_url |
text | URL or path to the document segment. |
processed_at |
timestamptz | Timestamp when the segment was processed. Default: now(). |
Relationships:
- References: companies.entity_id
Notes¶
- All timestamps use the database server's timezone unless otherwise specified.
- Foreign key relationships are enforced at the database level where defined.
- Many classification and mapping tables use AI (Gemini) for confirmation and reasoning, indicated by
gemini_confirmedandgemini_reasoningcolumns. - Match scores are typically numeric values (float4 or float8) representing the quality or confidence of a match between entities.
- JSONB columns (
jsonbtype) store structured JSON data and support efficient querying in PostgreSQL.