Home /
Data Imports /
Master Input Catalog
1. Purpose
This chapter is the single canonical inventory of everything travoBooks ingests — every file format, API feed, scheduled pull, manual upload, partner-initiated import, and integration push. For each input it lists the source, format, frequency, validation, target tables, security posture, and operational ownership. Detailed parsers for each input live in subsequent chapters (11.2 CSV, 11.3 BSP/ARC, 11.4 Bank, 11.5 Supplier, 11.6 Tax); this chapter is the index.
2. Why it matters
Imports are the highest-risk surface for data quality. A malformed BSP file mis-categorises a million in commission; a duplicate bank statement double-counts cash; a stale FX-rate file mis-prices new bookings for a day. Treating imports as first-class, versioned, idempotent, auditable artefacts — rather than ad-hoc cron jobs — is what separates a reliable platform from one that quietly breaks.
3. Import principles
Every import follows the same skeleton:
- Source identification — partner, supplier, gateway, or platform.
- Idempotency — file checksum or message-ID deduplication; re-import is safe.
- Staging — raw rows go into a
*_staging table unmodified before any business interpretation.
- Validation — schema and business rules; failures quarantined.
- Application — committed to operational/financial tables.
- Audit — every import produces an
import_run record with stats.
- Notification — operational owners notified of success / exceptions.
No import bypasses staging. No import silently drops rows. No import is one-shot — replay must produce the same end state.
4. The catalog
| Input |
Source |
Format |
Frequency |
Target |
Notes |
| Bulk customer onboarding |
Partner upload |
CSV / Excel |
Ad-hoc |
customers, customer_addresses, customer_contacts |
At-onboarding migration |
| Customer credit-limit batch update |
Partner upload |
CSV |
Ad-hoc |
customers.credit_limit + history |
Maker-checker; threshold rules |
| Customer status sync |
Partner ERP via API |
JSON |
Daily |
customers.status |
Optional partner integration |
| Sanctions watchlist refresh |
Compliance feed (OFAC, UN) |
CSV / XML |
Daily |
compliance_watchlists |
Triggers re-screening |
| Input |
Source |
Format |
Frequency |
Target |
| Supplier master onboarding |
Partner upload |
CSV |
Ad-hoc |
suppliers, supplier_addresses, supplier_bank_details |
| Commission rate updates |
Supplier email + partner upload |
CSV / PDF |
Ad-hoc |
supplier_commission_rules |
| Supplier statement of account |
Supplier email / portal |
PDF / Excel / CSV |
Monthly |
supplier_statement_files → supplier_statement_lines |
| Net rate agreements |
Negotiated contracts |
PDF (manual capture) |
Per contract |
supplier_pricing_profiles |
4.3 Operational imports
| Input |
Source |
Format |
Frequency |
Target |
| GDS booking pull |
Amadeus / Sabre / Travelport |
XML / NDC JSON |
On-demand + queued |
bookings, booking_segments, booking_pnrs |
| LCC booking pull |
Aggregator / direct API |
JSON |
On-demand |
bookings, etc. |
| NDC offer creation |
Supplier OfferPriceRQ |
JSON |
On-demand |
booking_tsts |
| Hotel availability cache |
Hotel aggregator |
JSON |
Cache TTL |
hotel_availability_cache |
| Schedule-change feed |
GDS PSS |
XML |
Daily |
booking_segments.status updates |
| Flown-status updates |
GDS / Airline NDC |
XML / JSON |
Daily |
booking_segments.status = FLOWN |
4.4 Financial imports
| Input |
Source |
Format |
Frequency |
Target |
| BSP/BSPlink billing analysis |
BSPlink (per BSP) |
HOT / AGTDATA / CSV |
Daily/weekly |
bsp_import_files → various |
| BSP memos (ADM/ACM) |
BSPlink |
CSV |
Daily |
memos |
| ARC report |
ARC (US BSP equivalent) |
CSV |
Weekly |
analogous to BSP |
| Bank statement |
Bank |
MT940 / OFX / CSV / BAI2 |
Daily |
bank_statement_files → bank_statement_lines |
| Gateway settlement |
Stripe / SSLCommerz / Razorpay |
CSV / JSON |
Daily |
gateway_settlement_files |
| FX rate feed |
Provider (XE, ECB, OANDA) |
JSON / CSV |
Hourly |
fx_rates |
| Manual journal upload |
Accountant |
CSV / Excel |
Ad-hoc |
journal_entries (staging-validated) |
| Trial balance import (migration) |
Predecessor system |
CSV / Excel |
One-time |
Opening balance JEs |
| Bank charges & interest |
Bank statement parsing |
derived |
Daily |
JE adjustments |
| Input |
Source |
Format |
Frequency |
Target |
| Tax rate table refresh |
Vetted tax-counsel update |
CSV / partner-managed |
As needed |
tax_rules (with maker-checker) |
| Withholding tax rates |
Government publications |
manual |
Annual |
withholding_tax_rates |
| VAT return submission receipts |
Tax authority portal |
PDF |
After filing |
tax_returns.filed_reference |
4.6 Reference data imports
| Input |
Source |
Format |
Frequency |
Target |
| IATA airport codes |
IATA / OAG |
CSV |
Quarterly |
airport_codes (~9,800 rows) |
| City / location codes |
OAG / curated |
CSV |
Quarterly |
location_codes (~81,000 rows) |
| IATA airline directory |
IATA |
CSV |
Quarterly |
airlines |
| ISO currency table |
ISO + corrections |
CSV |
Annually |
currencies |
| ISO country table |
ISO |
CSV |
Annually |
countries |
| Equipment / aircraft type |
IATA |
CSV |
Annually |
aircraft_types |
4.7 Configuration imports
| Input |
Source |
Format |
Frequency |
Target |
| Chart of accounts template |
Partner setup |
CSV |
At onboarding |
chart_of_accounts |
| Approval thresholds |
Partner config |
YAML/JSON |
Ad-hoc |
partner_config |
| Notification templates |
Partner customisation |
HTML/Markdown |
Ad-hoc |
notification_templates |
4.8 Integration push-ins (webhooks received)
| Input |
Source |
Format |
Trigger |
| Gateway webhook (payment_intent.succeeded) |
Stripe etc. |
JSON HTTPS |
Customer payment |
| Gateway webhook (charge.refunded) |
Stripe etc. |
JSON |
Refund settled |
| Gateway webhook (chargeback.created) |
Stripe etc. |
JSON |
Chargeback raised |
| Supplier NDC push (order.changed) |
NDC supplier |
JSON |
Order change |
| BSPlink notification |
BSPlink |
Email/portal |
New file available |
4.9 Manual operational imports
| Input |
Source |
Format |
Frequency |
| Opening balance migration |
Predecessor system |
Excel |
Once at onboarding |
| Historical booking import |
Predecessor system |
CSV |
Once at onboarding |
| Document attachments |
Operations |
PDF, image |
Ad-hoc |
5.1 CSV / Excel
- UTF-8 with BOM expected; UTF-8 without BOM accepted; other encodings rejected with hint.
- Header row mandatory; column names matched case-insensitive against expected schema.
- Date format ISO-8601 by default (YYYY-MM-DD); partner-config override allowed for locale-specific.
- Numeric: dot decimal separator default; partner-config override for comma-decimal.
5.2 XML (GDS)
- Schema validation against captured XSDs (per supplier version).
- Namespace-aware parsing.
- Raw payload always archived in S3 for replay/audit.
5.3 JSON (NDC / REST APIs)
- JSON Schema validation per endpoint version.
- Schema versions captured; deprecation warnings logged.
5.4 MT940 / BAI2 (banking)
- Standard banking formats; parser per format.
- Per-bank quirks handled by adapter pattern.
5.5 HOT / AGTDATA (BSP)
- IATA-defined fixed-width formats.
- Per-country BSP variants handled.
6. Idempotency strategy
| Input type |
Dedup key |
| File-based |
SHA-256 of file content + partner_id |
| API call replay |
Idempotency-Key header |
| Webhook |
Provider message ID + partner_id |
| GDS pull |
(supplier_ref, partner_id, snapshot_at quantised to minute) |
The idempotency_keys table holds 30-90 days of recent keys depending on category.
7. Validation tiers
| Tier |
Failure handling |
| Schema (shape, types) |
Reject entire file; alert operations |
| Referential (FK exists, partner-scope) |
Quarantine row; continue file |
| Business (amount range, status valid) |
Quarantine row; continue file |
| Cross-row (sum matches header) |
Reject file or flag for review per config |
| Cross-file (matches expected period) |
Warn; allow override |
8. Security & permissions
| Permission |
Allows |
import.upload.partner |
Upload file |
import.validate.partner |
Run validation pass |
import.commit.partner |
Commit staged data |
import.commit_financial.partner |
Commit financial imports (higher threshold; maker-checker) |
import.history.read.partner |
View import runs |
Uploaded files are encrypted at rest in S3 with KMS keys per partner; access logged.
9. Validation rules (cross-cutting)
| Code |
Condition |
IMPORT_FILE_DUPLICATE |
Same checksum already imported |
IMPORT_FILE_TOO_LARGE |
Exceeds per-type size limit |
IMPORT_ENCODING_INVALID |
Not UTF-8 |
IMPORT_SCHEMA_MISMATCH |
Headers don't match expected |
IMPORT_PARTNER_MISMATCH |
File appears to belong to different partner |
IMPORT_PERIOD_CLOSED |
Targets a closed accounting period |
IMPORT_TOTAL_MISMATCH |
Header total ≠ sum of lines |
10. Operational ownership
| Input type |
Operational owner |
| BSP / ADM / ACM |
BSP Operations Lead |
| Bank statements |
Treasury / Accountant |
| Gateway settlement |
Treasury |
| GDS / Booking imports |
Travel Ops |
| Customer / Supplier master uploads |
Account Management |
| Reference data (IATA, currencies) |
Platform Engineering |
| FX rates |
Platform Engineering (automated) |
| Tax rate updates |
Tax Compliance |
Each input has a documented runbook in the operations wiki referencing this catalog row.
11. Real-world examples
A — Daily BSP file processing
04:00 — BSPlink SFTP poll picks up the day's file.
04:01 — Checksum verified; idempotent; new file.
04:02 — Parsed into bsp_staging_lines.
04:05 — Schema validation passes.
04:06 — Business validation: 99.7% match expected tickets; 12 orphans quarantined.
04:07 — Committed; memo entries created; import_run summary email to BSP-ops.
04:30 — BSP-ops works orphan queue.
B — Bank statement of small partner
08:30 — Accountant downloads MT940 from bank portal.
08:31 — Uploads via UI; idempotent check passes.
08:32 — Parser handles bank-specific quirks via adapter.
08:33 — Auto-match against payments / payouts: 92% matched.
08:40 — Accountant works exceptions: links three unmatched credits to invoices, posts one bank-charge adjustment.
08:55 — Reconciliation report shows balanced; saved to period record.
C — Onboarding migration
New partner imports 12,000 historical customers, 2,400 suppliers, 86,000 bookings, and an opening trial balance.
- Files reviewed by data team for schema fit.
- Per-batch staging upload + validation.
- Final commit by partner_admin under maker-checker.
- One-time "migration period" — opening JE posted with effective date of cutover.
12. Step-by-step generic import flow
flowchart TD
A[Source produces file / event] --> B{Idempotent?}
B -->|Already processed| C[Return prior result]
B -->|New| D[Persist to S3 + import_runs row]
D --> E[Parse → Staging Table]
E --> F{Schema valid?}
F -->|No| G[Reject; alert ops]
F -->|Yes| H[Row-level validation]
H --> I{All rows valid?}
I -->|All| J[Commit to live tables]
I -->|Some| K[Quarantine invalid; commit valid; alert ops]
J --> L[Update materialised balances]
K --> L
L --> M[Emit webhooks / notifications]
M --> N[Mark import_run complete with stats]
13. Database tables touched
| Table |
Role |
import_runs |
One row per import attempt |
import_files |
Uploaded file metadata + S3 ref |
*_staging tables (per input type) |
Raw parsed rows pre-commit |
*_quarantine tables |
Failed rows with reason |
audit_logs |
Every import event |
14. Future scalability
- Universal import UI — drag-and-drop with auto-detect of format and supplier.
- Streaming imports — for partners with high-volume real-time events.
- AI-assisted column mapping — accept arbitrary CSV layouts and map to canonical schema.
- Self-service partner imports — extend beyond admin-only to selected partner users with thresholds.
- Compliance evidence linking — auto-attach evidence to relevant audit records.
15. Common pitfalls
- ⚠️ Skipping staging. Direct-to-live commits make replay impossible.
- ⚠️ Trusting header totals. Cross-validate sum of lines.
- ⚠️ Mixed-encoding files. Reject early; don't half-import.
- ⚠️ Manual fix-up after a bad import without a corresponding correcting import. Always go through the same path.
- 🔒 Imported files with PII must inherit the same encryption + retention as primary data.
- ⚠️ Closed-period imports. Block at validation tier; never silently shift dates.