Skip to Content
Welcome to xfrukt documentation!
Generating Schema for Securities Trading

Generating Schema for Securities Trading


TD;DR Jump to the resulting schema


What you’ll build

This tutorial walks you through generating a database schema for a financial analytics application. For example, a RAG system that answers questions from investment analysts, financial advisers, or securities traders. It generates signals by correlating securities price movements with facts from SEC filings:

  • 10-K — Annual reports
  • 10-Q — Quarterly reports
  • 8-K — Current reports

Implementation

step 1/5. Get API key

Request early access to receive your API key.

step 2/5. Prepare specification and data samples

Fill in the specification template

Access patterns and downstream usage are the biggest drivers of architectural decisions. Use the specification template to outline your business goals and functional requirements. Xfrukt uses these inputs to tailor the architecture and schema to your use case.


# Industry financial markets # Users securities traders our CFO who gambles on stocks of competitors using company's budgets # Key Responsibilities and Tasks of the Users market analysis trading strategy keeping us profitable # Functionality Of The Future Application It a combination of a data analytics console with text form to ask all sorts of questions. The sofware should assist in answering questions like these (but not limited too): Which companies have 10-Q revenue acceleration + rising gross margins + low relative valuation vs peers, but stock hasn't moved yet? Which companies reported lower-than-expected bad debt expense / allowances in 10-Q despite rising DSO? Which companies have CFOs who sold >50% of their holdings within 30 days of filing 10-Q, where the subsequent 10-Q revealed deteriorating working capital ratios? # Data sources SEC reports End-of-day trading pricing Companies announcements # Other Requirements and Notes none


Note: You don’t need to be 100% precise with terminology or work out the architecture upfront. That’s exactly what Xfrukt provides:

  • standardization of business terminology,
  • best practices for vertical and operational tasks,
  • battle-tested data architectures suited to the type of system you’re building.

You just define initial guidelines on the overall direction.


Save the completed specification to a file and note its path; you’ll reference it in the next step.

files_metas = [ { "filename": "spec_securities_trader.md", "file_path": "/tmp/spec_securities_trader.md", "purpose": "spec", "content_type": "text/markdown" } ]
Prepare data samples

Include representative samples of each filing type: 10-K, 10-Q, and 8-K.

https://www.sec.gov/submit-filings/forms-index 

https://edgartools.readthedocs.io/en/latest/guides/working-with-filing/ 

import requests import edgar dirname = "/tmp" TICKER = "NVDA" # Downloading reports edgar.set_identity("Moly moly@email.com") edgar.use_local_storage( path_or_enable=dirname, use_local=True ) company = edgar.Company(cik_or_ticker=TICKER) forms = ["10-K", "10-Q", "8-K"] for form in forms: filing_meta = company.latest(form) primary = filing_meta.document.text() if form == "8-K": filing_date = str(filing_meta.report_date).replace("-", "") else: filing_date = filing_meta.period_of_report filename = f"{form}_{TICKER}_{filing_date}.txt" filename = filename.lower().replace("-", "_") filepath = os.path.join(dirname, filename) with open(os.path.join(dirname, filename), "w", encoding="utf-8") as txt_file: txt_file.write(primary) files_metas.append( { "filename": filename, "filepath": filepath, "purpose": "data", # how to treat this file "content_type": "text/plain" } ) # Downloading trading data try: API_KEY = os.environ["API_KEY_ALPHAVANTAGE"] TICKER = "NVDA" except: API_KEY = "demo" TICKER = "IBM" # outputsize=compact --> the latest 100 data points url = f"https://www.alphavantage.co/query?function=TIME_SERIES_DAILY&symbol={TICKER}&outputsize=compact&apikey={API_KEY}" r = requests.get(url) data = r.json() filename = f"ohlcv_{TICKER}_latest.json".lower() filepath = os.path.join(dirname, filename) with open(filepath, 'w') as f: json.dump(data, f, indent=2, ensure_ascii=False) files_metas.append( { "filename": filename, "file_path": str(filepath), "purpose": "data", "content_type": "application/json" } )

Once prepared, you should have the following metadata for each file


[ { "filename": "spec_securities_trader.md", "file_path": "/tmp/spec_securities_trader.md", "purpose": "spec", "content_type": "text/markdown" }, { "filename": "ohlcv_nvda_latest.json", "file_path": "/tmp/ohlcv_nvda_latest.json", "purpose": "data", "content_type": "application/json" }, { "filename": "10_k_nvda_2025_01_26.txt", "file_path": "/tmp/10_k_nvda_2025_01_26.txt", "purpose": "data", "content_type": "text/plain" }, { "filename": "10_q_nvda_2025_10_26.txt", "file_path": "/tmp/10_q_nvda_2025_10_26.txt", "purpose": "data", "content_type": "text/plain" }, { "filename": "8_k_nvda_2026_01_20.txt", "file_path": "/tmp/8_k_nvda_2026_01_20.txt", "purpose": "data", "content_type": "text/plain" }, ]

where

content_type
text/plain
text/markdown
application/json

purposeDescription
specYour description of the future application and business context
dataone sample of a data

step 3/5. Upload data samples and specification

Upload each file individually.

import requests XFRUKT_API_KEY = "YOUR_API_KEY" BASE_URL = "https://api.xfrukt.com" headers = { "X-Api-Key": XFRUKT_API_KEY, } for ix, file in enumerate(files_metas): with open(file["file_path"], "rb") as f: file = { "file": (file["filename"], f, "text/markdown"), } response = requests.post( f"{BASE_URL}/v1/files", headers=headers, files=file, ) response_json = response.json() file_id = response_json["file_id"] # 🔑 will be used in the next step files_metas[ix]["file_id"] = response_json["file_id"]

Save every file_id and filename returned by the API. You’ll pass the file_id values into the next request to create the schema.

Response

Each successful upload returns a JSON body like

Success — 200 OK
{ "file_id": "a68d2bf16a44efc0efbc35c5564d5230a4d0480e900d495e787688c8ef353dfa", "object": "file", "filename": "8_k_nvda_2026_01_20.txt", "status": "uploaded", "content_type": "text/plain", "content_size": 9479, "created_at": 1776836274 }

step 4/5. Initiate generation of a schema

Reference all file_id from step 3.

response = requests.post( f"{BASE_URL}/v1/xfrukt_schema", json=files_metas, headers=headers ) response_json = response.json() assert response_json["status"] == "RUNNING" # 🔑 will be used in the next step request_id = response_json["request_id"]

step 5/5. Download the schema

Poll the job status. Once it reports FINISHED, download the result.

response = requests.get( f"{BASE_URL}/v1/xfrukt_schema?request_id={request_id}", headers=headers ) response_json = response_get.json() assert response_json.get("status", "") == "FINISHED"
Success — 200 OK
{ "request_id": "293db4a4-8a5e-43ec-a530-b187a221f804", "status": "FINISHED", "results": [ { "file_id": "d7ff9ada-7f1d-4682-8885-b4d993de0701", "filename": "securities_trading_schema.jsonl", "content": "...", "format": "application/jsonl", "error": "" } ] }

Download the schema

import tempfile file_ids = [r["file_id"] for r in response_json["results"]] data = {"file_ids": file_ids} response = requests.post( f"{BASE_URL}/v1/download", json=data, headers=headers ) with tempfile.NamedTemporaryFile(mode='wb', delete=False, suffix='.jsonl') as temp_jsonl: jsonl_path = pathlib.Path(temp_jsonl.name) for chunk in response.iter_bytes(chunk_size=8192): temp_jsonl.write(chunk) all_files = [] line_number = 0 with open(jsonl_path, "r", encoding="utf-8") as jsonl_file: for line in jsonl_file: line_number += 1 out = {} line = line.strip() if not line: continue try: row = json.loads(line) except json.JSONDecodeError as err: # log but continue processing remaining records print(f"Warning: Malformed JSON at line {line_number}: {err}") out["error"] = err all_files.append(out) continue file_id = row["file_id"] filename = row["filename"] content = row["content"] out = { "file_id": file_id, "filename": filename, } all_files.append(out)

Resulting schema


Loading flow…

Schemas are returned as JSON Structure notation  by default. Contact us to get another format like SQL DDL.


{ "$schema": "https://json-structure.org/meta/core/v0/#", "$id": "https://xfrukt.com/schemas/917c26c8-b43d-460b-ad0c-29c25151e777", "name": "securities_trading", "definitions": { "types": { "entity_type_enum": { "name": "entity_type_enum", "type": "string", "enum": [ "CORPORATION", "LLC", ... ] }, "period_type_enum": { "name": "period_type_enum", "type": "string", "enum": [ "FY", "Q1", ... ] } }, "tables": { "entities": { "type": "object", "properties": { "entity_id": { "name": "entity_id", "type": "uuid", "description": "Primary key. May align with external identifiers such as CIK, LEI, or internally generated." }, "legal_name": { "name": "legal_name", "type": "string", "maxLength": 512 }, "names": { "name": "names", "type": "array", "description": "All known aliases, trading names, former names.", "items": { "type": "string" } }, "entity_type": { "name": "entity_type", "type": { "$ref": "#/definitions/types/entity_type_enum" } }, "tickers": { "name": "tickers", "type": "array", "items": { "type": "string", "maxLength": 10 } }, "lei": { "name": "lei", "type": "string", "description": "ISO 17442 Legal Entity Identifier.", "minLength": 20, "maxLength": 20 }, "metadata": { "name": "metadata", "type": "map", "description": "Extensible key-value store for jurisdiction, SIC code, incorporation date, parent entity, and other domain-specific attributes.", "values": { "type": "any" } } }, "required": [ "entity_id" ], "identity": [ "entity_id" ], "description": "Universal legal or organizational entity. Synonyms: issuer, company, counterparty, filer, fund, SPV." }, "publications": { "type": "object", "properties": { "publication_id": { "name": "publication_id", "type": "uuid" }, "entity_id": { "name": "entity_id", "type": "uuid", "description": "FK to entities. The filing or reporting entity." }, "document_ids": { "name": "document_ids", "type": "array", "description": "External identifiers for constituent documents and attachments.", "items": { "type": "string" } }, "title": { "name": "title", "type": "string", "description": "Human-readable title of the publication." }, "publication_types": { "name": "publication_types", "type": "array", "description": "All applicable classification labels. Includes SEC form types (10-Q, 8-K, DEF 14A), press releases, etc. Values from dei:DocumentType or 'Other'.", "items": { "type": "string" } }, "fiscal_year": { "name": "fiscal_year", "type": "int16", "description": "Fiscal year of the reporting period." } }, "required": [ "publication_id", "entity_id" ], "identity": [ "publication_id" ], "relations": { "entity_id": { "name": "entity_id", "cardinality": "single", "targettype": { "$ref": "#/definitions/tables/entities" }, "description": "The entity that authored or filed this publication." } } }, "sections": { "type": "object", "properties": { "section_id": { "name": "section_id", "type": "uuid", }, "publication_id": { "name": "publication_id", "type": "uuid", }, "seq": { "name": "seq", "type": "int32", "description": "Ordinal position within the publication across all documents." }, "text": { "name": "text", "type": "string", "description": "Raw or normalized textual content of the section." }, "metadata": { "name": "metadata", "type": "map", "description": "Extensible attributes: page number, language, extraction confidence, etc.", "values": { "type": "any" } } }, "required": [ "section_id", "publication_id", "seq", "text", "metadata" ], "identity": [ "section_id" ], "description": "Together all sections reconstitute the full document.", "relations": { "publication_id": { "name": "publication_id", "cardinality": "single", "targettype": { "$ref": "#/definitions/tables/publications" } } } }, "securities": { "type": "object", "properties": { "security_id": { "name": "security_id", "type": "uuid" }, "entity_id": { "name": "entity_id", "type": "uuid" }, "isin": { "name": "isin", "type": "string", "minLength": 12, "maxLength": 12 }, "metadata": { "name": "metadata", "type": "map", "description": "Extensible attributes: currency, maturity date, coupon rate, strike price, underlying_security_id, contract size, settlement type, etc.", "values": { "type": "any" } } }, "required": [ "security_id", "entity_id", "isin" "metadata" ], "identity": [ "security_id" ], "description": "Tradable financial instruments. Covers equities, fixed income, derivatives, funds, and structured products. One security per distinct instrument listing.", "relations": { "entity_id": { "name": "entity_id", "cardinality": "single", "targettype": { "$ref": "#/definitions/tables/entities" } } } }, "price_history": { "type": "object", "properties": { "price_id": { "name": "price_id", "type": "uuid" }, "security_id": { "name": "security_id", "type": "uuid" }, "price_date": { "name": "price_date", "type": "date" }, "open_price": { "name": "open_price", "type": "decimal" "precision": 18, "scale": 6 }, "high_price": { "name": "high_price", "type": "decimal" "precision": 18, "scale": 6 }, "low_price": { "name": "low_price", "type": "decimal" "precision": 18, "scale": 6 }, "close_price": { "name": "close_price", "type": "decimal", "precision": 18, "scale": 6 }, "adjusted_close": { "name": "adjusted_close", "type": "decimal", "precision": 18, "scale": 6 }, "volume": { "name": "volume", "type": "int64" }, "vwap": { "name": "vwap", "type": "decimal", "precision": 18, "scale": 6 }, "trade_count": { "name": "trade_count", "type": "int32", "description": "Number of discrete trades in the session." }, "source_name": { "name": "source_name", "type": "string", "description": "Data vendor or exchange feed identifier.", "maxLength": 50 } }, "required": [ "price_id", "security_id", "price_date", "open_price", "high_price", "low_price", "close_price", "adjusted_close", "volume", "vwap", "trade_count", "source_name" ], "identity": [ "price_id" ], "description": "End-of-day and intraday OHLCV price records per security per date. Source of truth for market price time series.", "relations": { "security_id": { "name": "security_id", "cardinality": "single", "targettype": { "$ref": "#/definitions/tables/securities" } } } } } } }

Request Early Access

This site is protected by reCAPTCHA and the Google Privacy Policyand Terms of Service apply.