From 4112685b696226cf9950160416a66a531b9cd2cb Mon Sep 17 00:00:00 2001 From: "Peter.Morton" Date: Fri, 27 Mar 2026 16:06:18 -0500 Subject: [PATCH] docs: Update README with X-Verint-KAB-Original-URL header feature - Add Source URL Header to features list - Document /documents/{documentId} endpoint - Add X-Verint-KAB-Original-URL header to response headers section - Include curl examples for document export with header inspection - Add documentation links for new feature spec and API contract Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- README.md | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/README.md b/README.md index 7b95d6d..b022b8c 100644 --- a/README.md +++ b/README.md @@ -5,6 +5,8 @@ HTTP service that generates XML sitemaps listing all accessible documents in a G ## Features - **Sitemap Generation**: XML sitemap at `/sitemap.xml` listing all accessible Google Drive documents +- **Document Export**: Export Google Drive documents with original source URL tracking +- **Source URL Header**: X-Verint-KAB-Original-URL response header for content traceability - **RESTful URLs**: Document links in format `/documents/{documentId}` per sitemap protocol - **Service Account Auth**: JWT-based authentication using Google Service Account credentials - **Pagination Support**: Handles large document sets (up to 50,000 URLs per sitemap protocol) @@ -64,6 +66,12 @@ curl http://localhost:3000/sitemap.xml | xmllint --noout - # Count documents in sitemap curl http://localhost:3000/sitemap.xml | grep -c '' + +# Export a document and view source URL header +curl -I http://localhost:3000/documents/{documentId} + +# Export document and extract original Google Drive URL +curl -D - http://localhost:3000/documents/{documentId} | grep X-Verint-KAB-Original-URL ``` ## Architecture @@ -162,6 +170,7 @@ Environment variables override JSON config (e.g., `PORT`, `GOOGLE_SERVICE_ACCOUN ### Endpoints - `GET /sitemap.xml` - XML sitemap of all accessible documents (200 OK with XML body) +- `GET /documents/{documentId}` - Export Google Drive document with source URL tracking - `GET /*` - All other paths return 404 Not Found (empty body) ### Response Headers @@ -171,6 +180,11 @@ Successful sitemap response (200 OK): - `X-Request-Id: req_` - Request tracing ID - `X-Document-Count: ` - Number of documents in sitemap +Successful document export response (200 OK): +- `Content-Type: application/pdf` (or appropriate MIME type) +- `X-Request-Id: req_` - Request tracing ID +- `X-Verint-KAB-Original-URL: https://drive.google.com/file/d/{fileId}` - Original Google Drive URL for content traceability + ### Error Responses All errors return **HTTP status code only** with **no response body** (per specification): @@ -272,7 +286,15 @@ ISC ## Documentation For detailed setup and usage instructions, see: + +### Sitemap Feature - [Quick Start Guide](specs/001-drive-proxy-adapter/quickstart.md) - [Feature Specification](specs/001-drive-proxy-adapter/spec.md) - [Implementation Plan](specs/001-drive-proxy-adapter/plan.md) - [Data Model](specs/001-drive-proxy-adapter/data-model.md) + +### Source URL Header Feature +- [Quick Start Guide](specs/001-gdrive-url-header/quickstart.md) +- [Feature Specification](specs/001-gdrive-url-header/spec.md) +- [API Contract](specs/001-gdrive-url-header/contracts/response-headers.md) +- [Implementation Plan](specs/001-gdrive-url-header/plan.md)