Schema Changes and Reference File Download Link

https://eu2.contabostorage.com/00f3241116844f24b628f46d81abb929:st1/folder6/6600/1656108001_pds_v2_6_1_change_log_consolidated_-_Standar_Format.xlsx

2026-05-30 06:58:04 - Admin

<style> body{ font-family: Arial, Helvetica, sans-serif; line-height: 1.6; margin:0; padding:0 1rem; background:#f9f9f9; color:#333; } header{ background:#4A90E2; color:#fff; padding:1.5rem 1rem; text-align:center; } main{ max-width:800px; margin:2rem auto; background:#fff; padding:2rem; box-shadow:0 0 10px rgba(0,0,0,0.1); } h1,h2,h3{ color:#2c3e50; } pre{ background:#eee; padding:1rem; overflow-x:auto; } table{ width:100%; border-collapse:collapse; margin:1rem 0; } th, td{ border:1px solid #ddd; padding:0.5rem; text-align:left; } th{ background:#f2f2f2; } a{ color:#4A90E2; } </style><header> <h1>Schema Changes Concepts, Strategies, and Best Practices</h1></header><main> <section> <h2>What Is a Schema?</h2> <p>A schema is a formal definition of the structure of data stored in a database, a file format, or an API contract. It describes tables, columns, data types, constraints, relationships, indexes, and sometimes business rules. In relational databases a schema is often expressed in <code>CREATE TABLE</code> statements; in NoSQL stores it may be a JSON schema or a set of validation rules. In dataexchange scenarios, schemas appear as XML Schema (XSD), JSON Schema, or Avro/ProtoBuf definitions.</p> <p>Because applications, reports, and integrations rely on the shape of data, any change to a schema can have farreaching consequences. Understanding the types of changes, their impact, and how to manage them safely is essential for maintaining data integrity and system reliability.</p> </section> <section> <h2>Categories of Schema Changes</h2> <h3>1. Additive Changes</h3> <p>Additive changes extend the schema without breaking existing consumers. Typical examples include:</p> <ul> <li>Adding a new column with a <code>NULL</code> default.</li> <li>Introducing a new optional field in a JSON schema.</li> <li>Creating a new view or index.</li> </ul> <p>Since existing queries and code paths do not reference the new objects, these changes can often be deployed with zerodowntime.</p> <h3>2. Restrictive Changes</h3> <p>These tighten the definition and can cause failures for data that previously complied. Examples:</p> <ul> <li>Changing a column from <code>VARCHAR(255)</code> to <code>VARCHAR(100)</code>.</li> <li>Adding a <code>NOT NULL</code> constraint without a default value.</li> <li>Removing an optional field from an API contract.</li> </ul> <p>Restrictive changes usually require a data migration, backfill, or a feature toggle to avoid breaking production workloads.</p> <h3>3. Renaming / Restructuring</h3> <p>Renaming tables, columns, or moving data between tables is the most disruptive category. It typically requires:</p> <ul> <li>Updating all dependent code, queries, stored procedures, and ETL pipelines.</li> <li>Providing a migration window or using views/aliases to maintain compatibility.</li> </ul> <h3>4. Deleting Objects</h3> <p>Dropping tables, columns, or fields is a permanent operation. Before deletion you should:</p> <ul> <li>Confirm that the data is no longer required for reporting, audit, or compliance.</li> <li>Archive the data in a separate location if needed.</li> <li>Communicate the change to all stakeholders.</li> </ul> </section> <section> <h2>Why Schema Changes Matter</h2> <p>Changes affect three main dimensions:</p> <ol> <li><strong>Application Compatibility</strong> Code that expects a certain column or field will throw errors if it disappears or alters type.</li> <li><strong>Data Quality</strong> New constraints can surface hidden data problems, causing load failures or silent truncation.</li> <li><strong>Performance</strong> Adding indexes can improve read speed but slow writes; removing them can have the opposite effect.</li> </ol> <p>Neglecting any of these dimensions can lead to production incidents, data loss, or degraded user experience.</p> </section> <section> <h2>BestPractice Workflow for Schema Changes</h2> <h3>1. Plan and Document</h3> <p>Start with a clear description of the change, its rationale, and the impacted objects. Use a changerequest ticket or a migration ticket that includes:</p> <ul> <li>Scope (tables, columns, APIs).</li> <li>Risk assessment (additive, restrictive, breaking).</li> <li>Rollback strategy.</li> <li>Test plan.</li> </ul> <h3>2. Version the Schema</h3> <p>Treat the schema like source code. Store <code>.sql</code> files, JSON schema definitions, or Avro files in a versioncontrol repository. Tag releases so you can trace which application version expects which schema version.</p> <h3>3. Automate Migrations</h3> <p>Use migration tools (Flyway, Liquibase, Alembic, dbmate) that apply incremental scripts in a controlled order. Automation ensures:</p> <ul> <li>Idempotent execution.</li> <li>Consistent state across environments.</li> <li>Auditable history.</li> </ul> <h3>4. Test Extensively</h3> <p>Run migrations against:</p> <ul> <li>Unittest databases (inmemory or Docker containers).</li> <li>Staging environment with a realistic data volume.</li> <li>Performance benchmarks for index changes.</li> </ul> <p>Include regression tests for downstream services that consume the data.</p> <h3>5. Deploy with Feature Flags</h3> <p>When a change is restrictive, consider a twostep rollout:</p> <ol> <li>Deploy the new schema and backfill data behind a flag.</li> <li>Once verification passes, enable the flag for all traffic.</li> </ol> <h3>6. Monitor and Verify</h3> <p>After deployment, watch for:</p> <ul> <li>SQL error rates and slowquery logs.</li> <li>API validation failures.</li> <li>Data quality alerts (e.g., rows with <code>NULL</code> in a newly nonnull column).</li> </ul> <h3>7. Rollback If Needed</h3> <p>Because schema changes can be destructive, always have a rollback plan. For additive changes, a rollback may be as simple as undoing the migration script; for restrictive or rename operations, you might need to restore from a backup or run a reverse migration.</p> </section> <section> <h2>Common Scenarios and Sample SQL</h2> <h3>Adding a Nullable Column</h3> <pre>ALTER TABLE customersADD COLUMN preferred_language VARCHAR(10) NULL;</pre> <h3>Adding a Column with a Default (NonNullable)</h3> <pre>ALTER TABLE ordersADD COLUMN status VARCHAR(20) NOT NULL DEFAULT 'pending';</pre> <h3>Changing a Column Type Safely</h3> <p>First add a new column, copy data, then drop the old column.</p> <pre>ALTER TABLE products ADD COLUMN price_cents BIGINT;UPDATE products SET price_cents = ROUND(price * 100);ALTER TABLE products DROP COLUMN price;ALTER TABLE products RENAME COLUMN price_cents TO price;</pre> <h3>Renaming a Column (PostgreSQL example)</h3> <pre>ALTER TABLE employees RENAME COLUMN birthdate TO date_of_birth;</pre> <h3>Dropping an Index</h3> <pre>DROP INDEX IF EXISTS idx_orders_customer_id;</pre> </section> <section> <h2>Schema Changes in NoSQL & APIFirst Environments</h2> <p>In document stores (MongoDB, Couchbase) and API contracts, the same principles apply but the tools differ.</p> <ul> <li><strong>MongoDB</strong>: Use <code>db.collection.updateMany</code> to backfill new fields, and define validation rules with <code>$jsonSchema</code> for future writes.</li> <li><strong>JSON Schema</strong>: Mark new properties as <code>"required": false</code> initially, then evolve to required after existing payloads are verified.</li> <li><strong>GraphQL</strong>: Add new fields to types only; deprecate old fields with the <code>@deprecated</code> directive before removal.</li> </ul> </section> <section> <h2>Impact on Data Warehousing & BI</h2> <p>Data warehouses often replicate production schemas. When a source schema changes:</p> <ol> <li>Update the extraction layer (e.g., change CDC mapping).</li> <li>Adjust dimension tables slowly changing dimensions may need new attributes.</li> <li>Refresh downstream dashboards; use versioned view names to avoid breaking reports.</li> </ol> </section> <section> <h2>Key Takeaways</h2> <ul> <li>Classify changes as additive, restrictive, rename, or delete to gauge risk.</li> <li>Versioncontrol every schema artifact and automate migrations.</li> <li>Test migrations on realistic data and monitor closely after release.</li> <li>Use feature flags or twostep rollouts for breaking changes.</li> <li>Document rollback procedures and, when possible, avoid irreversible operations.</li> </ul> <p>By treating schema evolution as a disciplined engineering activity, teams can deliver new features and improvements without sacrificing stability, performance, or data quality.</p> </section></main>

Lebih banyak