SQL Issues

Scaling Issues with SQL DB


          Customer Table      Orders Table
          Id(pk) | name       OrderId | CustId (FK) | Amnt
          --------------      ----------------------------
            1    | A            101   |   1         | -
            2    | B            102   |   2         | -
            3    | C            103   |   1         | -

Referential Integrity

Definition:

This is a concept that ensures the relationships between tables are maintained properly by enforcing valid data in the foreign key relationships. It is a key aspect of database integrity, ensuring that the foreign key in one table correctly refers to a valid primary key in another table.

CASCADE OPERATIONS

When a row in the referenced table (the table with the primary key) is updated or deleted, the related foreign key values in the other table may need to be adjusted or removed to maintain integrity. This is often handled with CASCADE operations: delete, add

Challenges in Referential Integrity across Shards:

1. Transaction Coordination: if transaction requires to write data on same table in different shards, it becomes challenging.
2. Latency in data synchronization: Every time you need to validate a foreign key, it may require a network call to another shard, which slows down the query performances
3. Joins: joins across shards are less efficient, and careful query planning is required to minimize the impact on performance
4. Orphaned Records and Inconsistent State: For example, an Order record in one shard might refer to a Customer record in another shard that has been deleted, leading to data inconsistencies
Data replication lag and eventual consistency models, can further complicate maintaining consistency between dependent records.

SQL Cannot support Large data

SQL stores data in tables, if more data size of table will grow.
Problem with Huge tables:
- Searching will take time, indexing will take time
- Horizontal scaling not possible

SQL Cannot support Horizontal scaling=Sharding efficiently

Sharding a relational database involves dividing the dataset into shards (partitions) and distributing them across multiple servers.
Problem with Huge tables:
- Searching will take time, indexing will take time
- Horizontal scaling not possible - Referential Integrity