DEV Community

Pranav Bakare
Pranav Bakare

Posted on

Advanced SQL interview questions

Here are concise and clear answers to the advanced SQL interview questions listed previously:

  1. Clustered vs. Non-Clustered Indexes

Clustered Index: Determines the physical order of data in the table. Each table can have only one clustered index. Use it for columns frequently used in ORDER BY or JOIN clauses.

Non-Clustered Index: Stores a separate structure with pointers to the data rows. Multiple non-clustered indexes can exist on a table. Use it when you need to speed up searches on columns that are not the primary key.

  1. Optimizing a Slow-Running Query

Steps:

  1. Analyze the query execution plan.

  2. Identify bottlenecks, such as table scans.

  3. Check for missing indexes and add them if needed.

  4. Simplify complex joins or subqueries.

  5. Use appropriate filtering and pagination.

Example: Optimized a query that used full table scans by adding indexes and rewriting joins, reducing execution time from minutes to seconds.

  1. Window Functions

Definition: Window functions perform calculations across a set of table rows related to the current row.

Difference: Unlike aggregate functions that return a single value for multiple rows, window functions return a value for each row.

Example:

SELECT employee_id, salary, AVG(salary) OVER (PARTITION BY department) AS avg_department_salary
FROM employees;

  1. Data Partitioning

Scenario: Implemented partitioning on a large sales table based on date to improve query performance.

Challenges: Handling data distribution and ensuring queries remain efficient.

Solution: Used range partitioning to separate data by month, allowing quicker access to recent records.

  1. Transactions and ACID Properties

Transactions: A sequence of operations treated as a single unit.

ACID:

Atomicity: All operations must complete or none.

Consistency: Transactions bring the database from one valid state to another.

Isolation: Transactions do not affect each other.

Durability: Once committed, changes are permanent.

Example: A bank transfer operation where both debit and credit must succeed for the transaction to be valid.

  1. Common Table Expressions (CTEs)

Definition: A CTE provides a temporary result set that can be referenced within a SELECT, INSERT, UPDATE, or DELETE.

Difference: CTEs can be recursive and are often easier to read than subqueries.

Example:

WITH RecursiveCTE AS (
SELECT employee_id, manager_id FROM employees
UNION ALL
SELECT e.employee_id, e.manager_id FROM employees e
INNER JOIN RecursiveCTE r ON e.manager_id = r.employee_id
)
SELECT * FROM RecursiveCTE;

  1. Normalization

Definition: The process of organizing data to minimize redundancy.

Normal Forms:

1NF: No repeating groups or arrays.

2NF: All non-key attributes are fully functional dependent on the primary key.

3NF: No transitive dependency.

Importance: Ensures data integrity and reduces redundancy.

Example: A denormalized schema may combine customer and order tables to improve read performance at the expense of data redundancy.

  1. Triggers

Definition: A trigger is a stored procedure that automatically runs when certain events occur in a table (e.g., INSERT, UPDATE, DELETE).

Difference: Triggers execute automatically, while stored procedures require explicit calls.

Advantageous Scenario: Automatically logging changes to a table without modifying the application code.

  1. Indexing Strategies

Approach: Analyze query patterns, identify frequently accessed columns, and determine appropriate indexes.

Example: A covering index for a frequently run query includes all columns in the SELECT statement, reducing the need to access the main table.

  1. Data Integrity in Distributed Databases

Approach: Use distributed transactions, replication, and conflict resolution strategies to maintain consistency.

CAP Theorem: States that a distributed database can only guarantee two out of three properties: Consistency, Availability, and Partition Tolerance.

  1. SQL Injection Prevention

Definition: SQL injection is a code injection technique that exploits vulnerabilities in an application's software.

Prevention: Use prepared statements and parameterized queries to separate SQL code from data input.

Best Practices: Validate and sanitize user inputs.

  1. JSON Data Types in SQL

Definition: JSON data types allow storage and manipulation of JSON-formatted data within a relational database.

Advantages: Flexible schema, nested structures, and easy data interchange.

Example:

SELECT * FROM employees WHERE data->>'department' = 'Sales';

  1. Database Migration

Experience: Successfully migrated a large dataset from MySQL to PostgreSQL.

Challenges: Data type incompatibilities and maintaining referential integrity.

Solution: Used ETL tools to transform and load data while ensuring data integrity.

  1. Materialized Views

Definition: A materialized view stores the result of a query physically, allowing faster access at the cost of storage space.

Difference: Unlike regular views, materialized views can be indexed and refreshed periodically.

Use Case: Suitable for reporting purposes where data does not need to be real-time.

  1. Auditing and Logging Data Changes

Implementation: Use triggers to log changes into an audit table for data modifications.

Example:

CREATE TRIGGER log_changes
AFTER UPDATE ON employees
FOR EACH ROW
INSERT INTO audit_log (employee_id, change_time, old_salary, new_salary)
VALUES (OLD.employee_id, NOW(), OLD.salary, NEW.salary);

Conclusion

These precise answers provide a clear understanding of advanced SQL concepts and demonstrate practical knowledge that candidates can discuss in interviews.

Top comments (0)