Contents
1. Basic SQL Syntax
SELECT column1, column2
FROM table_name
WHERE condition
ORDER BY column ASC/DESC
LIMIT n;SELECT * → All columns
DISTINCT → Remove duplicates
AS → Create alias
ORDER BY → Sort results
2. WHERE Clause Operators
| Operator | Meaning | Example |
|---|---|---|
| = | Equal | age = 25 |
| <> or != | Not equal | name <> 'John' |
| BETWEEN | Range (inclusive) | age BETWEEN 20 AND 30 |
| IN | Match any value | dept IN ('CS', 'IT') |
| IS NULL | Check NULL | phone IS NULL |
| IS NOT NULL | Check NOT NULL | email IS NOT NULL |
3. Pattern Matching with LIKE
Wildcards
% → Zero or more characters_ → Exactly one characterCommon Patterns
'A%' → Starts with A'%n' → Ends with n'_A%' → Second char is A'%help%' → Contains 'help'-- Authors with single-char first name
SELECT author_fname, author_lname
FROM book_authors
WHERE author_fname LIKE '_';
-- Names starting with 'S' but not ending with 'n'
SELECT name FROM players WHERE name LIKE 'S%'
EXCEPT
SELECT name FROM players WHERE name LIKE '%n';4. Set Operations
UNION
Combines results, removes duplicates
A ∪ BINTERSECT
Common records only
A ∩ BEXCEPT
In first but not second
A - B-- UNION: Combine students and teachers
SELECT name FROM students
UNION
SELECT name FROM teachers;
-- INTERSECT: Courses in both Fall and Spring
SELECT course_id FROM fall_courses
INTERSECT
SELECT course_id FROM spring_courses;
-- EXCEPT: Members who haven't issued books
SELECT member_no FROM members
EXCEPT
SELECT member_no FROM book_issue;5. Aggregate Functions & GROUP BY
COUNT()Count rows
COUNT(*)SUM()Total sum
SUM(salary)AVG()Average
AVG(marks)MAX()Maximum
MAX(age)MIN()Minimum
MIN(price)GROUP BYGroup rows
GROUP BY dept⚡ WHERE vs HAVING
-- Count female students per department
SELECT department_code, COUNT(*) AS no_of_females
FROM students
WHERE gender='F'
GROUP BY department_code;
-- Departments with avg salary > 50000
SELECT dept_name, AVG(salary)
FROM instructor
GROUP BY dept_name
HAVING AVG(salary) > 50000;6. JOIN Operations
NATURAL JOIN
Automatically joins on common columns
SELECT * FROM students NATURAL JOIN departments;INNER JOIN (Explicit)
Join with ON condition
SELECT s.name, d.dept_name
FROM students s
INNER JOIN departments d
ON s.dept_id = d.dept_id;USING Clause
When column names are the same
SELECT * FROM book_catalogue INNER JOIN book_copies USING (isbn_no);Self-Join
Join table with itself (use aliases)
-- Teams in same city as 'Amigos'
SELECT t2.name
FROM teams t1, teams t2
WHERE t1.city = t2.city
AND t1.name = 'Amigos'
AND t2.name <> 'Amigos';Join Selection Guide
- ✓ Same column name →
NATURAL JOINorUSING - ✓ Different column names →
INNER JOIN ... ON - ✓ Compare within same table → Self-join with aliases
- ✓ All combinations → Cartesian product
7. Subqueries
IN / NOT IN
SELECT name
FROM students
WHERE dept_id IN (
SELECT dept_id
FROM departments
WHERE building = 'Block_2'
);EXISTS / NOT EXISTS
SELECT *
FROM members m
WHERE NOT EXISTS (
SELECT *
FROM book_issue bi
WHERE m.member_no = bi.member_no
);ANY
More than at least one (like OR)
SELECT * FROM instructor
WHERE salary > ANY (
SELECT salary
FROM instructor
WHERE dept = 'Psychology'
);ALL
More than every single one (like AND)
SELECT * FROM instructor
WHERE salary > ALL (
SELECT salary
FROM instructor
WHERE dept = 'Psychology'
);8. Common Table Expressions (CTE)
CTEs make complex queries more readable by breaking them into named temporary result sets.
-- Max salary per department with avg > 50000
WITH temp_table(iname, isalary, idept) AS (
SELECT name, salary, dept_name
FROM instructor
WHERE dept_name IN (
SELECT dept_name
FROM instructor
GROUP BY dept_name
HAVING AVG(salary) > 50000
)
)
SELECT iname, isalary
FROM temp_table
WHERE isalary IN (
SELECT MAX(isalary)
FROM temp_table
GROUP BY idept
);💡 When to Use CTE
- ✓ Complex nested queries
- ✓ Need to reference result multiple times
- ✓ Makes query more readable
9. Relational Algebra
| Symbol | Operation | SQL Equivalent | Example |
|---|---|---|---|
| σ | Selection (filter rows) | WHERE | σage>20(Student) |
| Π | Projection (select columns) | SELECT | Πname,age(Student) |
| × | Cartesian Product | FROM t1, t2 | Student × Course |
| ⋈ | Natural Join | NATURAL JOIN | Student ⋈ Enrolls |
| ∪ | Union | UNION | R ∪ S |
| ∩ | Intersection | INTERSECT | R ∩ S |
| − | Difference | EXCEPT | R − S |
Converting SQL to Relational Algebra
SQL
SELECT name, age
FROM Student
WHERE age > 20
AND dept = 'CS';Relational Algebra
Π name, age (
σ age>20 ∧ dept='CS'
(Student)
)10. Tuple Relational Calculus (TRC)
Basic Syntax
{ t | Condition }Read as: "Set of tuples t such that Condition is true"
| Symbol | Meaning | Example |
|---|---|---|
| ∃ | There exists | ∃p ∈ Students |
| ∀ | For all | ∀s ∈ Faculty |
| ∧ | AND | condition1 ∧ condition2 |
| ∨ | OR | condition1 ∨ condition2 |
| t[A] | Attribute A of tuple t | t[name] = 'John' |
Converting SQL to TRC
SQL
SELECT P, Z
FROM a, b
WHERE a.R = b.X;TRC
{ t | ∃p ∈ a, ∃q ∈ b (
t[P] = p[P] ∧
t[Z] = q[Z] ∧
p[R] = q[X]
)}11. Common Problem Patterns
| Problem Type | Solution |
|---|---|
| "Find names where..." | SELECT name FROM ... WHERE |
| "Count/Total number" | COUNT() + GROUP BY |
| "Has NOT done..." | EXCEPT or NOT EXISTS |
| "For each department..." | GROUP BY department |
| "Pattern matching" | LIKE '%pattern%' |
| "Records in both A and B" | INTERSECT |
| "Records in both A and B" | INTERSECT |
| "More than at least one" | > ANY (subquery) |
| "Same city as X" | Self-join |
| "Oldest/Youngest" | MIN(dob) or MAX(dob) |
◉ Quick Tips
✓ Always use IS NULL, never = NULL
✓ Use WHERE for rows, HAVING for groups
✓ BETWEEN includes both endpoints
✓ Use aliases for self-joins and complex queries
✓ DISTINCT removes duplicates
✓ Chain Rule: Compose operations from inside out
Real-World Practice Examples
Example 1: University Management System
Tables & Schema:
Student(student_id, name, age, dept_code, city, email)Course(course_id, course_name, credits, dept_code)Enrollment(student_id, course_id, semester, grade, year)
FK: student_id → Student, course_id → CourseInstructor(instructor_id, name, dept_code, salary, office)Teaches(instructor_id, course_id, semester, year)
FK: instructor_id → Instructor, course_id → CourseRelationship Visualization:
Relationship Visualization:
Student ──(enrolls in)──▶ Enrollment ◀──(has)── Course
│ │
│ │
└──(belongs to)──▶ Department ◀──(offers)─────┘
▲
│
(employs)
│
Instructor ──(teaches)──▶ Teaches ──▶ Course← Scroll horizontally to view full diagram
Questions:
Q1. Find names of all students enrolled in 'Database Systems' course.
Show Solution
SQL Query:
SELECT DISTINCT s.name
FROM Student s
NATURAL JOIN Enrollment e
NATURAL JOIN Course c
WHERE c.course_name = 'Database Systems';Explanation:
We join Student → Enrollment → Course to connect students with their courses. NATURAL JOIN works because all tables share common column names (student_id, course_id). DISTINCT removes duplicate names if a student enrolled multiple times.
Relational Algebra:
Πname(σcourse_name='Database Systems'(Student ⋈ Enrollment ⋈ Course))Q2. List all courses taught by instructor 'Dr. Smith' in Fall 2023.
Show Solution
SQL Query:
SELECT c.course_name, c.credits
FROM Course c
INNER JOIN Teaches t ON c.course_id = t.course_id
INNER JOIN Instructor i ON t.instructor_id = i.instructor_id
WHERE i.name = 'Dr. Smith'
AND t.semester = 'Fall'
AND t.year = 2023;Explanation:
We use INNER JOIN with explicit ON conditions because we need to filter by multiple attributes (semester, year) from the Teaches table. The join path: Course → Teaches → Instructor connects courses with who teaches them.
Q3. Find students who have NOT enrolled in any course.
Show Solution
SQL Query (Method 1 - EXCEPT):
SELECT student_id, name
FROM Student
EXCEPT
SELECT s.student_id, s.name
FROM Student s
INNER JOIN Enrollment e ON s.student_id = e.student_id;SQL Query (Method 2 - NOT EXISTS):
SELECT student_id, name
FROM Student s
WHERE NOT EXISTS (
SELECT 1
FROM Enrollment e
WHERE e.student_id = s.student_id
);Explanation:
Method 1: EXCEPT returns all students minus those who appear in Enrollment.
Method 2: NOT EXISTS checks if there's no enrollment record for each student. Both methods achieve the same result, but NOT EXISTS is often more efficient.
Q4. Count the number of students in each department.
Show Solution
SQL Query:
SELECT dept_code, COUNT(*) AS student_count
FROM Student
GROUP BY dept_code
ORDER BY student_count DESC;Explanation:
GROUP BY groups students by department. COUNT(*) counts rows in each group. ORDER BY DESC shows departments with most students first. This is a simple aggregation query without joins since all data is in one table.
Q5. Find students who enrolled in both 'Database Systems' AND 'Algorithms' courses.
Show Solution
SQL Query (Method 1 - INTERSECT):
SELECT s.student_id, s.name
FROM Student s
NATURAL JOIN Enrollment e
NATURAL JOIN Course c
WHERE c.course_name = 'Database Systems'
INTERSECT
SELECT s.student_id, s.name
FROM Student s
NATURAL JOIN Enrollment e
NATURAL JOIN Course c
WHERE c.course_name = 'Algorithms';SQL Query (Method 2 - GROUP BY with HAVING):
SELECT s.student_id, s.name
FROM Student s
NATURAL JOIN Enrollment e
NATURAL JOIN Course c
WHERE c.course_name IN ('Database Systems', 'Algorithms')
GROUP BY s.student_id, s.name
HAVING COUNT(DISTINCT c.course_name) = 2;Explanation:
Method 1: INTERSECT finds students common to both queries.
Method 2: GROUP BY counts distinct courses per student. HAVING = 2 ensures they took BOTH courses. This method is more efficient for large datasets.
Relational Algebra:
Πstudent_id,name(σcourse_name='Database Systems'(Student ⋈ Enrollment ⋈ Course))
∩
Πstudent_id,name(σcourse_name='Algorithms'(Student ⋈ Enrollment ⋈ Course))Q6. Find the average number of courses per student.
Show Solution
SQL Query:
SELECT AVG(course_count) AS avg_courses_per_student
FROM (
SELECT student_id, COUNT(*) AS course_count
FROM Enrollment
GROUP BY student_id
) AS student_courses;Explanation:
This is a nested aggregation. Inner query: COUNT courses per student (GROUP BY student_id). Outer query: AVG of those counts. We need subquery because we can't directly average grouped results. Note: This only counts students who have enrolled in at least one course.
Q7. Find students who are in the same city as student 'Alice Johnson'.
Show Solution
SQL Query (Self-Join):
SELECT s2.student_id, s2.name, s2.city
FROM Student s1, Student s2
WHERE s1.city = s2.city
AND s1.name = 'Alice Johnson'
AND s2.name <> 'Alice Johnson';Explanation:
Self-join compares Student table with itself using aliases (s1, s2). s1 represents Alice, s2 represents other students. Match on city, exclude Alice herself. Self-joins are useful when comparing rows within the same table.
Q8. List courses that have more than 50 students enrolled.
Show Solution
SQL Query:
SELECT c.course_name, COUNT(*) AS enrollment_count
FROM Course c
NATURAL JOIN Enrollment e
GROUP BY c.course_id, c.course_name
HAVING COUNT(*) > 50
ORDER BY enrollment_count DESC;Explanation:
GROUP BY groups enrollments by course. COUNT(*) counts students per course. HAVING filters groups (not rows), so we use it instead of WHERE. We must include course_id in GROUP BY even though we only show course_name (some databases require this).
Q9. Find instructors who teach more courses than the average instructor.
Show Solution
SQL Query:
SELECT i.name, COUNT(*) AS course_count
FROM Instructor i
NATURAL JOIN Teaches t
GROUP BY i.instructor_id, i.name
HAVING COUNT(*) > (
SELECT AVG(course_count)
FROM (
SELECT instructor_id, COUNT(*) AS course_count
FROM Teaches
GROUP BY instructor_id
) AS instructor_courses
);Explanation:
Complex nested query: Inner subquery counts courses per instructor. Middle layer computes AVG of those counts. Outer query filters instructors whose count exceeds the average. HAVING clause can contain subqueries for comparison.
Q10. Find students who have taken all courses offered by the CS department.
Show Solution
SQL Query (Division Operation):
SELECT s.student_id, s.name
FROM Student s
WHERE NOT EXISTS (
-- Find CS courses that this student has NOT taken
SELECT c.course_id
FROM Course c
WHERE c.dept_code = 'CS'
AND NOT EXISTS (
SELECT e.course_id
FROM Enrollment e
WHERE e.student_id = s.student_id
AND e.course_id = c.course_id
)
);Explanation:
This is "division" in relational algebra - find students who took ALL CS courses. Double negative logic: "There does NOT exist a CS course that the student has NOT taken." Outer NOT EXISTS checks if set of untaken CS courses is empty (meaning took all). This is a challenging pattern but important for "for all" type questions.
Relational Algebra (Division):
Πstudent_id,course_id(Enrollment) ÷ Πcourse_id(σdept_code='CS'(Course))Division (÷) finds students whose enrollment set contains all CS courses.
Example 2: Hospital Management System
Tables & Schema:
Patient(patient_id, name, age, gender, city, phone)Doctor(doctor_id, name, specialization, experience_years, dept_id)Appointment(appointment_id, patient_id, doctor_id, date, time, reason)
FK: patient_id → Patient, doctor_id → DoctorPrescription(prescription_id, appointment_id, medicine_name, dosage, duration)
FK: appointment_id → AppointmentDepartment(dept_id, dept_name, building, floor)Relationship Visualization:
Relationship Visualization:
Patient ──(makes)──▶ Appointment ◀──(with)── Doctor
│ │
│ │
(generates) (belongs to)
│ │
▼ ▼
Prescription Department← Scroll to see full diagram
Questions:
Q1. Find all patients who have appointments with cardiologists.
Show Solution
SQL Query:
SELECT DISTINCT p.patient_id, p.name, p.phone
FROM Patient p
INNER JOIN Appointment a ON p.patient_id = a.patient_id
INNER JOIN Doctor d ON a.doctor_id = d.doctor_id
WHERE d.specialization = 'Cardiology';Explanation:
Three-way join: Patient → Appointment → Doctor. Filter by doctor's specialization. DISTINCT ensures each patient listed once even if they have multiple cardiology appointments.
Q2. List doctors who have never given any appointments.
Show Solution
SQL Query:
SELECT doctor_id, name, specialization
FROM Doctor
WHERE doctor_id NOT IN (
SELECT DISTINCT doctor_id
FROM Appointment
);Explanation:
Subquery finds all doctor_ids that appear in Appointment table. Main query selects doctors NOT IN that list. Alternative: use LEFT JOIN with NULL check.
Q3. Find the most prescribed medicine.
Show Solution
SQL Query:
SELECT medicine_name, COUNT(*) AS prescription_count
FROM Prescription
GROUP BY medicine_name
ORDER BY prescription_count DESC
LIMIT 1;Explanation:
GROUP BY medicine_name groups all prescriptions by medicine. COUNT(*) counts occurrences. ORDER BY DESC puts highest count first. LIMIT 1 returns only the top result. Simple but effective for "most/least" questions.
Q4. Find patients who have appointments on the same date as patient 'John Doe'.
Show Solution
SQL Query:
SELECT DISTINCT p2.patient_id, p2.name, a2.date
FROM Patient p1
INNER JOIN Appointment a1 ON p1.patient_id = a1.patient_id
INNER JOIN Appointment a2 ON a1.date = a2.date
INNER JOIN Patient p2 ON a2.patient_id = p2.patient_id
WHERE p1.name = 'John Doe'
AND p2.name <> 'John Doe';Explanation:
Self-join pattern on Appointment table (a1, a2) matching by date. p1 represents John Doe, p2 represents other patients. Join appointments with patients on both sides to get patient names.
Q5. Count appointments per doctor, showing only doctors with more than 10 appointments.
Show Solution
SQL Query:
SELECT d.name, d.specialization, COUNT(*) AS appointment_count
FROM Doctor d
INNER JOIN Appointment a ON d.doctor_id = a.doctor_id
GROUP BY d.doctor_id, d.name, d.specialization
HAVING COUNT(*) > 10
ORDER BY appointment_count DESC;Explanation:
Standard aggregation with HAVING filter. GROUP BY must include all non-aggregated columns in SELECT. HAVING filters groups after aggregation (can't use WHERE for this). Shows busy doctors ranked by appointment count.
Q6. Find patients prescribed 'Aspirin' by doctors from the 'Emergency' department.
Show Solution
SQL Query:
SELECT DISTINCT p.patient_id, p.name
FROM Patient p
INNER JOIN Appointment a ON p.patient_id = a.patient_id
INNER JOIN Prescription pr ON a.appointment_id = pr.appointment_id
INNER JOIN Doctor d ON a.doctor_id = d.doctor_id
INNER JOIN Department dept ON d.dept_id = dept.dept_id
WHERE pr.medicine_name = 'Aspirin'
AND dept.dept_name = 'Emergency';Explanation:
Complex join chain: Patient → Appointment → Prescription (for medicine) AND Appointment → Doctor → Department (for dept). Two filters combine: medicine AND department. Shows how to navigate multiple relationships in a single query.
Relational Algebra:
Πpatient_id,name( σmedicine='Aspirin' ∧ dept_name='Emergency'( Patient ⋈ Appointment ⋈ Prescription ⋈ Doctor ⋈ Department ) )Q7. Find the average age of patients by city.
Show Solution
SQL Query:
SELECT city, AVG(age) AS avg_age, COUNT(*) AS patient_count
FROM Patient
GROUP BY city
HAVING COUNT(*) >= 5
ORDER BY avg_age DESC;Explanation:
GROUP BY city aggregates patients by location. AVG(age) calculates average age per city. HAVING filters to show only cities with at least 5 patients (for statistical significance). Also shows patient count for context.
Q8. Find doctors who have treated patients from all cities.
Show Solution
SQL Query:
SELECT d.doctor_id, d.name
FROM Doctor d
INNER JOIN Appointment a ON d.doctor_id = a.doctor_id
INNER JOIN Patient p ON a.patient_id = p.patient_id
GROUP BY d.doctor_id, d.name
HAVING COUNT(DISTINCT p.city) = (
SELECT COUNT(DISTINCT city)
FROM Patient
);Explanation:
"For all" pattern using COUNT DISTINCT. Count distinct cities treated by each doctor. Compare with total distinct cities in Patient table. If equal, doctor has treated patients from all cities. Elegant alternative to complex NOT EXISTS pattern.
Q9. List appointments where the patient age is greater than the doctor's experience years.
Show Solution
SQL Query:
SELECT a.appointment_id, p.name AS patient_name, p.age,
d.name AS doctor_name, d.experience_years
FROM Appointment a
INNER JOIN Patient p ON a.patient_id = p.patient_id
INNER JOIN Doctor d ON a.doctor_id = d.doctor_id
WHERE p.age > d.experience_years;Explanation:
Compares attributes from different tables in WHERE clause. Join brings together patient and doctor attributes. WHERE compares numerical values across tables. Use column aliases (AS) for clarity when showing both names.
Q10. Find the department with the highest average doctor experience.
Show Solution
SQL Query:
SELECT dept.dept_name, AVG(d.experience_years) AS avg_experience
FROM Department dept
INNER JOIN Doctor d ON dept.dept_id = d.dept_id
GROUP BY dept.dept_id, dept.dept_name
ORDER BY avg_experience DESC
LIMIT 1;Explanation:
JOIN Department with Doctor to link departments with their doctors. GROUP BY department calculates AVG experience per department. ORDER BY DESC + LIMIT 1 finds maximum without using MAX subquery. Efficient pattern for "find the highest/lowest" questions.
Example 3: E-commerce System
Tables & Schema:
Customer(customer_id, name, email, city, phone, join_date)Product(product_id, product_name, category, price, stock_quantity, supplier_id)Orders(order_id, customer_id, order_date, total_amount, status)
FK: customer_id → CustomerOrderItem(order_id, product_id, quantity, unit_price)
FK: order_id → Orders, product_id → ProductReview(review_id, product_id, customer_id, rating, review_text, review_date)
FK: product_id → Product, customer_id → CustomerRelationship Visualization:
Customer ──(places)──▶ Orders ──(contains)──▶ OrderItem ──▶ Product
│ │
│ │
└──────────────(reviews)──────────────────────────────────┘Questions:
Q1. Find all products with names starting with 'Smart' and price less than $500.
Show Solution
SQL Query:
SELECT product_id, product_name, price, stock_quantity
FROM Product
WHERE product_name LIKE 'Smart%'
AND price < 500
ORDER BY price ASC;Explanation:
LIKE with 'Smart%' matches any product starting with "Smart". % wildcard matches zero or more characters after "Smart". Combine pattern matching with numerical comparison using AND. ORDER BY shows cheapest products first.
Relational Algebra:
Πproduct_id,product_name,price,stock_quantity(σproduct_name LIKE 'Smart%' ∧ price<500(Product))Q2. Find customers who have placed orders but never written a review.
Show Solution
SQL Query:
SELECT customer_id, name, email
FROM Customer
WHERE customer_id IN (
SELECT DISTINCT customer_id FROM Orders
)
AND customer_id NOT IN (
SELECT DISTINCT customer_id FROM Review
);Explanation:
First condition: customer_id IN Orders (has placed orders). Second condition: customer_id NOT IN Review (never reviewed). AND combines both conditions. This finds customers who buy but don't review. Useful for marketing campaigns to encourage reviews.
Q3. List the top 5 best-selling products by total quantity sold.
Show Solution
SQL Query:
SELECT p.product_name, p.category,
SUM(oi.quantity) AS total_sold,
COUNT(DISTINCT oi.order_id) AS num_orders
FROM Product p
INNER JOIN OrderItem oi ON p.product_id = oi.product_id
GROUP BY p.product_id, p.product_name, p.category
ORDER BY total_sold DESC
LIMIT 5;Explanation:
SUM(quantity) aggregates total units sold per product across all orders. COUNT(DISTINCT order_id) shows how many different orders included this product. GROUP BY product groups all order items by product. LIMIT 5 shows top 5 only. Essential query for sales analytics.
Q4. Find products that have been ordered but never reviewed.
Show Solution
SQL Query:
SELECT DISTINCT p.product_id, p.product_name, p.price
FROM Product p
INNER JOIN OrderItem oi ON p.product_id = oi.product_id
WHERE p.product_id NOT IN (
SELECT product_id FROM Review
);Explanation:
JOIN ensures product has been ordered (appears in OrderItem). NOT IN subquery excludes products that have reviews. DISTINCT removes duplicates if product ordered multiple times. Shows products needing customer feedback.
Q5. Calculate the average order value per customer, showing only customers with average > $200.
Show Solution
SQL Query:
SELECT c.customer_id, c.name,
AVG(o.total_amount) AS avg_order_value,
COUNT(o.order_id) AS num_orders
FROM Customer c
INNER JOIN Orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.name
HAVING AVG(o.total_amount) > 200
ORDER BY avg_order_value DESC;Explanation:
AVG(total_amount) calculates average spending per customer. GROUP BY customer aggregates all their orders. HAVING filters groups (not individual rows) to show high-value customers. Identifies premium customers for loyalty programs or special offers.
Q6. Find customers from the same city as customer with email 'john@email.com'.
Show Solution
SQL Query (Subquery Method):
SELECT customer_id, name, city, email
FROM Customer
WHERE city = (
SELECT city
FROM Customer
WHERE email = 'john@email.com'
)
AND email <> 'john@email.com';SQL Query (Self-Join Method):
SELECT c2.customer_id, c2.name, c2.city
FROM Customer c1, Customer c2
WHERE c1.city = c2.city
AND c1.email = 'john@email.com'
AND c2.email <> 'john@email.com';Explanation:
Method 1: Subquery finds John's city, main query matches that city.
Method 2: Self-join compares Customer table with itself. Both exclude John from results. Self-join is more efficient if joining multiple conditions.
Q7. Find products with average rating greater than 4.5.
Show Solution
SQL Query:
SELECT p.product_name, p.price,
AVG(r.rating) AS avg_rating,
COUNT(r.review_id) AS review_count
FROM Product p
INNER JOIN Review r ON p.product_id = r.product_id
GROUP BY p.product_id, p.product_name, p.price
HAVING AVG(r.rating) > 4.5
ORDER BY avg_rating DESC, review_count DESC;Explanation:
AVG(rating) calculates mean rating per product. COUNT(review_id) shows how many reviews (for confidence). INNER JOIN excludes products with no reviews. HAVING filters aggregated results. Shows highly-rated products for featuring on homepage.
Relational Algebra:
Πproduct_name,price,avg_rating(σavg_rating>4.5(Product ⋈ Review))Q8. Find customers who bought products from both 'Electronics' AND 'Books' categories.
Show Solution
SQL Query:
SELECT c.customer_id, c.name
FROM Customer c
INNER JOIN Orders o ON c.customer_id = o.customer_id
INNER JOIN OrderItem oi ON o.order_id = oi.order_id
INNER JOIN Product p ON oi.product_id = p.product_id
WHERE p.category IN ('Electronics', 'Books')
GROUP BY c.customer_id, c.name
HAVING COUNT(DISTINCT p.category) = 2;Explanation:
Join chain: Customer → Orders → OrderItem → Product connects customers to purchased categories. WHERE filters to only these two categories. COUNT(DISTINCT category) counts unique categories per customer. HAVING = 2 ensures they bought from BOTH categories. Cross-category buyers for cross-selling opportunities.
Q9. Find products that appear in more than 100 orders.
Show Solution
SQL Query:
SELECT p.product_id, p.product_name, p.category,
COUNT(DISTINCT oi.order_id) AS order_count,
SUM(oi.quantity) AS total_quantity_sold
FROM Product p
INNER JOIN OrderItem oi ON p.product_id = oi.product_id
GROUP BY p.product_id, p.product_name, p.category
HAVING COUNT(DISTINCT oi.order_id) > 100
ORDER BY order_count DESC;Explanation:
COUNT(DISTINCT order_id) counts unique orders containing this product. Different from SUM(quantity) which counts total units. HAVING filters to popular products. Useful for identifying best-sellers and ensuring adequate stock levels.
Q10. Find the month with the highest sales revenue in 2023.
Show Solution
SQL Query:
SELECT EXTRACT(MONTH FROM order_date) AS month,
TO_CHAR(order_date, 'Month') AS month_name,
SUM(total_amount) AS monthly_revenue,
COUNT(order_id) AS order_count
FROM Orders
WHERE EXTRACT(YEAR FROM order_date) = 2023
GROUP BY EXTRACT(MONTH FROM order_date), TO_CHAR(order_date, 'Month')
ORDER BY monthly_revenue DESC
LIMIT 1;Explanation:
EXTRACT(MONTH FROM date) gets month number. TO_CHAR formats as month name for readability. SUM(total_amount) calculates total revenue per month. WHERE filters to 2023 only. GROUP BY month aggregates all orders in that month. Date functions are essential for time-series analysis.
Example 4: Library Management System
Tables & Schema:
Book(book_id, title, isbn, publication_year, category, total_copies, available_copies)Author(author_id, name, country, birth_year)BookAuthor(book_id, author_id)
FK: book_id → Book, author_id → AuthorMember(member_id, name, email, phone, membership_date, status)Borrowing(borrowing_id, book_id, member_id, borrow_date, due_date, return_date)
FK: book_id → Book, member_id → MemberRelationship Visualization:
Book ──(written by)──▶ BookAuthor ◀──(writes)── Author │ │ └──(borrowed in)──▶ Borrowing ◀──(borrows)── Member
Questions:
Q1. Find all books written by authors from 'USA'.
Show Solution
SQL Query:
SELECT DISTINCT b.book_id, b.title, b.publication_year
FROM Book b
INNER JOIN BookAuthor ba ON b.book_id = ba.book_id
INNER JOIN Author a ON ba.author_id = a.author_id
WHERE a.country = 'USA'
ORDER BY b.publication_year DESC;Explanation:
Many-to-many relationship requires junction table (BookAuthor). Join chain: Book → BookAuthor → Author connects books with their authors' countries. DISTINCT handles books with multiple USA authors (listed once). Bridge table pattern common in many-to-many relationships.
Relational Algebra:
Πbook_id,title,publication_year(σcountry='USA'(Book ⋈ BookAuthor ⋈ Author))Q2. Find members who have overdue books (past due_date and not returned).
Show Solution
SQL Query:
SELECT DISTINCT m.member_id, m.name, m.email, m.phone,
COUNT(br.borrowing_id) AS overdue_count
FROM Member m
INNER JOIN Borrowing br ON m.member_id = br.member_id
WHERE br.return_date IS NULL
AND br.due_date < CURRENT_DATE
GROUP BY m.member_id, m.name, m.email, m.phone
ORDER BY overdue_count DESC;Explanation:
return_date IS NULL identifies books not yet returned. due_date < CURRENT_DATE checks if deadline passed. Both conditions combined find overdue books. COUNT shows how many overdue books per member. Important for sending reminder notices and calculating fines.
Q3. Find books that have never been borrowed.
Show Solution
SQL Query (Method 1 - LEFT JOIN):
SELECT b.book_id, b.title, b.category, b.publication_year
FROM Book b
LEFT JOIN Borrowing br ON b.book_id = br.book_id
WHERE br.book_id IS NULL;SQL Query (Method 2 - NOT EXISTS):
SELECT book_id, title, category, publication_year
FROM Book b
WHERE NOT EXISTS (
SELECT 1
FROM Borrowing br
WHERE br.book_id = b.book_id
);Explanation:
Method 1: LEFT JOIN keeps all books, even with no borrowings. WHERE br.book_id IS NULL finds books with no matches in Borrowing.
Method 2: NOT EXISTS checks if borrowing record doesn't exist. Both identify unpopular books that might need marketing or removal.
Q4. Find the most prolific author (written the most books).
Show Solution
SQL Query:
SELECT a.author_id, a.name, a.country,
COUNT(ba.book_id) AS book_count
FROM Author a
INNER JOIN BookAuthor ba ON a.author_id = ba.author_id
GROUP BY a.author_id, a.name, a.country
ORDER BY book_count DESC
LIMIT 1;Explanation:
COUNT(book_id) counts books per author through BookAuthor junction table. GROUP BY author aggregates their book count. ORDER BY DESC + LIMIT 1 finds author with maximum books. Simple pattern for "most/least" questions without complex subqueries.
Q5. Find members who borrowed books from both 'Fiction' AND 'Science' categories.
Show Solution
SQL Query:
SELECT m.member_id, m.name
FROM Member m
INNER JOIN Borrowing br ON m.member_id = br.member_id
INNER JOIN Book b ON br.book_id = b.book_id
WHERE b.category IN ('Fiction', 'Science')
GROUP BY m.member_id, m.name
HAVING COUNT(DISTINCT b.category) = 2;Explanation:
Classic "both categories" pattern using COUNT DISTINCT. WHERE filters to only these two categories. COUNT(DISTINCT category) per member counts unique categories borrowed. HAVING = 2 ensures they borrowed from BOTH. Useful for identifying diverse readers for recommendation systems.
Q6. Calculate average borrowing duration per book category.
Show Solution
SQL Query:
SELECT b.category,
AVG(br.return_date - br.borrow_date) AS avg_days_borrowed,
COUNT(br.borrowing_id) AS total_borrowings
FROM Book b
INNER JOIN Borrowing br ON b.book_id = br.book_id
WHERE br.return_date IS NOT NULL
GROUP BY b.category
ORDER BY avg_days_borrowed DESC;Explanation:
Date arithmetic: return_date - borrow_date calculates days kept. AVG aggregates across all borrowings in each category. WHERE return_date IS NOT NULL excludes currently borrowed books. Shows which categories need longer borrowing periods. Date calculations vary by database (use DATEDIFF in MySQL, etc.).
Q7. Find books co-authored (written by more than one author).
Show Solution
SQL Query:
SELECT b.book_id, b.title, COUNT(ba.author_id) AS author_count
FROM Book b
INNER JOIN BookAuthor ba ON b.book_id = ba.book_id
GROUP BY b.book_id, b.title
HAVING COUNT(ba.author_id) > 1
ORDER BY author_count DESC;Explanation:
JOIN with junction table (BookAuthor) links books to their authors. COUNT(author_id) per book counts how many authors. HAVING > 1 filters to books with multiple authors. Shows collaborative works, useful for library cataloging and search features.
Q8. Find members who have never borrowed any book.
Show Solution
SQL Query:
SELECT member_id, name, email, membership_date
FROM Member
WHERE member_id NOT IN (
SELECT DISTINCT member_id
FROM Borrowing
);Explanation:
Subquery finds all members who appear in Borrowing table. NOT IN excludes those members, leaving only non-borrowers. Identifies inactive members for engagement campaigns or membership renewal reminders. Alternative: use LEFT JOIN with NULL check.
Q9. Find books published before 2000 that are still being actively borrowed (borrowed in last 6 months).
Show Solution
SQL Query:
SELECT DISTINCT b.book_id, b.title, b.publication_year,
COUNT(br.borrowing_id) AS recent_borrowings
FROM Book b
INNER JOIN Borrowing br ON b.book_id = br.book_id
WHERE b.publication_year < 2000
AND br.borrow_date >= CURRENT_DATE - INTERVAL '6 months'
GROUP BY b.book_id, b.title, b.publication_year
ORDER BY recent_borrowings DESC;Explanation:
Combines publication year filter with recent borrowing activity. INTERVAL '6 months' creates date range for recent activity. Shows classic/older books that remain popular. Useful for collection management decisions. Date intervals vary by database system.
Q10. Find authors who have written books in more than 3 different categories.
Show Solution
SQL Query:
SELECT a.author_id, a.name, a.country,
COUNT(DISTINCT b.category) AS category_count
FROM Author a
INNER JOIN BookAuthor ba ON a.author_id = ba.author_id
INNER JOIN Book b ON ba.book_id = b.book_id
GROUP BY a.author_id, a.name, a.country
HAVING COUNT(DISTINCT b.category) > 3
ORDER BY category_count DESC;Explanation:
Join chain: Author → BookAuthor → Book connects authors to book categories. COUNT(DISTINCT category) counts unique categories per author. HAVING filters to versatile authors who write across multiple genres. Identifies prolific, diverse authors for library events or recommendations.
Relational Algebra:
Πauthor_id,name,country(σcategory_count>3(Author ⋈ BookAuthor ⋈ Book))Continue Learning
You've completed Part 1! Next, learn about database design with E-R diagrams, functional dependencies, and normalization.
Next: Part 2 - Database Design
