I resolved all the database questions on Leetcode.com recently. The questions cover most of the SQL common queries inlcuding JOIN, Ranking and other SQL basics. I provided the answers as well as explanations in this blog, as a way to consolidate the SQL knowledge.
The questions on Leetcode only support MySQL, so you can install MySQL on your laptop for testing purpose before submitting the solution, or use online SQL platforms to test you query. E.g. rextester.
1. JOIN¶
175. Combining Two Tables¶
Solution:¶
Left join two tables, select the columns required.
Select FirstName, LastName, City, State
From Person Left Join Address
On Address.PersonId = Person.PersonId;
181. Employees Earning More Than Their Managers¶
Create test table¶
CREATE TABLE IF NOT EXISTS Employee (
Id INT,
Name VARCHAR(50),
Salary INT,
ManagerId INT
);
DELETE FROM Employee;
INSERT INTO Employee VALUES
(1, 'Joe', 70000, 3),
(2, 'Henry', 80000, 4),
(3, 'Sam', 60000, NULL),
(4, 'Max', 90000, NULL);
Solution:¶
A typical problem of self join. Join the table with itself, for each row compare the salaries of employee and manager.
Select a.Name
From Employee a JOIN Employee b
ON a.ManagerId = b.Id
Where a.Salary > b.Salary;
183. Customers Who Never Order¶
Create test table¶
Create table Customers (
Id INT,
Name VARCHAR(50)
);
Create table Orders (
Id INT,
CustomerId INT
);
Insert into Customers (Id, Name) Values (1, 'Joe'), (2, 'Henry'), (3, 'Sam'), (4, 'Max');
Insert into Orders (Id, CustomerId) Values (1, 3), (2, 1);
Solution:¶
Method 1: Use Left Join
, then select the rows whose CustomerId is not null.
Select C.Name As Customers
From Customers C left Join Orders O
On C.Id = O.CustomerId
Where O.CustomerId is null;
Method 2: (not sure why the code doesn't pass): Use Join
, then select the Name with Not In
.
Note: Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
Select Name As Customers from Customers
Where Name Not In (
Select Name from Customers C JOIN Orders O
ON C.Id = O.CustomerId
);
184. Department Highest Salary¶
Create test table¶
Create table Employee(
Id INT,
Name varchar(20),
Salary INT,
DepartmentId INT
);
Insert Into Employee (Id, Name, Salary, DepartmentId) Values
(1, 'Joe', 70000, 1),
(2, 'Henry', 80000, 2),
(3, 'Sam', 60000, 2),
(4, 'Max', 90000, 1);
Create table Department(
Id INT,
Name varchar(20)
);
Insert Into Department (Id, Name) Values
(1, 'IT'),
(2, 'Sales');
Solution¶
First select maxmium salary in each department with corresponding Id. Then join the result table with Employee
to get the Employee name who has maximum salary, DepartmentId and maximum salary. Then join the result table with the Department table to get the Department Name based on Department id.
Select D.Name AS Department, T.Name AS Employee, T.Salary AS Salary
From Department D,
(
Select Name, Salary, E.DepartmentId from Employee E JOIN (
Select DepartmentId, Max(Salary) AS MaxSalary From Employee
Group by DepartmentId) tmp
ON E.DepartmentId = tmp.DepartmentId
Where E.Salary = tmp.MaxSalary
) AS T
Where D.Id = T.DepartmentId;
2. Rank¶
176. Second Highest Salary¶
Create test table¶
Create table Employee(
Id INT,
Salary INT
);
Insert into Employee (Id, Salary) Values
(1, 100),
(2, 200),
(3, 300),
(4, 200);
Solution:¶
Method 1: Since the second highest salary is the highest salary after maximum salary is removed, we add a where
clause to inquery it, within which there is a subquery to select the highest salary.
Select Max(Salary) AS SecondHighestSalary
From Employee
Where Salary < (Select Max(Salary) From Employee);
Method 2: A more general method is to generate the rank of each row according to salary, then select the row whose value in rank
column is 2. The generalized ranking questions are showed later.
177. Nth Highest Salary¶
Create test table¶
The test table is the same as question 176.
Solution:¶
Method 1: The key is to generate the rank of rows. In MySQL, we can set a variable to help do that (which has the same output as dense_rank()
). Note there could be rows with the same rank, so we need select the distinct salary.
Select Distinct Salary From (
Select Id, Salary,
@rank := if(@preSalary = Salary, @rank, @rank+1) AS Rank,
@preSalary := Salary AS preSalary
From Employee, (Select @rank:=0, @preSalary:=-1) var
Order by Salary DESC
) tmp
Where Rank=N;
Method 2: In SQL server and Oracle, we can use rank()
and dense_rank()
function to generate rank directly. Note if there are ties, dense_rank()
always returns consecutive integers, while rank()
returns discrete ones. For the difference between these two function, see here.
Select Id, Salary, Rank() Over (Order by Salary Desc) From Employee;
Select Id, Salary, Dense_Rank() Over (Order by Salary Desc) From Employee;
178. Rank Scores¶
Create test table¶
Create table Scores(
Id INT,
Score Float
);
Insert Into Scores (Id, Score) Values
(1, 3.5),
(2, 3.65),
(3, 4.0),
(4, 3.85),
(5, 4.0),
(6, 3.65);
Solution:¶
The Solution is the same as question 177. First generate the rank, then select two columns Score
and Rank
.
Select Score, Rank From (
Select Id, Score,
@rank := if(@preScore = Score, @rank, @rank+1) AS Rank,
@preScore := Score AS preScore
From Scores, (Select @rank:=0, @preScore:=-1) var
Order by Score DESC
) tmp;
180. Consecutive Numbers¶
Create test table¶
Create table Logs(
Id INT,
Num INT
);
Insert into Logs (Id, Num) Values
(1,1),
(2,1),
(3,1),
(4,2),
(5,1),
(6,2),
(7,2);
Solution:¶
This is similar to ranking question, in which we set a variable to count consecutive numbers. We create a new column to record the number of consecutive numbers. If the Num
is the same as the previous one, then add the count
by one, and reset the count
to be 1 if the Num
is different.
Note: since it doesn't need rank, in SQL server and Oracle you can't use rank()
to solve this problem.
Select Distinct Num As ConsecutiveNums From (
Select Id, Num,
@count := If(@preNum = Num, @count+1, 1) AS Count,
@preNum := Num AS preNum
From Logs, (Select @preNum:=NULL, @count:=1) var
) tmp
Where Count >=3;
185. Department Top Three Salaries¶
Create test table¶
Create table Employee(
Id INT,
Name varchar(20),
Salary INT,
DepartmentId INT
);
Insert Into Employee (Id, Name, Salary, DepartmentId) Values
(1, 'Joe', 70000, 1),
(2, 'Henry', 80000, 2),
(3, 'Sam', 60000, 2),
(4, 'Max', 90000, 1),
(5, 'Janet', 69000, 1),
(6, 'Randy', 85000, 1),
(7, 'Xiang', 70000, 1);
Create table Department(
Id INT,
Name varchar(20)
);
Insert Into Department (Id, Name) Values
(1, 'IT'),
(2, 'Sales');
Solution:¶
This is also a ranking problem. But different from previous ones, we need generate the rank for each group of each department.
Method 1: In MySQL, we set a variable to generate the rank within each group. If the group(departmentId) changed, then set the rank to default. Then join the generated table with Department table to get the department name.
Select D.Name AS Department, T.Name AS Employee, T.Salary
From
(
Select DepartmentId, Name, Salary,
(CASE WHEN @id=DepartmentId THEN @rank:= IF(@preSalary = Salary, @rank, @rank+1)
ELSE @rank:= 1
END) AS Rank,
@preSalary:= Salary AS preSalary,
@id:= DepartmentId AS preId
From Employee, (Select @rank:=0, @preSalary:=-1, @id:=NULL) var
Order by DepartmentId, Salary DESC
) T JOIN Department D
ON T.DepartmentId = D.Id
Where T.Rank <=3;
Method 2: Similarly, in SQL server and Oracle, we can use Dense_rank()
function to generate the rank easily. Note according to the question, the tie should have the same rank. So Dense_rank()
is used here.
Select D.Name AS Department, T.Name AS Employee, T.Salary
From
(
Select DepartmentId, Name, Salary,
Dense_rank() Over (partition by DepartmentId Order by Salary DESC) AS Rank
From Employee
) T JOIN Department D
ON T.DepartmentId = D.Id
Where T.Rank <=3;
197. Rising Temperature¶
Create test table¶
CREATE TABLE Weather (
Id INT,
Date DATE,
Temperature INT
);
DELETE FROM Weather;
INSERT INTO Weather VALUES
(1, '2015-01-01', 10),
(2, '2015-01-02', 25),
(3, '2015-01-04', 20),
(5, '2015-01-03', 32),
(4, '2015-01-06', 30);
Solution:¶
Method 1: Set dummies variables, compare each day's temperature and date with its previous day. Then select the day in which temperature is rising and date is continuous. (Don't know why this method is very slow, almost exceeded time limit).
Select Id From (
Select Id, Date, Temperature,
@Higher := If(Temperature > @preTemp, 'Yes', 'No') AS Higher,
@DateContinuous := If(Datediff(Date, @preDate) = 1, 'Yes', 'No') AS DateContinuous,
@preTemp := Temperature AS preTemp,
@preDate := Date
From Weather, (Select @preTemp:=NULL, @Higher:=NULL, @preDate:=NULL) var
Order by Date
) tmp
Where Higher = 'Yes' AND DateContinuous = 'Yes';
Method 2: Join the table with itself on where the data difference is one. Then select the dates in which temperature is rising.
Select W1.Id
From Weather W1 JOIN Weather W2
ON to_days(W1.Date) = to_days(W2.Date) + 1
Where W1.Temperature > W2.Temperature;
3. SQL Basics¶
182. Duplicate Emails¶
Create test table¶
Create table Person (
Id INT,
Email VARCHAR(100)
);
Insert into Person (Id, Email) Values
(1, 'a@b.com'),
(2, 'c@d.com'),
(3, 'a@b.com')
Solution:¶
Group by Email, then filter the groups whose count are greater than 1.
Select Email From Person
Group by Email
Having count(*) > 1;
196. Delete Duplicate Emails¶
Create test table¶
Create table Person (
Id INT,
Email VARCHAR(100)
);
Insert into Person (Id, Email) Values
(1, 'john@example.com'),
(2, 'bob@example.com'),
(3, 'john@example.com')
Solution:¶
It's very easy to retain the unique emails with the smallest ids. However, note the question require to delete the duplicate emails, so Delete
clause is needed.
/* retain the unique emails*/
Select Min(Id) AS minId, Email From Person
Group by Email
/* delete the emails which don't appear in the unique emails*/
Delete From Person
Where Id not in (
Select minId From (
Select min(Id) AS minId, Email
From Person
Group by Email
) AS tmp
);
262. Trips and Users¶
Create test table¶
Create table Trips(
Id INT,
Client_id INT,
Driver_id INT,
City_id INT,
Status ENUM('completed', 'cancelled_by_driver', 'cancelled_by_client'),
Request_at DATE
);
Insert into Trips Values
(1,1,10,1,'completed', str_to_date('10/1/13', '%m/%d/%Y')),
(2,2,11,1,'cancelled_by_driver', str_to_date('10/1/13', '%m/%d/%Y')),
(3,3,12,6,'completed', str_to_date('10/1/13', '%m/%d/%Y')),
(4,4,13,6,'cancelled_by_driver', str_to_date('10/1/13', '%m/%d/%Y')),
(5,1,10,1,'completed', str_to_date('10/2/13', '%m/%d/%Y')),
(6,2,11,6,'completed', str_to_date('10/2/13', '%m/%d/%Y')),
(7,3,12,6,'completed', str_to_date('10/2/13', '%m/%d/%Y')),
(8,2,12,12,'completed', str_to_date('10/3/13', '%m/%d/%Y')),
(9,3,10,12,'completed', str_to_date('10/3/13', '%m/%d/%Y')),
(10,4,13,12,'cancelled_by_driver', str_to_date('10/3/13', '%m/%d/%Y'));
Create table Users(
Users_id INT,
Banned ENUM('No', 'Yes'),
Role ENUM('client', 'driver', 'partner')
);
Insert into Users Values
(1, 'No', 'client'),
(2, 'Yes', 'client'),
(3, 'No', 'client'),
(4, 'No', 'client'),
(10, 'No', 'driver'),
(11, 'No', 'driver'),
(12, 'No', 'driver'),
(13, 'No', 'driver');
Solution:¶
The key of this question is computing the cancellation rate. To compute it, we need count the number of trips cancelled by driver, and total number of trips within each group. The group is regularized by the date. Before the group by
clause, use where
clause to filter the rows which meet the requirement.
Select T.Request_at AS Day, Round(Sum(If(Status='cancelled_by_driver',1,0))/count(*), 2) AS 'Cancellation Rate'
From Trips T JOIN Users U
ON T.Client_Id = U.Users_Id
Where Banned='No' And Role='client' And Request_at Between '2013-10-01' and '2013-10-03'
Group by Request_at
Order by Request_at;
Comments
comments powered by Disqus