Common Mistakes In SQL Queries

Serdar A.
4 min readApr 18, 2024

--

Hello Everyone;

For Turkish : https://medium.com/@lastjavabuilder/sql-sorgular%C4%B1nda-yanl%C4%B1%C5%9F-kullan%C4%B1mlar-391347b83268

In this article, I will offer suggestions about the most common mistakes in our SQL queries and solutions to them.

  1. NULL comparison

In queries, comparisons are written as nullParam = NULL in the WHERE section.The result of this query returns FALSE, causing the query result to work incorrectly.Because NULL obviously means “unknow”. We cannot know that two unknown things are the same and therefore cannot compare them.

Solution:

using param IS NULL or param IS NOT NULL

2. Count(expression) function

By default, the count() function ignores null values in select queries.

Solution:

using count(*)

Query example

SELECT COUNT(MyField) AS count_column ,COUNT(*) AS count_all FROM my_table

Result

3. Using reserved word as column name

Select from FROM TableName

from word is reserved word and it can be error.

Solution :

SELECT `from` FROM TableName

or

SELECT TableName.from FROM TableName

or

select [from] from TableName

4. Unawareness of Join Type

Sometimes, when we use JOIN Type as INNER between table A and table B in queries, the result may not come, for example, or let’s say we want all the data in table A, we cannot get the result we want due to the INNER Join Type.

5. Using Between incorrectly

For example, if the column value is 2020–05–01 20:25:19, when we write a query like the following, no results will be returned.

SELECT * FROM TableName WHERE my_date  BETWEEN '2020-05-01' AND '2020-05-01'

this query execute in actually:

SELECT * FROM TableName
WHERE my_date BETWEEN '2020-05-01 00:00:00+00000' AND '2020-05-01 00:00:00+00000'

Solution :

SELECT *
FROM TableName
WHERE cast(my_date _at as date) BETWEEN '2020-05-01' AND '2020-05-01'

or

SELECT *
FROM TableName
WHERE my_date >= '2020-05-01' AND my_date < '2020-05-02'

6. Using Where clause after Group By clause

Produces incorrect results when using where on grouped datasets.

Solution :

  • Where : Filter records
SELECT S_Name, Age FROM Student 
WHERE Age >=18
  • Having : Filter grouping records
SELECT Age, COUNT(Roll_No) AS No_of_Students 
FROM Student GROUP BY Age
HAVING COUNT(Roll_No) > 1

Differences:

7. Using common column name

If there are column names with the same name in more than one table, using them interchangeably in the where section or select section will cause query result errors.

Solution :

using alias for each tables

Select A.name as A_name, B.col1, C.name as C_name 
from TableA A, TableB B, TableC C
WHERE
A.id = C.id
B.col2 = C.col2

8. Misusing the DISTINCT keyword

If DISTINCT statements are used unnecessarily, it increases the running time of the query and causes performance loss.

The queries should be analyzed and unnecessary DISTINCT statements should be removed from the queries.

9. Improper use of Subqueries

Although sub queries are sometimes necessary, they can sometimes cause performance problems. Solutions must be found by analyzing the queries.

Select A.col1,A.col2,
(Select max(B.col3) from TableB as B WHERE A.col4 = B.col4 ) AS maxValue
From TableA as A

Solution :

To optimize the queries, the queries in the sub query can be made more performant with JOIN Type or other techniques.

10. Wrong Normalization

  • Duplicate colons in the same table should be removed. Excess colons should be removed.
  • Related datasets should be grouped separately and a table created
  • If necessary, a separate table can be created for common related data sets in an interest set.
  • Defining primary key for each table
  • For secondary tables, reference keys must be created in the relevant tables.

Thanx to reading.

--

--

Serdar A.

Senior Software Developer & Architect at Havelsan Github: https://github.com/serdaralkancode #Java & #Spring & #BigData & #React & #Microservice