Logo
Back to blog

Filtering, Sorting, and Working with NULLs

7 min read
sql

In our previous post, we explored the basics of the SELECT statement—the foundation of all SQL queries. Now, we're ready to take our SQL skills to the next level by learning how to filter data, sort results, and handle missing values (NULLs).

These techniques are essential for pinpointing exactly the information you need in a database. Think of it as the SQL equivalent of Excel's filter and sort features, but with far more power and flexibility.

Filtering Data with WHERE

While seeing all records in a table can be useful, real-world SQL queries almost always need filtering. The WHERE clause lets us specify exactly which rows we want to see based on conditions we define.

Basic Filtering

Let's find all customers from the United States:

SELECT FirstName, LastName, City, State
FROM Customers
WHERE Country = 'USA';

The WHERE clause filters our results to include only rows where the Country equals 'USA'. The condition acts like a test that each row must pass to be included in the results.

In our coffee shop database, this should return around 15 customers who are based in the United States.

Comparison Operators

SQL supports all the comparison operators you're familiar with from Excel:

  • = Equal to
  • <> or != Not equal to
  • > Greater than
  • < Less than
  • >= Greater than or equal to
  • <= Less than or equal to

For example, to find premium products with a price over $50:

SELECT ProductName, UnitPrice, CategoryID
FROM Products
WHERE UnitPrice > 50;

This returns products like our espresso machines and other high-end items.

Multiple Conditions with AND/OR

Need to filter on multiple conditions? Use the logical operators AND and OR:

-- Products that are both expensive and low in stock (potential reorder candidates)
SELECT ProductName, UnitPrice, UnitsInStock
FROM Products
WHERE UnitPrice > 50 AND UnitsInStock < 20;

-- Customers from North America
SELECT FirstName, LastName, Country
FROM Customers
WHERE Country = 'USA' OR Country = 'Canada' OR Country = 'Mexico';

Understanding how AND and OR work is crucial:

  • AND: Both conditions must be true
  • OR: Either condition can be true

Operator Precedence and Parentheses

SQL evaluates AND before OR, similar to how multiplication happens before addition in math. When combining multiple logical operators, use parentheses to ensure the correct evaluation order:

-- Find expensive beverages or any product that's running low
SELECT ProductName, CategoryID, UnitPrice, UnitsInStock
FROM Products
WHERE (CategoryID = 2 AND UnitPrice > 15) OR UnitsInStock < 20;

Without parentheses, this query would find products that are both in category 2 AND have a price over $15, OR products with less than 20 units in stock. The parentheses make it clear we want expensive beverages or any low-stock product.

The IN Operator

When you need to match against a list of possible values, IN is cleaner than multiple OR conditions:

-- Equivalent to Country = 'USA' OR Country = 'Canada' OR Country = 'Mexico'
SELECT FirstName, LastName, Country
FROM Customers
WHERE Country IN ('USA', 'Canada', 'Mexico');

The IN operator is particularly useful when:

  • You have many possible values to match
  • The values come from a subquery (which we'll cover in a later post)

The NOT Operator

The NOT operator reverses the logic of any condition:

-- Customers NOT from the USA
SELECT FirstName, LastName, Country
FROM Customers
WHERE NOT Country = 'USA';

-- This can also be written as:
SELECT FirstName, LastName, Country
FROM Customers
WHERE Country <> 'USA';

You can combine NOT with other operators:

-- Products that aren't in categories 1, 2, or 3
SELECT ProductName, CategoryID
FROM Products
WHERE CategoryID NOT IN (1, 2, 3);

The BETWEEN Operator

For range checking, BETWEEN provides a cleaner syntax:

-- Products priced from $10 to $20 (inclusive)
SELECT ProductName, UnitPrice
FROM Products
WHERE UnitPrice BETWEEN 10 AND 20;

-- Equivalent to:
SELECT ProductName, UnitPrice
FROM Products
WHERE UnitPrice >= 10 AND UnitPrice <= 20;

BETWEEN includes the boundary values (10 and 20 in this case), making it an inclusive range.

You can also use BETWEEN with dates:

-- Orders placed in the first quarter of 2023
SELECT SaleID, CustomerID, SaleDate
FROM Sales
WHERE SaleDate BETWEEN '2023-01-01' AND '2023-03-31';

Filtering Text with LIKE

Need to search for patterns in text? The LIKE operator combined with wildcards is your tool:

-- Customers whose last name starts with 'S'
SELECT FirstName, LastName
FROM Customers
WHERE LastName LIKE 'S%';

SQL Server supports these wildcards:

  • % - Matches any sequence of characters (including none)
  • _ - Matches exactly one character
  • [abc] - Matches any single character in the brackets
  • [^abc] - Matches any single character NOT in the brackets

Practical examples:

  • 'S%' - Starts with S (Smith, Stevens, Sharma)
  • '%son' - Ends with son (Johnson, Anderson, Wilson)
  • '%ing%' - Contains ing (King, Singer, Washington)
  • '__th' - Four characters long, ending with th (Beth, Seth)
  • '[AM]%' - Starts with either A or M
  • '[A-M]%' - Starts with any letter from A to M
Excel Comparison: LIKE with wildcards is similar to Excel's * and ? wildcards in functions like COUNTIF or when using filters.

Case Sensitivity in Filters

In SQL Server, string comparisons are case-insensitive by default. This means WHERE LastName = 'smith' will match 'SMITH', 'Smith', or any other capitalization.

If you need case-sensitive comparisons, you can use:

SELECT FirstName, LastName
FROM Customers
WHERE LastName COLLATE Latin1_General_CS_AS = 'Smith';
Dialect Difference: PostgreSQL and some other databases are case-sensitive by default, unlike SQL Server.

Sorting Results with ORDER BY

Database results have no inherent order unless you specify one. The ORDER BY clause lets you arrange your results in a meaningful sequence:

-- Products from most to least expensive
SELECT ProductName, UnitPrice
FROM Products
ORDER BY UnitPrice DESC;
  • ASC - Ascending order (lowest to highest, A to Z, earliest to latest)
  • DESC - Descending order (highest to lowest, Z to A, latest to earliest)

If you don't specify, ASC is the default.

Multi-column Sorting

You can sort by multiple columns, establishing a hierarchy of sort criteria:

-- Sort customers by country, then by last name within each country
SELECT FirstName, LastName, Country, City
FROM Customers
ORDER BY Country ASC, LastName ASC;

This sorts first by Country (alphabetically), then within each country, by LastName (also alphabetically).

Sorting by Column Position

You can also sort by column position instead of column name:

SELECT FirstName, LastName, Country
FROM Customers
ORDER BY 3 ASC, 2 ASC;

This sorts by the third column (Country), then by the second column (LastName). While this approach saves typing, it can make your query harder to maintain if you later change the column list.

Sorting by Expressions

You can sort by calculated values:

-- Sort products by inventory value (price × quantity)
SELECT
    ProductName,
    UnitPrice,
    UnitsInStock,
    UnitPrice * UnitsInStock AS InventoryValue
FROM
    Products
ORDER BY
    UnitPrice * UnitsInStock DESC;

You can also sort by the alias:

SELECT
    ProductName,
    UnitPrice,
    UnitsInStock,
    UnitPrice * UnitsInStock AS InventoryValue
FROM
    Products
ORDER BY
    InventoryValue DESC;

Limiting Results with TOP

In SQL Server, the TOP clause limits how many rows are returned:

-- Show the 5 most expensive products
SELECT TOP 5 ProductName, UnitPrice
FROM Products
ORDER BY UnitPrice DESC;

You can also specify a percentage:

-- Show the top 10% most expensive products
SELECT TOP 10 PERCENT ProductName, UnitPrice
FROM Products
ORDER BY UnitPrice DESC;
Dialect Difference: PostgreSQL and MySQL use LIMIT instead of TOP. For example:

TOP is particularly useful for:

  • Creating "Top N" reports
  • Sampling data from large tables
  • Pagination (though more advanced techniques exist)

Working with NULL Values

One of the most confusing aspects of SQL for beginners is working with NULL values. In SQL, NULL represents a missing or unknown value—not zero, not an empty string, but the absence of any value.

What is NULL?