SQL, or Structured Query Language, is the universal language spoken by databases. It's the tool that allows us to interact with, retrieve, and manage data efficiently. If you're wondering where to begin, fear not! "Getting Started with SQL: A Quick Reference for Beginners" is here to guide you through the fundamental concepts in a way that's approachable and easy to understand.
I will break down the basics of SQL, focusing on the keywords and concepts that form the backbone of this powerful language.
To show which databases are there use show database, and in Microsoft SQL used to show the database.
SELECT name FROM sys. databases;
and to access any of the databases use: use eg USE sql_joins.
SELECT table_name = t.name
FROM sys.tables t
Used to show the number of tables in a database.
A table is a collection of related data entries and it consists of columns and rows.
A field is a column in a table that is designed to maintain specific information about every record in the table.
A record, also called a row or a horizontal entity in a table is each entry that exists in a table.
SQL keywords are NOT case sensitive: select is the same as SELECT.
A semicolon is the standard way to separate each SQL Statement in database systems that allow more than one SQL Statement to be executed in the same call to the server.
The SELECT also called a query, is a data manipulation language statement that is used to read and modify data.
1 AGGREGATE FUNCTIONS.
Aggregation is another name for summarizing your data points to get a single value.eg calculating the mean or the minumum.
Aggregating something, it is no longer enough just to say, “I want data from this and that, and only if it’s x, y, or z.” To ask for totals over “this and that” is usually insufficient; typically, we want to see “totals per customer,” “count of orders by day,” or “average sales of each category by month.”
Below are some essential SQL aggregation functions;
1.1:Use DISTINCT to Get Distinct Counts.
The purpose of the COUNT() aggregate function should be obvious from its name.
Note: Typing COUNT(1) has the same effect as COUNT(*). Which one you use is a matter of personal preference.
There are three different ways in which the COUNT() aggregate function can be used to return the number of items in a group.
- COUNT(*) returns the number of items in a group, including null values and duplicates. The regular COUNT function, which counts all rows including duplicates. Here's the general syntax
SELECT COUNT(title)
FROM cinema.films
The above code will return output - 4968
- COUNT(ALL ) (which can be shortened to COUNT() because ALL is the default) evaluates an expression for each row in a group and returns the number of non-null values.
- COUNT(DISTINCT ) evaluates an expression for each row in a group and returns the number of unique, non-null values.
SELECT COUNT(DISTINCT title)
FROM cinema.films
The above code outputs-4844
By specifying distinct, therefore, the count() function examines the values of a column for each member of the group to find and remove duplicates, rather than simply counting the number of values in the group.
1.2:The SQL MIN () and MAX() Functions.
The MIN () function returns the smallest value of the selected column. The **MAX() **function returns the largest value of the selected column.
SELECT MIN (budget) SELECT MAX(budget)
FROM cinema.films FROM cinema.films
1.3 ORDER BY
Order by keyword is used to sort the result set in ascending or descending order by default. To sort in Descending order use DESC.
SELECT * FROM Products
ORDER BY Price DESC;
SELECT * FROM Customers
ORDER BY Country, CustomerName;
The above SQL statement selects all customers from the "Customers" table, sorted by the "Country" and the "CustomerName" columns. This means that it orders by Country, but if some rows have the same Country, it orders them by customer name.
1.4:SQL SUM
SUM is an SQL aggregate function, that totals the values in a given column. Unlike COUNT, you can only use SUM on columns containing numerical values.
SELECT SUM (volume)
FROM apple_historical_stock_price
Note: SUM treats nulls as 0.
SELECT dept, sum(salary) as total_salary
FROM employees
GROUP BY dept;
1.5: Understand How GROUP BY Works.
There are three phases when you group data:
- Split:the dataset is split up into chunks of rows based on the values of the variables we have chosen for the aggregation.
- Apply : Compute an aggragate function, like average ,minimum and maximum returning a single value
- Combine: All these resulting outputs are combined in a unique table and we'll have a single value for each modality of the variable of interest.
The GROUP BY statement is often used with aggregate functions (COUNT (), MAX(), MIN(), SUM(),AVG() to group the result-set by one or more columns. General Syntax
SELECT column_name(s)
FROM table_name
WHERE condition
GROUP BY column_name(s)
ORDER BY column_name(s);
You often need to be able to partition your data into groups (where a group is a set of rows with the same values for all of the grouping columns) to be able to apply some type of
aggregation to your data.
SELECT country, COUNT(*) AS n_comapnies
FROM companies
GROUP BY country
ORDER BY n_companies DESC
LIMIT 10
COUNT- As the name suggest it is used to count rows for each group which corresponds to the country, the keyword AS is used to rename the column into a more explainable name.
SELECT
product_line,
AVG(unit-price) AS avg_price,
SUM(quantity) AS tot_pieces,
SUM(total) AS total_gain
FROM sales
GROUP BY product_line
ORDER BY total_gain DESC
Columns included in the GROUP BY clause are referred to as the grouping columns. Columns included in the GROUP BY clause don't need to be included in the SELECT clause.
You should filter the data in the WHERE clause as much as possible because that will reduce the amount of data that needs to be aggregated.
1.5 SELECT DISTINCT
- Statement is used to return only distinct(Different) values.It ensures that the result set contains only distinct (unique) values, eliminating any duplicate records. WHERE CLause Always require a predicate - Evalautes to True,False or unknown WHERE It is used to extract only those records that fulfill a specified condition. Note: The WHERE clause is not only used in SELECT statements, it is also used in UPDATE, DELETE, etc. You can use other operators than the = operator to filter the search.
SELECT DISTINCT longitude,latitude
FROM Dinosaurs
WHERE length_ma > 8.2
1.6:SQL HAVING CLAUSE
The HAVING clause was added to SQL because the WHERE keyword cannot be used with aggregate functions.
SELECT column_name(s)
FROM table_name
WHERE condition
GROUP BY column_name(s)
HAVING condition
ORDER BY column_name(s);
The HAVING clause filters the aggregated data set. You should use a HAVING clause only when the filtering depends on the results of the aggregation, such as HAVING Count(*) > 5 or HAVING Sum(Price) < 100.
REMEMBER if you want to do a COUNT the output does not have a column name instead write this to have a column name
SELECT COUNT(LastName) AS lastNameCount
FROM EmployeeDemographics
SELECT
product_line
AVG(unit_price) As avg_price,
SUM(quantity) AS tot_pieces,
SUM(total) AS total_gain
FROM sales
GROUP BY product_line
HAVING SUM (total) > 40000
ORDER BY total_gain DESC
WHERE CLAUSE.
The WHERE clause can contain one or many AND operators.( =,<>,<,>, or, LIKE , Null, Not null, IN )
- AND operator is used to filter records based on more than one condition. OR is used to return one true condition. LIKE is used when u want to return a value that starts with a certain letter. eg WHERE LASTNAME = LIKE 'S%O%'- remember the letters must follow each other. IN is used the same way as equals but it is a much-condensed way to return values eg WHERE FIRSTNAME IN ('JIM', Michael')
SELECT
FROM Dinosaurs
WHERE Type = 'small theropod' AND diet LIKE 'OMNIVOROUS';
The AND operator displays a record if all the conditions are TRUE.
SQL OR
The OR operator displays a record if any of the conditions are TRUE. The OR operator is used to filter records based on more than one condition, like if you want to return all Dinosaurs whose diet is omnivorous or whose region is Alberta.
SELECT name, length_m,
FROM Dinosaurs
WHERE diet = omnivorous OR region = Alberta,
The NOT Operator.
The NOT operator is used in combination with other operators to give the opposite result, also called the negative result.
SELECT * FROM Dinosaurs
WHERE NOT DIET = 'omnivorous'
NOT BETWEEN.
Select Dinosaurs with a length not between,
SELECT *FROM Dinosaurs
WHERE length_ma NOT BETWEEN 8.7 AND 10.0
NOT IN
SELECT *FROM Dinosaurs
WHERE region NOT IN ('Liaoning', 'Alberta');
Select Dinosaurs that are not from liaoning and alberta.
NOT Greater Than (NOT >) & NOT Less Than ( NOT <)
Select Dinosaurs with a length not greater than 8.7
SELECT * FROM Dinosaurs
WHERE NOT length > 8.7;
SQL INSERT INTO.
The INSERT INTO statement is used to insert new records in a table. There are two ways to write the INSERT INTO.
- Specify both the column names and the values to be inserted
INSERT INTO Dinosaurs (column1, column2, column3,
VALUES(value1, value2 value3...);
- If you are adding values for all the columns of the table, you do not need to specify the column names in the SQL query. However, make sure the order of the values is in the same order as the columns in the table.
INSERT INTO Dinosaurs
VALUES (value1,value2,value3,....);
NULL Value
Null values serve as placeholders(A field with no value)for data that is missing or not available. They assist us in eradicating data ambiguity, Null values are also useful for maintaining consistent data across the column.
If a field in a table is optional, it is possible to insert a new record or update a record without adding a value to this field. Then, the field will be saved with a NULL value.
To test for NULL VALUES we use IS NULL and IS NOT NULL
SELECT lat SELECT lat
FROM Dinosaurs, FROM Dinosaurs,
WHERE lat IS NULL WHERE lat IS NOT NULL`
The NULL operator is used to test for empty values (NULL Values)
while the IS NOT NULL is used to test for Non-empty Values.
Key points regarding the use of NULL in SQL:
Comparison with NULL
: Comparisons with NULL using regular comparison operators like “=”, “<>”, “<“, “>” do not yield true or false but rather produce a result of unknown or NULL. Instead, you need to use the IS NULL or IS NOT NULL operators to check for NULL values.Handling NULL in expressions:
When performing arithmetic or other operations involving NULL values, the result typically becomes NULL. For example, any arithmetic operation that involves a NULL operand will result in a NULL result.Aggregating NULL values:
Most aggregate functions in SQL, such as SUM, AVG, COUNT, etc., ignore NULL values when calculating results. However, there are some aggregate functions like COUNT(*) that consider NULL values.
SQL UPDATE
Be careful when updating records in a table. Notice the WHERE clause in the UPDATE statement. The WHERE clause specifies which record(s) that should be updated. If you omit the WHERE clause, all records in the table will be updated!
UPDATE Customers
SET ContactName = 'Juan'
WHERE Country = 'mexico'
SQL DELETE
Used to delete existing records in a table.
DELETE FROM Dinosaurs WHERE condition ;
The same case applies to DELETE like the UPDATE about WHERE.
SQL EXISTS Operator.
The EXISTS operator is used to test for the existence of any record in a subquery. It returns TRUE if the Subquery returns one or more records.
SELECT column_name(s)
FROM table_name
WHERE EXISTS
The SQL CASE STATEMENT.
The CASE statement is SQL's way of handling if/then logic. The case statement is followed by one least one pair of WHEN and THEN statements - SQL's equivalent of IF/THEN in Excel.
Every CASE statement must end with an END statement. The ELSE statement is optional and provides a way to capture values not specified in the WHEN/THEN statements.
SELECT certification
CASE - (checks each row to see if the conditional statement
WHEN duration >60 THEN 'long'
WHEN duration <=60 THEN 'Short'
ELSE 'Unknown'
END AS duration_category,
COUNT (*) AS movie_count
AVG(gross) AS Average_count
FROM cinema. films
GROUP BY certification,duration_category
ORDER BY certification,duration_category
Adding multiple conditions to a CASE statement.
You can also define a number of outcomes in a CASE statement by including as many WHEN/THEN statements as you'd like.
SELECT player_name,
weight,
CASE WHEN weight > 250 THEN 'over 250'
WHEN weight > 200 THEN '201-250'
WHEN weight > 175 THEN '176-200'
ELSE '175 or under' END AS weight_group
FROM benn.college_football_players
SELECT
player_name,
location,
CASE WHEN location ='California' THEN 'yes'
ELSE 'no'
END AS California
FROM benn.college_football_players
The SQL query selects the player_name and location columns from the table benn.college_football_players. Additionally, it uses a CASE statement to create a new column called California. If the value in the location column is 'California', the California column will have the value 'yes'; otherwise, it will have the value 'no'.
A QUICK REVIEW OF CASE BASICS.
The CASE statement always goes in the SELECT clause.
CASE must include the following components ;WHEN,THEN, and END . ELSE is an optional component.
3 You can make any conditional statement using any conditional operator (like WHERE) between WHEN and THEN . This includes stringing together multiple conditional statements using AND and OR.
In the above example, you can also string together multiple conditional statements with AND and OR the same way you might in a WHERE clause.
Using CASE with Aggregate functions.
It is often helpful to write a query containing the case statement first and run it on its own.
SELECT CASE WHEN year = 'FR' THEN 'FR'
WHEN year = 'SO' THEN 'SO'
WHEN year = 'JR' THEN 'JR'
WHEN year = 'SR' THEN 'SR'
ELSE 'No Year Data' END AS year_group,
COUNT(1) AS count
FROM benn.college_football_players
GROUP BY year_group
Top comments (0)