Simplifying Data Queries in .NET using LINQ
What is LINQ?
LINQ, introduced with .NET 3.5, is a unified query language that allows you to retrieve data from various sources such as collections of objects, relational databases, ADO.NET datasets, and XML files.
Different Steps of a LINQ Query Operation
A LINQ query operation involves three main steps:
Obtain the data source
Create the query
Execute the query
Obtain the Data Source
A valid LINQ data source must support the IEnumerable<T>
interface or an interface that inherits from it.
Let's define a simple data source:
var studentIds = new int[10] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
The studentIds
array supports the IEnumerable<T>
interface.
Types that support IEnumerable<T>
or a derived interface (IQueryable<T>
) are called queryable types. A queryable type can be directly used as a LINQ data source. If the data source is not in memory as a queryable type, we need to use LINQ providers to load it into a queryable form.
Create the Query
A query specifies the information we want to retrieve from the data source.
To create a query, we need to import LINQ into our code:
using System.Linq;
Now, let's define the query:
var studentsWithEvenIds = from studentId in studentIds
where (studentId % 2) == 0
select studentId;
Here, we are returning an IEnumerable<int>
collection named studentsWithEvenIds
that holds all the even-numbered student IDs.
The query expression has three clauses: from
, where
, and select
. The from
clause describes the data source, the where
clause applies filters, and the select
clause shapes the data to produce the query result.
Execute the Query
There are two ways to execute a LINQ query:
Deferred Execution
Immediate Execution
Deferred Execution
Deferred execution means the actual execution of the query is delayed until we iterate over it using a foreach
statement:
foreach (int studentId in studentsWithEvenIds)
{
Console.Write("Student Id {0} which is even.", studentId);
}
Immediate Execution
Immediate execution is the opposite of deferred execution. Here, the query is executed, and the result is obtained immediately. Examples include aggregate functions such as Count
, Average
, Min
, Max
, Sum
, and element operators like First
, Last
, Single
, ToList
, ToArray
, and ToDictionary
.
Basic Ways to Write LINQ Queries
There are two basic ways to write LINQ queries:
Query Syntax
Method Syntax
Query Syntax
To start with our example, we define a method that returns our data source:
static IQueryable<Student> GetStudentsFromDb()
{
return new[]
{
new Student { StudentID = 1, StudentName = "John Nigel", Mark = 73, City = "NYC" },
new Student { StudentID = 2, StudentName = "Alex Roma", Mark = 51, City = "CA" },
new Student { StudentID = 3, StudentName = "Noha Shamil", Mark = 88, City = "CA" },
new Student { StudentID = 4, StudentName = "James Palatte", Mark = 60, City = "NYC" },
new Student { StudentID = 5, StudentName = "Ron Jenova", Mark = 85, City = "NYC" }
}.AsQueryable();
}
We use LINQ query syntax to find all students with a Mark
higher than 80:
var studentList = GetStudentsFromDb();
var highPerformingStudents = from student in studentList
where student.Mark > 80
select student;
The query syntax starts with a from
clause. We can use any standard query operator to join, group, or filter the result. In this example, we use where
as the standard query operator. The query syntax ends with either a select
or a groupBy
clause.
Method Syntax
Method syntax uses extension methods provided in the Enumerable
and Queryable
classes.
To see this syntax in action, let's create another query:
var highPerformingStudents = studentList.Where(s => s.Mark > 80);
In this example, we use the Where()
extension method and provide a lambda expression s => s.Mark > 80
as an argument.
Lambda Expressions With LINQ
In LINQ, lambda expressions are used to define anonymous functions conveniently. They can be passed as variables or parameters to method calls. In many LINQ methods, lambda expressions are used as parameters, making the syntax short and precise. Their scope is limited to where they are used as expressions, so they cannot be reused afterward.
To see a lambda expression in action, let's create a query:
var firstStudent = studentList.Select(x => x.StudentName);
The expression x => x.StudentName
is a lambda expression. x
is an input parameter to the anonymous function representing each object inside the collection.
Frequently Used LINQ Methods
Since we've already seen the Where
method in action, let's look at other top LINQ methods used in everyday C# programming.
Sorting: OrderBy, OrderByDescending
We can use the OrderBy()
method to sort a collection in ascending order based on the selected property:
var selectStudentsWithOrderById = studentList.OrderBy(x => x.StudentID);
Similarly, the OrderByDescending()
method sorts the collection using the StudentID
property in descending order:
var selectStudentsWithOrderByDescendingId = studentList.OrderByDescending(x => x.StudentID);
Projection: Select
We use the Select
method to project each element of a sequence into a new form:
var studentsIdentified = studentList.Where(c => c.StudentName == name)
.Select(stu => new Student { StudentName = stu.StudentName, Mark = stu.Mark });
Here, we filter only the students with the required name and then use the Select
method to return students with only StudentName
and Mark
properties populated. This way, we can easily extract only the required information from our objects.
Grouping: GroupBy
We can use the GroupBy()
method to group elements based on the specified key selector function. In this example, we use City
:
var studentListGroupByCity = studentList.GroupBy(x => x.City);
All the previous methods (Where, OrderBy, OrderByDescending, Select, GroupBy) return collections as results. To use all the data inside the collection, we need to iterate over it.
All, Any, Contains
We can use All()
to determine whether all elements of a sequence satisfy a condition:
var hasAllStudentsPassed = studentList.All(x => x.Mark > 50);
Similarly, we can use Any()
to determine if any element of a sequence exists or satisfies a condition:
var hasAnyStudentGotDistinction = studentList.Any(x => x.Mark > 86);
The Contains()
method determines whether a sequence or a collection contains a specified element:
var studentContainsId = studentList.Contains(new Student { StudentName = "Noha Shamil" }, new StudentNameComparer());
Partitioning: Skip, Take
Skip()
will bypass a specified number of elements in a sequence and return the remaining elements:
var skipStudentsUptoIndexTwo = studentList.Skip(2);
Take()
will return a specified number of elements from the first element in a sequence:
var takeStudentsUptoIndexTwo = studentList.Take(2);
Aggregation: Count, Max, Min, Sum, Average
Applying the Sum()
method on the property Mark
will give the summation of all marks:
var sumOfMarks = studentList.Sum(x => x.Mark);
We can use the Count()
method to return the number of students with a score higher than 65:
var countOfStudents = studentList.Count(x => x.Mark > 65);
Max()
will display the highest Mark
scored by a student from the collection:
var maxMarks = studentList.Max(x => x.Mark);
Min()
will display the lowest marks scored by a student from the collection:
var minMarks = studentList.Min(x => x.Mark);
We can use Average()
to compute the average of a sequence of numerical values:
var avgMarks = studentList.Average(x => x.Mark);
Elements: First, FirstOrDefault, Single, SingleOrDefault
First()
returns the first element in the list that satisfies the predicate
function. However, if the input sequence is null, it throws the ArgumentNullException
, and if there’s no element for a condition, it throws InvalidOperationException
:
var firstStudent = studentList.First(x => x.StudentID % 2 == 0);
FirstOrDefault()
works similarly to the First()
method for positive use cases. If there’s no element found, it will return null
for reference types and a default value for value types:
var firstOrDefaultStudent = studentList.FirstOrDefault(x => x.StudentID == 1);
Single()
method returns only one element in the collection after satisfying the condition. It also throws the same exceptions as the First()
method if the source or predicate is null, or if more than one element satisfies the condition of the predicate:
var singleStudent = studentList.Single(x => x.StudentID == 1);
SingleOrDefault()
method works similarly to Single()
when we find the required element. But if we can’t find an element that meets our condition, the method will return null
for reference types or the default value for value types:
var singleOrDefaultStudent = studentList.SingleOrDefault(x => x.StudentID == 1);
Advantages and Disadvantages of Using LINQ
Advantages of using LINQ:
Improves code readability
Provides compile-time object type-checking
Offers IntelliSense support for generic collections
LINQ queries can be reused
Includes built-in methods to write less code and expedite development
Provides a common query syntax for various data sources
Disadvantages of using LINQ:
Difficult to write complex queries compared to SQL
Performance degradation if queries are not written accurately
Requires recompilation and redeployment for every query change
Doesn't fully utilize SQL features such as cached execution plans for stored procedures