Simplifying Data Queries in .NET using LINQ

What is LINQ?

LINQ, introduced with .NET 3.5, is a unified query language that allows you to retrieve data from various sources such as collections of objects, relational databases, ADO.NET datasets, and XML files.

Different Steps of a LINQ Query Operation

A LINQ query operation involves three main steps:

  • Obtain the data source

  • Create the query

  • Execute the query

Obtain the Data Source

A valid LINQ data source must support the IEnumerable<T> interface or an interface that inherits from it.

Let's define a simple data source:

var studentIds = new int[10] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };

The studentIds array supports the IEnumerable<T> interface.

Types that support IEnumerable<T> or a derived interface (IQueryable<T>) are called queryable types. A queryable type can be directly used as a LINQ data source. If the data source is not in memory as a queryable type, we need to use LINQ providers to load it into a queryable form.

Create the Query

A query specifies the information we want to retrieve from the data source.

To create a query, we need to import LINQ into our code:

using System.Linq;

Now, let's define the query:

var studentsWithEvenIds = from studentId in studentIds
                          where (studentId % 2) == 0
                          select studentId;

Here, we are returning an IEnumerable<int> collection named studentsWithEvenIds that holds all the even-numbered student IDs.

The query expression has three clauses: from, where, and select. The from clause describes the data source, the where clause applies filters, and the select clause shapes the data to produce the query result.

Execute the Query

There are two ways to execute a LINQ query:

  • Deferred Execution

  • Immediate Execution

Deferred Execution

Deferred execution means the actual execution of the query is delayed until we iterate over it using a foreach statement:

foreach (int studentId in studentsWithEvenIds)
{
    Console.Write("Student Id {0} which is even.", studentId);
}

Immediate Execution

Immediate execution is the opposite of deferred execution. Here, the query is executed, and the result is obtained immediately. Examples include aggregate functions such as Count, Average, Min, Max, Sum, and element operators like First, Last, Single, ToList, ToArray, and ToDictionary.

Basic Ways to Write LINQ Queries

There are two basic ways to write LINQ queries:

  • Query Syntax

  • Method Syntax

Query Syntax

To start with our example, we define a method that returns our data source:

static IQueryable<Student> GetStudentsFromDb()
{
    return new[]
    {
        new Student { StudentID = 1, StudentName = "John Nigel", Mark = 73, City = "NYC" },
        new Student { StudentID = 2, StudentName = "Alex Roma", Mark = 51, City = "CA" },
        new Student { StudentID = 3, StudentName = "Noha Shamil", Mark = 88, City = "CA" },
        new Student { StudentID = 4, StudentName = "James Palatte", Mark = 60, City = "NYC" },
        new Student { StudentID = 5, StudentName = "Ron Jenova", Mark = 85, City = "NYC" }
    }.AsQueryable();
}

We use LINQ query syntax to find all students with a Mark higher than 80:

var studentList = GetStudentsFromDb();

var highPerformingStudents = from student in studentList
                             where student.Mark > 80
                             select student;

The query syntax starts with a from clause. We can use any standard query operator to join, group, or filter the result. In this example, we use where as the standard query operator. The query syntax ends with either a select or a groupBy clause.

Method Syntax

Method syntax uses extension methods provided in the Enumerable and Queryable classes.

To see this syntax in action, let's create another query:

var highPerformingStudents = studentList.Where(s => s.Mark > 80);

In this example, we use the Where() extension method and provide a lambda expression s => s.Mark > 80 as an argument.

Lambda Expressions With LINQ

In LINQ, lambda expressions are used to define anonymous functions conveniently. They can be passed as variables or parameters to method calls. In many LINQ methods, lambda expressions are used as parameters, making the syntax short and precise. Their scope is limited to where they are used as expressions, so they cannot be reused afterward.

To see a lambda expression in action, let's create a query:

var firstStudent = studentList.Select(x => x.StudentName);

The expression x => x.StudentName is a lambda expression. x is an input parameter to the anonymous function representing each object inside the collection.

Frequently Used LINQ Methods

Since we've already seen the Where method in action, let's look at other top LINQ methods used in everyday C# programming.

Sorting: OrderBy, OrderByDescending

We can use the OrderBy() method to sort a collection in ascending order based on the selected property:

var selectStudentsWithOrderById = studentList.OrderBy(x => x.StudentID);

Similarly, the OrderByDescending() method sorts the collection using the StudentID property in descending order:

var selectStudentsWithOrderByDescendingId = studentList.OrderByDescending(x => x.StudentID);

Projection: Select

We use the Select method to project each element of a sequence into a new form:

var studentsIdentified = studentList.Where(c => c.StudentName == name)
                                    .Select(stu => new Student { StudentName = stu.StudentName, Mark = stu.Mark });

Here, we filter only the students with the required name and then use the Select method to return students with only StudentName and Mark properties populated. This way, we can easily extract only the required information from our objects.

Grouping: GroupBy

We can use the GroupBy() method to group elements based on the specified key selector function. In this example, we use City:

var studentListGroupByCity = studentList.GroupBy(x => x.City);

All the previous methods (Where, OrderBy, OrderByDescending, Select, GroupBy) return collections as results. To use all the data inside the collection, we need to iterate over it.

All, Any, Contains

We can use All() to determine whether all elements of a sequence satisfy a condition:

var hasAllStudentsPassed = studentList.All(x => x.Mark > 50);

Similarly, we can use Any() to determine if any element of a sequence exists or satisfies a condition:

var hasAnyStudentGotDistinction = studentList.Any(x => x.Mark > 86);

The Contains() method determines whether a sequence or a collection contains a specified element:

var studentContainsId = studentList.Contains(new Student { StudentName = "Noha Shamil" }, new StudentNameComparer());

Partitioning: Skip, Take

Skip() will bypass a specified number of elements in a sequence and return the remaining elements:

var skipStudentsUptoIndexTwo = studentList.Skip(2);

Take() will return a specified number of elements from the first element in a sequence:

var takeStudentsUptoIndexTwo = studentList.Take(2);

Aggregation: Count, Max, Min, Sum, Average

Applying the Sum() method on the property Mark will give the summation of all marks:

var sumOfMarks = studentList.Sum(x => x.Mark);

We can use the Count() method to return the number of students with a score higher than 65:

var countOfStudents = studentList.Count(x => x.Mark > 65);

Max() will display the highest Mark scored by a student from the collection:

var maxMarks = studentList.Max(x => x.Mark);

Min() will display the lowest marks scored by a student from the collection:

var minMarks = studentList.Min(x => x.Mark);

We can use Average() to compute the average of a sequence of numerical values:

var avgMarks = studentList.Average(x => x.Mark);

Elements: First, FirstOrDefault, Single, SingleOrDefault

First() returns the first element in the list that satisfies the predicate function. However, if the input sequence is null, it throws the ArgumentNullException, and if there’s no element for a condition, it throws InvalidOperationException:

var firstStudent = studentList.First(x => x.StudentID % 2 == 0);

FirstOrDefault() works similarly to the First() method for positive use cases. If there’s no element found, it will return null for reference types and a default value for value types:

var firstOrDefaultStudent = studentList.FirstOrDefault(x => x.StudentID == 1);

Single() method returns only one element in the collection after satisfying the condition. It also throws the same exceptions as the First() method if the source or predicate is null, or if more than one element satisfies the condition of the predicate:

var singleStudent = studentList.Single(x => x.StudentID == 1);

SingleOrDefault() method works similarly to Single() when we find the required element. But if we can’t find an element that meets our condition, the method will return null for reference types or the default value for value types:

var singleOrDefaultStudent = studentList.SingleOrDefault(x => x.StudentID == 1);

Advantages and Disadvantages of Using LINQ

Advantages of using LINQ:

  • Improves code readability

  • Provides compile-time object type-checking

  • Offers IntelliSense support for generic collections

  • LINQ queries can be reused

  • Includes built-in methods to write less code and expedite development

  • Provides a common query syntax for various data sources

Disadvantages of using LINQ:

  • Difficult to write complex queries compared to SQL

  • Performance degradation if queries are not written accurately

  • Requires recompilation and redeployment for every query change

  • Doesn't fully utilize SQL features such as cached execution plans for stored procedures