C# Concepts for creating Data
Structures
The most
common and likely well-known data structure is the array, which contains a
contiguous collection of data items that can be accessed by an ordinal index.
Before
jumping into the content for this article, let's first take a quick peek at the
roadmap for this six-part article series, so that you can see what lies ahead.
If there are any topics you think are missing from this outline, I invite you
to e-mail me at mitchell@4guysfromrolla.com and share your thoughts. Space
permitting, I'll be happy to add your suggestions to the appropriate
installment or, if needed, add a seventh part to the series.
In this
first part of the six-part series, we'll look at why data structures are
important, and their effect on the performance of an algorithm. To determine a
data structure's effect on performance, we'll need to examine how the various
operations performed by a data structure can be rigorously analyzed. Finally, we'll
turn our attention to two data structures present in the .NET Framework—the
Array and ArrayList. Chances are you've used both of these data structures in
past projects. In this article, we'll examine what operations they provide and
the efficiency of these operations.
In Part
2, we'll explore the ArrayList class in more detail and examine its
counterparts, the Queue class and Stack class. Like the ArrayList, both the
Queue and Stack classes store a contiguous collection of data and are data
structures available in the .NET Framework Base Class Library. However, unlike
an ArrayList from which you can retrieve any data item, Queues and Stacks only
allow data to be accessed in a predetermined sequential order. We'll examine
some applications of Queues and Stacks, and see how to implement both of these
classes by extending the ArrayList class. After examining Queues and Stacks,
we'll look at HashTables, which allow for direct access like an ArrayList, but
store data indexed by a string key.
While
ArrayLists are ideal for directly accessing and storing contents, they are
suboptimal candidates when the data needs to be searched. In Part 3, we'll
examine the binary search tree data structure, which provides a much more
efficient means for searching than the ArrayList. The .NET Framework does not
include any built-in binary search tree data structures, so we will have to
build our own.
The
efficiency of searching a binary search trees is sensitive to the order with
which the data was inserted into the tree. If the data was inserted in sorted
or near-sorted order, the binary search tree loses virtually all of its
efficiency advantages over the ArrayList. To combat this issue, in Part 4 we'll
examine an interesting randomized data structure—the SkipList. SkipLists
provide the efficiency of searching a binary search tree, but without the
sensitivity to the order with which data is entered.
In Part 5
we'll turn our attention to data structures that can be used to represent
graphs. A graph is a collection of nodes, with a set of edges connecting the
various nodes. For example, a map can be visualized as a graph, with cities as
nodes and the highways between them as edged between the nodes. Many real-world
problems can be abstractly defined in terms of graphs, thereby making graphs an
often-used data structure.
Finally,
in Part 6 we'll look at data structures to represent sets and disjoint sets. A
set is an unordered collection of items. Disjoint sets are a collection of sets
that have no elements in common with one another. Both sets and disjoint sets
have many uses in everyday programs, which we'll examine in detail in this
final part.
Analyzing
the Performance of Data Structures
When
thinking about a particular application or programming problem, many developers
(myself included) find themselves most interested in writing the algorithm to
tackle the problem at hand, or adding cool features to the application to
enhance the user's experience. Rarely, if ever, will you hear someone excited
about what type of data structure they are using. However, the data structures
used for a particular algorithm can greatly impact its performance. A common
example is finding an element in a data structure. With an array, this process
takes time proportional to the number of elements in the array. With binary
search trees or SkipLists, the time required is sub-linear. When searching
large amounts of data, the data structure chosen can make a difference in the
application's performance that can be visibly measured in seconds or even minutes.
Since the
data structure used by an algorithm can greatly affect the algorithm's
performance, it is important that there exists a rigorous method by which to
compare the efficiency of various data structures. What we, as developers
utilizing a data structure, are primarily interested in is how the data
structures performance changes as the amount of data stored increases. That is,
for each new element stored by the data structure, how are the running times of
the data structure's operations effected?
Consider
a scenario in which you have a program that uses the
System.IO.Directory.GetFiles(path) method to return the list of the files in a
specified directory as a string array. Now, imagine that you wanted to search
through the array to determine if an XML file existed in the list of files
(namely one whose extension was .xml). One approach to do this would be to scan
through the array and set some flag once an XML file was encountered. The code
might look like so:
using
System;
using
System.Collections; using System.IO;
public
class MyClass
{
public
static void Main()
{
string []
fs = Directory.GetFiles(@"C:\Inetpub\wwwroot"); bool foundXML =
false;
int i =
0;
for (i =
0; i < fs.Length; i++)
if
(String.Compare(Path.GetExtension(fs[i]), ".xml", true) == 0)
{
foundXML
= true; break;
}
if
(foundXML)
Console.WriteLine("XML file found - " +
fs[i]); else
Console.WriteLine("No
XML files found.");
}
Related Topics
Privacy Policy, Terms and Conditions, DMCA Policy and Compliant
Copyright © 2018-2024 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.