Sunday, 23 September 2012

XDocument in Silverlight

When working with data, it often comes in an XML format. So we have to serialize and deserialize it in order to use it. There are several ways of doing that – for example: DOM, XQuery, XSLT. DOM is the oldest from the three, but still can do the work. XQuery and XSLT are not very easy to use and require some time to master. In .NET 3.5 a big programming model improvement was made with the LINQ - Language-Integrated Query. It can be used for objects, databases and XML.

Introduction

LINQ to XML allows us to create, read and edit XML files and what’s more important - it’s done in a very easy and understandable way. To use LINQ to XML you should add a reference to the System.Xml.Linq.dll (for the XDocument and the other classes) and use the System.Linq namespace (for the LINQ syntax).

Important methods of XElement and XDocument

  • Add(object content) – adds the new content as a child of the element
  • Remove() – Remove this element from its parent
  • Descendents( XName name ) – returns a collection of all descendents of this element, which names match the argument, in document order
  • Element( XName name ) – returns the first child that has a matching name
  • Elements( XName name ) - returns a collection of all children of this element, which name matches the argument, in document order
  • Nodes() – returns a collection of all children of the current element in document order

Creating an XML file

So let’s have a simple class Person:
 public class Person
 {
     public string FirstName { get; set; }
     public string LastName { get; set; }
     public Location Address { get; set; }
 }
 
 public class Location
 {
      public string Country { get; set; }
      public string City { get; set; }
  }
And we create an object of type Person:
 Person p1 = new Person()
             {
                 FirstName = "Martin",
                 LastName = "Mihaylov",
                 Address = new Location()
                 {
                     City = "Sofia",
                     Country = "Bulgaria"
                 }
              };
Now let’s try to create an XML from it. For that purpose we use XElement and XAttribute objects:
 XElement persons =
         new XElement( "persons",
             new XElement( "person",
                 new XElement( "firstName", p1.FirstName ),
                 new XElement( "lastName", p1.LastName ),
                 new XElement( "address",
                     new XAttribute( "city", p1.Address.City ),
                     new XAttribute( "country", p1.Address.Country ) ) ) );
We simply create an element “persons” using the XElement object and then nest other elements in it. We can also create properties for the elements thanks to the XAttribute object. We can also use the XDeclaration object to define our xml document and XComment to add a comment to the xml document. Here it is:
 XDocument myXml = new XDocument( new XDeclaration( "1.0", "utf-8", "yes" ),
                 new XComment( "A Comment in the XML" ), persons );
So the final output should look like this:
 <?xml version="1.0" encoding="utf-8" standalone="yes"?>
 <!-- A Comment in the XML -->
 <persons>
     <person>
           <firstName>Martin</firstName>
           <lastName>Mihaylov</lastName>
           <address city="Sofia" country="Bulgaria" />
     </person>
 </persons>

 

Adding an element to the XDocument

First we find the element we want to add something to and then we use its Add method to add our new element. Here is an example:
 myXml.Element( "persons" ).Add( new XElement( "person",
                         new XElement( "firstName", p2.FirstName ),
                         new XElement( "lastName", p2.LastName ),
                         new XElement( "address",
                             new XAttribute( "city", p2.Address.City ),
                             new XAttribute( "country", p2.Address.Country ) ) ) );

 

Removing an element form the XDocument

To remove an element or attribute you must navigate to the desired element and then call its Remove method:
 myXml.Element( "persons" ).Element( "person" ).Remove();
This will remove the first element with name “person” in “persons”.

Reading an XML file

Before reading you should load your XML file to an XElement or XDocument object. This can be done with the Load method. You can input from string, from TextReader, from XMLReader and of course from file. Here is an Example:
 XDocument myXML = XDocument.Load( "MyXML.xml" );
Now let’s try to read the contents of an XML file. For this example we use the XML string we’ve already created in the beginning of the article. Thanks to LINQ we can use the standard query operators: from, in, select. Because of that to take the information you need from an XML file becomes fairly easy:
 List<Person> personsList =
             ( from person in myXml.Descendants( "person" )
              where (( string )person.Element( "address" ).Attribute( "country" )).Equals( "Bulgaria" )
              select new Person()
                  {
                      FirstName = person.Element( "firstName" ).Value,
                      LastName = person.Element( "lastName" ).Value,
                      Address = new Location()
                      {
                           City = person.Element( "address" ).Attribute( "city" ).Value,
                           Country = person.Element( "address" ).Attribute( "country" ).Value
                       }
                   } ).ToList();
The Descendants method returns all child elements that have name “person” (in our case). Then from each descendent that has an "address" element with "country" property set to "Bulgaria" we create a new object of type Person and set its properties. The output is a list of objects.

Query your data

Notice how it uses the XDocument.Descendants() method. That method looks through the XDocument and all of its child nodes - the descendants - and returns them in document order. When you pass it a name, it filters the list.
One important thing to keep in mind about XDocument.Descendants() is that is uses deferred execution. That means it returns a sequence (an IEnmerable, to be specific), but it doesn't actually descend through the XML document and find all of the descendants until its iterator is executed. If you use a foreach loop to iterate through the descendants, each iteration only reads to the next descendant.

Using LINQ to read XML data from an RSS feed

You can do some pretty powerful things with LINQ to XML, because so much data is stored and transmitted as XML. Like RSS feeds, for example! Open up any RSS feed - like this one from our blog, Building Better Software - and view its source, and you'll see XML data. And that means you can read it into an XDocument and query it with LINQ.
One nice thing about the XDocument.Load() method is that when you pass it a string, you're giving it a URI. A lot of the time, you'll just pass it a simple filename. But a URL will work equally well. Here's how you can read the title of a blog from its RSS feed, using the <rss>, <channel>, and <title> tags:
XDocument ourBlog = XDocument.Load("http://www.stellman-greene.com/feed");
Console.WriteLine(ourBlog.Element("rss").Element("channel").Element("title").Value);
That means it's easy to write a LINQ to XML query to read data from an RSS feed. Here's how we'll do it:
1.       Create a new console application
2.       Make sure you've got using System.Xml.Linq; at the top of the code
3.       We'll use XDocument.Load() to load the XML data from the URL.
4.       A simple LINQ query can extract the articles into instances of a Post class that we'll create
5.       Instead of using anonymous types, the select new clause will select new Post objects
When you use the XDocument.Element() method, you're really calling the Element() method of its base class, XContainer. The XElement class that use used earlier also extends XContainer, and the Element() method returns an XContainer.
We'll take advantage of that by creating a Post class with a constructor that takes an XContainer object and uses its Element() method to get values. Note its GetElementValue() method that either returns an element's Value or, if that element doesn't exist, returns an empty string. (Again, remember to add using System.Xml.Linq; to the top of the code, for both this and the Main() method below!)
class Post
{
     public string Title { get; private set; }
     public DateTime? Date { get; private set; }
     public string Url { get; private set; }
     public string Description { get; private set; }
     public string Creator { get; private set; }
     public string Content { get; private set; }
 
     private static string GetElementValue(XContainer element, string name)
     {
          if ((element == null) || (element.Element(name) == null))
              return String.Empty;
          return element.Element(name).Value;
      }
 
     public Post(XContainer post)
     {
          // Get the string properties from the post's element values
          Title = GetElementValue(post, "title");
          Url = GetElementValue(post, "guid");
          Description = GetElementValue(post, "description");
          Creator = GetElementValue(post, 
              "{http://purl.org/dc/elements/1.1/}creator");
          Content = GetElementValue(post, 
              "{http://purl.org/dc/elements/1.0/modules/content}encoded");
  
          // The Date property is a nullable DateTime? if the pubDate element
          // can't be parsed into a valid date,the Date property is set to null
          DateTime result;
          if (DateTime.TryParse(GetElementValue(post, "pubDate"), out result))
              Date = (DateTime?)result;
      }
 
     public override string ToString()
     {
          return String.Format("{0} by {1}", Title ?? "no title", Creator ?? "Unknown");
      }
}
Did you notice how the Post constructor passes uses "{http://purl.org/dc/elements/1.1/}creator" as the name for creator? If you go back to the RSS feed source and search for "creator", you'll find a tag that looks like this:
<dc:creator>Andrew Stellman</dc:creator>
See that "dc:"? At the top of the post, the tag has this attribute:
xmlns:dc="http://purl.org/dc/elements/1.1/"
That's an XML namespace. Put them together and you'll get the element's complete name:
{http://purl.org/dc/elements/1.1/}creator
Now you're ready for the LINQ query. Notice how it uses select new Post(post) to pass each XElement returned by ourBlog.Descendants("item") into the Post constructor.
static void Main(string[] args)
{
     // Load the blog posts and print the title of the blog
     XDocument ourBlog = XDocument.Load("http://www.stellman-greene.com/feed");
     Console.WriteLine(ourBlog
         .Element("rss")
         .Element("channel")
         .Element("title")
         .Value);
 
     // Query <item>s in the XML RSS data and select each one into a new Post()
     IEnumerable<Post> posts =
         from post in ourBlog.Descendants("item")
         select new Post(post);
 
     // Print each post to the console
     foreach (var post in posts)
         Console.WriteLine(post.ToString());
}
When you run your program, it connects to the blog, retrieves the RSS feed, and prints the list of articles to the console.

No comments:

Post a Comment