[Dev Tip] Quickly Generate C# Data Objects from XML


Ever had a need to read an existing XML document in a .NET app but didn’t want to deal with XPath queries or navigating the DOM?  Wouldn’t it be easier if you could just use data objects instead?  Enter Xsd.exe..

Xsd.exe is a free tool that ships with Visual Studio (including the Express editions) that allows you to “generate XML schema or common language runtime classes from XDR, XML, and XSD files, or from classes in a runtime assembly”.

Or to put it a different way, one of the abilities of Xsd.exe is the ability to auto-generate classes from an XML or XSD document.  With these classes in hand you can then deserialize an XML document at runtime and access the document without having to deal with the underlying XML.

Here’s how:

Step 1: Create the Schema Definition

First you’ll need to have a copy of the XML file that you intend to read.  If you already have an XSD schema file for the XML document skip to Step 2.

Here is an sample XML file provided with the MSXML SDK.   I have removed some of the book entries for brevity and saved this to a file named ‘books.xml’.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
<?xml version="1.0"?>
<catalog>
   <book id="bk102">
      <author>Ralls, Kim</author>
      <title>Midnight Rain</title>
      <genre>Fantasy</genre>
      <price>5.95</price>
      <publish_date>2000-12-16</publish_date>
      <description>A former architect battles corporate zombies,
      an evil sorceress, and her own childhood to become queen
      of the world.</description>
   </book>
   <book id="bk106">
      <author>Randall, Cynthia</author>
      <title>Lover Birds</title>
      <genre>Romance</genre>
      <price>4.95</price>
      <publish_date>2000-09-02</publish_date>
      <description>When Carla meets Paul at an ornithology
      conference, tempers fly as feathers get ruffled.</description>
   </book>
   <book id="bk108">
      <author>Knorr, Stefan</author>
      <title>Creepy Crawlies</title>
      <genre>Horror</genre>
      <price>4.95</price>
      <publish_date>2000-12-06</publish_date>
      <description>An anthology of horror stories about roaches,
      centipedes, scorpions  and other insects.</description>
   </book>
   <book id="bk110">
      <author>O'Brien, Tim</author>
      <title>Microsoft .NET: The Programming Bible</title>
      <genre>Computer</genre>
      <price>36.95</price>
      <publish_date>2000-12-09</publish_date>
      <description>Microsoft's .NET initiative is explored in
      detail in this deep programmer's reference.</description>
   </book>
</catalog>

Xsd.exe can generate classes directly from the XML document but I prefer to create an schema definition first to confirm that the data elements have been interpreted correctly.  If an XML file has no value for a particular element, Xsd.exe has no way of inferring the data type.  In these cases you may need to tweak the schema document before generating the data classes.

Ok, so let’s go ahead and create the XSD.  Xsd.exe is typically located at C:\Program Files\Microsoft Visual Studio 8\SDK\v2.0\Bin.

Here’s the command to create the XSD schema from the XML file:

1
2
3
4
5
6
>xsd.exe books.xml
Microsoft (R) Xml Schemas/DataTypes support utility
[Microsoft (R) .NET Framework, Version 2.0.50727.42]
Copyright (C) Microsoft Corporation. All rights reserved.
Writing file 'books.xsd'.

Here is the content of books.xsd:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
<?xml version="1.0" encoding="utf-8"?>
<xs:schema id="catalog" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
  <xs:element name="catalog" msdata:IsDataSet="true" msdata:UseCurrentLocale="true">
    <xs:complexType>
      <xs:choice minOccurs="0" maxOccurs="unbounded">
        <xs:element name="book">
          <xs:complexType>
            <xs:sequence>
              <xs:element name="author" type="xs:string" minOccurs="0" msdata:Ordinal="0" />
              <xs:element name="title" type="xs:string" minOccurs="0" msdata:Ordinal="1" />
              <xs:element name="genre" type="xs:string" minOccurs="0" msdata:Ordinal="2" />
              <xs:element name="price" type="xs:string" minOccurs="0" msdata:Ordinal="3" />
              <xs:element name="publish_date" type="xs:string" minOccurs="0" msdata:Ordinal="4" />
              <xs:element name="description" type="xs:string" minOccurs="0" msdata:Ordinal="5" />
            </xs:sequence>
            <xs:attribute name="id" type="xs:string" />
          </xs:complexType>
        </xs:element>
      </xs:choice>
    </xs:complexType>
  </xs:element>
</xs:schema>

You will notice that Xsd.exe interpreted the publish_date element as a string. This would be more useful as a date data type. This can be done by changing the type from xs:string to xs:date like so:

1
<xs:element name="publish_date" type="xs:date" minOccurs="0" msdata:Ordinal="4" />

Step 2: Generating the Data Classes

Now that we have a valid XSD file and we have reviewed it for accuracy, let’s generate the C# classes.

Here’s the command:

1
2
3
4
5
6
>xsd.exe -c books.xsd
Microsoft (R) Xml Schemas/DataTypes support utility
[Microsoft (R) .NET Framework, Version 2.0.50727.42]
Copyright (C) Microsoft Corporation. All rights reserved.
Writing file 'books.cs'.

Note that there are additional parameters that you can use to set the namespace etc. but this will do for demonstration purposes.

Here is the content of books.cs:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
//------------------------------------------------------------------------------
// <auto-generated>
//     This code was generated by a tool.
//     Runtime Version:2.0.50727.3082
//
//     Changes to this file may cause incorrect behavior and will be lost if
//     the code is regenerated.
// </auto-generated>
//------------------------------------------------------------------------------
using System.Xml.Serialization;
//
// This source code was auto-generated by xsd, Version=2.0.50727.42.
//
/// <remarks/>
[System.CodeDom.Compiler.GeneratedCodeAttribute("xsd", "2.0.50727.42")]
[System.SerializableAttribute()]
[System.Diagnostics.DebuggerStepThroughAttribute()]
[System.ComponentModel.DesignerCategoryAttribute("code")]
[System.Xml.Serialization.XmlTypeAttribute(AnonymousType=true)]
[System.Xml.Serialization.XmlRootAttribute(Namespace="", IsNullable=false)]
public partial class catalog {
    
    private catalogBook[] itemsField;
    
    /// <remarks/>
    [System.Xml.Serialization.XmlElementAttribute("book", Form=System.Xml.Schema.XmlSchemaForm.Unqualified)]
    public catalogBook[] Items {
        get {
            return this.itemsField;
        }
        set {
            this.itemsField = value;
        }
    }
}
/// <remarks/>
[System.CodeDom.Compiler.GeneratedCodeAttribute("xsd", "2.0.50727.42")]
[System.SerializableAttribute()]
[System.Diagnostics.DebuggerStepThroughAttribute()]
[System.ComponentModel.DesignerCategoryAttribute("code")]
[System.Xml.Serialization.XmlTypeAttribute(AnonymousType=true)]
public partial class catalogBook {
    
    private string authorField;
    
    private string titleField;
    
    private string genreField;
    
    private string priceField;
    
    private System.DateTime publish_dateField;
    
    private bool publish_dateFieldSpecified;
    
    private string descriptionField;
    
    private string idField;
    
    /// <remarks/>
    [System.Xml.Serialization.XmlElementAttribute(Form=System.Xml.Schema.XmlSchemaForm.Unqualified)]
    public string author {
        get {
            return this.authorField;
        }
        set {
            this.authorField = value;
        }
    }
    
    /// <remarks/>
    [System.Xml.Serialization.XmlElementAttribute(Form=System.Xml.Schema.XmlSchemaForm.Unqualified)]
    public string title {
        get {
            return this.titleField;
        }
        set {
            this.titleField = value;
        }
    }
    
    /// <remarks/>
    [System.Xml.Serialization.XmlElementAttribute(Form=System.Xml.Schema.XmlSchemaForm.Unqualified)]
    public string genre {
        get {
            return this.genreField;
        }
        set {
            this.genreField = value;
        }
    }
    
    /// <remarks/>
    [System.Xml.Serialization.XmlElementAttribute(Form=System.Xml.Schema.XmlSchemaForm.Unqualified)]
    public string price {
        get {
            return this.priceField;
        }
        set {
            this.priceField = value;
        }
    }
    
    /// <remarks/>
    [System.Xml.Serialization.XmlElementAttribute(Form=System.Xml.Schema.XmlSchemaForm.Unqualified, DataType="date")]
    public System.DateTime publish_date {
        get {
            return this.publish_dateField;
        }
        set {
            this.publish_dateField = value;
        }
    }
    
    /// <remarks/>
    [System.Xml.Serialization.XmlIgnoreAttribute()]
    public bool publish_dateSpecified {
        get {
            return this.publish_dateFieldSpecified;
        }
        set {
            this.publish_dateFieldSpecified = value;
        }
    }
    
    /// <remarks/>
    [System.Xml.Serialization.XmlElementAttribute(Form=System.Xml.Schema.XmlSchemaForm.Unqualified)]
    public string description {
        get {
            return this.descriptionField;
        }
        set {
            this.descriptionField = value;
        }
    }
    
    /// <remarks/>
    [System.Xml.Serialization.XmlAttributeAttribute()]
    public string id {
        get {
            return this.idField;
        }
        set {
            this.idField = value;
        }
    }
}

Step 3: Add the Class to the Solution

So far so good.. Now we need to add the C# class to our Visual Studio solution (copy the class to the solution, right-click the project file, Add > Existing Item..).

At this point we should now have a books.cs file added to our solution and the project should build.

Step 4: Reading the XML File
Almost there! Now we need a little bit of code to deserialize the XML file and new up our data objects.

Here is what that code looks like:

1
2
3
4
5
6
7
catalog catalog = (catalog)serializer.Deserialize(reader);
XmlSerializer serializer = new XmlSerializer(typeof(catalog));
using (TextReader reader = new StreamReader(@"path\to\xml\file.xml"))
{
     catalog catalog = (catalog)serializer.Deserialize(reader);
}

Now that we have a “newed up” catalog object we’re good to go.. We can now do something like the following:

1
2
3
4
5
6
7
8
foreach (catalogBook book in catalog.Items)
{
    System.Console.WriteLine(book.author);
    if (book.publish_date > DateTime.Now.AddMonths(-1))
    {
        // This book was published within the last month
    }
}

Step 5: Cleaning Things Up
At this point things are working but if you are like me you probably aren’t too happy about the auto-generated class and property names.

If you want to clean things up a little here’s an easy way to do it..

The first thing you need to do is add attributes to each of the properties in the auto-generated code. Here is the auto-generated property for the “author” element:

1
2
3
4
5
6
7
8
9
10
11
12
[System.Xml.Serialization.XmlElementAttribute(Form=System.Xml.Schema.XmlSchemaForm.Unqualified)]
public string author
{
    get
    {
        return this.authorField;
    }
    set
    {
        this.authorField = value;
    }
}

And here is the updated property (Note that ElementName=”author” has been added). This maps the property (currently named author) to the “author” xml element.

1
2
3
4
5
6
7
8
9
10
11
12
[System.Xml.Serialization.XmlElementAttribute(ElementName = "author", Form=System.Xml.Schema.XmlSchemaForm.Unqualified)]
public string author
{
    get
    {
        return this.authorField;
    }
    set
    {
        this.authorField = value;
    }
}

Now we can change the name of the actual property to whatever we want (Right-click, refactor, rename..).

Note that we can do the same with the class names (catalog and catalogBook).

So after a little bit of editing and refactoring here’s how things look.

1
2
3
4
5
6
7
8
9
10
Catalog catalog = (Catalog)serializer.Deserialize(reader);
foreach (Book book in catalog.Items)
{
    System.Console.WriteLine(book.Author);
    if (book.PublishDate > DateTime.Now.AddMonths(-1))
    {
        // This book was published within the last month
    }
}

Ah, much better!

Other Solutions

As always, there are other ways to do this type of thing (LINQ to XML, LINQ to XSD, xmlobjectsetc.) but I have found this simple approach to come in quite handy when I need to get up and running quickly with auto-gen’d data objects from an existing XML file.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s