Tuesday, July 29, 2008
My buddy Vittorio tagged me for this meme that has been making the rounds. It is interesting to see what some people have answered. Mine probably is pretty tame in comparison I am sure.
How old were you when you started programming?
My first computer was the Commodore 64 in the early 80's. I think it was around 1982 or early 83, so that would have made me around 7. My first exposure to programming was typing programs out of magazines and books. I remember computing the checksum as a way of catching errors. We initially had a tape drive for storage and eventually had the 5.25 disk drive to save files. I learned quickly the BASIC language and programmed trivial things there.
It would be a few years later when we got a 286 10mhz (it was a JET!) computer with 640K RAM and some obscenely small harddrive size that I forget now (maybe 20MB?). It had a Hercules graphics card that drove the monochrome monitor (amber no less). I became very proficient with word processing and using DOS at that time. I tried using QBASIC I think around that time.
How did you get started in programming?
I was only interested in computers as a hobby until I started working after college. My degree was for operations management and statistics. I happened to be working when the internet boom started (and busted). Since operations management was often about manufacturing and manufacturing jobs were often located in small towns, I watched my colleagues with envy as they got better projects in better locations just by learning Java. I will never forget Shelbyville, TN or Shenandoah, IA - if they weren't such horrible places to work I would probably never have jumped at an opportunity to learn ASP and get out of manufacturing consulting. It was simple to learn ASP and by extension VBScript and I never looked back again.
What was your first real language?
I didn't program again after my brief stint with BASIC until college. I had my intro computer science course taught in Turbo Pascal. It wasn't terribly hard and I enjoyed learning the algorithms. I knew BASIC from the C64 and TRS-80 days, but I would say that I learned Turbo Pascal probably better than that and it was my first real language. Of course, I don't remember it at all now, but I was good in the day.
What was the first 'real' program you wrote?
I had to write a final project for my CS course in college. I ended up writing a poker program that emulated the kind of poker you would play on a slot machine. It actually worked pretty well and I wish I still had it.
What languages have you used since you started programming?
BASIC, Turbo Pascal, VBScript, Javascript, VBA (for Access), VB6, TSQL and now mostly C# these days. I don't know if XSLT, XPath, DHTML, etc. count or not.
What was your first professional programming gig?
I think my first program I got paid for was writing some VBA code for a Lotus Notes application (what a gawdawful programming model, btw) at a client. Before I finished it, I was sent to a small startup called Point.com (now defunct) and wrote a whole ton of Javascript and XML code. This was using the XmlHttpRequest object well before people called that 'AJAX'. Unfortunately, the bust came around that time and that code never saw the light of day...
If you knew what you knew now, would have have started programming?
Hard to say. Part of me wishes that I would have gotten a masters in CS or taken a number of other programming courses in college. Part of me however wishes I would have just gone to medical school and kept this as a hobby. It just depends on the day of the week and what I am working on.
If there was one thing you learned along the way that you would tell new developers, what would it be?
Find a better developer, read their code, understand their code, and then emulate them. This is also called "saddle yourself to a better developer and learn". I became a better developer when I watched my friend help me with an Access database that I was attempting (badly) to write for a internship one summer. He was (and probably still is) a much better developer. I learned a ton just by seeing how he was doing things. Later, I would see code from senior developers and I would study their style to learn what they were doing. I always tried to emulate what I saw and make it a part of my style. If I had never read anyone else's code or never tried to incorporate it, I would still be a second rate programmer (I might still be... who knows).
What's the most fun you have ever had... programming?
I actually enjoyed learning Javascript and XML back in the day. I remember being excited about optimizing the rendering speed in IE 5 for a particularly large XML payload. I was learning a lot, and becoming a better developer. I didn't mind working crazy hours and I read constantly on how to be better at ASP, Javascript, and XML. The projects I hated were the ones where you produced code, but never got to see it get implemented and never saw anyone use it.
Now... on to two other suckers: Nino and James - you've been 'tagged'.
Friday, July 25, 2008
With the release of Sprint 3 bits, you might have noticed that you are prompted to download now when you hit the service directly from IE. Because the content type changed from 'application/xml' to 'application/x-ssds+xml', IE just doesn't know how to render the resulting response.
This is simple to fix. Copy the following to a .reg file and merge into your registry.
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\MIME\Database\Content Type\application/x-ssds+xml]
"CLSID"="{48123BC4-99D9-11D1-A6B3-00C04FD91555}"
"Extension"=".xml"
"Encoding"=hex:08,00,00,00
Now, you should be back to the behavior you are used to.
Wednesday, July 02, 2008
Here is my last installment in this series of working with objects in SQL Server Data Services. For background, readers should read the following:
Serialization in SSDS
Working with Objects in SSDS Part 1
Working with Objects in SSDS Part 2
Last time, we concluded with a class called SsdsEntity<T> that became an all-purpose wrapper or veneer around our CLR objects. This made it simple to take our existing classes and serialize them as entities in SSDS.
In this post, I want to discuss how the querying in the REST library works. First a simple example:
var ctx = new SsdsContext(
"authority=http://dunnry.data.beta.mssds.com/v1/;username=dunnry;password=secret"
);
var container = ctx.OpenContainer("foo");
var foo = new Foo { IsPublic = false, Name = "MyFoo", Size = 12 };
//insert it with unique id guid string
container.Insert(foo, Guid.NewGuid().ToString());
//now query for it
var results = container.Query<Foo>(e => e.Entity.IsPublic == false && e.Entity.Size > 2);
//Query<T> returns IEnumerable<SsdsEntity<T>>, so foreach over it
foreach (var item in results)
{
Console.WriteLine(item.Entity.Name);
}
I glossed over it in my previous posts with this library, but I have a class called SsdsContext that acts as my credential store and factory to create SsdsContainer objects where I perform my operations. Here, I have opened a container called 'foo', which would relate to the URI (http://dunnry.data.beta.mssds.com/v1/foo) according to the authority name I passed on the SsdsContext constructor arguments.
I created an instance of my Foo class (see this post if you want to see what a Foo looks like) and inserted it. We know that under the covers we have an XmlSerializer doing the work to serialize that to the proper POX wire format. So far, so good. Now, I want to retrieve that same entity back from SSDS. The key line here is the table.Query<T>() call. It accepts a Expression<Func<SsdsEntity<T>, bool>> argument that represents a strongly typed query.
For the uninitiated, the Expression<TDelegate> is a way to represent lambda expressions in an abstract syntax tree. We can think of them as a way to model what the expression does without generating the bits of code necessary to actually do it. We can inspect the Expression and create new ones based on it until finally we can call Compile and actually convert the representation of the lambda into something that can execute.
The Func<SsdsEntity<T>, bool> represents a delegate that accepts a SsdsEntity<T> as an argument and returns a boolean. This effectively represents the WHERE clause in the SSDS LINQ query syntax. Since SsdsEntity<T> contains an actual type T in the Entity property, you can query directly against it in a strongly typed fashion!
What about those flexible properties that I added to support flexible attributes outside of our T? I mentioned that I wanted to keep the PropertyBucket (a Dictionary<string, object>) property public for querying. In order to use the flexible properties that you add, you simply use it in a weakly typed manner:
var results = container.Query<Foo>(e => e.PropertyBucket["MyFlexProp"] > 10);
As you can see, any boolean expression that you can think of in the string-based SSDS LINQ query syntax can now be expressed in a strongly-typed manner using the Func<SsdsEntity<T>, bool> lambda syntax.
How it works
Since I have the expression tree of what your query looks like in strongly-typed terms, it is a simple matter to take that and convert it to the SSDS LINQ query syntax that looks like "from e in entities where [....] select e" that is appended to the query string in the REST interface. I should say it is a simple matter because Matt Warren did a lot of the heavy lifting for us and provided the abstract expression visitor (ExpressionVisitor) as well as the expression visitor that partially evaluates the tree to evaluate constants (SubTreeEvaluator). This last part is important because it allows us to write this:
int i = 10;
string name = "MyFoo";
var results = container.Query<Foo>(e => e.Entity.Name == name && e.Entity.Size > i);
Without the partial tree evaluation, you would not be able to express the right hand side of the equation. All I had to do was implement an expression visitor that correctly evaluated the lambda expression and converted it to the LINQ syntax that SSDS expects (SsdsExpressionVisitor). It would be a trivial matter to actually implement the IQueryProvider and IQueryable interfaces to make the whole thing work inside LINQ to Objects.
Originally, I did supply the IQueryProvider for this implementation but after consideration I have decided that using methods from the SsdsContainer class instead of the standard LINQ syntax is the best way to proceed. Mainly, this has to do with the fact that I want to make it more explicit to the developer what will happen under the covers rather than using the standard Where() extension method.
Querying data
The main interaction to return data is via the Query<T> method. This method is smart enough to add the Kind into the query for you based on the T supplied. So, if you write something like:
var results = container.Query<Foo>(e => e.Entity.Size > 2);
This is actually translated to "from e in entities where e["Size"] > 2 && e.Kind == "Foo" select e". The addition of the kind is important because we want to limit the results as much as possible. If there happened to be many kinds in the container that had the flexible property "Size", it would actually return those as well in the wire response.
Of course, what about if you want that to happen? What if you want to return other kinds that have the "Size" property? To do this, I have introduced a class called SsdsEntityBucket. It is exactly what it sounds like. To use it, you simply specify a query that uses additional types with either the Query<T,U,V> or Query<T,U> methods. Here is an example:
var foo = new Foo
{
IsPublic = true,
MyCheese = new Cheese { LastModified = DateTime.Now, Name = "MyCheese" },
Name = "FooMaster",
Size = 10
};
container.Insert(foo, foo.Name);
container.Insert(foo.MyCheese, foo.MyCheese.Name);
//query for bucket...
var bucket = container.Query<Foo, Cheese>(
(f, c) => f.Entity.Name == "FooMaster" || c.Entity.Name == "MyCheese"
);
var f1 = bucket.GetEntities<Foo>().Single();
var c1 = bucket.GetEntities<Cheese>().Single();
The calls to GetEntities<T> returns IEnumerable<SsdsEntity<T>> again. However, this was done in a single call to SSDS instead of multiple calls per T.
Paging
As I mentioned earlier, I wanted the developer to understand what they were doing when they called each method, so I decided to make paging explicit. If I had potentially millions of entities in SSDS, it would be a bad mistake to allow a developer to issue a simple query that seamlessly paged the items back - especially if the query was something like e => e.Id != "". Here is how I handled paging:
var container = ctx.OpenContainer("paging");
List<Foo> items = new List<Foo>();
int i = 1;
container.PagedQuery<Foo>(
e => e.Entity.Size != 0,
c =>
{
Console.WriteLine("Got Page {0}", i++);
items.AddRange(c.Select(s => s.Entity));
}
);
Console.WriteLine(items.Count);
The PagedQuery<T> method takes two arguments. One is the standard Expression<Func<SsdsEntity<T>, bool>> that you use to specify the WHERE clause for SSDS, and the other is Action<IEnumerable<SsdsEntity<T>>> which represents a delegate that takes an IEnumerable<SsdsEntity<T>> and has a void return. This is a delegate you provide that does something with the 500 entities returned per page (it gets called once per page). Here, I am just adding them into a List<T>, but I could easily be doing anything else here. Under the covers, this is adding the paging term dynamically into the expression tree that is evaluated.
What's next
This is a good head start on using the REST API with SSDS today. However, there are a number of optimizations that could be made to the model: additional overloads, perhaps some extension methods for common operations, etc.
As new features are added, I will endeavor to update this as well (blob support comes to mind here). Additionally, I have a few optimizations planned around concurrency for CRUD operations.
I have published this out to Code Gallery and I welcome feedback and bug fixes. Linked here.
Thursday, June 26, 2008
This is the second post in my series on working with SQL Server Data Service (SSDS) and objects. For background, you should read my post on Serializing Objects in SSDS and the first post in this series.
Last time I showed how to create a general purpose serializer for SSDS using the standard XmlSerializer class in .NET. I created a shell entity or a 'thin veneer' for objects called SsdsEntity<T>, where T was any POCO (plain old C#/CLR object). This allowed me to abstract away the metadata properties required for SSDS without changing my actual POCO object (which, I noted was lame to do).
If we decide that we will use SSDS to interact with POCO T, an interesting situation arises. Namely, once we have defined T, we have in fact defined a schema - albeit one only enforced in code you write and not by the SSDS service itself. One of the advantages of using something like SSDS is that you have a lot of flexibility in storing entities (hence the term 'flexible entity') without conforming to schema. Since, I want to support this flexibility, it means I need to think of a way to support not only the schema implied by T, but also additional and arbitrary properties that a user might consider.
Some may wonder why we need this flexibility: after all, why not just change T to support whatever we like? The issue comes up most often with code you do not control. If you already have an existing codebase with objects that you would like to store in SSDS, it might not be practical or even possible to change the T to add additional schema.
Even if you completely control the codebase, expressing relationships between CLR objects and expressing relationships between things in your data are two different ideas - sometimes this problem has been termed 'impedance mismatch'.
In the CLR, if two objects are related, they are often part of a collection, or they refer to an instance on another object. This is easy to express in the CLR (e.g. Instance.ChildrenCollection["key"]). In your typical datasource, this same relationship is done using foreign keys to refer to other entities.
Consider the following classes:
public class Employee
{
public string EmployeeId { get; set; }
public string Name { get; set; }
public DateTime HireDate { get; set; }
public Employee Manager { get; set; }
public Project[] Projects { get; set; }
}
public class Project
{
public string ProjectId { get; set; }
public string Name { get; set; }
public string BillCode { get; set; }
}
Here we see that the Employee class refers to itself as well as contains a collection of related projects (Project class) that the employee works on. SSDS only supports simple scalar types and no arrays or nested objects today, so we cannot directly express this in SSDS. However, we can decompose this class and store the bits separately and then reassemble later. First, let's see what that looks like and then we can see how it was done:
var projects = new Project[]
{
new Project { BillCode = "123", Name = "TPS Slave", ProjectId = "PID01"},
new Project { BillCode = "124", Name = "Programmer", ProjectId = "PID02" }
};
var bill = new Employee
{
EmployeeId = "EMP01",
HireDate = DateTime.Now.AddMonths(-1),
Manager = null,
Name = "Bill Lumbergh",
Projects = new Project[] {}
};
var peter = new Employee
{
EmployeeId = "EMP02",
HireDate = DateTime.Now,
Manager = bill,
Name = "Peter Gibbons",
Projects = projects
};
var cloudpeter = new SsdsEntity<Employee>
{
Entity = peter,
Id = peter.EmployeeId
};
var cloudbill = new SsdsEntity<Employee>
{
Entity = bill,
Id = bill.EmployeeId
};
//here is how we add flexible props
cloudpeter.Add<string>("ManagerId", peter.Manager.EmployeeId);
var table = _context.OpenContainer("initech");
table.Insert(cloudpeter);
table.Insert(cloudbill);
var cloudprojects = peter.Projects
.Select(s => new SsdsEntity<Project>
{
Entity = s,
Id = Guid.NewGuid().ToString()
});
//add some metadata to track the project to employee
foreach (var proj in cloudprojects)
{
proj.Add<string>("RelatedEmployee", peter.EmployeeId);
table.Insert(proj);
}
All this code does is create two employees and two projects and set the relationships between them. Using the Add<K> method, I can insert any primitive type to go along for the ride with the POCO. If we query the container now, this is what we see:
<s:EntitySet
xmlns:s="http://schemas.microsoft.com/sitka/2008/03/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:x="http://www.w3.org/2001/XMLSchema">
<Project>
<s:Id>2ffd7a92-2a3b-4cd8-a5f7-55f40c3ba2b0</s:Id>
<s:Version>1</s:Version>
<ProjectId xsi:type="x:string">PID01</ProjectId>
<Name xsi:type="x:string">TPS Slave</Name>
<BillCode xsi:type="x:string">123</BillCode>
<RelatedEmployee xsi:type="x:string">EMP02</RelatedEmployee>
</Project>
<Project>
<s:Id>892dbb1e-ba47-4c87-80e6-64fbb46da935</s:Id>
<s:Version>1</s:Version>
<ProjectId xsi:type="x:string">PID02</ProjectId>
<Name xsi:type="x:string">Programmer</Name>
<BillCode xsi:type="x:string">124</BillCode>
<RelatedEmployee xsi:type="x:string">EMP02</RelatedEmployee>
</Project>
<Employee>
<s:Id>EMP01</s:Id>
<s:Version>1</s:Version>
<EmployeeId xsi:type="x:string">EMP01</EmployeeId>
<Name xsi:type="x:string">Bill Lumbergh</Name>
<HireDate xsi:type="x:dateTime">2008-05-25T23:59:49</HireDate>
</Employee>
<Employee>
<s:Id>EMP02</s:Id>
<s:Version>1</s:Version>
<EmployeeId xsi:type="x:string">EMP02</EmployeeId>
<Name xsi:type="x:string">Peter Gibbons</Name>
<HireDate xsi:type="x:dateTime">2008-06-25T23:59:49</HireDate>
<ManagerId xsi:type="x:string">EMP01</ManagerId>
</Employee>
</s:EntitySet>
As you can see, I have stored extra data in my 'flexible' entity with the ManagerId property (on one entity) and RelatedEmployee property on the Project kinds. This allows me to figure out later what objects are related to each other since we can't model the CLR objects relationships directly. Let's see how this was done.
public class SsdsEntity<T> where T: class
{
Dictionary<string, object> _propertyBucket = new Dictionary<string, object>();
public SsdsEntity() { }
[XmlIgnore]
public Dictionary<string, object> PropertyBucket
{
get { return _propertyBucket; }
}
[XmlAnyElement]
public XElement[] Attributes
{
get
{
//using XElement is much easier than XmlElement to build
//take all properties on object instance and build XElement
var props = from prop in typeof(T).GetProperties()
let val = prop.GetValue(this.Entity, null)
where prop.GetSetMethod() != null
&& allowableTypes.Contains(prop.PropertyType)
&& val != null
select new XElement(prop.Name,
new XAttribute(Constants.xsi + "type",
XsdTypeResolver.Solve(prop.PropertyType)),
EncodeValue(val)
);
//Then stuff in any extra stuff you want
var extra = _propertyBucket.Select(
e =>
new XElement(e.Key,
new XAttribute(Constants.xsi + "type",
XsdTypeResolver.Solve(e.Value.GetType())),
EncodeValue(e.Value)
)
);
return props.Union(extra).ToArray();
}
set
{
//wrap the XElement[] with the name of the type
var xml = new XElement(typeof(T).Name, value);
var xs = new XmlSerializer(typeof(T));
//xml.CreateReader() cannot be used as it won't support base64 content
XmlTextReader reader = new XmlTextReader(
xml.ToString(),
XmlNodeType.Document,
null
);
this.Entity = (T)xs.Deserialize(reader);
//now deserialize the other stuff left over into the property bucket...
var stuff = from v in value.AsEnumerable()
let props = typeof(T).GetProperties().Select(s => s.Name)
where !props.Contains(v.Name.ToString())
select v;
foreach (var item in stuff)
{
_propertyBucket.Add(
item.Name.ToString(),
DecodeValue(
item.Attribute(Constants.xsi + "type").Value,
item.Value)
);
}
}
}
public void Add<K>(string key, K value)
{
if (!allowableTypes.Contains(typeof(K)))
throw new ArgumentException(
String.Format(
"Type {0} not supported in SsdsEntity",
typeof(K).Name)
);
if (!_propertyBucket.ContainsKey(key))
{
_propertyBucket.Add(key, value);
}
else
{
//replace the value
_propertyBucket.Remove(key);
_propertyBucket.Add(key, value);
}
}
}
I have omitted the parts of SsdsEntity<T> from the first post that didn't change. The only other addition you don't see here is a helper method called DecodeValue, which as you might guess, interprets the string value in XML and attempts to cast it to a CLR type based on the xsi:type that comes back.
All we did here was add a Dictionary<string, object> property called PropertyBucket that holds our extra stuff we want to associate with our T instance. Then in the getter and setter for the XElement[] property called Attributes, we are adding them into our array of XElement as well as pulling them back out on deserialization and stuffing them back into the Dictionary. With this simple addition, we have fixed our in flexibility (or lack thereof) problem. We are still limited to the simple scalar types, but as you can see you can work around this in a lot of cases by decomposing the objects down enough to be able to recreate them later.
The Add<K> method is a convenience only as we could operate directly against the Dictionary. I also could have chosen to keep the Dictionary property bucket private and not expose it. That would have worked just fine for serialization, but I wanted to also be able to query it later.
In my last post, I said I would introduce a library where all this code is coming from, but I didn't realize at the time how long this post would be and that I still need to cover querying. So... next time, I will finish up this series by explaining how the strongly typed query model works and how all these pieces fit together to recompose the data back into objects (and release the library).
Tuesday, June 17, 2008
Last time we talked about SQL Server Data Services and serializing objects, we discussed how easy it was to use the XmlSerializer to deserialize objects using the REST interface. The problem was that when we serialized objects using the XmlSerializer, it left out the xsi type declarations that we needed. I gave two possible solutions to this problem - one that used the XmlSerializer and 'fixed' the output after the fact, and the other built the XML that we needed using XLINQ and Reflection.
Today, I am going to talk about a third technique that I have been using lately that I like better. It uses some of the previous techniques and leverages a few tricks with XmlSerializer to get what I want. First, let's start with a POCO (plain ol' C# object) class that we would like to use with SSDS.
public class Foo
{
public string Name { get; set; }
public int Size { get; set; }
public bool IsPublic { get; set; }
}
In it's correctly serialized form, it looks like this on the wire:
<Foo xmlns:s="http://schemas.microsoft.com/sitka/2008/03/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:x="http://www.w3.org/2001/XMLSchema">
<s:Id>someid</s:Id>
<s:Version>1</s:Version>
<Name xsi:type="x:string">My Foo</Name>
<Size xsi:type="x:decimal">10</Size>
<IsPublic xsi:type="x:boolean">false</IsPublic>
</Foo>
You'll notice that we have the additional system metadata attributes "Id" and "Version" in the markup. We can account for the metadata attributes by doing something cheesy like deriving from a base class:
public abstract class Cheese
{
public string Id { get; set; }
public int Version { get; set; }
}
However this is very unnatural as our classes would all have to derive from our "Cheese" abstract base class (ABC).
public class Foo : Cheese
{
public string Name { get; set; }
public int Size { get; set; }
public bool IsPublic { get; set; }
}
Developers familiar with remoting in .NET should be cringing right now as they remember the hassles associated with deriving from MarshalByRefObject. In a world without multiple inheritance, this can be painful. I want a model where I can use arbitrary POCO objects (redundant, yes I know) and not be forced to derive from anything or do what I would otherwise term unnatural acts.
What if instead, we derived a generic entity that could contain any other entity?
public class SsdsEntity<T> where T: class
{
string _kind;
public SsdsEntity() { }
[XmlElement(Namespace = @"http://schemas.microsoft.com/sitka/2008/03/")]
public string Id { get; set; }
[XmlIgnore]
public string Kind
{
get
{
if (String.IsNullOrEmpty(_kind))
{
_kind = typeof(T).Name;
}
return _kind;
}
set
{
_kind = value;
}
}
[XmlElement(Namespace = @"http://schemas.microsoft.com/sitka/2008/03/")]
public int Version { get; set; }
[XmlIgnore]
public T Entity { get; set; }
}
In this case, we have simply wrapped the POCO that we care about in a class that knows about the specifics of the SSDS wire format (or more accurately could serialize down to the wire format).
This SsdsEntity<T> is easy to use and provides access to the strongly typed object via the Entity property.
Now, we just have to figure out how to serialize the SsdsEntity<Foo> object and we know that the metadata attributes are taken care of and our original POCO object that we care about is included. I call it wrapping POCOs in a thin SSDS veneer.
The trick to this is to add a bucket of XElement objects on the SsdsEntity<T> class that will hold our public properties on our class T (i.e. 'Foo' class). It looks something like this:
[XmlAnyElement]
public XElement[] Attributes
{
get
{
//using XElement is much easier than XmlElement to build
//take all properties on object instance and build XElement
var props = from prop in typeof(T).GetProperties()
let val = prop.GetValue(this.Entity, null)
where prop.GetSetMethod() != null
&& allowableTypes.Contains(prop.PropertyType)
&& val != null
select new XElement(prop.Name,
new XAttribute(Constants.xsi + "type",
XsdTypeResolver.Solve(prop.PropertyType)),
EncodeValue(val)
);
return props.ToArray();
}
set
{
//wrap the XElement[] with the name of the type
var xml = new XElement(typeof(T).Name, value);
var xs = new XmlSerializer(typeof(T));
//xml.CreateReader() cannot be used as it won't support base64 content
XmlTextReader reader = new XmlTextReader(
xml.ToString(),
XmlNodeType.Document,
null);
this.Entity = (T)xs.Deserialize(reader);
}
}
In the getter, we use Reflection and pull back a list of all the public properties on the T object and build an array of XElement. This is the same technique I used in my first post on serialization. The 'allowableTypes' object is a HashSet<Type> that we use to figure out which property types we can support in the service (DateTime, numeric, string, boolean, and byte[]). When this property serializes, the XElements are simply added to the markup.
The EncodeValue method shown is a simple helper method that correctly encodes string values, boolean, dates, integers, and byte[] values for the attribute. Finally, we are using a helper method that returns from a Dictionary<Type,string> the correct xsi type for the required attribute (as determined from the property type).
For deserialization, what happens is that the [XmlAnyElement] attribute causes all unmapped attributes (in this case, all non-system metadata attributes) to be collected in a collection of XElement. When we deserialize, if we simply wrap an enclosing element around this XElement collection, it is exactly what we need for deserialization of T. This is shown in the setter implementation.
It might look a little complicated, but now simple serialization will just work via the XmlSerializer. Here is one such implementation:
public string Serialize(SsdsEntity<T> entity)
{
//add a bunch of namespaces and override the default ones too
XmlSerializerNamespaces namespaces = new XmlSerializerNamespaces();
namespaces.Add("s", Constants.ns.NamespaceName);
namespaces.Add("x", Constants.x.NamespaceName);
namespaces.Add("xsi", Constants.xsi.NamespaceName);
var xs = new XmlSerializer(
entity.GetType(),
new XmlRootAttribute(typeof(T).Name)
);
XmlWriterSettings xws = new XmlWriterSettings();
xws.Indent = true;
xws.OmitXmlDeclaration = true;
using (var ms = new MemoryStream())
{
using (XmlWriter writer = XmlWriter.Create(ms, xws))
{
xs.Serialize(writer, entity, namespaces);
ms.Position = 0; //reset to beginning
using (var sr = new StreamReader(ms))
{
return sr.ReadToEnd();
}
}
}
}
Deserialization is even easier since we are starting with the XML representation and don't have to build a Stream in memory.
public SsdsEntity<T> Deserialize(XElement node)
{
var xs = new XmlSerializer(
typeof(SsdsEntity<T>),
new XmlRootAttribute(typeof(T).Name)
);
//xml.CreateReader() cannot be used as it won't support base64 content
XmlTextReader reader = new XmlTextReader(
node.ToString(),
XmlNodeType.Document,
null);
return (SsdsEntity<T>)xs.Deserialize(reader);
}
If you notice, I am using an XmlTextReader to pass to the XmlSerializer. Unfortunately, the XmlReader from XLINQ does not support handling of base64 content, so this workaround is necessary.
At this point, we have a working serializer/deserializer that can handle arbitrary POCOs. There are some limitations of course:
- We are limited to the same datatypes that SSDS supports. This also means nested objects and arrays are not directly supported.
- We have lost a little of the 'flexible' in the Flexible Entity (the E in the ACE model). We now have a rigid schema defined by SSDS metadata and T public properties and enforced on our objects.
In my next post, I will attempt to address some of those limitations and I will introduce a library that handles most of this for you.
Wednesday, June 11, 2008
I officially love LINQPad. Joe Albahari has done a great job of introducing a light weight tool that is great for learning and prototyping LINQ queries. From what I gather, Joe and Ben Albahari built this tool as part of their book offering. It was so useful, it has taken on a life of its own.
It may not be entirely obvious, but it turns out don't have to use LINQPad solely for LINQ queries. You can actually prototype any type of snippet of code. I have been using it now instead of SnippetCompiler (another great quick snippet tool).
As an example, here is how to use System.DirectoryServices snippets inside of LINQPad:
Hit F4 to bring up the Advanced Query Properties Window
Add the System.DirectoryServices.dll reference in the Additional References window, and then add "System.DirectoryServices" in the Additional Namespace Imports window.
Now, just type your code normally and hit F5 when you are done:
This is a great little tool to have as you can query databases, build LINQ expressions, and visually inspect the results that come back pretty easily. Now, as you can see you can also execute arbitrary code snippets as well. Highly recommended.
Thursday, June 05, 2008
A member in the book's forum mentioned some code I had originally posted here in the blog for asynchronous, paged searches in System.DirectoryServices.Protocols (SDS.P). He questioned whether or not it was thread safe. I honestly don't know - it might not be as I didn't test it extensively.
Regardless, I had actually moved on from that code and started using anonymous delegates for callbacks instead of events. I liked this pattern a bit better because it also got rid of the shared resources.
After reading Stephen Toub's article on asynchronous stream processing, I learned about the AsyncOperationManager which was something I was missing in my implementation. I have been doing a lot lately with .NET 3.5, LINQ, and lambda expressions, so I also decided to rewrite the anonymous delegates to lambda expressions. That is not as big a change, but it is more concise.
I actively investigated using async iterators, but ultimately I decided closures seemed to be more intuitive for me. I might revisit this at some time and change my mind. Here is my outcome:
public class AsyncSearcher
{
LdapConnection _connect;
public AsyncSearcher(LdapConnection connection)
{
this._connect = connection;
this._connect.AutoBind = true; //will bind on first search
}
public void BeginPagedSearch(
string baseDN,
string filter,
string[] attribs,
int pageSize,
Action<SearchResponse> page,
Action<Exception> completed
)
{
if (page == null)
throw new ArgumentNullException("page");
AsyncOperation asyncOp = AsyncOperationManager.CreateOperation(null);
Action<Exception> done = e =>
{
if (completed != null) asyncOp.Post(delegate
{
completed(e);
}, null);
};
SearchRequest request = new SearchRequest(
baseDN,
filter,
System.DirectoryServices.Protocols.SearchScope.Subtree,
attribs
);
PageResultRequestControl prc = new PageResultRequestControl(pageSize);
//add the paging control
request.Controls.Add(prc);
AsyncCallback rc = null;
rc = readResult =>
{
try
{
var response = (SearchResponse)_connect.EndSendRequest(readResult);
//let current thread handle results
asyncOp.Post(delegate
{
page(response);
}, null);
var cookie = response.Controls
.Where(c => c is PageResultResponseControl)
.Select(s => ((PageResultResponseControl)s).Cookie)
.Single();
if (cookie != null && cookie.Length != 0)
{
prc.Cookie = cookie;
_connect.BeginSendRequest(
request,
PartialResultProcessing.NoPartialResultSupport,
rc,
null
);
}
else done(null); //signal complete
}
catch (Exception ex) { done(ex); }
};
//kick off async
try
{
_connect.BeginSendRequest(
request,
PartialResultProcessing.NoPartialResultSupport,
rc,
null
);
}
catch (Exception ex) { done(ex); }
}
}
It can be consumed very easily using something like this:
class Program
{
static ManualResetEvent _resetEvent = new ManualResetEvent(false);
static void Main(string[] args)
{
//set these to your environment
string servername = "server.yourdomain.com";
string baseDN = "dc=yourdomain,dc=com";
using (LdapConnection connection = CreateConnection(servername))
{
AsyncSearcher searcher = new AsyncSearcher(connection);
searcher.BeginPagedSearch(
baseDN,
"(sn=Dunn)",
null,
100,
f => //runs per page
{
foreach (var item in f.Entries)
{
var entry = item as SearchResultEntry;
if (entry != null)
{
Console.WriteLine(entry.DistinguishedName);
}
}
},
c => //runs on error or when done
{
if (c != null) Console.WriteLine(c.ToString());
Console.WriteLine("Done");
_resetEvent.Set();
}
);
_resetEvent.WaitOne();
}
Console.WriteLine();
Console.WriteLine("Finished.... Press Enter to Continue.");
Console.ReadLine();
}
static LdapConnection CreateConnection(string server)
{
LdapConnection connect = new LdapConnection(
new LdapDirectoryIdentifier(server),
null,
AuthType.Negotiate
);
connect.SessionOptions.ProtocolVersion = 3;
connect.SessionOptions.ReferralChasing = ReferralChasingOptions.None;
connect.SessionOptions.Sealing = true;
connect.SessionOptions.Signing = true;
return connect;
}
}
The important thing to note is that because everything is running asynchronously, it is totally possible for the end delegate to be invoked before the paging delegate has a chance to finish processing results (depending on how complicated your code is). You would need to compensate for this yourself.
This client is a console application, so I am using a ManualResetEvent just to prevent it from closing before finishing. You wouldn't need to do this in a WinForms or WPF app.
I am sure there are other optimizations you could make to pass in parameters or even other directory controls. However, the general pattern should apply.
Monday, June 02, 2008
As a frequent flier and former Continental Platinum Elite member, I was always jealous of the American Airlines program where you got lifetime status on the airline whenever you achieved a million air miles with them. However, I was browsing the FlyerTalk forum when I noticed this announcement. I can't wait to see how far away I am on it (probably 600K or more away is my guess).
FlyerTalk Announcment
Sunday, June 01, 2008
According to a report released by Akamai (of CDN fame), Washington is the slowest state in the union when it comes to broadband. More appalling however is the USA's positioning with respect to the rest of the world. For the country that invented the internet (sorry Al), we are sure a long way behind our European and Asian counterparts, ranking only 24th for greater than 2Mbps broadband penetration.
What is the problem here? Why are we so far behind? More troubling, why has the cost of communication services like the internet access gone up year over year with no appreciable increases in either service quality nor speed? The same cable companies that begged to be released from regulation - citing a free market would increase competition and lower prices have instead consolidated, reduced choice, and increased rates.
Let's take a look at what speeds you typically get from your crappy cable provider. For an astonishing ~$50/month, you can order from your local cable provider a 6Mbps/768Kbps package. In some markets, you can order higher speed packages for significantly higher rates. Of course, in the markets that offer those higher speed tiers you will likely run into a cap:
An MSO talking 100 Mbit/s out of one side of its mouth and usage caps out the other is like a bi-polar buffet restaurateur. They continue adding more entrees to an all-you-can-eat spread, and then reduce the size of the plates and tell diners they only have 10 minutes to chow. It's a recipe for dissatisfaction. The buffet looks bigger and tastier – so the patron's hunger grows – and then they are asked to practice portion control. [source]
Most people I know that have cable (all in fact) use the standard plan because the higher speed plans are so much more expensive. The plans have to be - if they were cheap people would buy it and then the cable companies could never meet their SLA for their customers. So, let's deal with the average for now.
The part that gets most people is that those bandwidth numbers provided by the cable companies don't mean much in our every day surfing and downloading. Quick, how does 768Kbps in upload speed translate to your Bittorrent client on Comcast?[1] What is the max speed that you will be able to download the latest SP3 for XP on your 6Mbps connection? Without doing the math, no one really knows - we are talking two different units here.
What most folks are used to seeing is the browser download progress window:
This measurement (usually in KB/s) is what most people can identify with to truly understand how fast they can download something. On a 6Mbps connection, your theoretical max download speed is only 768KB/s. Of course, no one but no one gets that max speed. It might burst in some markets for a bit, but with a huge pipe coming down (like Microsoft Download Center), you are lucky if you get 400KB/s max on most cable systems.
An astute reader will notice that I am not using a normal cable provider here in the screen shot. In fact, at 1.24 MB/s, I am getting roughly 3x faster speeds that 'normal' cable. This is because after suffering Comcast's incredibly slow and laggy connection long enough, I bit the bullet and ordered FiOS from Verizon. I chose the 15Mbps/15Mbps package for roughly $70/month. I am a heavy internet user (both up and down) and so far it has been worth every penny.
Here I am uploading all my MP3 collection to my backup provider (Mozy). Previously on cable, I never dared doing it because I didn't want to tie up my connection for a few weeks at the measly 40KB/s. With FiOS, I can do this easily.

So what is the problem with the status quo? Unless hordes of users start to migrate to FiOS, cloud services in general will suffer for years to come. The value of a cloud service like Mozy is directly proportional to the bandwidth that users can access to get to it. Today, backing up only 30GB worth of data to cloud storage at 40KB/s would take over 9 days! It is impractical for most people to leave a machine on for 9 days and especially tie up all the bandwidth in the house for the same period of time. I know my parents would never do it.
If we look at some of the more interesting cloud services that could be offered, we see again that bandwidth (especially up) constrains the value of the service. Unfortunately, I don't hold out a lot of hope for things to improve soon. Certain classes of applications that could go in the cloud will be fine (web sites, some services, and simple download only type services). While truly interactive, rich media, or game changing devices (imagine the Network PC, but for real now) with be hobbled for years. We'll see...
What do you think?
[1] Trick question: Comcast blocks Bittorrent, so it is likely 0 KB/s instead of the theoretical 96 KB/s (max 40-50 KB/s in real use).
Wednesday, May 28, 2008
SQL Server Data Services returns data in POX (plain ol' XML) format. If you look carefully at the way the data is returned, you can see that individual flex entities look somewhat familiar to what is produced from the XmlSerializer. I say 'somewhat' because we have the data wrapped in this 'EntitySet' tag.
<s:EntitySet
xmlns:s="http://schemas.microsoft.com/sitka/2008/03/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:x="http://www.w3.org/2001/XMLSchema">
<PictureTag>
<s:Id>1e803f90-e5e5-4524-9e3c-3ba960be9494</s:Id>
<s:Version>1</s:Version>
<PictureId xsi:type="x:string">3a1714bc-8771-4f6c-8d16-93238f126d9f</PictureId>
<TagId xsi:type="x:string">ab696b85-1bdc-4bed-8824-dfbf9b67b5cc</TagId>
</PictureTag>
</s:EntitySet>
I am using the PictureTag from the PhluffyFotos sample application, but this could be any flexible entity. If we extract the PictureTag element and children from the surrounding EntitySet, we can very easily deserialize this into a class.
Given a class 'PictureTag':
public class PictureTag
{
[XmlElement(Namespace="http://schemas.microsoft.com/sitka/2008/03/")]
public string Id { get; set; }
[XmlElement(Namespace = "http://schemas.microsoft.com/sitka/2008/03/")]
public int Version { get; set; }
public string PictureId { get; set; }
public string TagId { get; set; }
}
We can deserialize this class in just 3 lines of code:
string xml = @"<s:EntitySet
xmlns:s=""http://schemas.microsoft.com/sitka/2008/03/""
xmlns:xsi=""http://www.w3.org/2001/XMLSchema-instance""
xmlns:x=""http://www.w3.org/2001/XMLSchema"">
<PictureTag>
<s:Id>1e803f90-e5e5-4524-9e3c-3ba960be9494</s:Id>
<s:Version>1</s:Version>
<PictureId xsi:type=""x:string"">3a1714bc-8771-4f6c-8d16-93238f126d9f</PictureId>
<TagId xsi:type=""x:string"">ab696b85-1bdc-4bed-8824-dfbf9b67b5cc</TagId>
</PictureTag>
</s:EntitySet>";
var xmlTag = XElement.Parse(xml).Element("PictureTag");
XmlSerializer xs = new XmlSerializer(typeof(PictureTag));
var tag = (PictureTag)xs.Deserialize(xmlTag.CreateReader());
Now, the 'tag' variable is a PictureTag instance. As you can see, deserialization is a snap. What about serialization, however?
If I reverse the process using the following code, you will notice that something has changed:
using (var ms = new MemoryStream())
{
//add a bunch of namespaces and override the default ones too
XmlSerializerNamespaces namespaces = new XmlSerializerNamespaces();
namespaces.Add("s", @"http://schemas.microsoft.com/sitka/2008/03/");
namespaces.Add("x", @"http://www.w3.org/2001/XMLSchema");
namespaces.Add("xsi", @"http://www.w3.org/2001/XMLSchema-instance");
XmlWriterSettings xws = new XmlWriterSettings();
xws.Indent = true;
xws.OmitXmlDeclaration = true;
using (XmlWriter writer = XmlWriter.Create(ms, xws))
{
xs.Serialize(writer, tag, namespaces);
ms.Position = 0; //reset to beginning
using (var sr = new StreamReader(ms))
{
xmlTag = XElement.Parse(sr.ReadToEnd());
}
}
}
If I look in the 'xmlTag' XElement, I get somewhat different XML back:
<PictureTag
xmlns:s="http://schemas.microsoft.com/sitka/2008/03/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:x="http://www.w3.org/2001/XMLSchema">
<s:Id>1e803f90-e5e5-4524-9e3c-3ba960be9494</s:Id>
<s:Version>1</s:Version>
<PictureId>3a1714bc-8771-4f6c-8d16-93238f126d9f</PictureId>
<TagId>ab696b85-1bdc-4bed-8824-dfbf9b67b5cc</TagId>
</PictureTag>
I lost the 'xsi:type' attributes that I need in order to signal to SSDS how to treat the type. Bummer.
We can manually add the attributes (fix-up) after the serialization. Let's see how that would work:
XNamespace xsi = @"http://www.w3.org/2001/XMLSchema-instance";
XNamespace ns = @"http://schemas.microsoft.com/sitka/2008/03/";
//xmlTag is XElement holding our
var nodes = xmlTag.Descendants();
foreach (var node in nodes)
{
if (node.Name != (ns + "Id") && node.Name != (ns + "Version"))
{
node.Add(
new XAttribute(
xsi + "type",
GetAttributeType(node.Name.LocalName.ToString(), typeof(PictureTag))
)
);
}
}
We need to loop through each node and set the 'xsi:type' attribute appropriately. Here is my quick and dirty implementation:
static Dictionary<Type, string> xsdTypes = new Dictionary<Type, string>()
{
{typeof(string), "x:string"},
{typeof(int), "x:decimal"},
{typeof(long), "x:decimal"},
{typeof(float), "x:decimal"},
{typeof(decimal), "x:decimal"},
{typeof(short), "x:decimal"},
{typeof(DateTime), "x:dateTime"},
{typeof(bool), "x:boolean"},
{typeof(byte[]), "x:base64Binary"}
};
private static string GetAttributeType(string name, Type type)
{
var prop = type.GetProperty(name);
if (prop != null)
{
if (xsdTypes.ContainsKey(prop.PropertyType))
return xsdTypes[prop.PropertyType];
}
return xsdTypes[typeof(string)];
}
When all is said and done, I am back to what I need:
<PictureTag
xmlns:s="http://schemas.microsoft.com/sitka/2008/03/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:x="http://www.w3.org/2001/XMLSchema">
<s:Id>1e803f90-e5e5-4524-9e3c-3ba960be9494</s:Id>
<s:Version>1</s:Version>
<PictureId xsi:type="x:string">3a1714bc-8771-4f6c-8d16-93238f126d9f</PictureId>
<TagId xsi:type="x:string">ab696b85-1bdc-4bed-8824-dfbf9b67b5cc</TagId>
</PictureTag>
However, I am not sure I really like this technique. It seems like that if I am going to be using Reflection to 'fix-up' the XML from the XmlSerializer, I might as well just use it to build the entire thing. With that in mind, here is the next implementation of SSDS Serialization:
public static XElement CreateEntity<T>(T instance, string id) where T : class, new()
{
XNamespace ns = @"http://schemas.microsoft.com/sitka/2008/03/";
XNamespace xsi = @"http://www.w3.org/2001/XMLSchema-instance";
XNamespace x = @"http://www.w3.org/2001/XMLSchema";
if (instance == null)
return null;
if (String.IsNullOrEmpty(id))
throw new ArgumentNullException("id");
Type type = typeof(T);
// Create an element for each non-system, non-binary property on the class
var properties =
from p in type.GetProperties()
where xsdTypes.ContainsKey(p.PropertyType) &&
p.Name != "Id" &&
p.Name != "Version" &&
!p.PropertyType.Equals(typeof(byte[]))
select new XElement(p.Name,
new XAttribute(xsi + "type", xsdTypes[p.PropertyType]),
p.GetValue(instance, null)
);
// Binary properties are special, since they must be serialized as Base-64
var binaryProperties =
from p in type.GetProperties()
where p.PropertyType.Equals(typeof(byte[])) && (p.GetValue(instance, null) != null)
select new XElement(p.Name,
new XAttribute(xsi + "type", xsdTypes[p.PropertyType]),
Convert.ToBase64String((byte[])p.GetValue(instance, null))
);
// Construct the Xml
var xml = new XElement(type.Name,
new XElement(ns + "Id", id), //here is the Id element
new XAttribute(XNamespace.Xmlns + "s", ns),
new XAttribute(XNamespace.Xmlns + "xsi", xsi),
new XAttribute(XNamespace.Xmlns + "x", x),
properties,
binaryProperties
);
return xml;
}
In this case, we are using Reflection to build a list of Properties in the object and depending on the type (byte[] array is special), we build the XElement ourselves and assemble the entity by hand. We can use it like this:
XElement entity = CreateEntity<PictureTag>(tag, tag.Id);
Of course, there are a number of other techniques that I am not covering in this already very long post. Perhaps in my next post we will look at a few others.