Wednesday, July 02, 2008
Here is my last installment in this series of working with objects in SQL Server Data Services. For background, readers should read the following:
Serialization in SSDS
Working with Objects in SSDS Part 1
Working with Objects in SSDS Part 2
Last time, we concluded with a class called SsdsEntity<T> that became an all-purpose wrapper or veneer around our CLR objects. This made it simple to take our existing classes and serialize them as entities in SSDS.
In this post, I want to discuss how the querying in the REST library works. First a simple example:
var ctx = new SsdsContext(
"authority=http://dunnry.data.beta.mssds.com/v1/;username=dunnry;password=secret"
);
var container = ctx.OpenContainer("foo");
var foo = new Foo { IsPublic = false, Name = "MyFoo", Size = 12 };
//insert it with unique id guid string
container.Insert(foo, Guid.NewGuid().ToString());
//now query for it
var results = container.Query<Foo>(e => e.Entity.IsPublic == false && e.Entity.Size > 2);
//Query<T> returns IEnumerable<SsdsEntity<T>>, so foreach over it
foreach (var item in results)
{
Console.WriteLine(item.Entity.Name);
}
I glossed over it in my previous posts with this library, but I have a class called SsdsContext that acts as my credential store and factory to create SsdsContainer objects where I perform my operations. Here, I have opened a container called 'foo', which would relate to the URI (http://dunnry.data.beta.mssds.com/v1/foo) according to the authority name I passed on the SsdsContext constructor arguments.
I created an instance of my Foo class (see this post if you want to see what a Foo looks like) and inserted it. We know that under the covers we have an XmlSerializer doing the work to serialize that to the proper POX wire format. So far, so good. Now, I want to retrieve that same entity back from SSDS. The key line here is the table.Query<T>() call. It accepts a Expression<Func<SsdsEntity<T>, bool>> argument that represents a strongly typed query.
For the uninitiated, the Expression<TDelegate> is a way to represent lambda expressions in an abstract syntax tree. We can think of them as a way to model what the expression does without generating the bits of code necessary to actually do it. We can inspect the Expression and create new ones based on it until finally we can call Compile and actually convert the representation of the lambda into something that can execute.
The Func<SsdsEntity<T>, bool> represents a delegate that accepts a SsdsEntity<T> as an argument and returns a boolean. This effectively represents the WHERE clause in the SSDS LINQ query syntax. Since SsdsEntity<T> contains an actual type T in the Entity property, you can query directly against it in a strongly typed fashion!
What about those flexible properties that I added to support flexible attributes outside of our T? I mentioned that I wanted to keep the PropertyBucket (a Dictionary<string, object>) property public for querying. In order to use the flexible properties that you add, you simply use it in a weakly typed manner:
var results = container.Query<Foo>(e => e.PropertyBucket["MyFlexProp"] > 10);
As you can see, any boolean expression that you can think of in the string-based SSDS LINQ query syntax can now be expressed in a strongly-typed manner using the Func<SsdsEntity<T>, bool> lambda syntax.
How it works
Since I have the expression tree of what your query looks like in strongly-typed terms, it is a simple matter to take that and convert it to the SSDS LINQ query syntax that looks like "from e in entities where [....] select e" that is appended to the query string in the REST interface. I should say it is a simple matter because Matt Warren did a lot of the heavy lifting for us and provided the abstract expression visitor (ExpressionVisitor) as well as the expression visitor that partially evaluates the tree to evaluate constants (SubTreeEvaluator). This last part is important because it allows us to write this:
int i = 10;
string name = "MyFoo";
var results = container.Query<Foo>(e => e.Entity.Name == name && e.Entity.Size > i);
Without the partial tree evaluation, you would not be able to express the right hand side of the equation. All I had to do was implement an expression visitor that correctly evaluated the lambda expression and converted it to the LINQ syntax that SSDS expects (SsdsExpressionVisitor). It would be a trivial matter to actually implement the IQueryProvider and IQueryable interfaces to make the whole thing work inside LINQ to Objects.
Originally, I did supply the IQueryProvider for this implementation but after consideration I have decided that using methods from the SsdsContainer class instead of the standard LINQ syntax is the best way to proceed. Mainly, this has to do with the fact that I want to make it more explicit to the developer what will happen under the covers rather than using the standard Where() extension method.
Querying data
The main interaction to return data is via the Query<T> method. This method is smart enough to add the Kind into the query for you based on the T supplied. So, if you write something like:
var results = container.Query<Foo>(e => e.Entity.Size > 2);
This is actually translated to "from e in entities where e["Size"] > 2 && e.Kind == "Foo" select e". The addition of the kind is important because we want to limit the results as much as possible. If there happened to be many kinds in the container that had the flexible property "Size", it would actually return those as well in the wire response.
Of course, what about if you want that to happen? What if you want to return other kinds that have the "Size" property? To do this, I have introduced a class called SsdsEntityBucket. It is exactly what it sounds like. To use it, you simply specify a query that uses additional types with either the Query<T,U,V> or Query<T,U> methods. Here is an example:
var foo = new Foo
{
IsPublic = true,
MyCheese = new Cheese { LastModified = DateTime.Now, Name = "MyCheese" },
Name = "FooMaster",
Size = 10
};
container.Insert(foo, foo.Name);
container.Insert(foo.MyCheese, foo.MyCheese.Name);
//query for bucket...
var bucket = container.Query<Foo, Cheese>(
(f, c) => f.Entity.Name == "FooMaster" || c.Entity.Name == "MyCheese"
);
var f1 = bucket.GetEntities<Foo>().Single();
var c1 = bucket.GetEntities<Cheese>().Single();
The calls to GetEntities<T> returns IEnumerable<SsdsEntity<T>> again. However, this was done in a single call to SSDS instead of multiple calls per T.
Paging
As I mentioned earlier, I wanted the developer to understand what they were doing when they called each method, so I decided to make paging explicit. If I had potentially millions of entities in SSDS, it would be a bad mistake to allow a developer to issue a simple query that seamlessly paged the items back - especially if the query was something like e => e.Id != "". Here is how I handled paging:
var container = ctx.OpenContainer("paging");
List<Foo> items = new List<Foo>();
int i = 1;
container.PagedQuery<Foo>(
e => e.Entity.Size != 0,
c =>
{
Console.WriteLine("Got Page {0}", i++);
items.AddRange(c.Select(s => s.Entity));
}
);
Console.WriteLine(items.Count);
The PagedQuery<T> method takes two arguments. One is the standard Expression<Func<SsdsEntity<T>, bool>> that you use to specify the WHERE clause for SSDS, and the other is Action<IEnumerable<SsdsEntity<T>>> which represents a delegate that takes an IEnumerable<SsdsEntity<T>> and has a void return. This is a delegate you provide that does something with the 500 entities returned per page (it gets called once per page). Here, I am just adding them into a List<T>, but I could easily be doing anything else here. Under the covers, this is adding the paging term dynamically into the expression tree that is evaluated.
What's next
This is a good head start on using the REST API with SSDS today. However, there are a number of optimizations that could be made to the model: additional overloads, perhaps some extension methods for common operations, etc.
As new features are added, I will endeavor to update this as well (blob support comes to mind here). Additionally, I have a few optimizations planned around concurrency for CRUD operations.
I have published this out to Code Gallery and I welcome feedback and bug fixes. Linked here.
Thursday, June 26, 2008
This is the second post in my series on working with SQL Server Data Service (SSDS) and objects. For background, you should read my post on Serializing Objects in SSDS and the first post in this series.
Last time I showed how to create a general purpose serializer for SSDS using the standard XmlSerializer class in .NET. I created a shell entity or a 'thin veneer' for objects called SsdsEntity<T>, where T was any POCO (plain old C#/CLR object). This allowed me to abstract away the metadata properties required for SSDS without changing my actual POCO object (which, I noted was lame to do).
If we decide that we will use SSDS to interact with POCO T, an interesting situation arises. Namely, once we have defined T, we have in fact defined a schema - albeit one only enforced in code you write and not by the SSDS service itself. One of the advantages of using something like SSDS is that you have a lot of flexibility in storing entities (hence the term 'flexible entity') without conforming to schema. Since, I want to support this flexibility, it means I need to think of a way to support not only the schema implied by T, but also additional and arbitrary properties that a user might consider.
Some may wonder why we need this flexibility: after all, why not just change T to support whatever we like? The issue comes up most often with code you do not control. If you already have an existing codebase with objects that you would like to store in SSDS, it might not be practical or even possible to change the T to add additional schema.
Even if you completely control the codebase, expressing relationships between CLR objects and expressing relationships between things in your data are two different ideas - sometimes this problem has been termed 'impedance mismatch'.
In the CLR, if two objects are related, they are often part of a collection, or they refer to an instance on another object. This is easy to express in the CLR (e.g. Instance.ChildrenCollection["key"]). In your typical datasource, this same relationship is done using foreign keys to refer to other entities.
Consider the following classes:
public class Employee
{
public string EmployeeId { get; set; }
public string Name { get; set; }
public DateTime HireDate { get; set; }
public Employee Manager { get; set; }
public Project[] Projects { get; set; }
}
public class Project
{
public string ProjectId { get; set; }
public string Name { get; set; }
public string BillCode { get; set; }
}
Here we see that the Employee class refers to itself as well as contains a collection of related projects (Project class) that the employee works on. SSDS only supports simple scalar types and no arrays or nested objects today, so we cannot directly express this in SSDS. However, we can decompose this class and store the bits separately and then reassemble later. First, let's see what that looks like and then we can see how it was done:
var projects = new Project[]
{
new Project { BillCode = "123", Name = "TPS Slave", ProjectId = "PID01"},
new Project { BillCode = "124", Name = "Programmer", ProjectId = "PID02" }
};
var bill = new Employee
{
EmployeeId = "EMP01",
HireDate = DateTime.Now.AddMonths(-1),
Manager = null,
Name = "Bill Lumbergh",
Projects = new Project[] {}
};
var peter = new Employee
{
EmployeeId = "EMP02",
HireDate = DateTime.Now,
Manager = bill,
Name = "Peter Gibbons",
Projects = projects
};
var cloudpeter = new SsdsEntity<Employee>
{
Entity = peter,
Id = peter.EmployeeId
};
var cloudbill = new SsdsEntity<Employee>
{
Entity = bill,
Id = bill.EmployeeId
};
//here is how we add flexible props
cloudpeter.Add<string>("ManagerId", peter.Manager.EmployeeId);
var table = _context.OpenContainer("initech");
table.Insert(cloudpeter);
table.Insert(cloudbill);
var cloudprojects = peter.Projects
.Select(s => new SsdsEntity<Project>
{
Entity = s,
Id = Guid.NewGuid().ToString()
});
//add some metadata to track the project to employee
foreach (var proj in cloudprojects)
{
proj.Add<string>("RelatedEmployee", peter.EmployeeId);
table.Insert(proj);
}
All this code does is create two employees and two projects and set the relationships between them. Using the Add<K> method, I can insert any primitive type to go along for the ride with the POCO. If we query the container now, this is what we see:
<s:EntitySet
xmlns:s="http://schemas.microsoft.com/sitka/2008/03/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:x="http://www.w3.org/2001/XMLSchema">
<Project>
<s:Id>2ffd7a92-2a3b-4cd8-a5f7-55f40c3ba2b0</s:Id>
<s:Version>1</s:Version>
<ProjectId xsi:type="x:string">PID01</ProjectId>
<Name xsi:type="x:string">TPS Slave</Name>
<BillCode xsi:type="x:string">123</BillCode>
<RelatedEmployee xsi:type="x:string">EMP02</RelatedEmployee>
</Project>
<Project>
<s:Id>892dbb1e-ba47-4c87-80e6-64fbb46da935</s:Id>
<s:Version>1</s:Version>
<ProjectId xsi:type="x:string">PID02</ProjectId>
<Name xsi:type="x:string">Programmer</Name>
<BillCode xsi:type="x:string">124</BillCode>
<RelatedEmployee xsi:type="x:string">EMP02</RelatedEmployee>
</Project>
<Employee>
<s:Id>EMP01</s:Id>
<s:Version>1</s:Version>
<EmployeeId xsi:type="x:string">EMP01</EmployeeId>
<Name xsi:type="x:string">Bill Lumbergh</Name>
<HireDate xsi:type="x:dateTime">2008-05-25T23:59:49</HireDate>
</Employee>
<Employee>
<s:Id>EMP02</s:Id>
<s:Version>1</s:Version>
<EmployeeId xsi:type="x:string">EMP02</EmployeeId>
<Name xsi:type="x:string">Peter Gibbons</Name>
<HireDate xsi:type="x:dateTime">2008-06-25T23:59:49</HireDate>
<ManagerId xsi:type="x:string">EMP01</ManagerId>
</Employee>
</s:EntitySet>
As you can see, I have stored extra data in my 'flexible' entity with the ManagerId property (on one entity) and RelatedEmployee property on the Project kinds. This allows me to figure out later what objects are related to each other since we can't model the CLR objects relationships directly. Let's see how this was done.
public class SsdsEntity<T> where T: class
{
Dictionary<string, object> _propertyBucket = new Dictionary<string, object>();
public SsdsEntity() { }
[XmlIgnore]
public Dictionary<string, object> PropertyBucket
{
get { return _propertyBucket; }
}
[XmlAnyElement]
public XElement[] Attributes
{
get
{
//using XElement is much easier than XmlElement to build
//take all properties on object instance and build XElement
var props = from prop in typeof(T).GetProperties()
let val = prop.GetValue(this.Entity, null)
where prop.GetSetMethod() != null
&& allowableTypes.Contains(prop.PropertyType)
&& val != null
select new XElement(prop.Name,
new XAttribute(Constants.xsi + "type",
XsdTypeResolver.Solve(prop.PropertyType)),
EncodeValue(val)
);
//Then stuff in any extra stuff you want
var extra = _propertyBucket.Select(
e =>
new XElement(e.Key,
new XAttribute(Constants.xsi + "type",
XsdTypeResolver.Solve(e.Value.GetType())),
EncodeValue(e.Value)
)
);
return props.Union(extra).ToArray();
}
set
{
//wrap the XElement[] with the name of the type
var xml = new XElement(typeof(T).Name, value);
var xs = new XmlSerializer(typeof(T));
//xml.CreateReader() cannot be used as it won't support base64 content
XmlTextReader reader = new XmlTextReader(
xml.ToString(),
XmlNodeType.Document,
null
);
this.Entity = (T)xs.Deserialize(reader);
//now deserialize the other stuff left over into the property bucket...
var stuff = from v in value.AsEnumerable()
let props = typeof(T).GetProperties().Select(s => s.Name)
where !props.Contains(v.Name.ToString())
select v;
foreach (var item in stuff)
{
_propertyBucket.Add(
item.Name.ToString(),
DecodeValue(
item.Attribute(Constants.xsi + "type").Value,
item.Value)
);
}
}
}
public void Add<K>(string key, K value)
{
if (!allowableTypes.Contains(typeof(K)))
throw new ArgumentException(
String.Format(
"Type {0} not supported in SsdsEntity",
typeof(K).Name)
);
if (!_propertyBucket.ContainsKey(key))
{
_propertyBucket.Add(key, value);
}
else
{
//replace the value
_propertyBucket.Remove(key);
_propertyBucket.Add(key, value);
}
}
}
I have omitted the parts of SsdsEntity<T> from the first post that didn't change. The only other addition you don't see here is a helper method called DecodeValue, which as you might guess, interprets the string value in XML and attempts to cast it to a CLR type based on the xsi:type that comes back.
All we did here was add a Dictionary<string, object> property called PropertyBucket that holds our extra stuff we want to associate with our T instance. Then in the getter and setter for the XElement[] property called Attributes, we are adding them into our array of XElement as well as pulling them back out on deserialization and stuffing them back into the Dictionary. With this simple addition, we have fixed our in flexibility (or lack thereof) problem. We are still limited to the simple scalar types, but as you can see you can work around this in a lot of cases by decomposing the objects down enough to be able to recreate them later.
The Add<K> method is a convenience only as we could operate directly against the Dictionary. I also could have chosen to keep the Dictionary property bucket private and not expose it. That would have worked just fine for serialization, but I wanted to also be able to query it later.
In my last post, I said I would introduce a library where all this code is coming from, but I didn't realize at the time how long this post would be and that I still need to cover querying. So... next time, I will finish up this series by explaining how the strongly typed query model works and how all these pieces fit together to recompose the data back into objects (and release the library).
Tuesday, June 17, 2008
Last time we talked about SQL Server Data Services and serializing objects, we discussed how easy it was to use the XmlSerializer to deserialize objects using the REST interface. The problem was that when we serialized objects using the XmlSerializer, it left out the xsi type declarations that we needed. I gave two possible solutions to this problem - one that used the XmlSerializer and 'fixed' the output after the fact, and the other built the XML that we needed using XLINQ and Reflection.
Today, I am going to talk about a third technique that I have been using lately that I like better. It uses some of the previous techniques and leverages a few tricks with XmlSerializer to get what I want. First, let's start with a POCO (plain ol' C# object) class that we would like to use with SSDS.
public class Foo
{
public string Name { get; set; }
public int Size { get; set; }
public bool IsPublic { get; set; }
}
In it's correctly serialized form, it looks like this on the wire:
<Foo xmlns:s="http://schemas.microsoft.com/sitka/2008/03/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:x="http://www.w3.org/2001/XMLSchema">
<s:Id>someid</s:Id>
<s:Version>1</s:Version>
<Name xsi:type="x:string">My Foo</Name>
<Size xsi:type="x:decimal">10</Size>
<IsPublic xsi:type="x:boolean">false</IsPublic>
</Foo>
You'll notice that we have the additional system metadata attributes "Id" and "Version" in the markup. We can account for the metadata attributes by doing something cheesy like deriving from a base class:
public abstract class Cheese
{
public string Id { get; set; }
public int Version { get; set; }
}
However this is very unnatural as our classes would all have to derive from our "Cheese" abstract base class (ABC).
public class Foo : Cheese
{
public string Name { get; set; }
public int Size { get; set; }
public bool IsPublic { get; set; }
}
Developers familiar with remoting in .NET should be cringing right now as they remember the hassles associated with deriving from MarshalByRefObject. In a world without multiple inheritance, this can be painful. I want a model where I can use arbitrary POCO objects (redundant, yes I know) and not be forced to derive from anything or do what I would otherwise term unnatural acts.
What if instead, we derived a generic entity that could contain any other entity?
public class SsdsEntity<T> where T: class
{
string _kind;
public SsdsEntity() { }
[XmlElement(Namespace = @"http://schemas.microsoft.com/sitka/2008/03/")]
public string Id { get; set; }
[XmlIgnore]
public string Kind
{
get
{
if (String.IsNullOrEmpty(_kind))
{
_kind = typeof(T).Name;
}
return _kind;
}
set
{
_kind = value;
}
}
[XmlElement(Namespace = @"http://schemas.microsoft.com/sitka/2008/03/")]
public int Version { get; set; }
[XmlIgnore]
public T Entity { get; set; }
}
In this case, we have simply wrapped the POCO that we care about in a class that knows about the specifics of the SSDS wire format (or more accurately could serialize down to the wire format).
This SsdsEntity<T> is easy to use and provides access to the strongly typed object via the Entity property.
Now, we just have to figure out how to serialize the SsdsEntity<Foo> object and we know that the metadata attributes are taken care of and our original POCO object that we care about is included. I call it wrapping POCOs in a thin SSDS veneer.
The trick to this is to add a bucket of XElement objects on the SsdsEntity<T> class that will hold our public properties on our class T (i.e. 'Foo' class). It looks something like this:
[XmlAnyElement]
public XElement[] Attributes
{
get
{
//using XElement is much easier than XmlElement to build
//take all properties on object instance and build XElement
var props = from prop in typeof(T).GetProperties()
let val = prop.GetValue(this.Entity, null)
where prop.GetSetMethod() != null
&& allowableTypes.Contains(prop.PropertyType)
&& val != null
select new XElement(prop.Name,
new XAttribute(Constants.xsi + "type",
XsdTypeResolver.Solve(prop.PropertyType)),
EncodeValue(val)
);
return props.ToArray();
}
set
{
//wrap the XElement[] with the name of the type
var xml = new XElement(typeof(T).Name, value);
var xs = new XmlSerializer(typeof(T));
//xml.CreateReader() cannot be used as it won't support base64 content
XmlTextReader reader = new XmlTextReader(
xml.ToString(),
XmlNodeType.Document,
null);
this.Entity = (T)xs.Deserialize(reader);
}
}
In the getter, we use Reflection and pull back a list of all the public properties on the T object and build an array of XElement. This is the same technique I used in my first post on serialization. The 'allowableTypes' object is a HashSet<Type> that we use to figure out which property types we can support in the service (DateTime, numeric, string, boolean, and byte[]). When this property serializes, the XElements are simply added to the markup.
The EncodeValue method shown is a simple helper method that correctly encodes string values, boolean, dates, integers, and byte[] values for the attribute. Finally, we are using a helper method that returns from a Dictionary<Type,string> the correct xsi type for the required attribute (as determined from the property type).
For deserialization, what happens is that the [XmlAnyElement] attribute causes all unmapped attributes (in this case, all non-system metadata attributes) to be collected in a collection of XElement. When we deserialize, if we simply wrap an enclosing element around this XElement collection, it is exactly what we need for deserialization of T. This is shown in the setter implementation.
It might look a little complicated, but now simple serialization will just work via the XmlSerializer. Here is one such implementation:
public string Serialize(SsdsEntity<T> entity)
{
//add a bunch of namespaces and override the default ones too
XmlSerializerNamespaces namespaces = new XmlSerializerNamespaces();
namespaces.Add("s", Constants.ns.NamespaceName);
namespaces.Add("x", Constants.x.NamespaceName);
namespaces.Add("xsi", Constants.xsi.NamespaceName);
var xs = new XmlSerializer(
entity.GetType(),
new XmlRootAttribute(typeof(T).Name)
);
XmlWriterSettings xws = new XmlWriterSettings();
xws.Indent = true;
xws.OmitXmlDeclaration = true;
using (var ms = new MemoryStream())
{
using (XmlWriter writer = XmlWriter.Create(ms, xws))
{
xs.Serialize(writer, entity, namespaces);
ms.Position = 0; //reset to beginning
using (var sr = new StreamReader(ms))
{
return sr.ReadToEnd();
}
}
}
}
Deserialization is even easier since we are starting with the XML representation and don't have to build a Stream in memory.
public SsdsEntity<T> Deserialize(XElement node)
{
var xs = new XmlSerializer(
typeof(SsdsEntity<T>),
new XmlRootAttribute(typeof(T).Name)
);
//xml.CreateReader() cannot be used as it won't support base64 content
XmlTextReader reader = new XmlTextReader(
node.ToString(),
XmlNodeType.Document,
null);
return (SsdsEntity<T>)xs.Deserialize(reader);
}
If you notice, I am using an XmlTextReader to pass to the XmlSerializer. Unfortunately, the XmlReader from XLINQ does not support handling of base64 content, so this workaround is necessary.
At this point, we have a working serializer/deserializer that can handle arbitrary POCOs. There are some limitations of course:
- We are limited to the same datatypes that SSDS supports. This also means nested objects and arrays are not directly supported.
- We have lost a little of the 'flexible' in the Flexible Entity (the E in the ACE model). We now have a rigid schema defined by SSDS metadata and T public properties and enforced on our objects.
In my next post, I will attempt to address some of those limitations and I will introduce a library that handles most of this for you.
Wednesday, June 11, 2008
I officially love LINQPad. Joe Albahari has done a great job of introducing a light weight tool that is great for learning and prototyping LINQ queries. From what I gather, Joe and Ben Albahari built this tool as part of their book offering. It was so useful, it has taken on a life of its own.
It may not be entirely obvious, but it turns out don't have to use LINQPad solely for LINQ queries. You can actually prototype any type of snippet of code. I have been using it now instead of SnippetCompiler (another great quick snippet tool).
As an example, here is how to use System.DirectoryServices snippets inside of LINQPad:
Hit F4 to bring up the Advanced Query Properties Window
Add the System.DirectoryServices.dll reference in the Additional References window, and then add "System.DirectoryServices" in the Additional Namespace Imports window.
Now, just type your code normally and hit F5 when you are done:
This is a great little tool to have as you can query databases, build LINQ expressions, and visually inspect the results that come back pretty easily. Now, as you can see you can also execute arbitrary code snippets as well. Highly recommended.
Thursday, June 05, 2008
A member in the book's forum mentioned some code I had originally posted here in the blog for asynchronous, paged searches in System.DirectoryServices.Protocols (SDS.P). He questioned whether or not it was thread safe. I honestly don't know - it might not be as I didn't test it extensively.
Regardless, I had actually moved on from that code and started using anonymous delegates for callbacks instead of events. I liked this pattern a bit better because it also got rid of the shared resources.
After reading Stephen Toub's article on asynchronous stream processing, I learned about the AsyncOperationManager which was something I was missing in my implementation. I have been doing a lot lately with .NET 3.5, LINQ, and lambda expressions, so I also decided to rewrite the anonymous delegates to lambda expressions. That is not as big a change, but it is more concise.
I actively investigated using async iterators, but ultimately I decided closures seemed to be more intuitive for me. I might revisit this at some time and change my mind. Here is my outcome:
public class AsyncSearcher
{
LdapConnection _connect;
public AsyncSearcher(LdapConnection connection)
{
this._connect = connection;
this._connect.AutoBind = true; //will bind on first search
}
public void BeginPagedSearch(
string baseDN,
string filter,
string[] attribs,
int pageSize,
Action<SearchResponse> page,
Action<Exception> completed
)
{
if (page == null)
throw new ArgumentNullException("page");
AsyncOperation asyncOp = AsyncOperationManager.CreateOperation(null);
Action<Exception> done = e =>
{
if (completed != null) asyncOp.Post(delegate
{
completed(e);
}, null);
};
SearchRequest request = new SearchRequest(
baseDN,
filter,
System.DirectoryServices.Protocols.SearchScope.Subtree,
attribs
);
PageResultRequestControl prc = new PageResultRequestControl(pageSize);
//add the paging control
request.Controls.Add(prc);
AsyncCallback rc = null;
rc = readResult =>
{
try
{
var response = (SearchResponse)_connect.EndSendRequest(readResult);
//let current thread handle results
asyncOp.Post(delegate
{
page(response);
}, null);
var cookie = response.Controls
.Where(c => c is PageResultResponseControl)
.Select(s => ((PageResultResponseControl)s).Cookie)
.Single();
if (cookie != null && cookie.Length != 0)
{
prc.Cookie = cookie;
_connect.BeginSendRequest(
request,
PartialResultProcessing.NoPartialResultSupport,
rc,
null
);
}
else done(null); //signal complete
}
catch (Exception ex) { done(ex); }
};
//kick off async
try
{
_connect.BeginSendRequest(
request,
PartialResultProcessing.NoPartialResultSupport,
rc,
null
);
}
catch (Exception ex) { done(ex); }
}
}
It can be consumed very easily using something like this:
class Program
{
static ManualResetEvent _resetEvent = new ManualResetEvent(false);
static void Main(string[] args)
{
//set these to your environment
string servername = "server.yourdomain.com";
string baseDN = "dc=yourdomain,dc=com";
using (LdapConnection connection = CreateConnection(servername))
{
AsyncSearcher searcher = new AsyncSearcher(connection);
searcher.BeginPagedSearch(
baseDN,
"(sn=Dunn)",
null,
100,
f => //runs per page
{
foreach (var item in f.Entries)
{
var entry = item as SearchResultEntry;
if (entry != null)
{
Console.WriteLine(entry.DistinguishedName);
}
}
},
c => //runs on error or when done
{
if (c != null) Console.WriteLine(c.ToString());
Console.WriteLine("Done");
_resetEvent.Set();
}
);
_resetEvent.WaitOne();
}
Console.WriteLine();
Console.WriteLine("Finished.... Press Enter to Continue.");
Console.ReadLine();
}
static LdapConnection CreateConnection(string server)
{
LdapConnection connect = new LdapConnection(
new LdapDirectoryIdentifier(server),
null,
AuthType.Negotiate
);
connect.SessionOptions.ProtocolVersion = 3;
connect.SessionOptions.ReferralChasing = ReferralChasingOptions.None;
connect.SessionOptions.Sealing = true;
connect.SessionOptions.Signing = true;
return connect;
}
}
The important thing to note is that because everything is running asynchronously, it is totally possible for the end delegate to be invoked before the paging delegate has a chance to finish processing results (depending on how complicated your code is). You would need to compensate for this yourself.
This client is a console application, so I am using a ManualResetEvent just to prevent it from closing before finishing. You wouldn't need to do this in a WinForms or WPF app.
I am sure there are other optimizations you could make to pass in parameters or even other directory controls. However, the general pattern should apply.
Wednesday, April 09, 2008
I just posted the first version of PhluffyFotos, our SQL Server Data Services (SSDS) sample app to CodePlex. PhluffyFotos is a photo sharing site that allows users to upload photos and metadata (tags, description) to SSDS for storage. As the service gets more features and is updated, the sample will be rev'd as well.
Points of interest that will likely also be blog posts in themselves:
- This sample has a LINQ-to-SSDS provider in it. You will notice we don't use any strings for queries, but rather lambda expressions. I had a lot of fun writing the first version of this and I would expect that there are a few more revisions here to go. Of course, Matt Warren should get a ton of credit here for providing the base implementation.
- This sample also uses a very simplistic ASP.NET Role provider for SSDS. Likely updates here will include encryption and hashing support.
- We have a number of Powershell cmdlets included for managing authorities and containers.
I have many other ideas for this app as time progresses, so you should check back from time to time to see the updates.
In case anyone was wondering about the name: clouds are fluffy... get it?
You need to have SSDS credentials to run this sample. If you don't have credentials yet, you can see an online version until then at http://www.phluffyfotos.com
Even if you don' t have access to SSDS credentials yet, the code is worth taking a look.
Monday, December 03, 2007
If you are interested in learning more about Visual Studio 2008, make sure you check out the Visual Studio 2008 Training Kit. Weighing in at roughly 120MB compressed, it contains, "a full 5-days of technical content including 20 hands-on labs, 28 presentations, and 20 scripted demos. The technologies covered in the kit include: LINQ, C# 3.0, VB 9, WCF, WF, WPF, Windows CardSpace, Silverlight, ASP.NET Ajax, .NET Compact Framework 3.5, VSTO 3.0, Visual Studio Team System, and Team Foundation Server".
Naturally, you will want to have a machine setup to run all these labs and samples... so, thanks to the hard work of David and James, you can now download a VPC with Vista, Visual Studio 2008 (trial), and the .NET 3.5 framework pre-loaded and ready to run. Get it here.
You want more? Ok, how about 17 training videos describing the technologies and running through a number of demos? Get those here.
These are truly some great resources to get you jumpstarted!
Tuesday, October 30, 2007
There are three ways of figuring out things that have changed in Active Directory (or ADAM). These have been documented for some time over at MSDN in the aptly titled "Overview of Change Tracking Techniques". In summary:
- Polling for Changes using uSNChanged. This technique checks the 'highestCommittedUSN' value to start and then performs searches for 'uSNChanged' values that are higher subsequently. The 'uSNChanged' attribute is not replicated between domain controllers, so you must go back to the same domain controller each time for consistency. Essentially, you perform a search looking for the highest 'uSNChanged' value + 1 and then read in the results tracking them in any way you wish.
- Benefits
- This is the most compatible way. All languages and all versions of .NET support this way since it is a simple search.
- Disadvantages
- There is a lot here for the developer to take care of. You get the entire object back, and you must determine what has changed on the object (and if you care about that change).
- Dealing with deleted objects is a pain.
- This is a polling technique, so it is only as real-time as how often you query. This can be a good thing depending on the application. Note, intermediate values are not tracked here either.
- Polling for Changes Using the DirSync Control. This technique uses the ADS_SEARCHPREF_DIRSYNC option in ADSI and the LDAP_SERVER_DIRSYNC_OID control under the covers. Simply make an initial search, store the cookie, and then later search again and send the cookie. It will return only the objects that have changed.
- Benefits
- This is an easy model to follow. Both System.DirectoryServices and System.DirectoryServices.Protocols support this option.
- Filtering can reduce what you need to bother with. As an example, if my initial search is for all users "(objectClass=user)", I can subsequently filter on polling with "(sn=dunn)" and only get back the combination of both filters, instead of having to deal with everything from the intial filter.
- Windows 2003+ option removes the administrative limitation for using this option (object security).
- Windows 2003+ option will also give you the ability to return only the incremental values that have changed in large multi-valued attributes. This is a really nice feature.
- Deals well with deleted objects.
- Disadvantages
- This is .NET 2.0+ or later only option. Users of .NET 1.1 will need to use uSNChanged Tracking. Scripting languages cannot use this method.
- You can only scope the search to a partition. If you want to track only a particular OU or object, you must sort out those results yourself later.
- Using this with non-Windows 2003 mode domains comes with the restriction that you must have replication get changes permissions (default only admin) to use.
- This is a polling technique. It does not track intermediate values either. So, if an object you want to track changes between the searches multiple times, you will only get the last change. This can be an advantage depending on the application.
- Change Notifications in Active Directory. This technique registers a search on a separate thread that will receive notifications when any object changes that matches the filter. You can register up to 5 notifications per async connection.
- Benefits
- Instant notification. The other techniques require polling.
- Because this is a notification, you will get all changes, even the intermediate ones that would have been lost in the other two techniques.
- Disadvantages
- Relatively resource intensive. You don't want to do a whole ton of these as it could cause scalability issues with your controller.
- This only tells you if the object has changed, but it does not tell you what the change was. You need to figure out if the attribute you care about has changed or not. That being said, it is pretty easy to tell if the object has been deleted (easier than uSNChanged polling at least).
- You can only do this in unmanaged code or with System.DirectoryServices.Protocols.
For the most part, I have found that DirSync has fit the bill for me in virtually every situation. I never bothered to try any of the other techniques. However, a reader asked if there was a way to do the change notifications in .NET. I figured it was possible using SDS.P, but had never tried it. Turns out, it is possible and actually not too hard to do.
My first thought on writing this was to use the sample code found on MSDN (and referenced from option #3) and simply convert this to System.DirectoryServices.Protocols. This turned out to be a dead end. The way you do it in SDS.P and the way the sample code works are different enough that it is of no help. Here is the solution I came up with:
public class ChangeNotifier : IDisposable
{ LdapConnection _connection;
HashSet<IAsyncResult> _results = new HashSet<IAsyncResult>();
public ChangeNotifier(LdapConnection connection)
{ _connection = connection;
_connection.AutoBind = true;
}
public void Register(string dn, SearchScope scope)
{ SearchRequest request = new SearchRequest(
dn, //root the search here
"(objectClass=*)", //very inclusive
scope, //any scope works
null //we are interested in all attributes
);
//register our search
request.Controls.Add(new DirectoryNotificationControl());
//we will send this async and register our callback
//note how we would like to have partial results
IAsyncResult result = _connection.BeginSendRequest(
request,
TimeSpan.FromDays(1), //set timeout to a day...
PartialResultProcessing.ReturnPartialResultsAndNotifyCallback,
Notify,
request
);
//store the hash for disposal later
_results.Add(result);
}
private void Notify(IAsyncResult result)
{ //since our search is long running, we don't want to use EndSendRequest
PartialResultsCollection prc = _connection.GetPartialResults(result);
foreach (SearchResultEntry entry in prc)
{ OnObjectChanged(new ObjectChangedEventArgs(entry));
}
}
private void OnObjectChanged(ObjectChangedEventArgs args)
{ if (ObjectChanged != null)
{ ObjectChanged(this, args);
}
}
public event EventHandler<ObjectChangedEventArgs> ObjectChanged;
#region IDisposable Members
public void Dispose()
{ foreach (var result in _results)
{ //end each async search
_connection.Abort(result);
}
}
#endregion
}
public class ObjectChangedEventArgs : EventArgs
{ public ObjectChangedEventArgs(SearchResultEntry entry)
{ Result = entry;
}
public SearchResultEntry Result { get; set;}}
It is a relatively simple class that you can use to register searches. The trick is using the GetPartialResults method in the callback method to get only the change that has just occurred. I have also included the very simplified EventArgs class I am using to pass results back. Note, I am not doing anything about threading here and I don't have any error handling (this is just a sample). You can consume this class like so:
static void Main(string[] args)
{