Programming, Policitcs, and uhhh Pineapples.
# Thursday, August 20, 2009

Thoughts From The Trenches (Giant Brain Dump Incoming!)

Thursday, August 20, 2009 1:16:44 AM UTC

A random assortment of random thoughts (and rants!) from the trenches...

You Know You're In Trouble When...

  1. You have to convene three people to figure out how to create an instance of one of the core objects in your framework.  I think this is directly related to having an anemic domain model - it just isn't obvious which "service" you should be calling to set the properties on the object.  It seems like the whole thing would be easier if you could just call the constructor or a static initializer on the class to get an instance; this is the most basic premise of an object oriented system (and one that gets thrown to the wayside much too often).  Constructors are the most natural way to create an instance of an object; why not use them?
  2. Your team members are afraid to update their code (in fact, they'll wait days before updating because it's always a painful, time-consuming excursion to get your codebase compiling not to mention your environment working afterwards).  This could be a symptom of many different ills.  In this case, the problem is three fold:
    1. The source control system is painful to use.  The culprit is Accurev; it is perhaps one of the worst source control systems I've ever used (not to mention it's very obscure and uses obtuse terms for common source control actions).  A quick search on Dice yields 6 results for the keyword "Accurev" while "svn or subversion" yields some 786 results.  Of course, the big problem with this is that it takes an extraordinarily long time to ramp up a new addition to the team to the peculiarities of the source control system.  (I still haven't figured out how to look at changesets, run "blame" on a file, and why it's so slow...)
    2. There are no automated unit tests for the most basic and important of functionality: data access code.  The lack of a structured way to unit test your core data access code makes the entire codebase seem....fragile.  Changes in code that are not regression tested tend to break things, which tends to ruin productivity.  I can understand not testing code that is dependent on external libraries which are difficult to test (it really requires a lot of thinking and work to do right), but I can't understand why any team wouldn't test their core data access code.  
    3. There is no software support for tracking breaking changes.  What I mean by this is, for example, changes to a database schema or a stored procedure.  The standard way some teams "resolve" this issue is by emailing people when a breaking change is entered.  However, the problem with email is that it's easy to forget someone and, even if you remember everyone, it's not easy to backtrack and find all of the different email notices.  For example, if I'm in the process of writing an intense piece of code, I'll ignore a breaking change and deal with it the next time I update.  But by that time, there could be two or three breaking changes.  It's difficult to sort these out in email and much easier to sort them out with some pretty basic software support.  On FirstPoint, we used a Trac discussion to track breaking changes.  Developers checking in breaking changes were required to document the steps that the other developers would need to take to ensure that the environment remained stable.
  3. You're worried about deadlines, but you roll off two people who've been working on your project for two years and replace them with one person who's been working on the project for two months.  Fred Brooks' The Mythical Man-Month covers this pretty succinctly:

    adding manpower to a late software project makes it later
    The problem is that the new resource cannot possibly have the richness of experience with the existing codebase that is require to be productive right away.  In a system that's sparsely documented (and by that I mean there is no documentation on the core object model), it means that a new developer has to interrupt the workstream of more seasoned developers to get anything done.  This is probably okay when the going is slow and steady, but in crunch time, this becomes a big productivity issue.  I know I hate being interrupted when I'm in the zone, so I personally hate to interrupt others, but in this scenario, I have no choice since there is no documentation, the codebase is huge, and it's not at all obvious how to get the data that I need.
  4. When there are multiple ways to set the value of a property on a core object in your model.  What I mean by this is say I have an object called Document and somehow, there were two or more ways to set the value of VersionId (and each way getting you a different type of value) when you use a data access object to retrieve an instance.  Again, this is a byproduct of an anemic domain model.  Because the rules of how to use the object are external of the object itself, the proper usage of the properties becomes open to interpretation, based on the specific service populating the object. 
  5. Your object model is littered with stuff ending in "DAO", "Util", "Service", or "Manager".  It means that you haven't really thought about your object model in terms of object interactions and the structural composition.  These are suffixes that I use only when I can't think of anything better.  More often than not, when I write these classes, they truly are utility classes and are usually static classes.  If this is a big portion of your codebase, you have some serious problems.

You Can Make People Productive If...

I think the role of any senior developer, lead, or principal on a project is not to watch over everyone's shoulder and make sure that they are writing good code.  I've learned pretty early on that this doesn't work; you can't control how people write code and if you try to, you'll just get your panties in a twist all the time, raise your blood pressure to unhealthy levels, and piss off everyone around you.  So then the question is how can you get a group of diverse individuals with a diverse level of experience to write consistently good code?

It's a hard question and one that I'm still trying to answer.  However, I've learned a few lessons from my own experiences in working with people:

  1. Make an effort to educate the team.  This means reading assignments, group discussions, and making learning a basic requirement of the job, not an optional extracurricular activity.  Pick a book of the month and commit to reading a chapter a day.
  2. Have code reviews regularly.  One of the surest ways to help get everyone on the same page is through code reviews.  The key is to keep it focused and not let the process devolve into a back-and-forth debate regarding the little things, but rather focus on the structural elements of the objects and interactions.
  3. The smartest guys on the team work on the most "useless" code.  What I mean by "useless" here is that the code doesn't yield immediate benefits; in other words, framework code.  Typically, this involves lots of interfaces, abstract classes, and lots of fancy-pants design patterns.  The idea here is to make it easy for the whole team to write structurally sound code, regardless of skill level, by modeling the core interactions between objects and the core structure of the objects.  I think a key problem is that project managers see this as a zero-sum activity early on in the game (the most important time to establish this type of code) when in reality, it usually returns a huge ROI when done with the right amount of forethought and proper effort to refactor when the need arises.
  4. Document things...thoroughly.  One of the easiest ways to mitigate duplication and misuse is to use documentation in the code.  For framework level code, it's even more important to have solid documentation about the fields, what type of values to expect, how the objects should be used, how instances are created, what special actions need to be performed for cleanup, etc.  Documentation done right can also help improve code consistency if you add examples into your documentation.

Writing good code is productive.  It becomes easier to maintain, easier to bugfix, easier to ramp up new developers, easier for one developer to take over for another, and it means a generally more pleasant and insightful workday, every day.  Which brings us to...

Sound Software Engineering Is Like...

Exercise!  Project managers seem to lose this very basic insight when they make the transition from a developer.  Like exercise, it's always easier to put in the effort to do it regularly and eat a healthy diet than to wait until you're obese and then start worrying about your health and well-being.  Sure, it feels like hard work, waking up at the crack of dawn and going out into the rain/snow/dark, eating granola and oatmeal, skipping the fries and mayonaise, but it's much easier to keep weight off than to lose weight once you're 200lbs overweight! 

Likewise, it's always going to be easier to refactor daily as necessary and address glaring structural issues as soon as possible than to let them linger and keep stuffing donuts in your face.  It's like carrying around 200lbs of fat: you lose agility, it becomes difficult to move, everything seems to take more effort - even simple things like climbing the stairs becomes a chore.  The lesson is to trim the fat as soon as possible; don't let serious structural issues linger -- if there's a better, cleaner, easier way to do something, do it that way.  Every excuse you make to keep fat, ugly code around will only make it heavier and harder to maintain.

How To Reinvent The Wheel...

It seems like a pretty common problem: a lead or architect doesn't want to use a library because it's not "mature" enough.  What this means, exactly, still baffles me to this day.  Mature is such an arbitrary measure that it's hard to figure out when software becomes mature.  What this usually leads to is reinventing the wheel (several times over).

When evaluating third party libraries, I really only have a handful or criteria to consider whether I want to use it or not:

  1. Is it open source and is the license friendly for commercial usage?  I'll almost always take a less feature-rich, open source library over a more complete licensed library.  The reason is that there's less lock-in.  I won't feel like I've just wasted $1000 (or whatever) if I encounter a scenario where the library is insufficient or plain doesn't work.
  2. Does it have sufficient documentation to get the basic scenarios working?  This is perhaps the only measure of "maturity" that matters to me.
  3. Does it solve some scenario that would otherwise take the team an inordinate amount of time to impelment ourselves? I hate wasting time duplicating work that's freely available and well documented with a community of users who can help if the problem arises.  And yet, time and time again, there is no end to the resistance against using third party libraries.  Part of it is this very abstract definition of "maturity" (objections by technical people) and part of it is a fundamental misunderstanding and general laziness about different licensing models (the business folks).

That's it.  I don't need the Apache software foundation to tell me whether log4net is mature or not.  I look at the documentation, I write some test code, I use it and I evaluate it, and I incorporate it once I'm satisfied.

Software Estimation And Baking Cakes...

Fine grained software estimation is most assuredly the biggest waste of everyone's time.  Once it comes down to the granularity of man-hours, you know that someone has failed at their job since there is no way to even quantify that level of absurdity.  Once you start having meetings about your fine-grained estimates that pull in all of the developers, then you really know that you're FOCKED.

If I handed you a box of cake batter and asked how long it would take you to bake the cake, you'd probably take a look at the directions, read the steps, and estimate how long it would take you to perform all of the steps and add the baking time and come up with 50 minutes.  Okay, we start the timer.  You're off and cracking eggs and cutting open pouches and what not.  But wait, your mother calls and wants to talk about your trip next week.  -5 minutes.  You open the fridge and find that you're half a stick of butter short so you run to the grocery store.  -30 minutes.  Oh shoot!  You forgot to pre-heat the oven.  -5 minutes.  Finally, you've got the batter mixed up and ready to bake.  The directions say to bake for 40 minutes but you've already used up 40 minutes and only 10 minutes left of your original estimate: now what?

Well, you could turn up the heat, but that'd only serve to singe the outside of the cake while leaving the inside uncooked.  You could just bake it for 10 minutes, but your cake would still be uncooked -- but hey, you'd meet your estimate.  More likely than not, you'd just bake the cake for 40 minutes and come in 30 minutes late since late, edible cake is better than burnt or mushy cake.

Software estimation is kinda like that (and look, in the case of baking a cake, all of the directions and exact steps are already well defined and spelled out for you -- writing software is rarely so straightforward).  It's mostly an exercise in futility once it becomes too granular since there are just too many variables to account for.  The answer -- if it must be implemented feature complete -- is that it's going to take as long as it's going to take (and probably longer!).  For most non-trival tasks, I feel like the only proper level of granularity is weeks.  Don't get me wrong, I'm not saying that you shouldn't estimate, but that you should estimate at the right level of granularity and accept that once you've reached your estimation and the work isn't done, your only real choices are to:

  1. Extend the deadline.
  2. Trim the unnecessary features.

So that's it; feels good after a brain dump!

# Wednesday, August 12, 2009

Advertising Done Right

Wednesday, August 12, 2009 5:21:18 PM UTC

This one made me chuckle:

# Wednesday, July 29, 2009

Domain Models, Anemic Domain Models, and Transaction Scripts (Oh My!)

Wednesday, July 29, 2009 2:44:46 AM UTC

Ever work on a small project (say 5-8 developers, a few hundred thousand lines of code) and get the feeling that the codebase is unreasonably large and difficult to navigate or use/reuse?  Ever notice that other people keep duplicating logic -- like validation logic -- all over the place, violating the DRY principle every which way?  Ever notice how difficult it is to change one part of your system without breaking lots of stuff in another part (I mean, this happens anyways, to a degree, but is it a common occurrence on  your project)?

It would seem that your project might be suffering the side effects of an anemic domain model.  As I have observed that these models tend to be used mostly with a transaction script style design, I will use these two terms interchangeably.

First, what are anemic domain models and transaction scripts?  This has been discussed to death and there are tons of resources which describe an anemic domain model.  I'll spare you from repeating these points, but I'll summarize this approach to design as creating a lot of dumb "shell" classes (essentially, all of your core domain objects are just DTOs) with everything essentially public (because an anemic domain model relies on components of a transaction script (typically things called "services") to act on them and modify their state, typically breaking encapsulation).  You'll recognize it if you are constantly using classes with a "Service" or "Util" or "Manager" suffix.

The wiki page for anemic domain model has a nice summary of why you should avoid using them:

  • Logic cannot be implemented in a truly object oriented way unless wrappers are used, which hide the anemic data structure.
  • Violation of the principals information hiding and encapsulation.
  • Necessitates a separate business layer to contain the logic otherwise located in a domain model. It also means that domain model's objects cannot guarantee their correctness at any moment, because their validation and mutation logic is placed somewhere outside (most likely in multiple places).
  • Necessitates a global access to internals of shared business entities increasing coupling and fragility.
  • Facilitates code duplication among transactional scripts and similar use cases, reduces code reuse.
  • Necessitates a service layer when sharing domain logic across differing consumers of an object model.
  • Makes a model less expressive and harder to understand.

I'll admit: I've been guilty of this very pattern (or anti-pattern, if you want to be an "object bigot").  It's not that it's a bad thing.  In fact, as Greg Young argues, it's a prefectly suitable pattern in some cases.  Fowler himself says that there are virtues to this pattern:

The glory of Transaction Script is its simplicity.  Organizing logic this way is natural for applications with only a small amount of logic, and it involves very little overhead either in performance or in understanding.

It's hard to quantify the cutover level, especially when you're more familiar with one pattern than the other.  You can refactor a Transaction Script design to a Domain Model design, but it's harder than it needs to be.

However much of an object bigot your become, don't rule out Transaction Script.  there are a lot of simple problems out there, and a simple solution  will get you up and running faster. (PoEAA p.111-112)

So it's not that it's inherently a bad design, what I've found through my own experience, is that it doesn't scale well.  What does this even mean?  Again, I admit a certain level of ignorance; it's not until about a year and a half ago that I finally "got it".  I had read through Fowler's book a few years ago, and found it difficult to grasp what it meant to write an application using a domain model.  Fowler himself only spends some 9 pages discussing the topic and, in his closing paragraph, "chickens out" on providing an end-to-end example.

In working on my current project with several junior consultants, I've found myself trying to explain what it means to design a software system using a domain model as opposed to a transaction script model.  In doing so, I think I've whittled it down to a pretty simple set of examples and the simplest explanation of why a domain model is far superior to a transaction script model when working on a team of more than one :)

While codebases written in the style of DM or TS/ADM will probably contain the same number of classes, there is a big difference in how those classes are wired together and how much surface area a user of the codebase (let's call it an API) will need to know.

As an aside, IMO, as soon as you're programming in a team of more than one, you're writing an API or a framework of some sort, even if on a very small scale and very loosely defined.

One of the key benefits of a domain model approach is that it hides the complexity of the different components behind the core objects of the problem domain.  I like to think that in a transaction script model, the locus of control is placed outside of your core objects.  In a well designed domain model, the locus of control is contained within your core objects (even if big pieces of functionality are still implemented in external services).

Take a calendaring application for example.  In a transaction script implementation, you'd have something like this to move a scheduled event (in C#):

// What you would expect to see in a TS/ADM
Event _anExistingEvent = ... ;

EventScheduler scheduler = new EventScheduler();
scheduler.MoveEventDate(_anExistingEvent, DateTime.Now.AddDays(1));

In contrast, what you would expect to see in a domain model approach:

Event _anExistingEvent = ... ;
_anExistingEvent.MoveEventDate(DateTime.Now.AddDays(1));

The difference is subtle, but there is a huge gain in usability as there is one less class involved in the interactions of your domain object; a user of your API/framework (i.e. your coworker) doesn't have to learn about the EventScheduler to use your code.  Likewise, if we consider the other things that we can do with events, we start to see the benefits of encapsulating (or rather hiding) the business logic and complex interactions within the domain object instead of in external services.  For example, imagine another scenario where you want to send an event in an email:

// TS/ADM style
Event _anExistingEvent = ...;
string _serverAddress = ...;

EventSendingService service = new EventSendingService(_serverAddress);

service.CreateMessageForEvent(_anExistingEvent);
service.Connect();
service.SendEventMessage();

// DM approach:

Event _anExistingEvent = ...;
_anExistingEvent.SendEventMessage(_serverAddress);

In this case, we now need to be aware of two services to interact with events in our calendaring application for moving a date and sending an event.  Not only that, they're relatively hard to discover; whereas a method on the domain object would be easily discoverable via intellisense, it's not clear how a new user to the system would know which classes were involved in which business interactions without sufficient documentation and/or assistance from the long-timers.

Now while I've intentionally left out the implementation of the event class, I'm not implying that there is any less code in the implementation (there may be more and it may be far more complex, involving the usage of dependency injection or inversion of control), but if we consider this code as a public API intended for use by others, clearly, the domain model approach is much more usable and approachable than a transaction script approach where a user of the API has to know many more objects and understand how they are supposed to interact.

I would hardly call myself an expert on the subject (as I've written many an anemic domain model in my time).  But to me, externally, the distinguishing feature of a domain model approach over a transaction script approach is that the intended usage of the codebase is more discoverable, even if the LOC in the actual implementation only differs by 1%.

Visually, I think of the difference like this:

Clearly, we can see that one of the key benefits of a domain model approach is that there is less coupling between your calling code and your business logic (making it somewhat less painful to change implementation).  Note that there aren't necessarily any less service classes in a domain model approach (although their APIs are likely dramatically different than the APIs of a transaction script model).  We can also see that in a domain model approach, the caller or user of the API only has to know about the domain objects and may or may not know about the services in the background (if we were desigining for testability, we'd have some overrides that allow us to pass the concrete service for purposes of mocking).  Interacting primarily with the domain objects has the benefit of making it easier to think about the business scenario and the business problem that you're trying to solve.

Once I grasped this, I started to see the huge benefit that a domain model approach has over an anemic domain model, even on a two person team.  Nowadays, I strongly believe that an anemic domain model/transaction script approach is suitable for only the smallest of application development environments: a one man team.  Because as soon as you are expected to program different, interacting parts of a system in a team, class explosion becomes a real problem and hinderance to usability (which leads to high ramp up time) and discoverability (which leads to duplication and lots of "Oh, I didn't know we had that" or "I already impelementated that in that other service").  In such a scenario, documentation (which never exists) becomes even more important (and, if it exists, even more dense).

One very real concern is that then the domain object will become far to complex and bloated.  Fowler addresses this:

A common concern with domain logic is bloated domain objects.  As you build a screen to manipulate your orders you'll notice that some of the order behavior is only needed for it.  If you put these responsibilities on the order, the risk is that the Order class will become too big because it's full of responsibilities that are only used in a single use case.  This concern leads people to consider whether some responsibility is general, in which case it should sit in the order class, or specific, in which case it should sit in some usage-specific class, which might be a Transaction Script or perhaps the presentation itself.

(Incidentally, that last part regarding putting logic in the presentation is what makes ASP.NET webforms to crappy: the design of the framework (and it doesn't help that most of the examples in books and MSDN) encourage this).

However, Fowler makes the point that:

The problem with separating usage-specific behavior is that it can lead to duplication.  Behavior that's separated from the order is harder to find, so people tend to not see it and duplicate it instead.  Duplication can quickly lead to more complexity and inconsistency, but I've found that bloating occurs much less frequently than predicted.  If it does occur, it's relatively easy to see and not difficult to fix.  My advice is to not separate usage-specific behavior.  Put it all in the object that's the natural fit.  Fix the bloating when, and if, it becomes a problem.

This point is particlarly important;  I think most of us recognize this scenario: you change some logic in one place and close a bug ticket only to have another one opened up somewhere else regarding the same issue because you forgot to copy your fix over.  Blech!  If your validation code is external of your object, it's easy to end up writing it in two use cases and only updating one (and happens quite often, in my experience).  Even if you move your validation code into a common "*Service" class, it is still less discoverable to a new user (well, even team members that have been on the project the whole time, actually) than a method on the class itself.  Again, the point is that discoverability and reducing surface area can aid dramatically in terms of cutting down duplication of logic in your codebase.

IMO, a domain model style application design is the way to go.  It's hard to make a case for transaction scripts or anemic domain models.  Granted: a domain model is no silver bullet and can add significantly to the initial difficulty of implemetation, but I think that as your codebase matures and grows, the long term savings from an initial investment in setting up the framework (and mindset) to support a domain model is more than worth it, even if you have to build one and throw it away to learn how.

# Monday, July 27, 2009

6 Books That Should Be On Every .NET Developers Bookshelf

Monday, July 27, 2009 1:12:06 AM UTC

As I've been working on my current project, I've found that many freshman developers who want to get better often have a hard time navigating the huge amount of resources out there to improve their skills.  It's hard to commit given the cost of some of these books (usually at least $30 a pop) and the time it takes to make it through them if you don't know where to begin.

IMO, there are six books which are must reads for any serious .NET developer with aspirations for architecture (I'd probably recommend reading them in this order, too):

  1. Pro C# by Andrew Troelson which covers just enough about the CLR and the high level concepts in .NET programming with enough readability to not put you to sleep every time you crack it open.  Even if you never use some of the features of the platform, it's important to know what's possible on the platform as it influences your design and implementation choices.  While the "bookend" chapters aren't necessarily that great, the middle chapters are invaluable.
  2. Framework Design Guidelines by Cwalina and Abrams which provides a lot of insight into the guidelines that Microsoft implemented internally in designing the .NET framework.  This is important for writing usable code, in general.  I tend to think that all code that I write is -- on some level -- a framework (even if in miniature).  Many otherwise tedious discussions on standards, naming, type/member design, etc. can be resolved by simply referencing this book as The Guideline.
  3. Design Patterns by GoF all too often, I come across terrible, terrible code where the original coder just goes nuts with if-else, switch-case statements...basic knowledge of how to resolve these issues leads to more maintainable and easier to read code.  At the end of the day, design patterns are aimed at helping you organize your code into elements that are easier to understand conceptually, easier to read, easier to use, and easier to maintain.  It's about letting the structure of your objects and the interactions between your objects dictate the flow of logic, not crazy, deeply nested conditional logic (I hope to never again see another 2000+ line method...yes, a single method).
  4. Patterns of Enterprise Application Architecture by Fowler My mantra that I repeat to clients and coworkers is that "it's been done before".  There are very few design scenarios where you need to reinvent the wheel from the ground up.  For business applications, many of the common patterns are documented and formalized in this book (to be paired with Design Patterns).  This and Design Patterns are must reads for any developer that is seriously aspiring to be a technical lead or technical architect.  As the .NET framework matures and we diverge from the legacy programming, understanding design patterns is becoming more important to grasping the benefits and liabilities of designs and frameworks.  For high level technical resources, it's important to understand how to write "clean" code by designing around object interactions; design patterns document these commonly recurring interactions.  It is also a vocabulary and set of conventions by which programmers can communicate intent and usage of complex graphs of objects. 
  5. Code Complete 2nd Edition by McConnell While not C# or .NET specific, it delves into the deep nitty-gritty of the art of programming (and it's still very much an art/craft as opposed to a science). Too often, we lose sight of this core principle in our software construction process in our rush to "make it work", often leaving behind cryptic, unreadable, unmaintainable code.  Some of the chapters in this book will definitely put you to sleep, but at the same time, it's filled with so much useful insight that it's worth trudging through this behemoth of a book.
  6. Pragmatic Unit Testing in C# by Hunt.  This book, perhaps more than any of the ones listed above, gives a much more practical view of the how's and why's of good objected oriented design.  Designing for testability intrinsically means creating decoupled modules, classes, and methods; it forces you to think about your object structure and interactions in a completely different mindframe.  Test driven development/design is good to learn in and of itself, but I think the biggest thing I got from reading this book was insight into the small changes in code construction for the purpose of testability that yield big dividends in terms of decoupling your code modules.  I think that once you read this book, you'll start to really understand what it means to write "orthogonal code".

Good luck!

# Saturday, July 25, 2009

Automatic Properties (And Why You Should Avoid Them)

Saturday, July 25, 2009 3:36:44 AM UTC

Ah yes, automatic properties.  Insn't it great that you don't have to do all of that extra typing now?  (Well, you wouldn't be doing it anyways with ReSharper, but that's besides the point.) For some reason, they've never sat well with me; they just seem like a pretty useless feature and, not only that, I think it severely impacts readability. 

Quick, are these members in a class, an abstract class, or an interface?

int Id { get; set; }

string FirstName { get; set; }

string LastName { get; set; }

Can't tell!  Perhaps you code at a leisurely pace, but when I'm in the zone, I'm flying around my desktop, flipping through tabs like crazy, ALT-TABbing between windows, and typing like a madman.  It's happened to me more than once where I've been working in a file, trying to add some logic to a getter and getting weird errors only to realize that I was working in the interface or abstract class instead of the concrete class.  Of course I don't normally write many non-public properties, but it's easy to make the mistake of missing the access modifier if you're working furiously and tabbing back and forth, especially if the file is long (so that you can see the class/interface declaration at the top of your file).

Look again:

public interface IEmployee
{
	int Id { get; set; }

	string FirstName { get; set; }

	string LastName { get; set; }
}

public class Employee
{
	int Id { get; set; }

	string FirstName { get; set; }

	string LastName { get; set; }
}

public abstract class AbstractEmployee
{
	int Id { get; set; }

	string FirstName { get; set; }

	string LastName { get; set; }
}

It's even more confusing when you're working within an abstract class and there's implementation mixed in.  Not only that, it looks like a complete mess as soon as you have to add custom logic to the getter or setter of a property (and add a backing field); it just looks so...untidy (but that's just me; I like to keep things looking consistent).  I'm also going to stretch a bit and postulate that it may also encourage breaking sound design in scenarios where junior devs don't know any better since they won't think to use a private field when the situation calls for one (just out of laziness).

I get that it's a bit more work (yeah, maybe my opinion would be different if I had to type them out all the time, too - but I don't :P), but seriously, if you're worried about productivity, then I really have to ask why you haven't installed ReSharper yet (I've been using it since 2.0 and can't imagine developing without it).  It's easy to mistake one for the other if you're just flipping through tabs really fast.  I've sworn off using them and I've been sticking to my guns on this.

There are three general arguments that I hear, from time to time, from the opposition:

  1. Too many keystrokes, man!  With R#, you simply define all of your private fields and then ALT+INS and in less than 5 or 6 keystrokes, you've generated all of your properties.  I would say even less keystrokes than using automatic properties since it's way easier to just write the private field and generate it using R#.  If you're worried about productivity and keystrokes and you're not using R#, then what the heck are you waiting for? 
  2. Too much clutter, takes up too much space! If that's the case, just surround it in a region and don't look at it.  I mean, if you really think about it, using KNF instead of Allman style bracing throughout your codebase would probably reduce whitespace clutter and raw LOC and yet...
  3. They make the assembly heavier!  Blah!  Not true!  Automatic properties are a compiler trick.  They're still there, just uglier and less readable (in the event that you have to extract code from an assembly (and I have - accidentally deleted some source, but still had the assembly in GAC!)).  In this case, the compiler generates the following fields:

    <FirstName>k__BackingField
    <Id>k__BackingField
    <LastName>k__BackingField

Depending on the project, there may also be unforseen design scenarios where you may want to get/set a private field by reflection to bypass logic implemented in a property (I dunno, maybe in a serialization scenario?).

So my take?  Just don't use them, dude!


Update: To clarify, McConnell has a whole section of Code Complete which discusses "code shape" and how it affects readability (see chapter 31). I think this is along the same veins. You gain NOTHING by using automatic properties, but you sacrifice readability and clarity. I don't think that the argument of "saving a couple lines" is a valid one since you can just as easily collapse those into regions and save many more lines or even switch bracing styles.

As McConnell writes:

"Making the code look pretty is worth something, but it's worth less than showing the code's structure. If one technique show the structure better and another looks better, use the one that shows the structure better."

"The smaller part of the job of programming is writing a program so that the computer can read it; the larger part is writing it so that other humans can read it."

My belief is that using backing fields shows the structure of the class file better than using automatic properties (which was my point in the blog post). Automatic properties are a convenience for the author, but it sacrifices structural cues to the purpose and usage of a given code file, IMO.  There is no benefit except to save a few keystrokes for the author, but in that case, even more keystrokes can be saved using R# and explicit backing fields.

# Thursday, July 23, 2009

Why ASP.NET (webforms) Sucks.

Thursday, July 23, 2009 12:57:02 AM UTC

Somehow, I got into a heated discussion at work today regarding the suckitude of ASP.NET web forms development model.  As a preface, I wrote Java for four years in a college, ASP in VBScript and JScript all throughout college, ASP after college, and then started working with ASP.NET when it released.

Undoubtedly, .NET and C# were a giant leap forward compared to ASP and JScript (my preferred scripting language for ASP).  But even from the start, I could never love ASP.NET.  Simple things that I used to do quite easily in ASP were now seemingly much more difficult and much more convoluted.  The HTML never quite came out the way I wanted it (remember the days when .NET wouldn't output cross browser compatible Javascript and markup?  And how that markup would always fail validation?  Yeah, I remember those days.) and it was a chore to get many development tasks done (had to overwrite this or implement that -- just to get a freaking icon in a table cell).  By the first year of use, I was already thinking that it was a terrible abomination of a framework; I never wanted to see a DataGrid ever again.

Ever since then, I've avoided writing applications in the traditional webforms model altogether.  First, I stumbled upon AJAX.NET, then I heard about this neat thing called "Atlas" (ASP.NET AJAX), then I picked up prototype, and finally, I have seen the light with jQuery (this is after writing my own Javascript for everything for the first several years of development).  I've never looked back; I've never once missed working with a web DataGrid.  I've never once missed working with most of the controls in ASP.NET web forms.  In fact, the only two four non-shitty controls worth using are the ScriptManager, Placeholder, LiteralControl, and Repeater.  That's it; that's where I draw the line.  UI controls?  They all suck.  Every.  Single.  One.  Why? 1) Because it mangles HTML output (ID's anyone?) which require ridiculous workarounds to do anything cool on the client side (i.e. writing the ClientID to the page?). 2) I hate viewstate...it's just extra...stuff (and yes, I'm intimately familiar with that "stuff" because I had to hack it to pieces at Factiva).

Don't get me wrong, I love .NET and C# to death; they're awesome (so awesome).  But ASP.NET webforms? I could smell coming from a mile away - like the stink of a dead skunk (or 20) in the middle of the road.  It's clever, I'll give you that, but it's been over-engineered to hell and back to account for the shittiness inherent in how it wants you to write web applications.

Finally, after what? 8, 9 years?  Microsoft has got it right with ASP.NET MVC.

What follows is the summary of my arguments as to why ASP.NET webforms suck (and it's only the tip of the iceberg!).


Undoubtedly, JSF and ASP.NET webforms suck giant elephant balls.  I mean giant balls of epic proportions of suck.  If this were the Richter scale, it would be a magnitude 10 level of suck (by the way, wiki describes that as "Epic.  Never recorded").

A simple google search for "JSF sucks" yields plenty of results.

Freddy D. has a nice summary list:

  • JSF + JSP = templates that are filled with way too much noise. f:verbatim? ack! Facelets helps here but it is still too hard to read.
  • The generated HTML is all kinds of black magic.
  • You can't have a simple link to a JSF bean with some parameters and invoking an action method, from outside the JSF context, without having to jump through all kinds of hoops.
  • Parameter binding and retrieval from a different page involves tricks and digging through the FacesContext.. yuk.
  • faces-config.xml. struts-config.xml all over again. Just too noisy and useless. What is the added value, really?
  • immediate="true"? are you kidding me?
  • the managed beans are supposed to be reusable POJOs. So why do you need to use SelectItem for select boxes? More useless code. This ties you to JSF for no good reason.
  • the data table is inept. Display Tag anyone?
  • writing your own component is way too much work.
  • want some AJAX? you'll need to add yet another framework to handle it, and if it doesn't work, you're SOL. if you want to write your own AJAX-enabled component.. read the item above and add a few more "way way WAY" in front of "too much work".
  • Even if you find ways to solve your problems, just knowing that it would be SO much easier with another framework, just adds to the frustration. Case in point, we switched to Stripes and our code is up to 30-50% more concise, clearer, easier to understand.. plus, as a small bonus, it actually works!
  • The principle of least surprise definitely does not apply when using JSF.

Well I'll be damned if those weren't some of the same reasons that ASP.NET webforms suck.

Since I'm not as familiar with JSF, I'll discuss this from the ASP.NET perspective.  First, why is ASP.NET the way it is?  Why did they design it like this?  Undoubtedly, one of the core reasons was because they thought VB6 developers were too damn stupid to know better.  To make these plebeian, lowest-of-the-low, bottom rung programmers "productive", they put together this abomination of an event model on top of a perfectly nice and awesome .NET and C#.  ASP.NET webforms are shitty beyond compare because of this.

To understand why this sucks, consider how you program for a client-server model when you are building a Windows forms application.  Your winform is responsible, clearly, for rendering data only.  It makes a service call that doesn't understand a damn thing about the winforms application, right?  The remote endpoint, the server side, cares about data and data only.  Once the service call completes, your client handles rendering of that data and putting it into grids and what not.  Your services called by the winforms client don't have silly "OnClick" handlers in the service implementation do they?  Can you imagine what that would be like if you had to write your WCF services or any sort of web service like that?  You'd have to do all this shitty state maintenance and pass around useless data (like what button was clicked); you would have to have UI logic in your service.  It sounds like it would suck to do that with your winforms to a WCF service.  This begs the question:  so why do you have them in your .aspx.cs?  .aspx.cs is an incredibly stupid model and absolutely forces you to mix your controller logic with your view logic, making it difficult to reuse, difficult to maintain, and difficult to understand.

Digest that for a moment.

(Granted, there are some differences in developing a winforms thin client and a web app and obviously some scenarios where it's advantageous to return generated/static markup.)

ASP.NET MVC is very, very far removed from JSP (I did a bit of JSP) since JSP was never an MVC design (and neither is ASP.NET webforms).  Out of the box JSP and ASP.NET webforms are both an almagam of the Page Controller and Template patterns with the shittiness of the faux event model layered on top.  ASP.NET MVC is a true MVC model (or at least it's pretty damn close) that uses a Front Controller pattern and truly separated concerns between the view, the model, and the controller.  Most of the shittiness and pain with ASP.NET stems from the fact that the page controller pattern violates all sorts of seperations of concern as it encourages mashing up code that should be isolated in a view with code that should be isolated in a controller.  The end result is a giant heaping pile of poo.  The control and event model only make the situation worse by requiring this ugly, terrible thing we've all come to hate known as viewstate.

I'm convinced that these frameworks were created for numbskull rent-a-coders because they're too lazy/stupid/incapable of grasping basic HTML, Javascript, CSS, and simple DOM.  Look, at the end of the day, if you're going to be a web developer and you're not going to put in the time to master HTML, CSS, and Javascript to some degree, you're not doing it right.  That would be like a baker that didn't want to learn the basic composition of breads and dough and the chemical reactions that make tasty baked goods out of simple ingredients like flour, yeast, eggs, water, salt, and sugar.  Web developers attached to ASP.NET web controls and fear Javascript (and I know some) are like "bakers" who open a "bakery" and buy their products from Costco and repackage them.  I guess it's efficient and productive, but at the end of the day, such a "baker" is constrained to a very small universe of possibilities...how boring.

I've come to beleive that ASP.NET webforms continue to perpetuate generations of developers who have no idea how all this "magic" works and when presented with a design scenario that calls for a truly innovative UI, they stumble because they can't figure out how to do it without a control to hold their hand...boo-hoo (I'm amazed when people are amazed by some stupid ASP.NET web control without realizing how easy it is to accomplish the same thing with some jQuery goodness (or conversely, how hard it was before we had nice things like prototype and jQuery)).  The list of companies that don't screw around with these stupid frameworks probably have some of the most brilliant developers working for them:  Yahoo!, Google, Facebook, Amazon, etc. -- these guys long ago learned the lesson that ASP.NET webforms/JSF is for simpletons (or masochists).

Okay, I concede that it's great for initial productivity.  Certainly, most developers can probably throw up a page with a grid with some databound recordset much faster than I could write a page with the same set of features.  But I wager that my implementation would be cleaner, easier to read, easier to understand, easier to maintain, easier to reuse, and easier to work with once the need arises to address a usage scenario not supported by the grid out of the box.  If first run development speed is your only concern, then I lose every time because I'm concerned about doing it right, making it maintainable, and making it easy to understand the programming model.  If you care about those things, then it's worth it to do it right.

ASP.NET MVC will be the litmus test that will differentiate the competent developers from the rent-a-coders and, hopefully, will come to replace webforms as the primary model of web application development on the .NET framework.  The "stateful web" was an experimental failure of epic proportions to satisfy the simple minds that couldn't wrap their heads around the fact that the web is inherently stateless.  Let's move on from that dark, dank, dreary past and bask in the light of ASP.NET MVC!

# Wednesday, July 22, 2009

Mercurial vs. SVN

Wednesday, July 22, 2009 2:39:27 AM UTC

As I've been starting on a new project recently, I've been delving into Mercurial (hg) as an alternative to Subversion (svn).

I've been using svn for about 3 years now, and - for the most part - it has been way better than Visual Source Safe (whatever last version I used...it sucked), it clearly has some pain points that make daily workflows difficult.  The most important of these are the speed of commits, the fact that a commit is global, reverting to a version, and branching/merging (painful...so painful).

hg addresses each of these pain points:

  • Speed of commits and local commits: Mercurial has two separate concepts of commit and push.  A commit in hg creates a local commit as opposed to a global commit in svn.  This is useful because you can work->work->commit->work->commit->work->revert->commit without affecting anyone else's work stream.  In other words, you don't have to wait to commit until you have flawless, working, compiling code.  You can commit as much as you want and keep your work intact; this is a big win as it encourages experimentation.  When you're ready to share your code, a push operation (the equivalent of a svn commit) pushes it to a shared location for others to pull.  Of course, to some degree, you can accomplish this with svn as well using a branch, but...
  • Branching/merging: OMG, so much easier and more intuitive than svn.  I can't believe it.  In comparison, svn is a giant charlie foxtrot.  Branching and merging in svn pre-1.5 was an exercise in futility.  It was extremely difficult to remember the right procedure (always requiring a lookup to the docs) and very much error prone.  So difficult, in fact, that it discouraged branching for fear of wasting a good half day trying to merge it back in later.  Maybe it's better with 1.5?
  • Reverting to a version: ever try it in svn?  'Nuff said.  It's counterintuitive and always confuses the heck out of junior devs or devs new to svn.

I've only been using it for a few days now (primarily TortoiseHg), so perhaps hg has just as many warts as svn, but I'm going to stick with it and find out.

Some good resources on hg/git vs. svn (or DCVS vs CCVS):

I'll post new entries as I learn more about hg in daily use :)

P.S. WebFaction is a pretty awesome webhost.  For the low, low price of some $10 a month, I can host my own hg repository, svn repository, Trac, and (note: not or) my web app.  Hot damn, that's awesome!  Although svn was significantly easier to set up with WebFaction than hg, I've read that they are close to officially supporting hg and it should be just as easy in the future.

# Tuesday, July 14, 2009

TARP Paying Off...At Least For Now.

Tuesday, July 14, 2009 6:40:54 PM UTC

Whoa, caught whiff of this just now:

On June 9, the Treasury Department announced that 10 of the largest financial institutions that participated in the Capital Purchase Program (through TARP) have been approved to repay $68 billion. Yes, they had to be approved to repay the money. The companies had to prove they no longer needed the money, because the government doesn't want them begging for more down the road.

To date, those 10 companies have paid dividends on their preferred stock to the Treasury totaling about $1.8 billion, the Treasury announced. Overall, dividend payments from all of the 600 bank participants has come to about $4.5 billion so far. That's commensurate with the 5 percent (annualized) dividend return that was part of the terms of the program.

Bank analyst Bert Ely said while the government may end up losing money on investments in some financial firms, it's likely the entirety of the bank portion of the TARP will ultimately turn a profit.

The 5 percent paid in dividends on preferred stock purchased by the Treasury will certainly outpace the interest rate on money borrowed to finance the program, he said. And the warrants could also prove profitable.

"People think the government gave banks money," Ely said. "They made investments in banks."

So we could still end up losing money, but at least for now, it seems like it was a wise move.

# Thursday, July 09, 2009

XMPP SASL Challenge-Response Using DIGEST-MD5 In C#

Thursday, July 09, 2009 3:46:19 PM UTC

I've been struggling mightily with implementing the SASL challenge-response portion of an XMPP client I've been working on. By far, this has been the hardest part to implement as it's been difficult to validate whether I've implemented the algorithm correctly as there doesn't seem to be any (easy to find) open source implementations of of SASL with the DIGEST-MD5 implementation (let alone in C#).

The trickiest part of the whole process is building the response token which gets sent back as a part of the message to the server.

RFC2831 documents the SASL DIGEST-MD5 authentication mechanism as such:

   Let { a, b, ... } be the concatenation of the octet strings a, b, ...

   Let H(s) be the 16 octet MD5 hash [RFC 1321] of the octet string s.

   Let KD(k, s) be H({k, ":", s}), i.e., the 16 octet hash of the string
   k, a colon and the string s.

   Let HEX(n) be the representation of the 16 octet MD5 hash n as a
   string of 32 hex digits (with alphabetic characters always in lower
   case, since MD5 is case sensitive).

      response-value  =
         HEX( KD ( HEX(H(A1)),
                 { nonce-value, ":" nc-value, ":",
                   cnonce-value, ":", qop-value, ":", HEX(H(A2)) }))

   If authzid is specified, then A1 is


      A1 = { H( { username-value, ":", realm-value, ":", passwd } ),
           ":", nonce-value, ":", cnonce-value, ":", authzid-value }

   If authzid is not specified, then A1 is


      A1 = { H( { username-value, ":", realm-value, ":", passwd } ),
           ":", nonce-value, ":", cnonce-value }

   where

         passwd   = *OCTET

   If the "qop" directive's value is "auth", then A2 is:

      A2       = { "AUTHENTICATE:", digest-uri-value }

   If the "qop" value is "auth-int" or "auth-conf" then A2 is:

      A2       = { "AUTHENTICATE:", digest-uri-value,
               ":00000000000000000000000000000000" }

Seems simple enough, right?  Not!  It took a bit of time to parse through it mentally and come up with an impelementation, but I was still failing (miserably).

The breakthrough came when I stumbled upon a posting by Robbie Hanson:

Here's the trick - normally when you hash stuff you get a result in hex values. But we don't want this result as a string of hex values! We need to keep the result as raw data! If you were to do a hex dump of this data you'd find it to be "3a4f5725a748ca945e506e30acd906f0". But remeber, we need to operate on it's raw data, so don't convert it to a string.

The most important part of his posting is the last line (and that it included the intermediate hexadecimal string results. Win! Now I finally had some sample data to compare against to figure out where I was going wrong).  At one critical junction in my implementation of the algorithm, I was converting the MD5 hash value to a hexadecimal string -- thank goodness for Robbie's clarification of that point!

Armed with this test data, I was finally able to get it all working.  Here is the test code:

using MbUnit.Framework;
using Xmpp.Client;

namespace Xmpp.Tests
{
    [TestFixture]
    public class SaslChallengeResponseTests
    {
        [Test]
        public void TestCreateResponse()
        {
            // See example here: http://deusty.blogspot.com/2007/09/example-please.html

            // h1=3a4f5725a748ca945e506e30acd906f0
            // a1Hash=b9709c3cdb60c5fab0a33ebebdd267c4
            // a2Hash=2b09ce6dd013d861f2cb21cc8797a64d
            // respHash=37991b870e0f6cc757ec74c47877472b

            SaslChallenge challenge = new SaslChallenge(
                "md5-sess", "utf-8", "392616736", "auth", "osXstream.local");

            SaslChallengeResponse response = new SaslChallengeResponse(
                challenge, "test", "secret", "05E0A6E7-0B7B-4430-9549-0FE1C244ABAB");

            Assert.AreEqual("3a4f5725a748ca945e506e30acd906f0", 
                response.UserTokenMd5HashHex);
            Assert.AreEqual("b9709c3cdb60c5fab0a33ebebdd267c4", 
                response.A1Md5HashHex);
            Assert.AreEqual("2b09ce6dd013d861f2cb21cc8797a64d", 
                response.A2Md5HashHex);
            Assert.AreEqual("37991b870e0f6cc757ec74c47877472b", 
                response.ResponseTokenMd5HashHex);
        }
    }
}

I modeled the SASL challenge like so:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Reflection;
using System.Text;

namespace Xmpp.Client
{
    /// <summary>
    /// Represents a SASL challenge in object code.
    /// </summary>
    public class SaslChallenge
    {
        private static readonly Dictionary<string, FieldInfo> _fields;

        private readonly string _rawDecodedText;
        private string _algorithm;
        private string _charset;
        private string _nonce;
        private string _qop;
        private string _realm;

        /// <summary>
        /// Initializes the <see cref="SaslChallenge"/> class.
        /// </summary>
        /// <remarks>
        /// Caches the properties which are set using reflection on <see cref="Parse"/>.
        /// </remarks>
        static SaslChallenge()
        {
            // Initialize the hash of fields.
            _fields = new Dictionary<string, FieldInfo>();

            FieldInfo[] fields = typeof (SaslChallenge).GetFields(
                BindingFlags.NonPublic | BindingFlags.Instance);

            foreach (FieldInfo field in fields)
            {
                // Trim the _ from the start of the field names.
                string name = field.Name.Trim('_');

                _fields[name] = field;
            }
        }

        /// <summary>
        /// Creates a specific SASL challenge message.
        /// </summary>
        public SaslChallenge(string algorithm, string charset, 
            string nonce, string qop, string realm)
        {
            _algorithm = algorithm;
            _charset = charset;
            _nonce = nonce;
            _qop = qop;
            _realm = realm;

            Debug.WriteLine("algorithm=" + _algorithm);
            Debug.WriteLine("charset=" + _charset);
            Debug.WriteLine("nonce=" + _nonce);
            Debug.WriteLine("qop=" + _qop);
            Debug.WriteLine("realm=" + _realm);
        }

        /// <summary>
        /// Initializes a new instance of the <see cref="SaslChallenge"/> 
        /// class based on the raw decoded text.
        /// </summary>
        /// <remarks>
        /// Use the <see cref="Parse"/> method to create an instance from 
        /// a raw encoded message.
        /// </remarks>
        /// <param name="rawDecodedText">The raw decoded text.</param>
        private SaslChallenge(string rawDecodedText)
        {
            _rawDecodedText = rawDecodedText;

            string[] parts = rawDecodedText.Split(',');

            foreach (string part in parts)
            {
                string[] components = part.Split('=');

                string property = components[0];

                _fields[property].SetValue(this, 
                    components[1].Trim('"'));
            }

            Debug.WriteLine("algorithm=" + _algorithm);
            Debug.WriteLine("charset=" + _charset);
            Debug.WriteLine("nonce=" + _nonce);
            Debug.WriteLine("qop=" + _qop);
            Debug.WriteLine("realm=" + _realm);
        }

        public string Realm
        {
            get { return _realm; }
        }

        public string Nonce
        {
            get { return _nonce; }
        }

        public string Qop
        {
            get { return _qop; }
        }

        public string Charset
        {
            get { return _charset; }
        }

        public string Algorithm
        {
            get { return _algorithm; }
        }

        public string RawDecodedText
        {
            get { return _rawDecodedText; }
        }

        /// <summary>
        /// Parses the specified challenge message.
        /// </summary>
        /// <param name="response">The base64 encoded challenge.</param>
        /// <returns>An instance of <c>SaslChallenge</c> based on 
        /// the message.</returns>
        public static SaslChallenge Parse(string encodedChallenge)
        {
            byte[] bytes = Convert.FromBase64String(encodedChallenge);

            string rawDecodedText = Encoding.ASCII.GetString(bytes);

            return new SaslChallenge(rawDecodedText);
        }
    }
}

And finally, here is the challenge response class which contains the meat of the response building logic:

using System;
using System.Diagnostics;
using System.Security.Cryptography;
using System.Text;

namespace Xmpp.Client
{

    /// <summary>
    /// Partial implementation of the SASL authentication protocol 
    /// using the DIGEST-MD5 mechanism.
    /// </summary>
    /// <remarks>
    /// See <see href="http://www.ietf.org/rfc/rfc4422.txt"/> and 
    /// <see href="http://www.ietf.org/rfc/rfc2831.txt"/> for details.
    /// </remarks>
    public class SaslChallengeResponse
    {
        #region fields

        private static readonly Encoding _encoding;
        private static readonly MD5 _md5;
        private readonly SaslChallenge _challenge;
        private readonly string _cnonce;
        private readonly string _decodedContent;
        private readonly string _digestUri;
        private readonly string _encodedContent;
        private readonly string _password;
        private readonly string _realm;
        private readonly string _username;

        private string _a1Md5HashHex;
        private string _a2Md5HashHex;
        private string _responseTokenMd5HashHex;
        private string _userTokenMd5HashHex;

        #endregion

        #region properties

        /// <summary>
        /// Gets the final, base64 encoded content of the challenge response.
        /// </summary>
        /// <value>A base64 encoded string of the response content.</value>
        public string EncodedContent
        {
            get { return _encodedContent; }
        }

        /// <summary>
        /// Gets the unencoded content of the challenge response.
        /// </summary>
        /// <value>The response content in plain text.</value>
        public string DecodedContent
        {
            get { return _decodedContent; }
        }

        /// <summary>
        /// Gets the hexadecimal string representation of the user token MD5 
        /// hash value.
        /// </summary>
        /// <value>The hexadecimal representation of the user token MD5 hash 
        /// value.</value>
        public string UserTokenMd5HashHex
        {
            get { return _userTokenMd5HashHex; }
        }

        /// <summary>
        /// Gets the hexadecimal string representation of the response token 
        /// MD5 hash value.
        /// </summary>
        /// <value>The hexadecimal string representation of the response token 
        /// MD5 hash value.</value>
        public string ResponseTokenMd5HashHex
        {
            get { return _responseTokenMd5HashHex; }
        }

        /// <summary>
        /// Gets the hexadecimal string representation of the A1 MD5 hash 
        /// value (see RFC4422 and RFC2831)
        /// </summary>
        /// <value>The hexadecimal string representation of the A1 MD5 hash 
        /// value (see RFC4422 and RFC2831)</value>
        public string A1Md5HashHex
        {
            get { return _a1Md5HashHex; }
        }

        /// <summary>
        /// Gets the hexadecimal string representation of the A2 MD5 hash 
        /// value (see RFC4422 and RFC2831)
        /// </summary>
        /// <value>The hexadecimal string representation of the A2 MD5 hash 
        /// value (see RFC4422 and RFC2831)</value>
        public string A2Md5HashHex
        {
            get { return _a2Md5HashHex; }
        }

        #endregion

        /// <summary>
        /// Initializes the <see cref="SaslChallengeResponse"/> class.
        /// </summary>
        static SaslChallengeResponse()
        {
            _md5 = MD5.Create();
            _encoding = Encoding.UTF8;
        }

        /// <summary>
        /// Initializes a new instance of the <see cref="SaslChallengeResponse"/> 
        /// class.
        /// </summary>
        /// <param name="challenge">The challenge.</param>
        /// <param name="username">The username.</param>
        /// <param name="password">The password.</param>
        public SaslChallengeResponse(SaslChallenge challenge, 
            string username, string password)
            : this(challenge, username, password, null, null, null)
        {
        }

        /// <summary>
        /// Initializes a new instance of the <see cref="SaslChallengeResponse"/> 
        /// class.
        /// </summary>
        /// <param name="challenge">The challenge.</param>
        /// <param name="username">The username.</param>
        /// <param name="password">The password.</param>
        /// <param name="cnonce">A specific cnonce to use.</param>
        public SaslChallengeResponse(SaslChallenge challenge, string username, 
            string password, string cnonce)
            : this(challenge, username, password, null, null, cnonce)
        {
        }

        /// <summary>
        /// Initializes a new instance of the <see cref="SaslChallengeResponse"/> 
        /// class.
        /// </summary>
        /// <param name="challenge">The challenge.</param>
        /// <param name="username">The username.</param>
        /// <param name="password">The password.</param>
        /// <param name="realm">A specific realm, different from the one in the 
        /// challenge.</param>
        /// <param name="digestUri">The digest URI.</param>
        /// <param name="cnonce">A specific client nonce to use.</param>
        public SaslChallengeResponse(SaslChallenge challenge, string username, 
            string password, string realm, string digestUri, string cnonce)
        {
            _challenge = challenge;
            _username = username;
            _password = password;

            if (string.IsNullOrEmpty(_challenge.Realm))
            {
                _realm = realm;
            }
            else
            {
                _realm = challenge.Realm;
            }

            if (string.IsNullOrEmpty(_realm))
            {
                throw new ArgumentException("No realm was specified.");
            }

            if (string.IsNullOrEmpty(cnonce))
            {
                _cnonce =
                    Guid.NewGuid().ToString().TrimStart('{').TrimEnd('}')
                        .Replace("-", string.Empty).ToLowerInvariant();
            }
            else
            {
                _cnonce = cnonce;
            }

            if (string.IsNullOrEmpty(digestUri))
            {
                _digestUri = string.Format("xmpp/{0}", _challenge.Realm);
            }
            else
            {
                _digestUri = digestUri;
            }

            // Main work here:
            _decodedContent = GetDecodedContent();

            byte[] bytes = _encoding.GetBytes(_decodedContent);

            _encodedContent = Convert.ToBase64String(bytes);
        }

        /// <summary>
        /// Gets the body of the response in a decoded format.
        /// </summary>
        /// <returns>The raw response string.</returns>
        private string GetDecodedContent()
        {
            // Gets the response token according to the algorithm in RFC4422 
            // and RFC2831
            _responseTokenMd5HashHex = GetResponse();

            StringBuilder buffer = new StringBuilder();

            buffer.AppendFormat("username=\"{0}\",", _username);
            buffer.AppendFormat("realm=\"{0}\",", _challenge.Realm);
            buffer.AppendFormat("nonce=\"{0}\",", _challenge.Nonce);
            buffer.AppendFormat("cnonce=\"{0}\",", _cnonce);
            buffer.Append("nc=00000001,qop=auth,");
            buffer.AppendFormat("digest-uri=\"{0}\",", _digestUri);
            buffer.AppendFormat("response={0},", _responseTokenMd5HashHex);
            buffer.Append("charset=utf-8");

            return buffer.ToString();
        }

        /// <summary>
        /// HEX( KD ( HEX(H(A1)), { nonce-value, ":" nc-value, ":", cnonce-value, 
        /// ":", qop-value, ":", HEX(H(A2)) }))
        /// </summary>
        private string GetResponse()
        {
            byte[] a1 = GetA1();
            string a2 = GetA2();

            Debug.WriteLine("a1=" + ConvertToBase16String(a1));
            Debug.WriteLine("a2=" + a2);

            byte[] a2Bytes = _encoding.GetBytes(a2);

            byte[] a1Hash = _md5.ComputeHash(a1);
            byte[] a2Hash = _md5.ComputeHash(a2Bytes);

            _a1Md5HashHex = ConvertToBase16String(a1Hash);
            _a2Md5HashHex = ConvertToBase16String(a2Hash);

            // Let KD(k, s) be H({k, ":", s})
            string kdString = string.Format("{0}:{1}:{2}:{3}:{4}:{5}",
                _a1Md5HashHex, _challenge.Nonce, "00000001",
                _cnonce, "auth", _a2Md5HashHex);

            Debug.WriteLine("kd=" + kdString);

            byte[] kdBytes = _encoding.GetBytes(kdString);

            byte[] kd = _md5.ComputeHash(kdBytes);
            string kdBase16 = ConvertToBase16String(kd);

            return kdBase16;
        }

        /// <summary>
        /// A1 = { H( { username-value, ":", realm-value, ":", passwd } ), 
        /// ":", nonce-value, ":", cnonce-value }
        /// </summary>
        private byte[] GetA1()
        {
            string userToken = string.Format("{0}:{1}:{2}",
                                             _username, _realm, _password);

            Debug.WriteLine("userToken=" + userToken);

            byte[] bytes = _encoding.GetBytes(userToken);

            byte[] md5Hash = _md5.ComputeHash(bytes);

            // Use this for validation purposes from unit testing.
            _userTokenMd5HashHex = ConvertToBase16String(md5Hash);

            string nonces = string.Format(":{0}:{1}",
                                          _challenge.Nonce, _cnonce);

            byte[] nonceBytes = _encoding.GetBytes(nonces);

            byte[] result = new byte[md5Hash.Length + nonceBytes.Length];

            md5Hash.CopyTo(result, 0);
            nonceBytes.CopyTo(result, md5Hash.Length);

            return result;
        }

        /// <summary>
        /// A2 = { "AUTHENTICATE:", digest-uri-value }
        /// </summary>
        private string GetA2()
        {
            string result = string.Format("AUTHENTICATE:{0}", _digestUri);

            return result;
        }

        /// <summary>
        /// Converts a byte array to a base16 string.
        /// </summary>
        /// <param name="bytes">The bytes to convert.</param>
        /// <returns>The hexadecimal string representation of the contents 
        /// of the byte array.</returns>
        private string ConvertToBase16String(byte[] bytes)
        {
            StringBuilder buffer = new StringBuilder();

            foreach (byte b in bytes)
            {
                string s = Convert.ToString(b, 16).PadLeft(2, '0');

                buffer.Append(s);
            }

            Debug.WriteLine(string.Format("Converted {0} bytes", 
                bytes.Length));

            string result = buffer.ToString();

            return result;
        }
    }
}

Happy DIGESTing!

# Wednesday, July 08, 2009

Leveraging The Windows Forms WebBrowser Control (For The Win)

Wednesday, July 08, 2009 1:28:32 PM UTC

I've been working on a little utility to experiment with the XMPP protocol.  The idea was to write a tool that would allow me to send, receive, and display the XML stream and XML stanza messages at the core of XMPP.

Of course, I could have implemented it using a simple multi-line text box for the XML entry, but that would mean that I wouldn't have nice things like syntax highlighting (for XML) and nice (auto) indentation.

On the desktop, I'm not familiar with any free Windows Forms editor controls that are capable of syntax highlighting.  But on the web side, there are several free, open source script libraries at our disposal.  For example, CodePress, EditArea, and CodeMirror.

I chose CodeMirror for this application as it was the simplest library that met my needs.

There really aren't any tricks to this aside from getting the URL correct.  In this case, I have my static HTML content in a directory in my project:

And I set the content files to "Copy always" in the properties pane for the files so that they get copied to the output directory of the project.  To set the correct path, I find the executing directory of the application and set the URL properly:

protected override void OnLoad(EventArgs e)
{
    if (!DesignMode)
    {
        string path = Path.Combine(
            Environment.CurrentDirectory, 
            "HTML/input-page.html");
        path = path.Replace('\\', '/');

        _inputBrowser.Url = new Uri(path);
    }

    base.OnLoad(e);
}

Note that I check if the control is in design mode (the designer throws an error if you don't do this since the path points to Visual Studio's runtime directory instead of your applications output).  Now all that's left is to get the script and style references correct in your HTML page:

<script src="../HTML/CodeMirror/js/codemirror.js" type="text/javascript"></script>
<link rel="stylesheet" type="text/css" href="../HTML/CodeMirror/css/docs.css" />  

And in your script:

<script type="text/javascript">
    var editor = CodeMirror.fromTextArea('code', {
        height: "210px",
        parserfile: "parsexml.js",
        stylesheet: "../HTML/CodeMirror/css/xmlcolors.css",
        path: "../HTML/CodeMirror/js/",
        continuousScanning: 500,
        lineNumbers: true,
        textWrapping: false
    });
</script>

The final part is getting your Windows forms code talking to the Javascript in the browser.  In this case, I've written some simple Javascript that interacts with the editor control:

<script type="text/javascript">
    var $g = document.getElementById;

    function resize(height) {
        var textArea = $g("code");

        // Get the iframe.
        var editor = textArea.nextSibling.firstChild;

        editor.style.height = (height - 1) + "px";
    }

    function reset() {
        editor.setCode("");
    }

    function addContent(data) {
        var code = editor.getCode();

        editor.setCode(code + data);

        editor.reindent();
    }

    function setContent(data) {
        editor.setCode(data);

        editor.reindent();

        // Scroll to bottom.
        editor.selectLines(editor.lastLine(), 0);
    }

    function getContents() {
        return editor.getCode();
    }
</script>

From the application code, we can call these script functions using the InvokeScript method:

private void AddStreamMessageInternal(string data)
{
    if (_streamBrowser.Document != null)
    {
        // Get the contents
        string code = (string) _streamBrowser.Document.InvokeScript(
            "getContents", new object[] {});

        code = code + data;

        code = code.Replace("><", ">\r\n<");

        _streamBrowser.Document.InvokeScript(
            "setContent", new object[] {code});
    }
}

private void AdjustSize()
{
    // Call a resize method to change the HTML editor size.
    if (_streamBrowser.Document != null)
    {
        _streamBrowser.Document.InvokeScript(
            "resize", new object[] {_streamBrowser.ClientSize.Height});
    }
}

public void RefreshBrowserContents()
{
    if (_streamBrowser.Document != null)
    {
        _streamBrowser.Document.InvokeScript(
            "reset", new object[] {});
    }
}

Awesome! You can see that I can both pass arguments into the Javascript functions and read the return data from the Javascript function as well.  The big win is that now you can take advantage of your favorite Javascript utilities in your Windows Forms applications.

RSS 2.0 Atom 1.0 CDF