Random Thoughts of a Scatterbrain.
 Thursday, April 24, 2008

FOR XML Needs More Love

4/24/2008 1:55:27 PM (Eastern Daylight Time, UTC-04:00)

I'm constantly amazed by the number of developers who have never worked with FOR XML EXPLICIT and the new FOR XML PATH.  If I were designing data access, it would be my go-to commands for building queries for complex data structures (nested DataReaders?  yuck!).

In the past, to support paging using FOR XML EXPLICIT queries took tons of lines to accomplish (although there is something about the whole explicitness that makes it surprisingly legible).  Now with the fancy pants ROW_NUMBER function in SQL along with CTEs, a hundred line query can be written with maybe 15-20 lines.

Here's a simple example that you can copy+paste and run:

/* 
Demonstrates usage of ROW_NUMBER and FOR XML PATH to create 
pageable XML results queries.

In this case, the key is to page only on the Route objects.
*/

--// Mock Route table
DECLARE @Route TABLE (
    Id int,
    Title varchar(100)
)

--// Mock Step table
DECLARE @Step TABLE (
    Id int,
    RouteId int,
    Title varchar(100),
    Sequence int
)

--// Insert mock data
INSERT INTO @Route VALUES (1, 'Route 1')
INSERT INTO @Route VALUES (2, 'Route 2')
INSERT INTO @Route VALUES (3, 'Route 3')
INSERT INTO @Route VALUES (4, 'Route 4')
INSERT INTO @Route VALUES (5, 'Route 5')

--// Route 1 Steps
INSERT INTO @Step VALUES (1, 1, 'Step 1.1', 1)
INSERT INTO @Step VALUES (2, 1, 'Step 1.2', 2)
INSERT INTO @Step VALUES (3, 1, 'Step 1.3', 3)
INSERT INTO @Step VALUES (4, 1, 'Step 1.4', 4)
INSERT INTO @Step VALUES (5, 1, 'Step 1.5', 5)

--// Route 2 Steps
INSERT INTO @Step VALUES (6, 2, 'Step 2.1', 1)
INSERT INTO @Step VALUES (7, 2, 'Step 2.2', 2)
INSERT INTO @Step VALUES (8, 2, 'Step 2.3', 3)

--// Route 3 Steps
INSERT INTO @Step VALUES (9, 3, 'Step 3.1', 1)

--// Route 4 Steps
INSERT INTO @Step VALUES (10, 4, 'Step 4.1', 1)

--// Route 5 Steps
INSERT INTO @Step VALUES (11, 5, 'Step 5.1', 1)
INSERT INTO @Step VALUES (12, 5, 'Step 5.2', 2)
INSERT INTO @Step VALUES (13, 5, 'Step 5.3', 3)
INSERT INTO @Step VALUES (14, 5, 'Step 5.4', 4)

/*
Define the page size.
 -- Add sorting and ordering later
*/

DECLARE @PageSize int
DECLARE @CurrentPage int

SET @CurrentPage = 0
SET @PageSize = 3

/*
Calculate starting and ending row.
*/
DECLARE @StartIndex int
DECLARE @EndIndex int

SET @StartIndex = @CurrentPage * @PageSize
SET @EndIndex = @StartIndex + @PageSize

; --// Need to terminate with a semicolon for CTE 
/*
Perform core XML select
*/

WITH Routes AS (
    SELECT 
        *,
        ROW_NUMBER() OVER (ORDER BY Id) AS 'RowNumber'
    FROM
        @Route
)
SELECT
    Routes.Id AS "@Id",
    Routes.Title AS "@Title",
    (
        SELECT
            Step.Id AS '@Id',
            Step.Title AS '@Title'
        FROM
            @Step AS Step
        WHERE
            Step.RouteId = Routes.Id
        ORDER BY 
            Step.Sequence ASC
        FOR XML PATH('Step'), TYPE            
    ) AS 'Steps'
FROM 
    Routes
WHERE
    RowNumber > @StartIndex AND RowNumber <= @EndIndex  --// BETWEEN Results in improper paging
FOR XML PATH('Route'), ROOT('Routes')

What's great about this is that if your object model is properly designed, it's just a matter of deserializing the XML (using precompiled serialization binaries, of course) to rehydrate your data model.

In this case, the output XML looks like this:

<Routes>
  <Route Id="1" Title="Route 1">
    <Steps>
      <Step Id="1" Title="Step 1.1" />
      <Step Id="2" Title="Step 1.2" />
      <Step Id="3" Title="Step 1.3" />
      <Step Id="4" Title="Step 1.4" />
      <Step Id="5" Title="Step 1.5" />
    </Steps>
  </Route>
  <Route Id="2" Title="Route 2">
    <Steps>
      <Step Id="6" Title="Step 2.1" />
      <Step Id="7" Title="Step 2.2" />
      <Step Id="8" Title="Step 2.3" />
    </Steps>
  </Route>
  <Route Id="3" Title="Route 3">
    <Steps>
      <Step Id="9" Title="Step 3.1" />
    </Steps>
  </Route>
</Routes>

Next, you'll need some simple code to deserialize the XML to make it useful:

using System;
using System.Collections.Generic;
using System.Collections.ObjectModel;
using System.IO;
using System.Xml.Serialization;

namespace XmlDeserialization {
    internal class Program {
        private static void Main(string[] args) {
            string xml = ...; // XML string here

            RouteList routes = new RouteList(xml);

            Console.Out.WriteLine(routes.Count);

            foreach(Route route in routes) {
                Console.Out.WriteLine(" + Route: {0}", route.Id);

                foreach(Step step in route.Steps) {
                    Console.Out.WriteLine("   + Step: {0}", step.Id);
                }
            }
        }
    }
    
    public abstract class XmlDeserializingList<Titem> : List<Titem> {
        protected XmlDeserializingList() { }

        protected XmlDeserializingList(string xml) {
            StringReader reader = new StringReader(xml);

            XmlSerializer serializer = new XmlSerializer(GetType());
            XmlDeserializingList<Titem> items = (XmlDeserializingList<Titem>)serializer.Deserialize(reader);

            AddRange(items);
        }
    }

    [XmlRoot("Routes")]
    public class RouteList : XmlDeserializingList<Route> {
        public RouteList() {}

        public RouteList(string xml) : base(xml) {
            
        }
    }

    [Serializable]
    public class Route {
        private int id;
        private string title;
        private Collection<Step> steps;

        [XmlAttribute]
        public int Id {
            get { return id; }
            set { id = value; }
        }

        [XmlAttribute]
        public string Title {
            get { return title; }
            set { title = value; }
        }

        [XmlArray("Steps"), XmlArrayItem("Step", typeof(Step))]
        public Collection<Step> Steps {
            get { return steps; }
            set { steps = value; }
        }
    }

    [Serializable]
    public class Step {
        private int id;
        private string title;

        [XmlAttribute]
        public int Id {
            get { return id; }
            set { id = value; }
        }

        [XmlAttribute]
        public string Title {
            get { return title; }
            set { title = value; }
        }
    }
}

It might even be useful to add some abstract methods (and properties to support it) to GetNextPage()

 Thursday, September 27, 2007

Dynamic SQL: Yea or Nay?

9/27/2007 2:42:01 PM (Eastern Daylight Time, UTC-04:00)

I've always been on the side of stored procedures in the classic debate over the merits of dynamic SQL.  In reality, I can only think of one good scenario where dynamic SQL at the application layer should be used: programmatic batch inserts.

I won't go into the performance debate, since there are tons of articles that already cover this area, but rather, I'd like to discuss the usability and development and architectural aspect of it.

In almost all other cases, it seems like the best choice is to have the application not generate dynamic SQL and use a stored procedure...always.  There are certainly times when dynamic SQL is necessary, for example, when generating selects against a dynamic table structure, but in those cases, the variable portions of the query can be parameterized into the stored procedure and the procedure should generate the dynamic SQL.

Some would argue that if the underlying data models change, the application layer will usually be forced to change are ignoring other aspects of model changes that don't necessitate application model changes.  These include performance tuning, filtering by table JOINs and reuse of the data logic in nested stored procedures or functions.

When working with compiled code like .NET, the core issue is that fixing query errors involves a recompile and redeploy, which in most cases, is much more difficult than just fixing a completely disconnected (but not completely decoupled since there is a quasi-interface (the return result type and structure)) stored procedure.

For example, if a dataset today contains data from table A and tomorrow it needs to include data from table A and B (let's say they both contain the same elements, but one is used for archives), it would be easy to update the procedure to UNION the results from the two datasets without affecting the application layer.

This isn't the only scenario, for example, let's say the requirement changes and now the data needs to be filtered by another table.  It would be easy to add a new INNER JOIN to the query without affecting the application layer.  Not only that, it also allows for the recombination of fields (for example a user name field today only needs to show first and last name, but tomorrow, it may need to show the middle initial as well - this change can be done at the database level and not affect the application or UI layers).  It can also make it easy to change the underlying table structure so long as the return data isn't expected to change: it provides a layer of decoupling between the application layer and the raw data storage.

In addition, having a stored procedure allows for easier testing of the data layer without the added overhead of having to execute the application runtime and walk through the debugger line by line just to figure out if the return data is correct; it is much more efficient to simply execute the query and simulate the use case to find if the data that is returned is correct.  It becomes much easier and much less painful to simulate data access tests since they can be run, observed, and analyzed nearly instantly.

In larger organizations with dedicated DBAs, stored procedures have the added benefit of allowing SQL experts to add performance tuning to eek out extra performance without requiring the application to be rewritten or recompiled.  Again, we see this decoupling of the application layer from the data layer.  Of course you could always have templated SQL stored in XML files or something that would get rid of that recompile, but it is still likely to necessitate more redeployment if the application in question is distributed.  This key point is not to be taken lightly since -- as an example -- an error in string formatting may require the replacement of binaries and services across dozens of servers.  Not only that, testing in such a scenario still requires interaction with the application layer, adding to the possible failure points, time required, and general development pain.

My own conclusion is that using dynamic SQL (including LINQ) creates too tight of a coupling between the application layer binaries and the underlying data store; it's great for RAD and testing, but in any application of significance (especially in highly distributed environments), dynamic SQL at the application layer seems like it's a maintenance and testing disaster waiting to happen.

 Tuesday, August 21, 2007

Working With SQL Server Compact Edition 2005

8/21/2007 6:05:02 PM (Eastern Daylight Time, UTC-04:00)

One interesting issue that I just solved involved how to specify the location of the database file for a SQL Server Compact Edition 2005 connection string in a .Net add-in for Microsoft Office.

You see, when the add-in starts, it sets the context directory as the user's documents directory, which of course, makes it impossible to enter a configuration string for the data source of the connection string.

It works fine if the directory is hard coded - which is what I did for testing purposes initially, but of course, when I switched over to XP64, this broke as on XP64, the program is installed to "Program Files (x86)".

The solution lies buried in Microsoft's SQL CE documentation: there's a note that you should use a special token with the connection string like so:

<connectionStrings>
	<add name="ClientDatabase" 
		connectionString="Data Source=|DataDirectory|\data-file.sdf"
		providerName="Microsoft.SqlServerCe.Client" />
</connectionStrings>

The token needs to be included exactly as entered "|DataDirectory|". So how is this token replaced? In the static constructor of my Connect class that was autogenerated by Visual Studio, I added the following code:

/// <summary>
/// Initializes the logging subsystem for the <see cref="Connect"/> class.
/// </summary>
static Connect() {
	string path = Assembly.GetExecutingAssembly().Location;
	path = path.Substring(0, path.LastIndexOf('\\'));

	// Set the DataDirectory for the SQL Server CE connection string.
	AppDomain domain = AppDomain.CurrentDomain;
	domain.SetData("DataDirectory", path);
}
 Tuesday, April 17, 2007

Updating Reporting Services 2000 Connection Info

4/17/2007 2:09:01 PM (Eastern Daylight Time, UTC-04:00)

If you're running Reporting Services 2000 and you accidentally change the password or remove the original account used by Reporting Services, you will have to update this information.

With Reporting Services 2005, there's a nice GUI utility that you can use to do this, but for Reporting Services 2000, you'll need the rsconfig.exe command line utility which can be found, in a typical installation, at: <drive>:\Program Files\Microsoft SQL Server\80\Tools\Binn

To update the account or password information, the following command will probably do:

rsconfig -c -s servername -d ReportServer -a sql -u username -p password -t

Useful to know...odd that with all of the RS development I've done in the past, I've never had to use this utility once until today.

 Monday, January 08, 2007

ContentTypeIds In WSS3

1/8/2007 4:50:43 PM (Eastern Standard Time, UTC-05:00)

In WSS3, if you execute the following SQL:

    SELECT
        ContentTypeId
    FROM 
        ContentTypes

You will notice that the content types are represented in the output as hex.  If you take a look at the table definition, you'll see that the actual data type of the column is VARBINARY(512).

Doing a lookup like so:

    SELECT
        ContentTypeId
    FROM 
        ContentTypes
    WHERE 
        ContentTypeId = '0x101'

Will not work since you cannot perform a comparison between VARBINARY and a character data type directly.

Doing the following will also not work:

    SELECT
        ContentTypeId
    FROM 
        ContentTypes
    WHERE 
        CAST(ContentTypeId AS VARCHAR(512)) = '0x101'

This doesn't work because the underlying type of the binary data isn't character data.  It's integer data.  You can confirm this by running the following query:

    SELECT
        ContentTypeId,
        CAST(ContentTypeId AS VARCHAR(512))
    FROM 
        ContentTypes

You'll see that it's just a bunch of gibberish.  Try the same query with INT and you'll see that the data makes much more sense.  What you'll notice is that content types that inherit from a base content type will have numerical values that increment by 1.

This information is useful, but not nearly as useful as the data that you can glean from the hex string representation of the ContentTypeId.  You see, in the hex string representation, the base ID is a substring of the ID of any inheriting content type.  For example, if I have a content type which has a ID (as a hex string) of 0x0101345346345312234346, then any child content types will have 0x0101345346345312234346 as a substring (e.g. 0x010134534634531223434601, 0x010134534634531223434602).

So how do we get this data in SQL Server for comparison purposes?  We need to use an "undocumented" SQL function: fn_varbintohexstr().

This allows you to do nifty queries to find a given content type and all child content types (or any query where you have to retrieve information about a hierarchy of content types) like so:

    SELECT
        *
    FROM 
        ContentTypes
    WHERE 
        master.dbo.fn_varbintohexstr(ContentTypeId) LIKE 
            '0x0101345346345312234346%'

You can find out more information on this function here.

 Thursday, November 17, 2005

Fun in Philly

11/17/2005 8:56:45 PM (Eastern Standard Time, UTC-05:00)

Wow, I had a lot more fun in Philly than I thought I would have.

Back up a bit.  I had originally registered for the Microsoft launch event in Philly as a participant a few weeks back.  Last week, Rich, my new manager, got 4 of us together, the only 4 that had really played around with SQL Server 2005, and asked if we could come up with a demo for the show as, apparently, we had a presence at the show. 

Long story short, I ended up taking up part of the responsibility of building the demo and also working the booth at the show.  Somehow, we ended up building a database mirroring demo which featured an ASP.Net 2.0 and Atlas frontend.

I also ended up with the responsibility of creating some posters and datasheets that we could hand out along with our demo.  Keep in mind that all of this came up last Thursday.  Having technical and graphical skills is both a curse and a blessing I guess.

Well, in any case, the show was a huge success.  We had people lining up to look at the demo and to talk to Igor and myself.  For an app that was put together in 10-12 hours, it was surprisingly stable.  We only had one major issue and that was a networking issue (dropped IP addresses).  It was so good that we had people coming back to the booth after seeing the datasheet from other people.  We had people coming back with their friends because they were so impressed.  It was quite amazing.  We ran out of the datasheets for the demo in the first 90 minutes of the show (maybe the first 60).  Fortunately, we also had it in poster form (which we didn't give away).  It drew quite a bit of attention and there were a lot of people who just stopped in their tracks to check it out.  The funny story regarding that is I almost didn't have the poster printed.  I was originally only going to print 8.5x11 copies, but Frank, another consultant, suggested that I go with the poster.

poster_web.gif

Believe it or not, I put it together in ~3-4 hours Tuesday night after I got back from a client site.  Everything on this sheet was created from scratch (Except for the computer graphic, which came from Visio).  I swear I was half asleep when I put it all together :-D

I actually heard a lot of comments on the poster, even one from a professor who said she really loved the layout and colors.  Even my wife was impressed that I put this together when I did/in the amount of time that I did it in.

So all in all, a fun but tiring day.  Man, I just love building cool stuff and I love connecting with other developers.  Hopefully, I'll have more opportunities like this in the future.

 Saturday, November 12, 2005

SQL, VS.Net, and BizTalk 2005 Release Event

11/12/2005 10:38:11 AM (Eastern Standard Time, UTC-05:00)

I'll be in Philly next week for the Microsoft release event for SQL Server 2005, Visual Studio.Net 2005, and BizTalk 2006 next week (11/17).

I'm not only going as a participant, but a small group of us will be there representing EMC, one of Microsoft's partners for the event.  Rich Millman, my current MC, will be on some speaking panel and a couple of us will be manning a booth on the floor.

As there were only 4 of us in the group at INS Piscataway that have played around with and read up on SQL Server 2005, I was invited into a brainstorming session on what type of demo that we could put together that would draw people in and hopefully get some new contacts.

My first thought was to create a failover cluster using 4 spare PCs that we had.  Not that it's a great demo of the new features of SQL Server 2005, but I figured that it would draw people's attention and since most developers probably very rarely interact with failover clusters.  It would have been cool to let people walk up and plug/unplug indivitual nodes and watch it failover automatically.  Alas, we didn't have any spare hardware sitting around to build the disk array and it was probably too late to borrow anything from EMC.

My second idea was a little better.  Even though database mirroring isn't officially supported in this release of SQL Server 2005, I figured it would be cool to demonstrate it as it's much more likely to be used than failover clustering due to the low cost of implementation.  And so, myself and Igor went about building a demo setup for database mirroring.

One of the first challenges we had to overcome was to figure out why it's not officially supported in this release.  Obviously, it would have sucked to spend hours working on the architecture and UI only to realize that mirroring was buggy and unstable.  After some research, it turns out that the primary reason for not supporting it in this release is because of the fact that Microsoft couldn't find enough beta testers to fully test the new feature.  With that in mind, we decided we could probably pull it off and Igor and I started to dig in.

It's actually fairly cool and takes advantage of a lot of the new features of the 2005 suite.  The UI is an ASP.Net application that utilizes Atlas to retrieve data from a web service.  Another web service was written to interact with the host machine services to stop and start the individual SQL Server instances.  Very cool.  I'm hoping I can talk them into letting me post the code and walkthrough here for anyone that wants to try to set up mirroing.

demo_screencap_t.jpg

So if you're going to be in Philly at the launch event, look for INS there!  The demo is very cool.

 Monday, October 31, 2005

Installing VS2005 and SQL2005

10/31/2005 1:30:37 PM (Eastern Daylight Time, UTC-04:00)

RTM versions were downloaded over the weekend and I'm setting them up now.

Weird error encountered with SQL Server 2005 setup; On a WinXP SP2, I'm getting the following message:

"SQL Server Setup failed to modify security permissions on WMI namespace \\.\root\Microsoft\SqlServer\ServerEvents\DEV2005. To proceed, verify that the account and domain running SQL Server Setup exist, that the account running SQL Server Setup has administrator privileges, and that the WMI namespace  exists on the destination drive."

I googled the first line and found no results, so I figured I'd post this in case anyone else runs across this error.  Further googling led me to an MSDN article regarding Setting Namespace Security.  I followed the instructions but wasn't able to find the namespace \\.\root\Microsoft\SqlServer\ServerEvents\DEV2005.  Note that I'm installing 2005 as a named instance ("DEV2005" being the name of my instance), so this may not be an issue with default instances.

My current action is to abort the install and try again...I'll update if there are other developments.

Update: I switched to the default, non-named instance and it seems to work fine...however, I get an error when the installer tries to install the sample databases 'doh!  It says that my account is not an admin account, even though it is...

 Thursday, October 06, 2005

Workshop : EditPlus + SQL

10/6/2005 8:32:48 PM (Eastern Daylight Time, UTC-04:00)

Welcome to the second installment of my workshop series of articles.

This time, the workshop focuses on tools, for what is more important to the master craftsperson than the tools used in the craft?

While most people are resigned to using Microsoft's default IDEs for working with SQL Server (Query Analyzer, SQL Management Studio), both utilize terribly subpar text editors which are just too clunky to work with when dealing with complex procedures.

Enter EditPlus, the greatest text editor, ever.  Evar.  I've been using it since my Freshman year in college when a co-worker at iCan Services introduced it to me.  I've been using it for everything from Prolog to Java to C# to DHTML to TSQL.  With a little help from some of the tools that ship with SQL Server, we can hook up EditPlus so that working with TSQL is no longer a pain in the butt.

If you're tired of working with Query Analyzer and you're a cheap bastard like me, then this workshop is for you ;-)

Please feel free to leave comments, questions, criticisms, error reports, and what not in the thread.

Enjoy.

 Wednesday, September 28, 2005

Workshop : Configurable DTS Packages

9/28/2005 5:37:44 PM (Eastern Daylight Time, UTC-04:00)

I finished my first "workshop" article!

Had a bit of free time and decided to finish it as it's something that I've been meaning to do for quite a while now.

As an aside, the "workshop" series of articles are meant to be my gesture of giving back to the community from which I've taken so much.  I know I've found tons of useful blog posts and articles online that have helped me greatly in the past, so I feel that it's only fair that I give back a bit.   I don't claim to be some guru or know all of the answers, but my hope is that the information that I provide in these articles can help someone out there get their job done more easily.  I'm also aware that someone out there may have already done something like this, but my own experience tells me that it's always helpful to work with various sources to gain different insights.

Now onto the article!

In March, I worked on creating a SQL Server DTS package to import data from various data files.  One aspect that grew tiring really fast was having to manually change the paths of the source data files each time I moved the package from one environment to another (or even from my machine to Kent's).  We had 6 datafiles at the time (early testing/development phase) in addition to various other settings that were environment dependent like mail server addresses and database logins.  I was simply too lazy to keep fiddling with the settings in DTS every time I had to send the package to the client or to Kent.

The solution I came up with makes use of DTS global variables, the powerful "Dynamic Properties Task" task, and some simple VBScript.

Still interested?  Then read the full article.

Feel free to leave comments, questions, and criticisms :-) in this thread

RSS 2.0 Atom 1.0 CDF