TechnicalArchitectureWorx

The (Unofficial) ITWorx Technical Architecture Blog

Continuous Integration: Differencing .NET Assemblies – MVID regenerates for every compilation

Posted by archworx on January 22, 2007

Background: This post is about challenges in differencing binaries to identify if they are identical between assemblies on the testing environment and assemblies generated from code on SourceSafe. Read on for more details.

The Problem: At the end of your development efforts, you typically need to subject your code base to testing, and then upon testing approval, you ship your code to the client. However, bear in mind the following:

  1. Your Code Base is on a Source Control engine
  2. Your binaries are not – they are on the development and testing servers. It doesn’t make sense from a configuration management perspective to store binaries on source control (because:
    1. You can always generate them (theoretically) exactly from the source.
    2. It is a huge burden to ensure object/source consistency
    3. Even if you do keep them, you’ll have to check that the source & object are indeed consistent, which is an extra overhead)
  3. You typically need to ship to your client the binaries that are on your testing server – as those are the ones that have been approved by the testers.
  4. You can’t ship the source unless you are sure it generates the binaries the developers claim they have produced on the testing environment.
  5. So the obvious process is to retrieve your source code from your Source repository and recompile it.

Enter .NET Assemblies – which include the following obstacles to successfully being able to recompile the exact binary stream of code twice:

  1. By default .NET assemblies change their version # every time you compile – this is a good thing, as it provides for very good tracking of version numbers; something that is sadly lacking in many developer’s culture. However, this means that binary differencing will yield false positives.
  2. If you need your assembly to be hosted on the GAC, or otherwise want to sign your assembly, your assembly must be strongly named, this can pose challenges if you sign them with keys that are not controlled properly.
  3. The assembly header also contains a field called the “MVID” – which is the Module Version Identifier. This field’s purpose is solely to be unique for each time the module is compiled. This is a rather powerful concept, in the sense that this is the first time I’ve personally seen the concept of someone wanting to distinguish a compilation instance from another one, irrespective of the code being compiled itself.

The Solution: This article is about attempts to solve the three aspects of the problem described above. At this time, we have a simple solution and a workaround for the first two – about the version and the signatures, and we have hopeful indicators that the MVID issue too can be resolved.

Resolutions:

  1. Version Numbers – can be explicitly defined through the removal of the “*” sign for release builds. You can find this field in the assembly info.
  2. Strongly Named – let’s ignore thise case temporarilly.
  3. MVID – we believe this can be controlled via a compiler option – but I am yet to find it.

The rest of this post is mostly dedicated to discussing the MVID issue.

Tools:
Intermediate Language Disassembly:
ildasm /text /all file.dll

Impact:
The MVID is used by the .NET CLR to determine whether or not to reload the precompiled assembly data.
This is to allow caching such precompiled data, and consequently ensuring cache integrity.
This would imply that the MVID is only useful when precompiled information exists in the assembly.
Typically precompilation only happens when you use NGEN.EXE.
Consequently not generating an MVID or generating it with the same ID is not necessarily a dangerous idea to contemplate.

Note:
Emperical Observation has shown that Nant manages to automagically generate the same MVID each time it recompiles, thus dispelling the myth that it must be unique for every ”compilation”. There must be a way to mimic Nant’s communication with the C# Compiler, as it must be using it to do the compilation. There is no way that Nant is faking a compilation. Or is there? ;)

Epilogue:
The observations proposed herein are very encouraging, even in so far as they encourage extreme ideas, such as:
1. manually coercing the same uid value for the mvid for otherwise identical compilations (via injecting it into the binary for example); because this would theoretically not jeopardize the sanity of the ngen-generated data.
2. We could do a manual textual comparison of the assembly’s code via ildasm /text and a script that conceals the mvid information

More on this later.

About these ads

13 Responses to “Continuous Integration: Differencing .NET Assemblies – MVID regenerates for every compilation”

  1. I would like to suggest an alternative solution for the problem, why not enforce all source files to be built according to a specific version when ever we make a build.

    There are tools that can modify the AssemblyInfo.cs content for you like nAnt (http://vidmar.net/weblog/archive/2006/10/18/3237.aspx) and MSBuild (http://www.gotdotnet.com/codegallery/codegallery.aspx?id=93d23e13-c653-4815-9e79-16107919f93e)

    So, after we get the latest verson if source code, we call a tool that will change all versions in all AssemblyInfo.cs files and build every thing.

    I would say that we have been doing this with CrouseControl.NET for nightly builds and triggered builds; whenever a new build is triggered a new version number is generated and then all AssemblyInfo.cs are updated with this version.

  2. Pranav said

    How to modify Assemblyinfo.cs file from NAnt build script
    e.g.Assemblyversion
    AssemblyFileversion

  3. Matt said

    I still believe you’ll have problems relating to other elements in the general PE header structures. We’ve been working on a differencing tool for a few weeks now, and have delved into modifying timestamps, string in the #Strings stream (based on method names usually, specifically PrivateImplementationClass{..GUID..} in examples we’ve come across), even some bytes in the #BLOB stream were showing as different in some assemblies which we knew were functionally identical. There are more differences than those you have highlighted and even more than I have mentioned.

  4. archworx said

    Thank you for your comment Matt – I am curious to know if you are interested in collaborating to build a list of items that people ought to look out for in terms of differencing otherwise functionally identical assemblies?

  5. Gandalf Hudlow said

    We are in the process of solving this very problem. What we have run across is that the MVID is mentioned in more than just the top of the ildasm output, it is generated inside. And even more disheartening is that the order of the various blocks of information in the ildasm output changes depending on which machine it was built under. We are in the process of creating a custom diff object that will reconcile two sets of output from two different build machines.

    • scott said

      noticed this just today. very disappointing. fortunately we have a build machine so it is not a problem but it forces me to commit earlier than i may want to just to test for no differences.

  6. Samuel L. said

    If you ever want to see a reader’s feedback :) , I rate this post for 4/5. Decent info, but I have to go to that damn google to find the missed pieces. Thanks, anyway!

  7. xx-vip said

    XXX-Vip Video – порно ролики видео онлайн только для archworx.wordpress.com

  8. kino-get said

    скачать и смотреть новые кино фильмы

  9. kino-klad said

    только у нас новые фильмы на фтп

  10. xxx said

    xxx…

    [...]Continuous Integration: Differencing .NET Assemblies – MVID regenerates for every compilation « TechnicalArchitectureWorx[...]…

  11. Whats up are using WordPress for your blog platform? I’m new to the blog world but I’m
    trying to get started and set up my own. Do you require any coding knowledge to make your own blog?
    Any help would be really appreciated!

  12. These are really enormous ideas in on the topic of blogging.
    You have touched some good factors here. Any way keep
    up wrinting.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

Join 559 other followers

%d bloggers like this: