Convention tests – file reading

In previous article we’ve discussed possibility of using reflection for convention tests. Today I would like to focus on manual file inspection.

By manual file inspection I mean old-fashioned file reading and applying regex to assert that something exists or not inside the file. Basically we have two scenarios here:

  • Reading source code files – for things that we cannot assert with reflection
  • Reading other files – because we cannot use reflection on those.

Reading Files

In order to test what is inside the file we need to write utils that will help us with file reading. Basically we need first to go to directory with *.sln file. Then we would like to be able to go to other directory and get all files from it. Minimum what we need is to have method like this:

private IEnumerable<string> GetFiles(string path)
{
    var d = new DirectoryInfo(Directory.GetCurrentDirectory());
    while (d.EnumerateFiles("*.sln").Any() == false)
    {
        d = d.Parent;
    }
    var fullPath = Path.Combine(d.FullName, path);
 
    return Directory.EnumerateFiles(fullPath);
} 

We can write more helper methods – for example method that will return files from predefined directory. Or method that will return us all *.cs file in all directories in case if we want to test something for all classes, etc.

How to test something inside files? First of all we need to open file. If we would like to traverse file content as a string it is just a matter of calling File.ReadAllText.

Now, we have couple of options. If some tests are really simple we can just use string.Contains, for more complex checkings we might want to use actual regex with Regex.IsMatch method. If our file contains xml we should think about using something that can deal with xml nodes traversal. If we are opening other type of files we should consider finding library that help us read it.

Reading source files

You might think that analyse of source code we covered in last article. But in fact you cannot test everything with reflection. There are some cases when simple regex can do better.

For example you might want to check if there are calls to forbidden api. In the talk I’ve linked last time Maciek is giving an example about call to DateTime.Now that can be considered as bad practice. We can create regex that check if such call exists.

Another case described by Maciek is to check dependencies between namespaces. Usually we would like to have specific order of dependencies. For example views have dependencies to services, but we don’t want to have dependencies in the opposite direction. We can create regex that will check if there are using declarations pointing to view namespace in all classes inside service namespace.

Since convention testing can be used instead of tools for static code analyse – we can take a look for rulesets used in such tools for inspiration. For example – couple of rules from stylecop that can be implemented in easy way:

  • We should use type aliases instead of basic types (so int instead of Int32)
  • We should use string.Empty instead of empty string literal
  • We should not use regions
  • etc.

It is a different story if we would like to have such tests or not.

Reading other files

There are other files in project that we can find beside source code, most important are:

  • “resource” files
  • Solution definition file (*.sln)
  • Project file (*.*proj)
  • Application configuration (app.config/web.config)
  • Nuget packages definition (packages.config)

Let’s take a quick look at each of them

Resource files

I’m not sure if this is correct name in .net world (it is in Java one), but I think about files that are used by application itself. Those files vary between applications so I don’t have any advices how to test them.

Solution definition

Sln file is the general description of solution. We can split data inside it to three categories: pre solution, post solution and project data. During file opening all the things marked as pre solution will be executed first, then project definitions will be loaded. Usually those are information about location of project file and the project guid. Based on that information IDE looks for project file for each project definition and use those files to actually load project (more about that in next section). At the end postSolution part of sln file is executed.

From my perspective there is nothing interesting inside solution definition that can be used for convention tests.

Project file

The project file contains all information needed to build a project. In case of c# we are looking for file with csproj extension. It contains various data about your project like type, version, list of files, dependency references, tasks performed during build, etc.
 
It is a regular xml file so it might be easier to use xml libraries to walk through it.

The detailed information about structure of this file you can find on MSDN, so I just want to put some high-level overview here.

Whole project definition is enclosed inside project tag.

Each PropertyGroup tag contains definition of properties (for example project guid, project assembly name, target platform, etc.) The idea is the properties defined here can be referenced in other parts of project file

ItemGroup section contains information about all items that should be processed during build – so source code files and directories where you can find those files, references needed by your code, etc.

Target node contains set of so called tasks – the definitions of single step of build proces.

Various things can be tested by reading this file – for example we might want to check referneces to other projects in order to asses correct hierarchy of dependencies between projects. We can also check if some files contains correct attributes (for example some files should be marked as an embedded resources)

App.config / Web.config

This is the xml file where you store configuration that later can be read by application. Since it may contain lots of kev-value pairs needed by application we can add tests that will check if we have all keys and values needed, so if someone delete one by accident, our test will inform about that.

packages.config

This file is a part of nuget configuration. Basically it contains definition of nuget packages that we want to use. Based on that files nuget can execute package restore action. There is a one test case that I have in my mind for this file. Each entry can contains allowedVersion attribute – this attribute specifies which versions of package can be used when we will execute update action on that package (for example we know that everything should be fine with 1.x line, but in 2.x api changed so we cannot update). If we know that for some package it is crucial to have allowedVersion setup for specific values, than we can use regex to force that rule in test.

Regex.IsMatch(f, "id=\"NugetPackageName\".*allowedVersions=\"\\[1,2\\)\"")  

Now if someone without knowledge about problems in version 2 of library will change packages.config – our test will inform about that.

Summary

Manual analysis of files have some benefits – because it allows you to check things that you cannot check only with reflection, and allows you to inspect files that are not sources. But there is a price for that. First of all we need to write little bit of code that allows us to read files in easy way. Second problem is with assertions itself – regex that we need might be really complicated which makes test less readable (but this we can mitigate with proper method extraction strategy and good naming). If we are digging inside xml file – we need also write some code that will traverse nodes. The last problem is since we are basically checking if strings contains some substrings – such tests can be very fragile.

That’s all for today. In next chapter we will take a look on possiblities of convention tests that are given to us by Roslyn project.

The source code for this series you can find here: https://github.com/mprzybylak/CSharpConventionTests

Links