Biggy

Biggy Available as Pre-Release on Nuget


Image by fortfan  |  Some Rights Reserved


Biggy
is a very fast, synchronized, in-memory document/relational query tool for .NET.

If you have been following the Biggy project over the past year, you are aware that there has been a rapid evolution of ideas, changes in structure, and changes in stewardship. Biggy was originally an experiment by Rob Conery, a man who likes to challenge convention, break rules, and in general stir things up in a way which causes folks to re-think some closely held conventions. All of the above are why I follow Rob’s blog, Twit-stream, and Github repo.

Plus, he’s just a damn fine fellow.

Rob recently withdrew from active development on Biggy, and I am now doing my best to shepherd this interesting project. We’ll see how this goes. My goal is to try to maintain the spirit of the original project, and find some happy medium between what I think Rob would have done, what I would like to see, and what the community wants (if anything) from this unique data access and query tool. I’ll have more to say on that in another post, but I had a fantastic time working on this with Rob, and as you might imagine, I learned a lot.

On that note, I am pleased to announce that there is now a pre-release version of Biggy available on Nuget.

Getting to Know Biggy

Biggy works by loading all of the data in your store (be it a flat JSON file, or from a relational database) into memory, which makes querying super fast. Also, because your data is deserialized into POCO objects behind and ICollection<T> interface, it is all queryable using LINQ.

This works with relational data in the way you might expect, but it really shines when you work with JSON documents. Complex, nested JSON documents are now queryable at blazing speed.

In its current form, Biggy is comprised of a core package, and data extension packages specific to each supported backing store. Biggy directly supports simple file-based JSON as a store, as well as SQLite and Postgresql relational data bases. However, Biggy is designed to be extensible, so alternative stores can be supported by implementing the IDataStore interface found in Biggy.Core.

The basic Nuget packages look like this:

The store-specific packages all take a dependency on Biggy.Core, and it is not necessary to pull down Biggy.Core as a separate package, unless you intend to implement your own store against the interfaces and base classes therein.

Inject Stores into Lists

At present, the underpinning of how Biggy works is based on the idea of synchronizing an in-memory list with a backing store. An implementation of IDataStore<T> knows how to talk to a database of some sort (even if it may be a flat JSON file). The BiggyList<T> , an in-memory implementation of ICollection<T>, doesn’t know or care what specific store it is working with, it just knows how to implement the ICollection interface (plus a few additional methods) and present a LINQ-queryable API to the world.

The relationship between the two is achieved by injecting an instance of IDataStore<T> into an instance of BiggyList<T> upon instantiation. There are a number of ways to do this, depending upon the needs of your application.

A Simple How-To: File-Based JSON Data

We’ll take a quick look at the very basics of using Biggy using a Visual Studio Console Application. To get started, you’ll need to create a new Console Application, and pull down the Biggy.Data.Json package:

Get the Biggy.Data.Json Package from Nuget:
PM> Install-Package Biggy.Data.Json

 

Now we can work against flat-file JSON data.

The IDataStore<T> interface offers a fairly simple CRUD interface to the world. We can work directly with an implementation of IDataStore<T> to Add, Update, and Delete records, and to read the entirety of the store into an IEnumerable<T> for use in our code.

Since we have started with the JSON implementation, let’s take a look at how the JsonStore<T> works on it’s own.

We will need to add the following namespaces to the usings at the top of our code file:

Add Biggy References to the Program.cs Code File:
using Biggy.Core;
using Biggy.Extensions;
using Biggy.Data.Json;

 

Then, we might do the following:

Add a Document to the Json Store:
public class ArtistDocument
{
    public ArtistDocument()
    {
        this.Albums = new List<AlbumDocument>();
    }
 
    [PrimaryKey(Auto: false)]
    public int ArtistDocumentId { get; set; }
    public string Name { get; set; }
    public List<AlbumDocument> Albums;
}
 
 
public partial class AlbumDocument
{
    public AlbumDocument()
    {
        this.Tracks = new List<Track>();
    }
 
    public int AlbumId { get; set; }
    public string Title { get; set; }
    public int ArtistId { get; set; }
    public virtual List<Track> Tracks { get; set; }
}
 
 
class Program
{
    static void Main(string[] args)
    {
        var jsonArtistStore = new JsonStore<ArtistDocument>();
        var newArtist = new ArtistDocument
        {
            ArtistDocumentId = 1,
            Name = "Nirvana"
        };
        newArtist.Albums.Add(new AlbumDocument
        {
            AlbumId = 1,
            ArtistId = 1,
            Title = "Bleach"
        });
        newArtist.Albums.Add(new AlbumDocument
        {
            AlbumId = 2,
            ArtistId = 1,
            Title = "Incesticide"
        });
        jsonArtistStore.Add(newArtist);
    }
}

If we run this code, the following happen behind the scenes:

  • By default, new directory is created in our project root named ~\Data\Json\BiggyDemo1
  • A file is created in that directory, named artistdocuments.json
  • The new artist data is persisted as raw JSON in the artistdocuments.json file

If we crack open the file, we find:

Contents of Artist Documents JSON File:
[ {
    "Albums": [
      { "AlbumId": 1, "Title": "Bleach", "ArtistId": 1, "Tracks": [ ] },
      { "AlbumId": 2, "Title": "Incesticide", "ArtistId": 1, "Tracks": [ ]
      } ],
    "ArtistDocumentId": 1,
    "Name": "Nirvana"
  } ]

Yup. The POCO objects we used in our .NET code have been serialized to JSON and saved in the file.

We can use the JsonStore<T> this way to our heart’s content if we like, but that’s not really the point of Biggy. Instead, let’s use our store in conjunction with a BiggyList<T> instance, and get some real work done.

Inject IDataStore into BiggyList

Biggy is intended to present an in-memory representation of the data in your store, so you can query away using LINQ, make additions, updates, and deletions, and the data in memory will remain in sync with your backing store.

Let’s expand on what we did above, and inject a store into a BiggyList<T>. In this case, we will use some data from the Chinook Database which I have used to create a full set of JSON artist documents, including albums and tracks for each artist. There are 275 artists in the JSON data set, including even more albums (nested under each artist), and several thousand tracks (nested under the appropriate album).

For example, the JSON data for a single artist in the modified Chinook sample data looks like this:

Singe Sample JSON Artist Document Record Using Modified Chinook Data:
[
  {
    "Albums": [
      {
        "AlbumId": 1,
        "Title": "For Those About To Rock We Salute You",
        "ArtistId": 1,
        "Tracks": [
          { "TrackId": 1, "AlbumId": 1, "Name": "For Those About To Rock (We Salute You)" },
          { "TrackId": 6, "AlbumId": 1, "Name": "Put The Finger On You" },
          { "TrackId": 7, "AlbumId": 1, "Name": "Let's Get It Up" },
          { "TrackId": 8, "AlbumId": 1, "Name": "Inject The Venom" },
          { "TrackId": 9, "AlbumId": 1, "Name": "Snowballed" },
          { "TrackId": 10, "AlbumId": 1, "Name": "Evil Walks" },
          { "TrackId": 11, "AlbumId": 1, "Name": "C.O.D." },
          { "TrackId": 12, "AlbumId": 1, "Name": "Breaking The Rules" },
          { "TrackId": 13, "AlbumId": 1, "Name": "Night Of The Long Knives" },
          { "TrackId": 14, "AlbumId": 1, "Name": "Spellbound" }
        ]
    },
    {
        "AlbumId": 4,
        "Title": "Let There Be Rock",
        "ArtistId": 1,
        "Tracks": [
          { "TrackId": 15, "AlbumId": 4, "Name": "Go Down" },
          { "TrackId": 16, "AlbumId": 4, "Name": "Dog Eat Dog" },
          { "TrackId": 17, "AlbumId": 4, "Name": "Let There Be Rock" },
          { "TrackId": 18, "AlbumId": 4, "Name": "Bad Boy Boogie" },
          { "TrackId": 19, "AlbumId": 4, "Name": "Problem Child" },
          { "TrackId": 20, "AlbumId": 4, "Name": "Overdose" },
          { "TrackId": 21, "AlbumId": 4, "Name": "Hell Ain't A Bad Place To Be" },
          { "TrackId": 22, "AlbumId": 4, "Name": "Whole Lotta Rosie" }
        ]
      }
    ],
    "ArtistDocumentId": 1,
    "Name": "AC/DC"
  }
]

I went ahead and overwrote artistdocuments.json in our Data\Json directory with this Chinook file. We could change our Main() method to look like this:

Query Chinook Document Data:
static void Main(string[] args)
{
    var artistDocumentStore = new JsonStore<ArtistDocument>();
    var artistDocuments = new BiggyList<ArtistDocument>(artistDocumentStore);
 
    // Select artists with names beginning with 'M' and write to the console, 
    // along with a count of albums for each:
    var selected = from a in artistDocuments where a.Name.StartsWith("M") select a;
 
    foreach(var artistDoc in selected)
    {
        Console.WriteLine("{0}: {1} Albums", artistDoc.Name, artistDoc.Albums.Count);
    }
    Console.Read();
}

Notice how we simply initialized a store, injected it into the BiggyList<ArtistDocument> , and presto, our document data was loaded and queryable using LINQ?

Again, by default, Biggy is going to look in the ~\Data\Json directory for a folder with the current project name, then look for a file with a name matching the POCO class represented by T. If one is found, the data is loaded. If not, a new file is created the first time data is added.

In this example, a (new) file named artistdocuments.json already exists in the default directory, so the Chinook document data is loaded up during initialization, and is ready for querying.

Output from the above would resemble the following:

Console Output from LINQ Query Against Chinook Artist Documents:

console-output-1

Or, we might want to query up a specific artist and examine the albums and tracks on file for that artist. If we change our Main() method again:

Query a Specific Artist and Output the Albums and Tracks to the Console:
static void Main(string[] args)
{
    var artistDocumentStore = new JsonStore<ArtistDocument>();
    var artistDocuments = new BiggyList<ArtistDocument>(artistDocumentStore);
 
    // Select a single artist, and list the albums/tracks for that artist:
    var selected = artistDocuments.FirstOrDefault(a => a.Name == "Metallica");
    Console.WriteLine("Albums by {0}:", selected.Name);
    foreach(var albumDoc in selected.Albums)
    {
        Console.WriteLine(albumDoc.Title);
        foreach(var track in albumDoc.Tracks)
        {
            Console.WriteLine("\t{0}", track.Name);
        }
    }
    Console.Read();
}

Output here is as expected, and makes clear that the maintainers of Chinook Database are pretty big Metallica fans.

Biggy and Relational Database Stores

There are limitations to using a flat-file JSON store. For one, concurrency can become an issue. When your application needs call for a more robust persistence mechanism, you can use a relational database, both in the conventional sense, and for persisting document data.

SQLite is a nice simple solution if a file-based relational data store is a good option. SQLite required no configuration or administration, and is easily added to your project. Also, like Postgres, SQLite is free, cross-platform, and open-source. In terms of scaling up, SQLite makes a logical next step from flat-file JSON as your application grows.

Postgresql is also free, cross-platform, and open-source, and is our default, large-scale client-server database of choice.

We’ve designed Biggy to minimize the pain associated with moving between different backing stores. While file-based JSON stores, SQLite, and Postgresql all have different capabilities, advantages, and disadvantages, the IDataStore interface and the BiggyList don’t care. Concrete implementations of IDataStore can capitalize on the various strengths of each storage format, but you can also pass them into an existing BiggyList and everything should “just work.”

RelationalStore and DocumentStore

When working with relational data in the traditional sense, use a concrete implementation of RelationalStore<T> . In this case, Biggy will expect to find a table and schema which match the POCO class <T> specified as a type argument (or appropriately mapped using some extensions found in Biggy.Core).

When working with Document data, even stored in a relational database, Biggy will serialize/de-serialize the POCO class <T> into JSON, nesting any child collections or contained objects.

Also, Biggy will create Document tables on the fly, as any document table used by Biggy will have the same schema: id, body, and created_at.

Document Storage in a Relational Data Store

Once again, one of the primary use-cases for Biggy is to work with JSON-format document data. Postgres, because it is awesome, implements its own JSON data type (and now, with the release of version 9.4, additionally supports Binary JSON, or bson). Persisting documents in Postgres takes full advantage of the JSON data type.

For SQLite (or any other concrete implementation of IDataStore<T> you choose to make), JSON is persisted as simple string data.

For working with Relational data stores, the Chinook Database once again provides a handy, ready-to-use data set for both SQLite and Postgresql. We will use Chinook in the following examples.

Using Biggy with SQLite

To use Biggy with a SQLite backing store, just pull Biggy.Data.SQLite down from Nuget:

Use Nuget Package Manager Console to get Biggy.Data.SQLite:
PM> Install-Package Biggy.Data.Sqlite

If we drop the Chinook Database file into our project Data directory (if needed, change the file extension to .db), we can get right to work querying and using the relational data present in Chinook out of the box.

When using SQLite with Biggy, the primary constructor argument will at a minimum specify the name of the database file to look for. If no other arguments are provided, Biggy will look in the ~\Data directory at the root of our project directory, and try to match the string value with a file by the same name, with a .db extension.

If no database with that name is found, Biggy will create one.

If we change the using statement to Biggy.Data.SQLite instead of Biggy.Data.Json, we can do the following, working with some slightly different POCO models, and the basic relational tables in the Chinook Database just as they are. Notice here, we specify SQLiteRealtionalStore<T> , and not SQLiteDocumentStore<T> because we will be working with a relational data set.

Note: Biggy will create DocumentStore<T> tables on the fly. However, Biggy cannot, at present, create standard relational tables – to work against relational data, the tables need to already exist.

Connect BiggyList<T> to a SQLite Database:
public partial class Artist
{
    public int ArtistId { get; set; }
    public string Name { get; set; }
}
 
public partial class Album
{
    public int AlbumId { get; set; }
    public string Title { get; set; }
    public int ArtistId { get; set; }
}
 
 
static void Main(string[] args)
{
    // Pass the name of the database file as a constructor argument:
    var artistStore = new SqliteRelationalStore<Artist>("Chinook");
    var albumStore = new SqliteRelationalStore<Album>("Chinook");
 
    // Pass the store into the list:
    var artists = new BiggyList<Artist>(artistStore);
    var albums = new BiggyList<Album>(albumStore);
 
    var someArtist = artists.FirstOrDefault(a => a.Name == "AC/DC");
    var artistAlbums = albums.Where(a => a.ArtistId == someArtist.ArtistId);
 
    Console.WriteLine("Albums by {0}:", someArtist.Name);
    foreach(var album in artistAlbums)
    {
        Console.WriteLine(album.Title);
    }
}

Creating a Document Store Using SQLite

We can use SQLite to persist documents as well as work with traditional relational data, simply be using SQLiteDocumentStore<T> , similar to the way we did with the JSON store. The following code will create a new table in the Chinook database we are using for our back-end, and add a single artist document record:

Use SQLite to Create a Document Store in the Chinook Database:
var artistDocumentStore = new SqliteDocumentStore<ArtistDocument>("Chinook");
var artistDocuments = new BiggyList<ArtistDocument>(artistDocumentStore);
 
var newArtist = new ArtistDocument
{
    ArtistDocumentId = 1,
    Name = "Nirvana"
};
 
newArtist.Albums.Add(new AlbumDocument
{
    AlbumId = 1,
    ArtistId = 1,
    Title = "Bleach"
});
 
newArtist.Albums.Add(new AlbumDocument
{
    AlbumId = 2,
    ArtistId = 1,
    Title = "Incesticide"
});
 
artistDocuments.Add(newArtist);

In this case, the artistdocuments table has three simple fields: id, body, and created_at. The id will be the same as the primary key for each artist object. the body field contains the JSON document itself, and the created_at is simply a date-time stamp.

We could, of course, decide we want to pull all of the artist/album/track data together, and compose it all into artist documents similar to the JSON file I used in the previous section, and then push it out into a new SQLite database.

Aggregate Artists, Albums,and Tracks into a Document Store Using SQLite:
// Pass the name of the database file as a constructor argument:
var artistStore = new SqliteRelationalStore<Artist>("Chinook");
var albumStore = new SqliteRelationalStore<Album>("Chinook");
var trackStore = new SqliteRelationalStore<Track>("Chinook");
 
// Pass the store into the list:
var artists = new BiggyList<Artist>(artistStore);
var albums = new BiggyList<Album>(albumStore);
var tracks = new BiggyList<Track>(trackStore);
 
// Use a list, and do a bulk add when all the artist documents have been created:
var newArtistDocs = new List<ArtistDocument>();
foreach (var artist in artists)
{
    var artistDoc = new ArtistDocument { ArtistDocumentId = artist.ArtistId };
    var artistAlbums = albums.Where(a => a.ArtistId == artist.ArtistId);
 
    foreach (var album in artistAlbums)
    {
        var albumDoc = new AlbumDocument();
        var albumtracks = tracks.Where(t => t.AlbumId == album.AlbumId);
        albumDoc.Tracks.AddRange(albumtracks);
        artistDoc.Albums.Add(albumDoc);
    }
    newArtistDocs.Add(artistDoc);
}
 
// Now let's new up a NEW SQLite-based database:
var artistDocumentStore = new SqliteDocumentStore<ArtistDocument>("ChinookDocuments");
var artistDocumentsList = new BiggyList<ArtistDocument>(artistDocumentStore);
 
// Push our new nested artist documents into the new database:
artistDocumentsList.Add(newArtistDocs);

That’s a large mess of code, but you get the idea. In the above, we materialized some relational data, composed it into document form, and persisted it into a brand-new SQLite database (this time named “ChinookDocuments.db” and again, in our ~\Data\ directory) as JSON string values in a table named “artistdocuments”.

Using Biggy with Postgres

Unlike simple JSON files and SQLite, PostgreSql is a full-fledged client-server database. This means we need to work with real connection strings, connect to a real database, and that Biggy cannot create a database for us.

However, Postgres is an awesome database, and if you are not familiar, I recommend you go check it out.

With the Json store, and the SQLite store, we were able to provide some minimal initialization arguments to the concrete implementation of IDataStore<T> to get up and moving. With Postgres, we still don’t need much, but we do need to specify a connection string in our App.config or Web.config file.

You can pull down the Biggy.Data.Postgres package from Nuget using the Package Manager Console:

Get Biggy.Data.Postgres from Nuget:
PM> Install-Package Biggy.Data.Postgres

Once again, we can get started by pulling down the Chinook Database for Postgres, running the CREATE script, and Voila.

Specifics may vary, but your App.Config should look something like this:

Example App.config File with Postgres Connection:
<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <startup>
    <supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.0" />
  </startup>
  <connectionStrings>
    <add name="chinook" connectionString="server=localhost;user id=biggy;password=password;database=chinook" />
  </connectionStrings>
</configuration>

Note in the above, our connection string is named “chinook” and we are using a local PG instance.

With that, we can once again start with some code that should be looking familiar by now:

Query Chinook Database Using Biggy with Postgres:
// Pass the name of the database file as a constructor argument:
var artistStore = new PgRelationalStore<Artist>("chinook");
var albumStore = new PgRelationalStore<Album>("chinook");
 
// Pass the store into the list:
var artists = new BiggyList<Artist>(artistStore);
var albums = new BiggyList<Album>(albumStore);
 
var someArtist = artists.FirstOrDefault(a => a.Name == "AC/DC");
var artistAlbums = albums.Where(a => a.ArtistId == someArtist.ArtistId);
 
Console.WriteLine("Albums by {0}:", someArtist.Name);
foreach (var album in artistAlbums)
{
    Console.WriteLine(album.Title);
}
Console.Read();

Here again, we are able to directly query the relational data loaded into the BiggyList<T> from Chinook Database. Biggy looks in the App.config file for a connection string matching the one provided as a constructor argument, and then does its thing.

From here, working with document data in Postgres is much the same as we saw from SQLite. Remember though, that Biggy cannot create a new database on the fly with Postgres as we did using SQLite.

What Next for Biggy?

As mentioned, the packages available on Nuget at the moment are definitely “pre-release” in that we expect to be making changes. I’m not feeling the product is all the way there in terms of its front-side API, and usefulness.

At this point, I would love to hear some feedback from those who have taken some time to play with it. What would make it more useful? Is there a critical feature missing that would also be widely used (as opposed to special-case features)?

My interest now is to arrive at a stable, minimal feature set, and a maximally useful API. I’m less interested in adding additional store options or features, and more in refining what is already there.

Please do check it out, and please do open an issue on the Github Repo for any bugs, suggestions, or other comments you may have.

Want to Contribute?

Again, please do! I will happily accepts Pull Requests for bug fixes and features, so long as they fit in the general scheme of things. A word of caution, though, creeping feature-itis was one of the problems we had with the “version 1” repo of Biggy. We really want to make sure any new feature is needed, and I really would like to keep things as simple as possible until we have a ready-for-prime-time release version.

What’s with the Strange Code Formatting in the Repo?

The code in the repo is very much not idiomatic C# format. We are using two spaces for indentation, and same-line braces. This is  something I picked up from Rob, who I think picked it up during his years in the Ruby/JS wilderness. However, after struggling with it for a bit, I came to really like the compact style.

For the moment, I plan to maintain the repo using this format. However, any PR’s you send my way do not need to match -I’ll fix them

At some point, I may cave and revert back to idiomatic C#. But I challenge you to check it out, try it out, and see if you don’t find yourself thinking most C# code looks a little spread out after a while…

Where can I Find More Info and Better Docs?

A more fully developed documentation page is coming, There is a lot more to Biggy then you saw here – this was a quick and dirty intro, since the code has changed so much in the past year. There are better ways to do almost everything I discussed in this article, but the examples here were kept purposefully simple.

We are also gratefully accepting contributions to documentation :-)

Additional Resources and Items of Interest

Some History on Biggy:

C#
Working with Pdf Files in C# Using PdfBox and IKVM
ASP.Net
ASP.NET MVC and Identity 2.0: Understanding the Basics
C#
Use Cross-Platform/OSS ExcelDataReader to Read Excel Files with No Dependencies on Office or ACE
There are currently no comments.