Wednesday 6 February 2008

Data is Key

Since this is a software development blog, I thought I should at least once in a while write something remotely related to software development;

Picture the scene... You have some real-world GIS data. You know that the data you have is amongst the most accurate and best quality of it's type, anywhere in the world. You have some third party tools from an SDK which is designed to take such data in one form or another and produce some other kind of output. Great, all's well, only it isn't.

What happens if this fantastic data you have contains values or parameters you're not aware of or has errors that are not apparent ? What happens if the format of your data is not acceptable to the tool(s) you have to process it ? What happens if the tool(s) you're using to process the data produce results that are unexpected or it simply doesn't work and crashes ?

If you are familiar with these questions then most likely you're working in Flight Simulator scenery development! And the answer to all these questions is a resounding "it doesn't work!".

Now, since you neither created the data nor wrote the tool(s) you're in for a rough ride most of the time. Occasionally the cause of a problem will be fairly obvious. Occasionally it will be extremely difficult to track down, requiring huge effort and an inordinate amount of time to finally get to the point you wanted to reach. In some cases it will require code level knowledge and programming skills to develop your own intermediary tools to pre-process the data to make it acceptable to the third party tool(s). [Edit] If this is not your bag, then as a colleague has just said "if it doesn't work for me then there's no writing a clever bit of code ... its down to getting my hands in the bucket of crap up to me elbows".

If this all sounds a bit vague then it probably is, but the point I am making is this :-

Data is key. If your process begins with data (of any sort) then that data had better be good or you'll be in with a battle. The old adage Garbage in, Garbage out applies to any process involving data.

Working on scenery is certainly an exercise in deferred gratification and unpleasant surprises. I write a lot of code before I see any results. When people post screenshots of my scenery I am usually surprised by what it looks like, because I spent most if not all of the development time working on code and not playing with the final product.

The unpleasant surprises come from the size of the data sets I'm working with; Given how big and varied the scenery is (and how unreliable data can be at times), there will always be locations where the particular local data causes something to break or produce bogus results.

So spare a thought for the poor software developer working on scenery the next time you're doing circuits at your local airfield or admiring the magnificent views whilst flying through the valleys of Snowdonia!

TTFN

Disclaimer: The views and opinions I post are the views and opinions of me, and me only, and do not reflect views or opinions of anyone or anything else. Views and opinions are subject to change without notice!

No comments: