« The Beta pressure is ON! | Importing Data into Excel »
Meta WHAT?
Posted by James Standen on 8/06/08 • Categorized as Datamartist Tool,Meta Data
In the hallowed halls of any serious data analysis shop, you can expect to hear the phrase “meta data” spoken with reverence- But what, to put it bluntly IS meta data?
The definition I like the best is the dictionary one, Meta Data is Data about Data.
Sure its much more complicated than that- Some of the data about data is all about things only programmers really love- which table, what data format, etc. But the meta data ( or “Reference Data” as it is sometimes called) that analysts are interested in describes the raw data in business terms. What are the market segments that we follow, and for each customer which segment are they in?
You might have met consultants recently that talk a lot about meta data- or maybe they call it “Master Data” (yet another name). They’ve told you that this is super important stuff, that its affecting your business and your bottom line, and thankfully (for a “reasonable” fee) they can fix you and your enterprises data right up.
Well, it is important. You might know how many of product X you sold- but if you don’t know what product X is, which categories it is in, what other products are similar, complementary or conflicting, its hard to do actual analysis. So maybe you do need their services- but there is also that report that your Boss wants tomorrow, and the IT department says it can get you by March. So of course reality sets in, and once again you launch your trusty spreadsheet software.
In a spreadsheet, data is data. Some of the cells are actually meta data, and some are data (some are both), but generally we just sort of wing it and build what we need. Sometimes spreadsheet meta data is easy to see- certainly formulas themselves are meta, and certainly anytime you use a VLOOKUP or related function you are referencing common definitions. But overall its pretty much the wild west. Which means you can change it quick- but also means it sometimes gets a bit, well, messy.
In a formal data mart made by the IT department, meta data (one hopes) is tightly controlled, clearly defined and in its own tables and systems. As a result can often be difficult to change quickly (or cheaply, see “reasonable fee” mentioned above). So things are all squared away- but because of the rate of change they reflect what you asked for 8 months ago.
In the datamartist tool, what I’m building is a middle ground. The goal is to allow users to manipulate the meta data quickly but within a tool that has more structure than just an empty spreadsheet. (And to let them do it without an advanced degree in data modelling).
By doing this, operations that are time consuming and repetitive in a spreadsheet- deduplication, ability to have multiple rollup paths and views, ability to re-categorise large data sets- can be done very quickly and in a visual drag and drop interface. And of course then you can export the cleaned, structured data to your spreadsheet to do all the things that spreadsheets do best- all in time for your boss tomorrow, because in the end, she wants her TPS report– not an explanation of why meta data is holding you up.
« The Beta pressure is ON! | Importing Data into Excel »
Comments are closed.