Maintaining my own Pageview/Visit Database
I currently use Google Analytics (GA) and I really like it. But ...
- You can't (yet) export your raw visit data from GA
- You can't (yet) browse your GA cube
- I have other data in my MoveableType database (post dates, categories, commentors, tracebacks, etc...) that GA just doesn't know about. This information would be interesting to get into the cube.
What have I done so far?
The handler extracts the tokens and any http header values and pushes them to the SQL Server 2005 database I have setup (also with DiscountASP). Below is a screen shot showing these records ...
Then, I added a linked server connection to this database from my home PC and setup a SQL Job that calls a sproc that moves the data from the remote server to the local PC on my desktop. This keeps the live transaction table relatively empty.
Then, again on my local PC, I setup a scheduled task that processes the visit table and creates a master/detail structure for each of the raw visit records where the header contains stuff like the url requested, the pages title, the referring url, the IP of the client that requested the page and the date. The detail table is filled with the key/value pairs taken from any querystring or header tokens.
Here is a quick screen shot of a handful of records from my header table. Only 4 columns and ~20 records, but still you can see what kind of information you can get from this data (I blacked out all of the IP's except mine).
How is it working?
It seems to be working OK. I am benchmarking my hit counts with what GA is reporting to make sure I have an accurate baseline. So far they seem to be relatively close.
Building the cube. Should be fun!
That's it. Enjoy!