Monday, July 28, 2014

The case of the disappearing users

Google continues its efforts to thwart me. In today's episode, we find that the number of unique users actually goes down when I increase the date range in Google Analytics.

238 users (formerly known as "unique visitors") in February + March:


254 users in March:


280 users in the last 12 days of March:


274 users in the last 11 days of March:



I believe the last two are correct. Before March 20, you keep increasing the date range and the count of users keeps decreasing. I wanted to see how far back this went (because eventually it would get to 0, right?), and I found that Sept. 12, 2013 - Mar. 31, 2014 shows 230 users, but if I keep going back (e..g., Jan. 1, 2009 - Mar. 31, 2014), it remains unchanged at 230 users. (Sept. 13, 2013 - Mar. 31, 2014 shows 231 users.) I can't figure out any significance of Sept. 13, 2013 or 230 users.

What is important, however, is that March 20 was the first day the site went live and started collecting analytics.

So the lesson here is, when collecting Google Analytics data from a range that includes when your site went live, the beginning of the range has to be that same go-live date.

If you're doing monthly reports and getting data back for the entire months of June, May, April, you have to be careful when you get March -- instead of getting the entire month of March from days 1 - 31, you have to get from March 20 - 31. If you extend the start of a range to include any dates from before a site went live and started tracking, then bogus data ensues. Yay Google!

Monday, July 21, 2014

Auto-starting a Windows service at build time

My coworker Frank just posted a how-to on setting up a Windows service, which inspired this post.

If you have a Windows service as a C# project, you can set it up to start the service automatically anytime you build. You could even have it work only on your computer so others don't do it by accident, etc.

First go in your solution configuration (that little dropdown at the top that usually says "Debug" or "Release") and for your service project, add a new project configuration called "InstallServiceLocally." To make it even more explicit, if you leave the existing solution configuration alone, you can add a new solution configuration and then have that point to a new project configuration "InstallServiceLocally." Then you can change the dropdown from "Debug" to "InstallServiceLocally" and it will run a debug build plus install the service when that value is set in the solution configuration dropdown.

After you add the project configuration, you will need to edit your csproj file (every C# project has one, or if VB.NET use vbproj) and add the post-build event with the installation logic. Paste this in toward the at the end of your csproj, before the closing </Project> tag:

  <PropertyGroup Condition="'$(COMPUTERNAME)' == 'MyComputerNameGoesHere' and '$(Configuration)' == 'InstallServiceLocally'">    <PreBuildEventDependsOn>SetLatestNetFrameworkPath;$(PreBuildEventDependsOn)</PreBuildEventDependsOn>    <PostBuildEventDependsOn>SetLatestNetFrameworkPath;$(PostBuildEventDependsOn)</PostBuildEventDependsOn>    <CleanDependsOn>UninstallDealerOnCmsGoogleAnalyticsImportService</CleanDependsOn>    <LatestNetFrameworkPath>$(WinDir)\Microsoft.NET\Framework\v4.0.30319\</LatestNetFrameworkPath>    <InstallUtilPath>$(LatestNetFrameworkPath)InstallUtil.exe</InstallUtilPath>    <PostBuildEventWithDeployment>      "$(InstallUtilPath)" "$(TargetPath)"
      net start "$(TargetName)"
    </PostBuildEventWithDeployment>    <PreBuildEventWithDeployment>      net stop "$(TargetName)"
      "$(InstallUtilPath)" /u "$(TargetPath)"
      Exit /b 0
    </PreBuildEventWithDeployment>    <PostBuildEvent>$(PostBuildEventWithDeployment)</PostBuildEvent>    <PreBuildEvent>$(PreBuildEventWithDeployment)</PreBuildEvent>  </PropertyGroup>  <Target Name="SetLatestNetFrameworkPath">    <GetFrameworkPath>      <Output TaskParameter="Path" PropertyName="LatestNetFrameworkPath" />    </GetFrameworkPath>  </Target>  <Target Name="UninstallDealerOnCmsGoogleAnalyticsImportService">    <Exec WorkingDirectory="$(OutDir)" Command="$(PreBuildEvent)" />  </Target>

Whenever the solution configuration is set which builds the project in "InstallServiceLocally" configuration, if the computer name you're building on matches what's hardcoded in the project file (or you can remove this check, or do whatever other condition you want or no condition), then at the end of each build, it will stop, reinstall, and restart the service.

After that, if you want to debug, you'll need to attach the debugger to the service exe... So if you really want to debug the service as it's running, you'll probably want to put a sleep or timer of some sort at startup so it doesn't start running before your debugger is attached. Or you can always debug by creating a unit test wrapper around it, and debugging unit-test methods that call your service's internals. (Remember, you don't have to make things public for a unit test project to access them -- just use the [assembly: InternalsVisibleTo("Name.Of.Friend.Assembly")] attribute in the AssemblyInfo.cs file of the project having the members you want to expose. This makes it similar to a friend assembly in Java.

Wednesday, July 16, 2014

Stop changing my data!!

Dear Google,

Every day I do an incremental import of your Analytics data by pulling in the data from the previous full day.
Why is it when I come back the next day and do a new query for the data from the same day, it's sometimes (but all too often) different? Sometimes it's even different when querying the same completed data range twice in the same day.

Your documentation is confusing. I know you do some mysterious processing. I know there are some things related to data sampling that I need to worry about. I know if I give you $150,000 per account per year then I can upgrade to Premium Analytics and not have to worry (as much) about sampling.

Could you please make it clear, for other users who haven't already figured this out the hard way, that anything we query from your API may possibly be an approximation, and is liable to change if we run the same query an hour or a day later?

I have changed my daily import from 4 a.m. Eastern to 7 a.m. Eastern to ensure that any processing you may be doing on the previous day has had three additional hours to finish.

I have changed all my queries to use the highest-precision sampling level.

I am even going to far as to delete any data that was queried with a range including the previous two days, and then I re-import them along with the latest day, on each day I do an import, because I don't have confidence that the data isn't going to change until at least 48 hours after the day is done.

Just please be a little more up-front about this stuff next time, Google, and it will save your users a lot of pain. It's really not cool thinking you're ready to go-live with brand-new reports that use imported Googly data, only to discover that we have a report that is internally inconsistent with itself because one part of it aggregates daily visits over the last 30 days, being the result of 30 separate incremental imports, and another part of the report shows data from the last 30 days aggregated together, being the result of a single query done afresh on each import, using the last 30 days as the date range.

Annoyed,
Samer

_________________________________

The pictures below shows the kind of stuff you get when you do the same query for the same date range twice in Google Analytics. The numbers at the top were done using one query, done yesterday, for the past 30 days. The numbers at the bottom were done using 30 queries of 1 day each over the past 30 days, and then added together. Before you think I'm doing something wrong, when I delete the 30 daily records and do all those 30 queries again in one shot, the aggregated numbers start to match all of a sudden. I can literally run the same query over the same (finished) date range twice and get two separate results. After a while it does stop changing -- probably 48 hours to be safe.


Google Analytics is just giving us trouble heaped on top of trouble!