Omniture Insight, SiteCatalyst, Coremetrics, WebTrends, Unica NetInsight, Google Analytics and Yahoo! Web Analytics… the list of analytic tools goes on and on. Most analytic solutions, particularly web traffic analytic solutions, claim to have accurate data yet each one will give very different results. So which one is correct and what are the differences?
All these solutions are correct in their own way but understanding how the counts are derived is a challenge. In fact, it all seems like a Big Black Box. For most Software as a Service (SaaS) solutions you may never know all the details. You are totally dependent on the documentation and vendor technical support to explain the details. However, with Omniture Insight, the Black Box is opened for the customer to see and reconfigure as they wish… every detail, not just a few customized settings.
So what about those differences? They are probably more numerous than most folks realize. So many differences that the likelihood of counts from different tools exactly matching up is nearly impossible. Some of these differences are standard default vendor decisions while others are customer configured. It’s according to what the decision makers feel is important to count and not count. It may be based on the purpose of the data or even if the administrator understands what the setting means.
Omniture Insight is very straight-forward in how it determines the data to be included and how it is configured. There are three phases within the real-time process that defines the data included within a dataset. These are…
- Data Collection – determine data sent to logs
- Log Processing – determine data pulled into the dataset
- Transformation – configure the data to be viewed in the dataset
Data Collection within Omniture Insight is incredibly versatile and drives the ability for multi-channel analytics within a single dataset. Data Collection may be based on web logs, page tagging or flat files from any other enterprise systems. If your dataset includes web traffic then the best method is to collect using a Visual Sensor. It puts you in control of what is collected within the special Visual Sensor log files. A Visual Sensor provides several benefits that include…
- Ability to configure the data to be collected independent of web log settings for Apache or IIS. Usually this is dictated by your web operations team.
- Exceptional data compression many times smaller than standard web logs.
- Ability to drop a configurable cookie for Visitor identification
- Controlled Experimentation for A/B and multi-variant testing
Visual Sensors are controlled via a simple configuration file that you can provide to your web operations team. Typically, this configuration file is used to exclude unnecessary data like graphic images, text files like CSS and application files like JavaScript. It is also used to define your log file name, any special tracking cookies as well as if Controlled Experimentation is to be used.
Data Collection via page tags is possible if the web site is coded with a JavaScript page tag for each applicable page and directed to a host tracking domain that uses Visual Sensors. So Visual Sensors are still used with page tagged data and can provide an added benefit of bridging Visitors that jump between domains.
One of the great benefits of Omniture Insight is its multi-channel analytics. Flat delimited data files from your other enterprise systems can be imported into the dataset by building special Decoders to read the data. You will need to have special data fields that can be used to join data together for a single customer or whatever suits your needs.
Log Processing is what brings all this data together within a dataset. A profile is created to define the entire configuration for the dataset. The profile defines the Log Processing and the Transformation properties. Log Processing is commonly used to…
- Define the Start and Stop Times
- Define your fields
- Define your Log Sources. This includes both Visual Sensor Logs and special log files that use Decoders.
- Identify any special Transformations that need to occur during the Log Processing phase instead of during the Transformation phase. This might be a lookup file that gets automatically updated daily. If the lookup file is read during the Log Processing phase the changes to the lookup will be processed immediately in real-time.
- Define your Log Entry Conditions. This where you define specifics of the type data included or excluded. The most common inclusions into the dataset are by domain or DNS values and by HTTP Status Codes that may not include 400 or 500 level errors. The most common exclusions are the robot filter by User Agent and IP Address and URI page extensions (.GIF, .JPG, .JS, etc.).
Transformation occurs immediately after the completion of Log Processing and defines what shows up in your dataset. Log Processing brings the data into the dataset but you would not see it without defining Dimensions and Transformations in the Transformations phase.
Dimensions are defined segments of data for viewing in a workspace. Dimensions can be cross tabbed and must have Metrics applied to them within a table to show any counts. Transformations are manipulations of the data to format it in a special way to be presented in a Dimension. For example, you may want to take only a specific parameter from a query string and union that with another data field such as the Referrer. Transformations can be used in a number of ways such as Changing Case, Copying, Categorizing, Unioning, Splitting, Flattening, Merging, Formatting and many others. Transformations give you near limitless power to define your data exactly as it is needed.
By reviewing the straight-forward configuration of the Data Collection, Log Processing and Transformation phases you uncover the black box and make the contents inside understandable. This ability is not to be taken for granted. Through proper understanding of your configuration, you significantly reduce misinterpretation of the data, which is prevalent particularly in web analytic solutions. Proper interpretation leads to accurate actionable results and will improve your company’s bottom line.
Guest post by Craig Ketner





