• Latest Analytics Opportunities in US Healthcare – An update


    Over the past few years there has been a lot of buzz around the OBAMACARE Health Reform which was implemented in 2014. The reform mandates that every individual buy health insurance, irrespective of which health bracket they fall in. In a way, it is a mandate for the employers too, to provide health insurance to each of their employees, irrespective of their company size. While this was gaining popularity, several states filed law suits against the federal government claiming that it was unconstitutional to force citizens to buy health insurance.

    Companies spend billions of dollars every year on health insurance. Yet, we see very limited initiatives to organize healthcare data and do analytics around it. The major hurdle that comes in the way of healthcare companies, is to decide on the kind of health plan / deal they can offer small and medium size employers, so that their interest in providing comprehensive healthcare to their employees goes up. Companies like Blue Cross & Blue Shield, Kaiser Permanente, Highmark, United Health Group etc. have spent lots of money in setting up their IT infrastructure, but the investment in Exploratory and Predictive Analytics is way behind.

    Exploratory Data Analysis has proved to be a great starting point in the analysis of B2B healthcare relations. It has enabled healthcare firms to help companies of all sizes in providing comprehensive health insurance. Analysis like classification and segmentation help in strategizing plans for small companies (with even less than 5 employees) where it gives them an option to be a part of a pool or consortium and avail healthcare like a mid-sized company. Now these companies individually may not be in a position to buy healthcare for its employees at all, but because they join a bigger umbrella (a consortium or a pool), it helps them afford the healthcare plan.

    For companies that are mid-sized and over, proper predictive analytics can help healthcare firms estimate the amount of claims that might arise from employees. This will help them estimate right premiums and other costs like co-pay and deductibles for the insured.
    With proper analysis of an individual’s health, premium and claims history, data scientists might be able to suggest a proper plan for individuals (HMP vs PPO vs Consumer Directed Health Plan – CDHP).

    There is lots of data available in the healthcare system which requires extensive research and analysis. These include –

    • DxCG Health Risk Scores Data
    • Claims Data
      • Inpatient Claims
      • Out-Patient Claims
      • Denials Data
      • Resubmissions of Claims
    • Premiums (Co-Pay and Deductible)
    • Dental Insurance
    • Eye Care Insurance
    • TPA Data

    The points mentioned above just form the tip of the iceberg. Data scientists have become really interested in the use of big data in healthcare insurance. About 70% of the data in healthcare is unstructured. By using Big Data techniques data scientists expect to learn trends from data so that important information can be extracted from them which could be used for serving Healthcare firms, employers and brokers as well.

  • Fortune in the cookies – maximizing online customer acquisition

    A cry in the dark

    Consider a person who has just walked into a Macy’s in a mall. So, why is she in the store and what is she looking for? Has she been to other stores or other Macy’s stores looking for the same item(s) she now wishes to purchase?

    In the traditional world, Macy’s can never know any of the above and that precisely has been and always shall be, the Achilles’ heel of traditional marketing. It constitutes a 2-player game (Player 1: Buyer, Player 2: Seller), where a player 1 has a distinct advantage due to the incomplete information at Player 2′s disposal. The Buyer here is looking to maximize his Utility from the purchase and the Seller here is looking to make a sale and maximize his margins from it. In the traditional setup, the Buyer generally knows what products are being offered, their price, the potential cost of those products to the Seller and similar stats even on the seller’s competitors. The Seller on the other hand has little to no information on what the Buyer has on her mind about her tastes and preferences, purchase behavior, prior purchase attempts, the urgency of her need, or the trigger for her purchase decision. The best guess the Seller can make on the Buyer is her purchasing ability and her intent to buy.  This is what we would call an incomplete information set and on top of that an asymmetric one (given that the Buyer knows more). Any system that improves on this information set for the Seller, improves on his ability to maximize his objective function.

    “I know what you did last summer…and even 5 minutes ago”
    Now if you take our current e-commerce environment – chances are all the activities of the buyer are recorded in what we call cookies.  This includes how many times she has viewed the product, in how many sites, for what length of time, how many times she has shown the intent to purchase by adding it to her shopping cart, what related products she has viewed or purchased, and how related searches she had conducted. It is these cookies that hold the key to unlocking the utility function of the consumer, by revealing her tastes and preferences, purchase behavior, and the whole nine yards. The question is – How do you wield that key?  There is terabytes of data to trudge upon before you get to something meaningful and actionable. Every Buyer has her own length of history and search pattern for one single purchase. Multiply that by the few dozen purchases she makes in a year and a few million customers the single Seller is dealing with.

    In my opinion, the most amazing gift endowed upon us marketers by digital media is this ability to deconstruct the drivers, needs, and aspirations of buyers down to atomic levels. Thanks to marketing organizations religiously farming SERP keywords, cookies, and site navigation data, we are in the lucrative position of unlocking the buyer’s utility function provided we are able to eliminate the noise in the signals by applying advanced quantitative methods on big data. Before we get into the ‘geek talk’ overdrive let us define the fundamental questions we are seeking to answer in order to arrive at individual buyer specific targeting strategies.

    It all begins with a search: Every search is nothing but an ‘expression’ of intent which offers the key to unraveling the buyers need of the hour.

    Understanding the long and short of the buyers recent web-trail creates an opportunity for the digital marketer to define customized strategies empowered by contextual and behavioral targeting.

    Combining search intent and cookie trail with site navigation (i.e. Pathing) helps understand which acquisition journeys lead to conversion and which paths are, well, roads to nowhere.

    Search (Organic+ Paid) traffic coming into an established e-commerce website often comprises hundreds of thousands of unique keywords. However these seemingly distinct searches can be assigned to a finite set of ‘intent groups’ through logical classification of ‘semantic’ and ‘thematic’ similarity… Consider these two keywords: “best credit card for small business” vs. “top small business credit cards”. Clearly, these two searches are semantically different, but evidently they express very similar, if not the same, intent on the part of the searchers. Essentially, the intent here can be categorized as falling in the segment COMPARATIVE. Needless to say, by applying various text mining techniques, we can capture the massive number of searches in distinct intent groups such as INFORMATIONAL (“what is…”, “How to…”), CALL TO ACTION (“apply for…”, “buy online…”), and so on.

    The objective of the entire exercise is to reduce the dimensionality of the massive keywords data into actionable, logical, accurate intent groups. This, when achieved, enables the digital marketer to rank site visitors from search channels in terms of ‘purchase propensity’. For instance, a visitor coming into the site having searched “apply today or “instant approval” is way lower in the sales funnel (i.e. closer to conversion) than one who arrived searching “low interest cards”. You can therefore understand how robust search intent segmentation can create a definitive early advantage for the e-marketer as far as addressing ‘WHO IS THE BUYER?’ is concerned.

    All this is great, but we also know that all the people in the ‘CALL TO ACTION’ group do not convert, and that few in the weaker intent segments actually do. This is often a function of how the visitors interact with the site (‘Pathing’). A smart e-commerce site can actually manipulate the visitor’s site navigation based on the knowledge of their search intent groups as well as cookie trails thus maximizing the likelihood of ‘site navigation’ culminating into ‘acquisition journeys.’

    Once we are able to define concrete intent groups and recent purchase priorities or needs (i.e., search history) of the visitor, it enables customized page/content displays that keeps the visitor on the ‘conversion path’.

    The analytical techniques here get way more complex than standard regression models. This is because one cannot make the oversimplified assumption of identifying the triggers of conversion based on site navigation on the day the conversion happens. Why so? Let’s try and illustrate with an example: A visitor comes to a luxury watch retailer site, landing on the Homepage and takes the following path:

    Homepage–>Products–>Add to Cart–>Checkout

    If our dependent variable was ‘conversion’(Y/N) and the independent variables (predictors) were visit/no visit flags to the website pages then the traditional logistic regression model would tend to imply that the pages ‘Products’, and  ‘Add to Cart’ and ‘Checkout’ are the strongest influencers of conversion. But as logical minds we know these are ‘self-selected’ pages for people who convert in as much that they cannot select the product of choice without clicking on the ‘Products’ page, and cannot complete the purchase without going through the ritual of visiting the subsequent two pages. This constitutes a unique situation where ‘correlation’ does not imply ‘causality’.

    So where did the model fail?
    It failed because it ignored the buyer’s entire acquisition journey through the sales window. On the day of the purchase, the decision to buy has most likely already been made in the buyer’s mind. The ‘pathing’ on that day is a mere execution of a foregone decision.

    The real journey of awareness-to-interest-to-decision, hidden away in the buyer’s prior visits to the site or related sites when she was mulling over the idea of whether or not to commit to the sale, holds the key to unlocking which pages/features on the site actually influenced her decision. These, mis amigos, are the real ‘foot soldiers’, the ‘movers & shakers’ that cradled the visitor to conversion.

    Mathematically one therefore has to estimate a panel data based mixed-effects models, where the pathing of each visitor, whether converted or not, on each visit is accounted for. One needs to understand the critical importance of integrating search intent and pathing based insights into e-commerce strategies. The digital marketing world is a two-edged sword – while on the one hand it offers tremendous opportunity to decode the buyer’s utility function, it also creates a perilous situation where the substitution of the seller by one of its competitors is a mere click of the mouse with the buyer not having to move an inch, and having the opportunity to compare offerings across multiple competing sellers in real time. The marketing campaigns that often fail are those where the seller puts his ‘brand’ above the buyer’s ‘needs’. Digital marketing should not be afflicted by seller’s ego which makes him self-assured of the footprint of his brand because the buyer is simply interested in her own best interest. If by leveraging intelligent big data analytics you can weave yourself into her scheme of things whereby she resonates with your brand as “THIS IS WHAT I WANT!” you would convert a site visitor into a customer or otherwise your competitor surely will.

  • Paid Search Analytics

    EPSON MFP image

    Ad Position Analysis

    There are many benefits of being in the top position in Google Ad words – Higher click-through rate (CTR), more impressions, greater share of search, and a greater likelihood of increased conversions. Unfortunately, along with these benefits, come additional costs for the advertiser.

    Assuming an ad with an average quality score, cost at an average ad position of 1 is 30% more than in position 2 and cost at position 2 is 20% more than in position 3. Consequently, brands would want to economize ad campaigns by reducing their bids to settle for a lower ad position. However, this is a mistake that can cost them dear because lowering your cost per click is not useful if you’re paying low prices for irrelevant clicks. By discovering new, relevant and valuable clicks, the distribution of your budget will improve substantially.

    Here’s something interesting!

    Click through Rates fall by 80% between an average position of 2-3 and a further 18% by a position of 3-4.  For generic keywords, brands can target an average position till 4-5 but for brand specific terms, accumulating clicks beyond an average position of 3-4 is highly unlikely.

    For a Global manufacturer and marketer of consumer and professional products, we analyzed Paid Search Data from Google Ad words for 6 months ending September 2014 to find the ‘Sweet Spot’ of average position for their brand.  For all brands, we observed a maximum CTR at an average position of 1-2 followed by a steep decline of 80% at an average position 2-3. The image on the left shows a comparison of CTR with average Ad Position across brands.

    ChangeInCTR CTRbyAD

    As depicted by the chart on the left, Ads beyond position 3 are hardly clicked. Brands need to make sure their ads are placed in the top 3 positions to increase clicks. Since, top 3 positions are more expensive to bid, one needs to prioritize between campaigns and brands. Also, a great Quality Score reduces the cost involved in bidding for the top position.


    The image on the right quantifies the fall in CTR at position 3 and 4. We found that at position 3, CTR for Branded keywords falls by 65%, whereas CTR for Generic keywords falls by 40%. CTR for Branded keywords tends to zero in fewer ad positions than for generic keywords.

    More than 90% of clicks are associated with generic keywords for products related to general categories, while close to 30% clicks are associated with branded keywords related to personal products. Branded keywords see a steeper fall in both CPC and CTR compared to a more gradual fall for generic keywords.

    It is clear that CTR at top positions are considerably higher than those at lower positions. This suggests that most consumers conduct limited search and have small consideration sets. More clicks at lower positions suggest that consumers may be evaluating more ads before making their purchase decisions; these consumers will be having higher purchase intent.  In this case, placing ads at intermediate positions may be an effective way to reach higher purchase intent consumers without paying more for the top positions.

    We conclude by saying that Generic keywords are more contextual than Branded keywords and require a careful design depending on personal vs general category brands to attain maximum clicks.

Hide dock Show dock Back to top