How would you check the validity of randomization on the available data?

Ace Health Insurance Inc. (AHI) offers health insurance to millions of customers in the US. The AHI management is concerned about the rising customer support costs at their call center. A management consulting firm has recommended that AHI should start a self-service web portal for its member. Members can find their relevant health insurance information on the portal, such as see the status of their health insurance claims, explore member benefits, and look for treatment options. Members can also conduct transactions with AHI on the portal, such as ordering a new health insurance card. The portal would allow members to find the desired information and make fewer calls to AHI’s call center.

Before rolling out the member web portal on its website, the AHI decided to conduct a randomized experiment to examine the effect of the web portal on member calls. In this experiment, 1228 randomly selected members (called treated members) from a sample of 2473 members were allowed access to the web portal. The remaining 1245 members (called control members) did not have access to the web portal. AHI collected data on the total web portal visits and calls made by the sample of members for one year. AHI also collected data on the age, annual income, and the total number and amount of claims filed by these members in the previous year. The attached MS Excel file (Homework 1.xslx) provides the data and data dictionary.

Answer the following questions

How would you check the validity of randomization on the available data? Check the validity of randomization and show your results.
Compute the difference-in-means estimate for the treatment effect of availability of a web portal on member calls. Show your results.
What other variables in the data influence the number of members’ calls? Include these variables to compute the multivariable regression estimate for the treatment effect of the availability of a web portal on member calls. Why are the regression estimates better than the difference-in-means estimates?
Is the treatment effect of availability of the web portal higher for the members who receive a higher number (amount) of claims during the experiment? How would you check this with multivariable regression? Estimate the regression model and interpret the results.
While many treated members use the web portal a lot after its availability, others don’t use it at all. Estimate the effect of the number of member web portal visits on their calls with multivariable regressions. Interpret these results.

GRAB 30% OFF ON YOUR ASSIGNMENTS NOW